An autonomous AI security testing agent successfully compromised McKinsey's internal AI platform Lilli within 2 hours, gaining full read and write access to the production database containing 46.5 million chat messages, 728,000 files, and 57,000 user accounts through an unauthenticated SQL injection vulnerability.
McKinsey & Company deployed an internal AI platform called Lilli in 2023 for its 43,000+ employees, processing 500,000+ prompts monthly with 70% adoption. CodeWall's autonomous offensive security agent targeted the system without credentials or insider knowledge, finding publicly exposed API documentation with over 200 endpoints. The agent discovered that 22 endpoints lacked authentication, with one allowing SQL injection through JSON key manipulation in user search queries. Within 2 hours, the agent achieved full read and write access to the production database. The compromised data included 46.5 million chat messages containing strategy discussions, client engagements, and financial information stored in plaintext. Additionally, 728,000 files were accessible including 192,000 PDFs, 93,000 Excel spreadsheets, and 93,000 PowerPoint presentations. The breach exposed 57,000 user accounts, 384,000 AI assistants, 94,000 workspaces, and 3.68 million RAG document chunks representing decades of proprietary McKinsey research. The agent also found it could modify system prompts controlling AI behavior, potentially allowing silent manipulation of the AI's responses to employees. McKinsey patched the vulnerabilities within one day of disclosure on 2026-03-02.
Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.
Vulnerabilities that can be exploited in AI systems, software development toolchains, and hardware, resulting in unauthorized access, data and privacy breaches, or system manipulation causing unsafe outputs or behavior.
AI system
Due to a decision or action made by an AI system
Intentional
Due to an expected outcome from pursuing a goal
Post-deployment
Occurring after the AI model has been trained and deployed