2.2 AI system security vulnerabilities and attacks

▸Read full description

AI systems, like other software systems, face a range of security threats. These issues may arise from inherent weaknesses in the design of AI algorithms, the data used to train the models, or the operational context. Specific examples include:

Toolchain and dependency vulnerabilities that arise unintentionally through the use of automated code-generation tools (e.g., Github Copilot, Python language, OpenCV), deep learning frameworks (e.g., Tensorflow, PyTorch), or as a result of complex interdependencies in the development environment.

External tool and API integration into AI system applications can compromise the trustworthiness and privacy of systems due to their potential unreliability or susceptibility to adversarial control.

Security vulnerabilities in physical and network infrastructure, such as vulnerabilities in graphics processing units, or GPUs, or to sophisticated attacks like side-channel and rowhammer attacks, can lead to unauthorized access or manipulation of model parameters when used during training of AI systems. The use of distributed network systems for training AI systems such as LLMs exposes them to network-specific threats like pulsating attacks or congestion.

Direct manipulation of AI systems such as adversarial attacks and instruction-based attacks. Adversarial attacks focus on altering the model's learning process or extracting its data. They include perturbations designed to deceive models into incorrect outputs, extraction attacks to steal model insights, and poisoning attacks to alter model behavior. Instruction-based attacks manipulate the way the model handles and responds to inputs. Attackers deliberately craft prompts to induce models to produce biased or unsafe outputs (a.k.a. 'jailbreaking'). This manipulation directly targets the operational aspects of AI systems with the intent to cause harm.

Excerpt from the MIT AI Risk Repository full report

Vulnerabilities that can be exploited in AI systems, software development toolchains, and hardware, resulting in unauthorized access, data and privacy breaches, or system manipulation causing unsafe outputs or behavior.

Incidents up 60% since 2024
Ranked 2nd of 24 subdomains for governance coverage
Well-covered — governance coverage outpaces incident volume
71% of recorded incidents occurred since 2023
251 enacted and 90 proposed governance documents

112 risks(3rd)

21 incidents(10th)

345.5 governance(2nd)

Governance vs. Incident volume

Well-covered (-0.08)

Well-governedUnder-governed

Incident volume relative to governance coverage; each dot is one of 24 subdomains

Dataset Drilldown

Entity

Who or what caused the harm

Human

AI system

Other

Not coded

Intent

Whether the harm was intentional or accidental

Intentional

Unintentional

Other

Not coded

Timing

Whether the risk is pre- or post-deployment

Pre-deployment

Post-deployment

Other

Not coded

Browse all 112 risks →

Recent Incidents

An autonomous AI security testing agent successfully compromised McKinsey's internal AI platform Lilli within 2 hours, gaining full read and write access to the production database containing 46.5 million chat messages, 728,000 files, and 57,000 user accounts through an unauthenticated SQL injection vulnerability.

AI systemIntentionalPost-deployment

Developers: Mckinsey And Company, Codewall

Deployers: Mckinsey And Company, Codewall

View on AIID View full details →

Three Chinese AI labs (DeepSeek, Moonshot, and MiniMax) conducted large-scale distillation attacks against Anthropic's Claude model, using over 24,000 fraudulent accounts to generate 16+ million exchanges and extract capabilities for their own models.

HumanIntentionalPost-deployment

Developers: Anthropic

Deployers: Deepseek, Moonshot AI, Minimax, Proxy Reseller Services

View on AIID View full details →

A software engineer using an AI coding assistant to reverse-engineer his DJI robot vacuum's communication system inadvertently gained access to live camera feeds, microphone audio, maps, and status data from nearly 7,000 other vacuums across 24 countries due to a backend security vulnerability.

AI systemUnintentionalPost-deployment

Developers: Dji

Deployers: Dji

View on AIID View full details →

Browse all 21 incidents →

Privacy & Security subdomains

Privacy & Security 2.1 Compromise of privacy by leaking or correctly inferring sensitive information 2.2 AI system security vulnerabilities and attacks

Related Subdomains

7.3 Lack of capability or robustness

AI systems that fail to perform reliably or effectively under varying conditions, exposing them to errors and failures that can have significant consequences, especially in critical applications or areas that require moral reasoning.

341 shared governance docs

6.5 Governance failure

Inadequate regulatory frameworks and oversight mechanisms that fail to keep pace with AI development, leading to ineffective governance and the inability to manage AI risks appropriately.

323 shared governance docs

7.4 Lack of transparency or interpretability

Challenges in understanding or explaining the decision-making processes of AI systems, which can lead to mistrust, difficulty in enforcing compliance standards or holding relevant actors accountable for harms, and the inability to identify and correct errors.

262 shared governance docs

6.4 Competitive dynamics

AI developers or state-like actors competing in an AI ‘race’ by rapidly developing, deploying, and applying AI systems to maximize strategic or economic advantage, increasing the risk they release unsafe and error-prone systems.

256 shared governance docs

2.2 AI system security vulnerabilities and attacks

Governance vs. Incident volume

Dataset Drilldown

1412. CodeWall's Autonomous Agent Reportedly Obtained Unauthorized Access to McKinsey’s Lilli AI Platform Database

1395. Anthropic Said DeepSeek, Moonshot, and MiniMax Used Fraudulent Accounts and Proxies to Illicitly Distill Claude Capabilities at Scale

1389. DJI Romo Cloud Authorization Bug Reportedly Exposed Camera, Microphone, and Home-Mapping Data From Nearly 7,000 Robot Vacuums

Privacy & Security subdomains

Related Subdomains

2.2 AI system security vulnerabilities and attacks

Governance vs. Incident volume

Incidents vs Governance

Dataset Drilldown

1412. CodeWall's Autonomous Agent Reportedly Obtained Unauthorized Access to McKinsey’s Lilli AI Platform Database

1395. Anthropic Said DeepSeek, Moonshot, and MiniMax Used Fraudulent Accounts and Proxies to Illicitly Distill Claude Capabilities at Scale

1389. DJI Romo Cloud Authorization Bug Reportedly Exposed Camera, Microphone, and Home-Mapping Data From Nearly 7,000 Robot Vacuums

Recent Governance Documents

FY2026 NDAA, Section 224 ("National Security and Defense Artificial Intelligence Institute")

FY2026 NDAA, Section 347 ("Integration of commercially available artificial intelligence capabilities into logistics operations")

FY2026 NDAA, Section 1007 ("Use of technology using artificial intelligence to facilitate audit of the financial statements of the Department of Defense for fiscal year 2026")

Privacy & Security subdomains

Related Subdomains

Incidents vs Governance

Recent Governance Documents

FY2026 NDAA, Section 224 ("National Security and Defense Artificial Intelligence Institute")

FY2026 NDAA, Section 347 ("Integration of commercially available artificial intelligence capabilities into logistics operations")

FY2026 NDAA, Section 1007 ("Use of technology using artificial intelligence to facilitate audit of the financial statements of the Department of Defense for fiscal year 2026")