BackPrompt injection attacks

This page is still being polished. If you have thoughts, please share them via the feedback form.

Data on this page is preliminary and may change. Please do not share or cite these figures publicly.

Prompt injection attacks

Bunzel (2025)|LLM classified

Mitigation Taxonomy

1AI System

1.2Non-Model

1.2.1Guardrails & Filtering

Input validation, output filtering, and content moderation classifiers.

Also in Non-Model

1.2.2 Runtime Environment1.2.3 Monitoring & Detection1.2.4 Security Infrastructure1.2.5 Provenance & Watermarking

Definitionp. 72

To defend against prompt injection attacks, integrators can implement input sanitization and filtering mechanisms to detect and block malicious instructions. Prompt injection attacks not only compromise the integrity of an AI system by manipulating inputs to produce unintended outputs, but can also target confidentiality by extracting sensitive or private information from the system. While input validation is more challenging for natural language than for structured inputs like SQL, these measures remain critical.

LLM Classification Details

Reasoning

Input filtering mechanisms detect and block malicious instructions before reaching the model.

Code: 1.2.1Version: v0.5Classified: Jan 22, 2026

Part of

Detecting model poisoning

Detecting model poisoning can be achieved through techniques like model inspection [14], which allow integrators to identify compromised models

Other mitigations from Bunzel (2025) (9)

Mitigations for Availability

2.3.2 Access & Security Controls

Lifecycle:DeployActor:UserAIRM:Manage

Mitigations for Availability > Leverage protections provided by model hosters

As a model integrator, leveraging the protections provided by model hosters is critical to addressing threats such as bot activity, Denialof-Service (DoS) and Denial-of-Wallet attacks. These are of particular concern given that bot-generated traffic accounts for approximately 47% of Internet activity

2.3.2 Access & Security Controls

Lifecycle:DeployActor:UserAIRM:Manage

Mitigations for Availability > Documenting protections

Documenting these protections helps meet EU AI Act requirements

3.1.4 Compliance Requirements

Lifecycle:DeployActor:UserAIRM:Manage

Mitigations for Availability > Measuring inference costs

In addition, measuring inference costs, such as time or energy consumption, and implementing cut-off thresholds can prevent abuse [18]. This approach potentially eliminates the need for complex sponge attack detectors1 while maintaining operational efficiency

1.2.3 Monitoring & Detection

Lifecycle:Operate and MonitorActor:UserAIRM:Measure

Mitigations for Integrity

1.2 Non-Model

Lifecycle:Verify and ValidateActor:UserAIRM:Measure

Mitigations for Integrity > Detecting model poisoning

Detecting model poisoning can be achieved through techniques like model inspection [14], which allow integrators to identify compromised models

2.2.2 Testing & Evaluation

Lifecycle:Verify and ValidateActor:UserAIRM:Measure

View all 9 mitigations from this source →

Source Document

Compliance Made Practical: Translating the EU AI Act into Implementable Security Actions

Bunzel, Niklas (2025)

The EU AI Act, along with emerging regulations in other countries, mandates that AI systems meet security requirements to prevent risks associated with AI misuse and vulnerabilities. However, for practitioners, defining and achieving a secure AI system is complex and context-dependent, posing challenges in understanding what actions they need to take and when they are sufficient. ISO/IEC TR 24028/29 and ENISA Securing Machine Learning Algorithms offer a comprehensive framework for AI security, aligning with the EU AI Act's requirements by addressing risks, threats, and mitigation strategies. However, for practical implementation, these reports lack hands-on guidance. Industry resources like the OWASP AI Exchange and OWASP LLM Top 10 fill this gap by providing accessible, actionable insights for securing AI systems effectively. This paper addresses the question of responsibility in AI risk mitigation, especially for companies utilizing pretrained or off-the-shelf models. We want to clarify how companies can practically comply with the upcoming ISO 27090 and ensure compliance with the EU AI Act through actionable security strategies tailored to this prevalent use case. ¬© 2025 IEEE.

View source DOI: 10.1109/RAIE66699.2025.00016

Classification

AI Lifecycle Stage

Verify and Validate

Testing, evaluating, auditing, and red-teaming the AI system

Responsible Actor

User

Individual or organisation that directly uses the AI system

Deployer

NIST AI RMF Function

Measure

Quantifying, testing, and monitoring identified AI risks

Manage

Risk Domains

Primary

2.2 AI system security vulnerabilities and attacks

Other

2.1 Compromise of privacy by leaking or correctly inferring sensitive information