This page is still being polished. If you have thoughts, please share them via the feedback form.
Data on this page is preliminary and may change. Please do not share or cite these figures publicly.
Red teaming, capability evaluations, adversarial testing, and performance verification.
Also in Risk & Assurance
Detecting model poisoning can be achieved through techniques like model inspection [14], which allow integrators to identify compromised models
Reasoning
Model inspection detects poisoning through technical anomaly detection within AI system.
Mitigations for Integrity
Evasion attacks
For evasion attacks, mitigation strategies depend on the input type—whether images, video, or audio—and the attacker’s level of access. Direct input access necessitates specific defenses, while indirect manipulations, such as through cameras, require different approaches. While various methods have been proposed [15], [16], [21], identifying effective detection thresholds remains an open research challenge. These thresholds should be tailored to the application’s risk assessment to ensure robust security
1.2.1 Guardrails & FilteringPrompt injection attacks
To defend against prompt injection attacks, integrators can implement input sanitization and filtering mechanisms to detect and block malicious instructions. Prompt injection attacks not only compromise the integrity of an AI system by manipulating inputs to produce unintended outputs, but can also target confidentiality by extracting sensitive or private information from the system. While input validation is more challenging for natural language than for structured inputs like SQL, these measures remain critical.
1.2.1 Guardrails & FilteringMitigations for Availability
2.3.2 Access & Security ControlsMitigations for Availability > Leverage protections provided by model hosters
As a model integrator, leveraging the protections provided by model hosters is critical to addressing threats such as bot activity, Denialof-Service (DoS) and Denial-of-Wallet attacks. These are of particular concern given that bot-generated traffic accounts for approximately 47% of Internet activity
2.3.2 Access & Security ControlsMitigations for Availability > Documenting protections
Documenting these protections helps meet EU AI Act requirements
3.1.4 Compliance RequirementsMitigations for Availability > Measuring inference costs
In addition, measuring inference costs, such as time or energy consumption, and implementing cut-off thresholds can prevent abuse [18]. This approach potentially eliminates the need for complex sponge attack detectors1 while maintaining operational efficiency
1.2.3 Monitoring & DetectionMitigations for Integrity
1.2 Non-ModelMitigations for Confidentiality
2.3.2 Access & Security ControlsCompliance Made Practical: Translating the EU AI Act into Implementable Security Actions
Bunzel, Niklas (2025)
The EU AI Act, along with emerging regulations in other countries, mandates that AI systems meet security requirements to prevent risks associated with AI misuse and vulnerabilities. However, for practitioners, defining and achieving a secure AI system is complex and context-dependent, posing challenges in understanding what actions they need to take and when they are sufficient. ISO/IEC TR 24028/29 and ENISA Securing Machine Learning Algorithms offer a comprehensive framework for AI security, aligning with the EU AI Act's requirements by addressing risks, threats, and mitigation strategies. However, for practical implementation, these reports lack hands-on guidance. Industry resources like the OWASP AI Exchange and OWASP LLM Top 10 fill this gap by providing accessible, actionable insights for securing AI systems effectively. This paper addresses the question of responsibility in AI risk mitigation, especially for companies utilizing pretrained or off-the-shelf models. We want to clarify how companies can practically comply with the upcoming ISO 27090 and ensure compliance with the EU AI Act through actionable security strategies tailored to this prevalent use case. © 2025 IEEE.
Verify and Validate
Testing, evaluating, auditing, and red-teaming the AI system
User
Individual or organisation that directly uses the AI system
Measure
Quantifying, testing, and monitoring identified AI risks