BackPrivacy - Model Extraction Attack (MEA)
Privacy - Model Extraction Attack (MEA)
Risk Domain
Vulnerabilities that can be exploited in AI systems, software development toolchains, and hardware, resulting in unauthorized access, data and privacy breaches, or system manipulation causing unsafe outputs or behavior.
"replicating the parameters of the LLM,"(p. 7)
Entity— Who or what caused the harm
Intent— Whether the harm was intentional or accidental
Timing— Whether the risk is pre- or post-deployment
Other risks from Wang et al. (2025) (11)
Privacy - Membership Inference Attack (MIA)
2.2 AI system security vulnerabilities and attacksHumanIntentionalPost-deployment
Privacy - Data Extraction Attack (DEA)
2.2 AI system security vulnerabilities and attacksHumanIntentionalPost-deployment
Privacy - Prompt Inversion Attack (PIA)
2.2 AI system security vulnerabilities and attacksHumanIntentionalPost-deployment
Privacy - Attribute Inference Attack (AIA)
2.2 AI system security vulnerabilities and attacksHumanIntentionalPost-deployment
Hallucination
3.1 False or misleading informationAI systemUnintentionalPost-deployment
Value-related risks in LLMs
7.1 AI pursuing its own goals in conflict with human goals or valuesOtherUnintentionalOther