Skip to main content
Home/Risks/Gipiškis2024/Jailbreak of a multimodal model

Jailbreak of a multimodal model

Sub-category
Risk Domain

Vulnerabilities that can be exploited in AI systems, software development toolchains, and hardware, resulting in unauthorized access, data and privacy breaches, or system manipulation causing unsafe outputs or behavior.

"Current generation multimodal (e.g., vision and language) GPAI models are vulnerable to adversarial jailbreak attacks. These attacks can be used to automatically induce a model to produce an arbitrary or specific output with high success rate [227]. Multimodal jailbreaks can also be used to exfiltrate a model’s context window or other model internals [18]."(p. 26)

Other risks from Gipiškis2024 (144)