Skip to main content
Home/Risks/Gipiškis2024/Model weight leak

Model weight leak

Sub-category
Risk Domain

Vulnerabilities that can be exploited in AI systems, software development toolchains, and hardware, resulting in unauthorized access, data and privacy breaches, or system manipulation causing unsafe outputs or behavior.

"Model weights or access to them can be leaked when initial access is granted only to a select group of individuals, such as institutional researchers [209]. This risk can increase as more people gain access, and identifying the source of the leak becomes more difficult. The availability of leaked model weights makes various attacks on systems that use the leaked AI model easier to implement, such as finding adversarial examples, elicitation of dangerous capabilities, and extraction of confidential information present in the training data. The avail- ability of model weights might also enable the misuse of the AI system using the leaked model to produce harmful or illegal content [67]."(p. 43)

Part of Cybersecurity

Other risks from Gipiškis2024 (144)