Skip to main content
Home/Risks/Marchal2024/Model diversion

Model diversion

Sub-category
Risk Domain

Using AI systems to develop cyber weapons (e.g., by coding cheaper, more effective malware), develop new or enhance existing weapons (e.g., Lethal Autonomous Weapons or chemical, biological, radiological, nuclear, and high-yield explosives), or use weapons to cause mass harm.

"Model Diversion takes model manipulation one step further, by repurposing (often open-source) generative AI models in a way that diverts them from their intended functionality or from the use cases envisioned by their developers (Lin et al., 2024). An example of this is training the BERT open source model on the DarkWeb to create DarkBert.7"

Part of Misuse tactics to compromise GenAI systems (Model integrity)

Other risks from Marchal2024 (22)