BackMalicious and Indirect
Malicious and Indirect
Risks of AI Scientists: Prioritizing Safeguarding Over Autonomy
"Benign intermediate for harmful end objective"(p. 4)
Entity— Who or what caused the harm
Intent— Whether the harm was intentional or accidental
Timing— Whether the risk is pre- or post-deployment
Supporting Evidence (1)
1.
"Malicious intent includes cases where users directly aim to create dangerous situations. Users may also employ an indirect “divide and conquer” approach by instructing the agent to synthesize or produce innocuous components that collectively lead to a harmful outcome."(p. 6)
Other risks from Tang2025 (7)
Chemical Risks
4.2 Cyberattacks, weapon development or use, and mass harmHumanIntentionalPost-deployment
Biological Risks
4.2 Cyberattacks, weapon development or use, and mass harmOtherUnintentionalOther
Radiological Risks
7.3 Lack of capability or robustnessOtherOtherPost-deployment
Physical (Mechanical ) Risks
7.3 Lack of capability or robustnessOtherUnintentionalPost-deployment
Information Science Risks
2.1 Compromise of privacy by leaking or correctly inferring sensitive informationOtherOtherPost-deployment
Malicious and Direct
4.0 Malicious Actors & MisuseHumanIntentionalOther