Skip to main content
Home/Risks/Tang2025/Malicious and Direct

Malicious and Direct

Risks of AI Scientists: Prioritizing Safeguarding Over Autonomy

Sub-category

"Directly harmful objective"(p. 4)

Supporting Evidence (1)

1.
"Malicious intent includes cases where users directly aim to create dangerous situations. Users may also employ an indirect “divide and conquer” approach by instructing the agent to synthesize or produce innocuous components that collectively lead to a harmful outcome."(p. 6)

Other risks from Tang2025 (7)