Skip to main content

Security

AGI Safety Literature Review

Everitt. Lea & Hutter (2018)

Category
Risk Domain

Vulnerabilities that can be exploited in AI systems, software development toolchains, and hardware, resulting in unauthorized access, data and privacy breaches, or system manipulation causing unsafe outputs or behavior.

"How to design AGIs that are robust to adversaries and adversarial environ- ments? This involves building sandboxed AGI protected from adversaries (Berkeley), and agents that are robust to adversarial inputs (Berkeley, DeepMind)."(p. 9)

Other risks from Everitt. Lea & Hutter (2018) (8)