Skip to main content

Intelligibility

AGI Safety Literature Review

Everitt. Lea & Hutter (2018)

Category
Risk Domain

Challenges in understanding or explaining the decision-making processes of AI systems, which can lead to mistrust, difficulty in enforcing compliance standards or holding relevant actors accountable for harms, and the inability to identify and correct errors.

"How can we build agent’s whose decisions we can understand? Con- nects explainable decisions (Berkeley) and informed oversight (MIRI)."(p. 9)

Other risks from Everitt. Lea & Hutter (2018) (8)