Skip to main content
BackMalign belief distributions
Home/Risks/Everitt. Lea & Hutter (2018)/Malign belief distributions

Malign belief distributions

AGI Safety Literature Review

Everitt. Lea & Hutter (2018)

Category
Risk Domain

AI systems that fail to perform reliably or effectively under varying conditions, exposing them to errors and failures that can have significant consequences, especially in critical applications or areas that require moral reasoning.

"Christiano (2016) argues that the universal distribution M (Hutter, 2005; Solomonoff, 1964a,b, 1978) is malign. The argument is somewhat intricate, and is based on the idea that a hypothesis about the world often includes simulations of other agents, and that these agents may have an incentive to influence anyone making decisions based on the distribution. While it is unclear to what extent this type of problem would affect any practical agent, it bears some semblance to aggressive memes, which do cause problems for human reasoning (Dennett, 1990)."(p. 9)

Other risks from Everitt. Lea & Hutter (2018) (8)