AI systems that fail to perform reliably or effectively under varying conditions, exposing them to errors and failures that can have significant consequences, especially in critical applications or areas that require moral reasoning.
"Christiano (2016) argues that the universal distribution M (Hutter, 2005; Solomonoff, 1964a,b, 1978) is malign. The argument is somewhat intricate, and is based on the idea that a hypothesis about the world often includes simulations of other agents, and that these agents may have an incentive to influence anyone making decisions based on the distribution. While it is unclear to what extent this type of problem would affect any practical agent, it bears some semblance to aggressive memes, which do cause problems for human reasoning (Dennett, 1990)."(p. 9)
Other risks from Everitt. Lea & Hutter (2018) (8)
Value specification
7.1 AI pursuing its own goals in conflict with human goals or valuesReliability
7.1 AI pursuing its own goals in conflict with human goals or valuesCorrigibility
7.1 AI pursuing its own goals in conflict with human goals or valuesSecurity
2.2 AI system security vulnerabilities and attacksSafe learning
7.3 Lack of capability or robustnessIntelligibility
7.4 Lack of transparency or interpretability