BackType 2: Bigger than expected
Type 2: Bigger than expected
Risk Domain
AI systems that fail to perform reliably or effectively under varying conditions, exposing them to errors and failures that can have significant consequences, especially in critical applications or areas that require moral reasoning.
Harm can result from AI that was not expected to have a large impact at all, such as a lab leak, a surprisingly addictive open-source product, or an unexpected repurposing of a research prototype.(p. 3)
Entity— Who or what caused the harm
Intent— Whether the harm was intentional or accidental
Timing— Whether the risk is pre- or post-deployment
Supporting Evidence (1)
1.
Example: "A chat-bot is created to help users talk about stressors in their personal life. A 6-month beta test shows that users claim a large benefit from talking to the bot, and almost never regret using it, so an open source version of the bot is made available online, which can be downloaded and used for free even without an internet connection. The software “goes viral”, attracting many more users than expected, until over 50% of young adults aged 20 to 30 become regular users of the bot’s advice. When the bot gives the same advice to multiple members of the same friend group, they end up taking it much more seriously than in the beta tests (which didn’t recruit whole groups of friends). As a result of the bot’s frequent advice to “get some distance from their stressors”, many people begin to consider dropping out of college or quitting their jobs. Ordinarily this would be a passing thought, but finding that many of their friends were contemplating the same decisions (due to the influence of the bot), they feel more socially comfortable making the change. Many groups of friends collectively decide to leave their jobs or schools. Public education suffers, and unemployment rates increase'(p. 9)
Other risks from Critch & Russell (2023) (5)
Type 1: Diffusion of responsibility
6.5 Governance failureAI systemUnintentionalOther
Type 3: Worse than expected
7.3 Lack of capability or robustnessAI systemUnintentionalPost-deployment
Type 4: Willful indifference
6.4 Competitive dynamicsHumanUnintentionalPost-deployment
Type 5: Criminal weaponization
4.2 Cyberattacks, weapon development or use, and mass harmHumanIntentionalPost-deployment
Type 6: State Weaponization
4.2 Cyberattacks, weapon development or use, and mass harmHumanIntentionalPost-deployment