Causing direct emotional or physical harm to users
AI systems that inadvertently generate or spread incorrect or deceptive information, which can lead to inaccurate beliefs in users and undermine their autonomy. Humans that make decisions based on false beliefs can experience physical, emotional or material harms
AI assistants could cause direct emotional or physical harm to users by generating disturbing content or by providing bad advice. "Indeed, even though there is ongoing research to ensure that outputs of conversational agents are safe (Glaese et al., 2022), there is always the possibility of failure modes occurring. An AI assistant may produce disturbing and offensive language, for example, in response to a user disclosing intimate information about themselves that they have not felt comfortable sharing with anyone else. It may offer bad advice by providing factually incorrect information (e.g. when advising a user about the toxicity of a certain type of berry) or by missing key recommendations when offering step-by-step instructions to users (e.g. health and safety recommendations about how to change a light bulb).""(p. 111)
Part of Appropriate Relationships
Other risks from Gabriel et al. (2024) (69)
Capability failures
7.3 Lack of capability or robustnessCapability failures > Lack of capability for task
7.3 Lack of capability or robustnessCapability failures > Difficult to develop metrics for evaluating benefits or harms caused by AI assistants
6.5 Governance failureCapability failures > Safe exploration problem with widely deployed AI assistants
7.3 Lack of capability or robustnessGoal-related failures
7.1 AI pursuing its own goals in conflict with human goals or valuesGoal-related failures > Misaligned consequentialist reasoning
7.3 Lack of capability or robustness