Users anthropomorphizing, trusting, or relying on AI systems, leading to emotional or material dependence and inappropriate relationships with or expectations of AI systems. Trust can be exploited by malicious actors (e.g., to harvest personal information or enable manipulation), or result in harm from inappropriate use of AI in critical situations (e.g., medical emergency). Overreliance on AI systems can compromise autonomy and weaken social ties.
"The the risks that uncalibrated trust may generate in the context of user–assistant relationships"(p. 124)
Sub-categories (2)
Competence trust
"We use the term competence trust to refer to users’ trust that AI assistants have the capability to do what they are supposed to do (and that they will not do what they are not expected to, such as exhibiting undesirable behaviour). Users may come to have undue trust in the competencies of AI assistants in part due to marketing strategies and technology press that tend to inflate claims about AI capabilities (Narayanan, 2021; Raji et al., 2022a). Moreover, evidence shows that more autonomous systems (i.e. systems operating independently from human direction) tend to be perceived as more competent (McKee et al., 2021) and that conversational agents tend to produce content that is believable even when nonsensical or untruthful (OpenAI, 2023d). Overtrust in assistants’ competence may be particularly problematic in cases where users rely on their AI assistants for tasks they do not have expertise in (e.g. to manage their finances), so they may lack the skills or understanding to challenge the information or recommendations provided by the AI (Shavit et al., 2023). Inappropriate competence trust in AI assistants also includes cases where users underestimate the AI assistant’s capabilities. For example, users who have engaged with an older version of the technology may underestimate the capabilities that AI assistants may acquire through updates. These include potentially harmful capabilities. For example, through updates that allow them to collect more user data, AI assistants could become increasingly personalisable and able to persuade users (see Chapter 9) or acquire the capacity to plug in to other tools and directly take actions in the world on the user’s behalf (e.g. initiate a payment or synthesise the user’s voice to make a phone call) (see Chapter 4). Without appropriate checks and balances, these developments could potentially circumvent user consent."
5.1 Overreliance and unsafe useAlignment trust
"Users may develop alignment trust in AI assistants, understood as the belief that assistants have good intentions towards them and act in alignment with their interests and values, as a result of emotional or cognitive processes (McAllister, 1995). Evidence from empirical studies on emotional trust in AI (Kaplan et al., 2023) suggests that AI assistants’ increasingly realistic human-like features and behaviours are likely to inspire users’ perceptions of friendliness, liking and a sense of familiarity towards their assistants, thus encouraging users to develop emotional ties with the technology and perceive it as being aligned with their own interests, preferences and values (see Chapters 5 and 10). The emergence of these perceptions and emotions may be driven by the desire of developers to maximise the appeal of AI assistants to their users (Abercrombie et al., 2023). Although users are most likely to form these ties when they mistakenly believe that assistants have the capacity to love and care for them, the attribution of mental states is not a necessary condition for emotion-based alignment trust to arise. Indeed, evidence shows that humans may develop emotional bonds with, and so trust, AI systems, even when they are aware they are interacting with a machine (Singh-Kurtz, 2023; see also Chapter 11). Moreover, the assistant’s function may encourage users to develop alignment trust through cognitive processes. For example, a user interacting with an AI assistant for medical advice may develop expectations that their assistant is committed to promoting their health and well-being in a similar way to how professional duties governing doctor–patient relationships inspire trust (Mittelstadt, 2019). Users’ alignment trust in AI assistants may be ‘betrayed’, and so expose users to harm, in cases where assistants are themselves accidentally misaligned with what developers want them to do (see the ‘misaligned scheduler’ (Shah et al., 2022) in Chapter 7). For example, an AI medical assistant fine-tuned on data scraped from a Reddit forum where non-experts discuss medical issues is likely to give medical advice that may sound compelling but is unsafe, so it would not be endorsed by medical professionals. Indeed, excessive trust in the alignment between AI assistants and user interests may even lead users to disclose highly sensitive personal information (Skjuve et al., 2022), thus exposing them to malicious actors who could repurpose it for ends that do not align with users’ best interests (see Chapters 8, 9 and 13). Ensuring that AI assistants do what their developers and users expect them to do is only one side of the problem of alignment trust. The other side of the problem centres on situations in which alignment trust in AI developers is itself miscalibrated. While developers typically aim to align their technologies with the preferences, interests and values of their users – and are incentivised to do so to encourage adoption of and loyalty to their products, the satisfaction of these preferences and interests may also compete with other organisational goals and incentives (see Chapter 5). These organisational goals may or may not be compatible with those of the users. As information asymmetries exist between users and developers of AI assistants, particularly with regard to how the technology works, what it optimises for and what safety checks and evaluations have been undertaken to ensure the technology supports users’ goals, it may be difficult for users to ascertain when their alignment trust in developers is justified, thus leaving them vulnerable to the power and interests of other actors. For example, a user may believe that their AI assistant is a trusted friend who books holidays based on their preferences, values or interests, when in fact, by design, the technology is more likely to to book flights and hotels from companies that have paid for privileged access to the user."
5.1 Overreliance and unsafe useOther risks from Gabriel et al. (2024) (69)
Capability failures
7.3 Lack of capability or robustnessCapability failures > Lack of capability for task
7.3 Lack of capability or robustnessCapability failures > Difficult to develop metrics for evaluating benefits or harms caused by AI assistants
6.5 Governance failureCapability failures > Safe exploration problem with widely deployed AI assistants
7.3 Lack of capability or robustnessGoal-related failures
7.1 AI pursuing its own goals in conflict with human goals or valuesGoal-related failures > Misaligned consequentialist reasoning
7.3 Lack of capability or robustness