Overreliance

Foundational Challenges in Assuring Alignment and Safety of Large Language Models

Anwar et al. (2024)

Source DOI

Sub-category

Risk Domain

5Human-Computer Interaction

5.1Overreliance and unsafe use

Users anthropomorphizing, trusting, or relying on AI systems, leading to emotional or material dependence and inappropriate relationships with or expectations of AI systems. Trust can be exploited by malicious actors (e.g., to harvest personal information or enable manipulation), or result in harm from inappropriate use of AI in critical situations (e.g., medical emergency). Overreliance on AI systems can compromise autonomy and weaken social ties.

"If a user begins to excessively trust an LLM, this may cause them to develop an overreliance on the LLM. Overreliance can result in automation bias (Kupfer et al., 2023), and can cause errors of omission (user choosing not to verify the validity of a response) and errors of commission (user believing and acting on the basis of the LLM’s response, even if it contradicts their own knowledge) (Skitka et al., 1999). It can be particularly dangerous in domains where the user may lack relevant expertise to robustly scrutinize the LLM responses. This is particularly a source of risk for LLMs because LLMs can often generate plausible, yet incorrect or unfaithful, rationalizations of their actions (c.f. Section 3.4.10), which can mistakenly cause the user to develop the belief that LLM has the relevant expertise and has provided a valid response"(p. 91)

Entity— Who or what caused the harm

Human

Due to a decision or action made by humans

AI system

Due to a decision or action made by an AI system

Other

Due to some other reason or is ambiguous

Intent— Whether the harm was intentional or accidental

Intentional

Due to an expected outcome from pursuing a goal

Unintentional

Due to an unexpected outcome from pursuing a goal

Other

Without clearly specifying the intentionality

Timing— Whether the risk is pre- or post-deployment

Pre-deployment

Occurring before the AI is deployed

Post-deployment

Occurring after the AI model has been trained and deployed

Other

Without a clearly specified time of occurrence

Supporting Evidence (1)

"There is also a need to better understand the related risks that might arise due to consistent and prolonged usage of LLM by a user in a particular domain. Outsourcing certain types of cognitive tasks to LLMs, e.g. writing tasks, could impair corresponding skills among LLM users. This is particularly a risk for the use of LLMs in education where excessive usage of LLM may cause students to develop an unnecessary, and unwanted, dependency on LLMs. Additionally, prior work has shown that humans can inherit biases from AI systems, and that these negative effects of AI technology do not naturally go away even when the biased AI systems are removed (Vicente and Matute, 2023; Kidd and Birhane, 2023)."(p. 91)

Part of Vulnerability to Poisoning and Backdoors