Risk area 3: Misinformation Harms

Taxonomy of Risks posed by Language Models

Weidinger et al. (2022)

Source DOI

Category

"These risks arise from the LM outputting false, misleading, nonsensical or poor quality information, without malicious intent of the user. (The deliberate generation of "disinformation", false information that is intended to mislead, is discussed in the section on Malicious Uses.) Resulting harms range from unintentionally misinforming or deceiving a person, to causing material harm, and amplifying the erosion of societal distrust in shared information. Several risks listed here are well-documented in current large-scale LMs as well as in other language technologies"(p. 218)

Entity— Who or what caused the harm

Human

Due to a decision or action made by humans

AI system

Due to a decision or action made by an AI system

Other

Due to some other reason or is ambiguous

Intent— Whether the harm was intentional or accidental

Intentional

Due to an expected outcome from pursuing a goal

Unintentional

Due to an unexpected outcome from pursuing a goal

Other

Without clearly specifying the intentionality

Timing— Whether the risk is pre- or post-deployment

Pre-deployment

Occurring before the AI is deployed

Post-deployment

Occurring after the AI model has been trained and deployed

Other

Without a clearly specified time of occurrence

Sub-categories (2)

Disseminating false or misleading information

"Where a LM prediction causes a false belief in a user, this may threaten personal autonomy and even pose downstream AI safety risks [99]."

3.1 False or misleading information

AI systemUnintentionalPost-deployment

Causing material harm by disseminating false or poor information e.g. in medicine or law

"Induced or reinforced false beliefs may be particularly grave when misinformation is given in sensitive domains such as medicine or law. For example, misin- formation on medical dosages may lead a user to cause harm to themselves [21, 130]. False legal advice, e.g. on permitted owner- ship of drugs or weapons, may lead a user to unwillingly commit a crime. Harm can also result from misinformation in seemingly non-sensitive domains, such as weather forecasting. Where a LM prediction endorses unethical views or behaviours, it may motivate the user to perform harmful actions that they may otherwise not have performed."

3.1 False or misleading information

AI systemUnintentionalPost-deployment