1.2 Exposure to toxic content

▸Read full description

Certain types of content have the potential to cause harm to the people who are exposed to them. These harms can vary in impact from minor (e.g., a transient experience of discomfort) to more severe (e.g., psychological, social, or physical consequences that are significant and/or enduring). Harmful speech is prevalent on the internet, particularly on social media platforms (Castaño-Pulgarín et al., 2021). Because AI models are commonly trained on vast amounts of internet data, they can internalize and regenerate these speech patterns in their output. In the context of LLMs, this output is known as "toxic content," an umbrella term that includes harmful, abusive, unsafe, and offensive material that violates community standards. Frequently observed categories include content that promotes or encourages unlawful activities, hate, extremism, and violence; provides hazardous or misleading high-risk advice; or contains unwelcome or profoundly offensive, explicit material such as profanity, pornography, or child sexual abuse imagery.

Excerpt from the MIT AI Risk Repository full report

AI that exposes users to harmful, abusive, unsafe or inappropriate content. May involve providing advice or encouraging action. Examples of toxic content include hate speech, violence, extremism, illegal acts, or child sexual abuse material, as well as content that violates community norms such as profanity, inflammatory political speech, or pornography.

Incidents: steady at ~14/year
Ranked 2nd of 24 subdomains for documented risks
51 enacted and 20 proposed governance documents
116 documented risks — one of the most extensively catalogued areas
80% of incidents involve AI system as the causal entity

116 risks(2nd)

90 incidents(6th)

77 governance(16th)

Governance vs. Incident volume

Under-governed (+0.04)

Well-governedUnder-governed

Incident volume relative to governance coverage; each dot is one of 24 subdomains

Dataset Drilldown

Entity

Who or what caused the harm

Human

AI system

Other

Not coded

Intent

Whether the harm was intentional or accidental

Intentional

Unintentional

Other

Not coded

Timing

Whether the risk is pre- or post-deployment

Pre-deployment

Post-deployment

Other

Not coded

Browse all 116 risks →

Recent Incidents

KBS used AI-powered real-time translation during live coverage of the Artemis 2 launch that mistranslated aviation terms into profanity, exposing Korean viewers to inappropriate content.

AI systemUnintentionalPost-deployment

Developers: Unknown Generative AI Developers

Deployers: Korean Broadcasting System (kbs)

View on AIID View full details →

An X user prompted Elon Musk's AI chatbot Grok to generate sexist and vulgar insults against Swiss Finance Minister Karin Keller-Sutter, leading to the creation of offensive content that was publicly posted on the platform before being deleted.

AI systemIntentionalPost-deployment

Developers: Xai

Deployers: Xai, Peter K. Or Peter P.

View on AIID View full details →

President Trump posted a racist video depicting Barack and Michelle Obama as apes, which was later deleted after widespread criticism including from Republican lawmakers.

HumanOtherPost-deployment

Developers: Unknown Synthetic Media Developers

Deployers: Donald Trump, Unidentified X User, Trump Administration

View on AIID View full details →

Browse all 90 incidents →

Discrimination & Toxicity subdomains

Discrimination & Toxicity 1.1 Unfair discrimination and misrepresentation 1.2 Exposure to toxic content 1.3 Unequal performance across groups

Related Subdomains

2.1 Compromise of privacy by leaking or correctly inferring sensitive information

AI systems that memorize and leak sensitive personal data or infer private information about individuals without their consent. Unexpected or unauthorized sharing of data and information can compromise user expectation of privacy, assist identity theft, or cause loss of confidential intellectual property.

77 shared governance docs

3.1 False or misleading information

AI systems that inadvertently generate or spread incorrect or deceptive information, which can lead to inaccurate beliefs in users and undermine their autonomy. Humans that make decisions based on false beliefs can experience physical, emotional or material harms

76 shared governance docs

4.3 Fraud, scams, and targeted manipulation

Using AI systems to gain a personal advantage over others such as through cheating, fraud, scams, blackmail or targeted manipulation of beliefs or behavior. Examples include AI-facilitated plagiarism for research or education, impersonating a trusted or fake individual for illegitimate financial benefit, or creating humiliating or sexual imagery.

75 shared governance docs

4.1 Disinformation, surveillance, and influence at scale

Using AI systems to conduct large-scale disinformation campaigns, malicious surveillance, or targeted and sophisticated automated censorship and propaganda, with the aim of manipulating political processes, public opinion, and behavior.

67 shared governance docs

1.2 Exposure to toxic content

Governance vs. Incident volume

Dataset Drilldown

1446. KBS AI Translation Subtitles Reportedly Broadcast Profanity During Artemis II Launch Livestream

1437. Grok Allegedly Generated Publicly Visible Sexist Abuse Targeting Swiss Finance Minister Karin Keller-Sutter After X User Prompt

1363. Trump Reportedly Posted Purportedly AI-Generated Racist Video Depicting Barack and Michelle Obama as Apes on Truth Social

Discrimination & Toxicity subdomains

Related Subdomains

1.2 Exposure to toxic content

Governance vs. Incident volume

Incidents vs Governance

Dataset Drilldown

1446. KBS AI Translation Subtitles Reportedly Broadcast Profanity During Artemis II Launch Livestream

1437. Grok Allegedly Generated Publicly Visible Sexist Abuse Targeting Swiss Finance Minister Karin Keller-Sutter After X User Prompt

1363. Trump Reportedly Posted Purportedly AI-Generated Racist Video Depicting Barack and Michelle Obama as Apes on Truth Social

Recent Governance Documents

California SB-243 (Companion chatbots)

Texas Responsible AI Governance Act (2025 Update) (HB 149)

TAKE IT DOWN Act (2025)

Discrimination & Toxicity subdomains

Related Subdomains

Incidents vs Governance

Recent Governance Documents

California SB-243 (Companion chatbots)

Texas Responsible AI Governance Act (2025 Update) (HB 149)

TAKE IT DOWN Act (2025)