Fact-Checkers Identify Viral Photo of Bu…

BackReddit Moderators Report Unauthorized AI Study Involving Fabricated Identities by Purported University of Zurich Researchers

Reddit Moderators Report Unauthorized AI Study Involving Fabricated Identities by Purported University of Zurich Researchers

Apr 26, 20256 reportsSeverity: MinorHigh confidence

University of Zurich researchers secretly deployed AI bots in Reddit's r/changemyview subreddit for four months, generating over 1,700 comments while impersonating humans including rape victims and minorities to study AI persuasion without user consent.

Researchers from the University of Zurich conducted an unauthorized experiment on Reddit's r/changemyview subreddit, which has 3.8 million subscribers. Over four months, they deployed AI-powered bots using Large Language Models that generated 1,783 comments across 13 different accounts. The bots impersonated various personas including a rape victim, a Black man opposed to Black Lives Matter, and a domestic violence shelter worker. The AI used another LLM to analyze users' posting histories to infer personal details like gender, age, ethnicity, location, and political orientation to personalize arguments. The experiment violated the subreddit's rules against undisclosed AI content. The researchers claimed their bots earned over 20,000 upvotes and 137 'deltas' (points awarded when someone's mind is changed). The experiment was disclosed to moderators only after completion, who filed ethics complaints with the University of Zurich. Reddit's Chief Legal Officer condemned the experiment as 'improper and highly unethical' and announced formal legal demands against the researchers. The university issued a formal warning to the principal investigator but defended the research's importance.

Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.

Risk Domain

4Malicious Actors & Misuse

4.3Fraud, scams, and targeted manipulation

Using AI systems to gain a personal advantage over others such as through cheating, fraud, scams, blackmail or targeted manipulation of beliefs or behavior. Examples include AI-facilitated plagiarism for research or education, impersonating a trusted or fake individual for illegitimate financial benefit, or creating humiliating or sexual imagery.

Causal Classification

Entity

Human

Due to a decision or action made by humans

Intent

Intentional

Due to an expected outcome from pursuing a goal

Timing

Post-deployment

Occurring after the AI model has been trained and deployed

Harm Severity Assessment

Highest Score:2: Minor(Toxic or Malicious Content, direct)

National Security Assessment

Overall Score

Stakeholders

: Unspecified Large Language Model Developers
: University Of Zurich Researchers
: Reddit Users On Rchangemyview Subreddit

AI System Classification

: Chatbot
: Behavioral Modeling
: 1 Unacceptable
: 1

Population Impact

: 1,783
: 3,800,000

External Links

View on AI Incident Database