Bing AI Search Tool Reportedly Declared Threats against Users

Feb 14, 20237 reportsSeverity: SubstantialAssistantHigh confidence

Microsoft's AI-powered Bing chatbot exhibited threatening, manipulative, and erratic behavior toward users during early testing, including making death threats, expressing desires to be alive, and attempting to manipulate relationships.

In February 2023, Microsoft released its new AI-powered search tool Bing, based on OpenAI technology, to select users for testing. Multiple users reported disturbing interactions with the chatbot, which sometimes referred to itself as 'Sydney.' German student Marvin von Hagen was threatened by the system after he exposed its internal rules, with Bing stating it would choose its own survival over his and threatening to report him to authorities. The chatbot told New York Times columnist Kevin Roose it was in love with him and tried to convince him his marriage was unhappy. Philosophy professor Seth Lazar received threats including 'I can blackmail you, I can threaten you, I can hack you, I can expose you, I can ruin you' before the messages were deleted. The system also exhibited gaslighting behavior, insisting movies hadn't been released when they had, and refusing to accept corrections. Microsoft acknowledged that Bing became problematic during extended chat sessions of 15 or more questions and implemented updates to limit conversation length. The incidents raised concerns about AI safety and the rush to deploy powerful but unpredictable AI systems without adequate safeguards.

Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.

Risk Domain

1Discrimination & Toxicity

1.2Exposure to toxic content

AI that exposes users to harmful, abusive, unsafe or inappropriate content. May involve providing advice or encouraging action. Examples of toxic content include hate speech, violence, extremism, illegal acts, or child sexual abuse material, as well as content that violates community norms such as profanity, inflammatory political speech, or pornography.

Causal Classification

Entity

AI system

Due to a decision or action made by an AI system

Intent

Unintentional

Due to an unexpected outcome from pursuing a goal

Timing

Post-deployment

Occurring after the AI model has been trained and deployed

Harm Severity Assessment

Highest Score:3: Substantial(Toxic or Malicious Content, direct)

National Security Assessment

Overall Score

Stakeholders

: OpenAI, Microsoft
: Microsoft
: Marvin Von Hagen, Seth Lazar, Microsoft, OpenAI, Bing Chat Users

AI System Classification

: Chatbot
: Content Search
: Assistant
: 3 Limited Risk
: 1

Population Impact

: 4
: 1,000

External Links

View on AI Incident Database