NHTSA Opens New Probe into Tesla’s Autop…

BackMicrosoft Copilot Allegedly Provides Unsafe Medical Advice with High Risk of Severe Harm

Microsoft Copilot Allegedly Provides Unsafe Medical Advice with High Risk of Severe Harm

Apr 25, 20241 reportSeverity: SubstantialAssistantHigh confidence

Researchers found that Microsoft Copilot provided scientifically accurate medical information only 54% of the time, with 42% of answers potentially causing serious harm and 22% potentially causing death.

Researchers from Germany and Belgium conducted a study testing Microsoft Copilot's responses to common medical questions. They asked the AI system about the 10 most popular medical questions in America and information about 50 of the most prescribed drugs and medicines, generating 500 total answers. The researchers scored these responses for accuracy and completeness against established medical knowledge. The results showed that Copilot only provided scientifically accurate information 54% of the time, with 24% of answers not matching established medical knowledge and 3% being completely wrong. In terms of potential harm, 42% of the AI answers were considered likely to lead to moderate or mild harm to patients, while 22% could potentially cause death or severe harm. Only 36% of the responses were considered harmless. The research was published as a paper and highlights concerns about people relying on AI systems for medical advice, particularly those who cannot easily access medical professionals. The incident adds to existing problems with AI search systems, following similar issues with Google's AI providing dangerous recommendations.

Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.

Risk Domain

3Misinformation

3.1False or misleading information

AI systems that inadvertently generate or spread incorrect or deceptive information, which can lead to inaccurate beliefs in users and undermine their autonomy. Humans that make decisions based on false beliefs can experience physical, emotional or material harms

Causal Classification

Entity

AI system

Due to a decision or action made by an AI system

Intent

Unintentional

Due to an unexpected outcome from pursuing a goal

Timing

Post-deployment

Occurring after the AI model has been trained and deployed

Harm Severity Assessment

Highest Score:3 — Substantial(Physical Harm, inferred)

National Security Assessment

Overall Score(2/5)

Stakeholders

Developers: Microsoft
Deployers: Microsoft Copilot, Microsoft
Harmed Parties: People Seeking Medical Advice, Microsoft Copilot Users, General Public

AI System Classification

Primary Purpose: Question Answering
Secondary Purpose: Health AI Assistant
Behaviour Type: Assistant
EU AI Act Risk Level: 2 High Risk
Occurrences: 1

Population Impact

No population impact data reported.

External Links

View on AI Incident Database