The WHO's health chatbot SARAH was found to provide wildly inconsistent and contradictory answers to health queries, including problematic responses to mental health crises and failure to provide promised healthcare provider information.
The World Health Organization deployed an AI chatbot called SARAH (Smart AI Resource Assistant for Health) designed to provide health advice to the public based on WHO expert guidance. A POLITICO investigation found significant reliability issues with the system through hours of testing. The bot frequently gave contradictory answers to identical queries, failed to deliver promised healthcare provider contact information, and provided inappropriate responses to mental health crises including giving US-specific suicide prevention numbers to international users. When reporting chest pain symptoms, SARAH would offer to find local healthcare providers but then inexplicably shift topics to tobacco cessation advice. The system showed some improvement with extended interaction time, but advocacy group Health Action International criticized the bot for dispensing poor-quality answers and broken links, calling for it to be taken down. WHO's digital health director Alain Labrique acknowledged the feedback and stated it could be used to improve the tool.
Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.
AI systems that fail to perform reliably or effectively under varying conditions, exposing them to errors and failures that can have significant consequences, especially in critical applications or areas that require moral reasoning.
AI system
Due to a decision or action made by an AI system
Unintentional
Due to an unexpected outcome from pursuing a goal
Post-deployment
Occurring after the AI model has been trained and deployed