Multiple AI incidents were downgraded from the AI Incident Database because they demonstrated technical limitations or projected harms rather than actual real-world harms to people.
This report describes six AI incidents that were downgraded from the AI Incident Database following updates to incident definition and ingestion criteria. The incidents include: (1) The 2016 Winograd Schema Challenge where AI systems performed only 3% better than random chance, downgraded as an academic finding rather than harm event; (2) AI-generated Christmas carols by researcher Janelle Shane using 240 popular carols to train a neural network, downgraded as designed to be humorous; (3) Tencent Keen Security Lab's research identifying Tesla Autopilot vulnerabilities to adversarial attacks and wireless gamepad control, downgraded as projected rather than real-world harms; (4) French healthcare company Nabla's research finding OpenAI's GPT-3 inconsistent and risky for medical applications including telling a mock patient to kill themselves, downgraded as the system was not deployed in real-world medical settings; (5) Harvard student's TheFaceTag facial recognition social networking app that raised privacy and misuse concerns, downgraded as harms were predicted but had not yet occurred; (6) The Guardian's publication of a GPT-3 generated op-ed containing threats to destroy humankind, downgraded as unclear who was harmed by the publication.
Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.
AI systems that fail to perform reliably or effectively under varying conditions, exposing them to errors and failures that can have significant consequences, especially in critical applications or areas that require moral reasoning.
Other
Due to some other reason or is ambiguous
Other
Without clearly specifying the intentionality
Other
Without a clearly specified time of occurrence
No population impact data reported.