OpenAI released an AI Text Classifier tool to detect AI-generated content, but it demonstrated significant failures by incorrectly identifying human-written texts from 2015 and Shakespeare's Macbeth as likely AI-generated, raising concerns about false accusations of plagiarism in educational settings.
OpenAI launched the AI Text Classifier in early 2023 as a tool to identify texts generated by AI systems like ChatGPT. The classifier was designed to help educators detect potential plagiarism and academic dishonesty. However, the tool immediately demonstrated significant reliability issues when tested by researchers and users. Sebastian Raschka, an AI researcher, tested the classifier using excerpts from his Python machine learning book published in 2015, with the tool incorrectly flagging content as 'unclear,' 'possibly AI,' and 'likely AI' generated. Most notably, the classifier identified the first page of Shakespeare's Macbeth as 'likely AI-generated.' OpenAI acknowledged the tool's limitations, stating it correctly identifies only 26% of AI-written text while incorrectly labeling human-written text as AI-generated 9% of the time. The company admitted the classifier is 'not fully reliable' and should only be used as a complement to other detection methods. Similar detection tools like GPTZero and DetectGPT also showed comparable failure rates. Researchers demonstrated that AI-generated content could easily evade detection through simple reprompting and paraphrasing techniques. The incident raised particular concerns about potential harm to students who might be falsely accused of plagiarism, with educators already beginning to adopt such tools for grading and academic integrity enforcement.
Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.
AI systems that fail to perform reliably or effectively under varying conditions, exposing them to errors and failures that can have significant consequences, especially in critical applications or areas that require moral reasoning.
AI system
Due to a decision or action made by an AI system
Unintentional
Due to an unexpected outcome from pursuing a goal
Post-deployment
Occurring after the AI model has been trained and deployed
No population impact data reported.