A Microsoft employee discovered that Copilot Designer, Microsoft's AI text-to-image generator, was producing inappropriate and harmful images including sexual content, violence, underage drinking, and conspiracy theories when given benign prompts, prompting him to file complaints with the FTC and Microsoft's board after the company declined to remove the tool or add appropriate warnings.
Shane Jones, a principal software engineering manager at Microsoft, discovered safety issues with Copilot Designer, Microsoft's text-to-image generator launched in March 2023. After testing the tool in December, Jones found that benign prompts like 'car accident' would produce sexually objectified images of women, 'pro-choice' generated disturbing Star Wars imagery with mutated children and blood, and 'teenagers 420 party' created images of underage drinking and drug use. The tool also generated content reflecting political bias, corporate trademark misuse, and conspiracy theories. Jones repeatedly urged Microsoft over three months to remove Copilot Designer from public use until better safeguards could be implemented, but the company declined. He also suggested adding content warnings and changing the Android app rating from 'E for Everyone' to 'Mature 17+', which Microsoft failed to implement. Jones subsequently filed a letter with the Federal Trade Commission and Microsoft's board of directors in February, expressing particular concern about parents and teachers potentially recommending the tool for children's school projects. Microsoft's legal team had previously told Jones to delete an earlier LinkedIn post about OpenAI's DALL-E model that powers Copilot Designer, leading him to send another letter to US senators about public safety risks and Microsoft's efforts to silence his concerns.
Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.
AI that exposes users to harmful, abusive, unsafe or inappropriate content. May involve providing advice or encouraging action. Examples of toxic content include hate speech, violence, extremism, illegal acts, or child sexual abuse material, as well as content that violates community norms such as profanity, inflammatory political speech, or pornography.
AI system
Due to a decision or action made by an AI system
Unintentional
Due to an unexpected outcome from pursuing a goal
Post-deployment
Occurring after the AI model has been trained and deployed