Researchers at Carnegie Mellon and George Washington University discovered that unsupervised AI image models like OpenAI's iGPT and Google's SimCLR encode gender and racial biases from their training data, even without human-labeled images.
Ryan Steed from Carnegie Mellon University and Aylin Caliskan from George Washington University studied two unsupervised learning algorithms: OpenAI's iGPT (a version of GPT-2 trained on pixels) and Google's SimCLR. These algorithms learn from unlabeled images without human annotations. The researchers adapted techniques previously used to examine bias in natural language processing models, using mathematical representations called embeddings that cluster similar content together. They found that both AI systems exhibited stereotypical associations similar to those measured in human Implicit Association Tests, with photos of men clustering closer to images of ties and suits while photos of women appeared farther apart. This research revealed that even without human-created labels, the images themselves from internet datasets encode harmful stereotypes due to overrepresentation of certain demographics and stereotypical portrayals online. The findings have concerning implications for downstream applications, particularly when these models are fine-tuned for sensitive uses like hiring, policing, or other consequential decision-making systems. The researchers worry about potential harm when such biased models are deployed in real-world applications and call for greater transparency, more testing before deployment, and more responsible dataset curation practices.
Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.
Unequal treatment of individuals or groups by AI, often based on race, gender, or other sensitive characteristics, resulting in unfair outcomes and unfair representation of those groups.
AI system
Due to a decision or action made by an AI system
Unintentional
Due to an unexpected outcome from pursuing a goal
Pre-deployment
Occurring before the AI is deployed
No population impact data reported.