Researchers discovered that word embeddings trained on Google News articles exhibit significant gender stereotypes and biases that can be amplified when these embeddings are used in downstream machine learning applications.
Researchers analyzed word embeddings, a popular framework for representing text data as vectors used in machine learning and natural language processing tasks. They found that word embeddings trained on Google News articles exhibited female/male gender stereotypes to a disturbing extent. The study revealed that gender bias could be captured geometrically by a direction in the word embedding space, and that gender neutral words were linearly separable from gender definition words. The researchers demonstrated problematic associations such as between 'receptionist' and 'female' while desired associations like 'queen' and 'female' remained appropriate. To address this issue, they developed methodology and algorithms to 'debias' embeddings by removing gender stereotypes while preserving useful properties like clustering related concepts and solving analogy tasks. They defined metrics to quantify both direct and indirect gender biases and used crowd-worker evaluation and standard benchmarks to demonstrate that their debiasing algorithms significantly reduced gender bias while maintaining the embeddings' functionality for downstream applications.
Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.
Unequal treatment of individuals or groups by AI, often based on race, gender, or other sensitive characteristics, resulting in unfair outcomes and unfair representation of those groups.
AI system
Due to a decision or action made by an AI system
Unintentional
Due to an unexpected outcome from pursuing a goal
Post-deployment
Occurring after the AI model has been trained and deployed
No population impact data reported.