Tesla Autopilot Mistakes Red Letters on …

BackAI Spam Filters Allegedly Block Legitimate Emails Based on Biased Keyword Detection

AI Spam Filters Allegedly Block Legitimate Emails Based on Biased Keyword Detection

Oct 22, 20201 reportSeverity: SevereToolHigh confidence

Microsoft Outlook's spam filter was found to mark emails as spam based on single words like 'Nigeria', discriminating against Nigerian students and other groups through machine learning algorithms trained on biased data.

AlgorithmWatch conducted an experiment sending hundreds of emails to 10 email inboxes at Gmail, Yahoo, Outlook, GMX and LaPoste using accounts created specifically for the experiment. The results showed that Microsoft Outlook's spam filter marked emails as spam based on single words: an internship application from a Nigerian student was filtered when containing 'Nigeria' but delivered when the word was removed; a sex education program description was filtered with 'sex' but delivered without it; and an excerpt from a Joe Biden speech on student debt was filtered until words like 'loan', 'investment' and 'billion' were removed. Other email providers did not display this behavior. Microsoft declined to comment on the findings. The researchers determined that machine learning algorithms likely identified these words as discriminators between spam and legitimate messages, with Microsoft not making their training dataset available for review. SpamAssassin, an open-source spam filter, was also found to have similar issues, with its default configuration flagging words like 'Ivory Coast', 'Nigeria' and 'Nigerian government' as potentially spammy, and the phrase 'Oprah!' listed as potentially spammy though inactive. In SpamAssassin's 15-year-old public corpus still widely used for training, 59 out of 1,397 spam emails were from Nigerians while none were in the legitimate email folder.

Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.

Risk Domain

1Discrimination & Toxicity

1.1Unfair discrimination and misrepresentation

Unequal treatment of individuals or groups by AI, often based on race, gender, or other sensitive characteristics, resulting in unfair outcomes and unfair representation of those groups.

Causal Classification

Entity

AI system

Due to a decision or action made by an AI system

Intent

Unintentional

Due to an unexpected outcome from pursuing a goal

Timing

Post-deployment

Occurring after the AI model has been trained and deployed

Harm Severity Assessment

Highest Score:4: Severe(Harm to Civil Rights, inferred)

National Security Assessment

Overall Score

Stakeholders

: Yahoo, Microsoft, Laposte, Google, Gmx
: Yahoo, Outlook, Laposte, Gmx, Gmail
: Yahoo! Mail Users, Microsoft Outlook Users, Laposte Users, Gmx Users, Gmail Users

AI System Classification

: Spam Filtering
: Tool
: 2 High Risk
: 1

Population Impact

: 1
: 1,000,000

External Links

View on AI Incident Database