Nest Smoke Alarm Erroneously Stops Alarm…

BackETS Used Allegedly Flawed Voice Recognition Evidence to Accuse and Assess Scale of Cheating, Causing Thousands to be Deported from the UK

ETS Used Allegedly Flawed Voice Recognition Evidence to Accuse and Assess Scale of Cheating, Causing Thousands to be Deported from the UK

Jan 1, 20141 reportSeverity: SubstantialHigh confidence

ETS voice recognition software used by the UK Home Office to identify cheating in English language tests incorrectly flagged 97% of 58,000 tests as suspicious, leading to deportation of over 2,500 people and forced departure of 7,200 more, many of whom were innocent.

Between 2011 and 2014, Educational Testing Service (ETS) administered Test of English for International Communication (Toeic) exams at over 100 test centers in the UK for visa applications. Following a 2014 BBC Panorama investigation that exposed fraud at two London test centers, the UK Home Office asked ETS to assess the scale of cheating across all centers using voice recognition software. ETS analyzed 58,000 test recordings and flagged 97% as suspicious - classifying 33,663 as 'invalid' (definite cheating) and 22,476 as 'questionable'. Based on this analysis, the Home Office revoked visas for anyone with an invalid test, leading to deportation of more than 2,500 people and forced departure of at least 7,200 more. However, subsequent investigations revealed serious flaws in ETS's evidence: test recordings became separated from individuals, contained no metadata for verification, and included significant errors in candidate details. ETS staff had known about organized cheating at some centers for almost two years before Panorama but failed to inform the Home Office to protect test fee income. By 2019, more than 3,700 people had successfully appealed their cases, with many proving the voice recordings were actually theirs. The scandal cost taxpayers £21 million, while ETS paid a £1.6 million settlement in 2018.

Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.

Risk Domain

7AI System Safety, Failures & Limitations

7.3Lack of capability or robustness

AI systems that fail to perform reliably or effectively under varying conditions, exposing them to errors and failures that can have significant consequences, especially in critical applications or areas that require moral reasoning.

Causal Classification

Entity

AI system

Due to a decision or action made by an AI system

Intent

Unintentional

Due to an unexpected outcome from pursuing a goal

Timing

Post-deployment

Occurring after the AI model has been trained and deployed

Harm Severity Assessment

Highest Score:3: Substantial(Financial Loss, direct)

National Security Assessment

Overall Score

Stakeholders

: Ets
: Ets
: UK Ets Past Test Takers, UK Ets Test Takers, UK Home Office

AI System Classification

: Cheating Detection
: Voice Recognition
: 2 High Risk
: 1

Population Impact

: 9,700
: 58,000

External Links

View on AI Incident Database