AI Democracy Projects found that five leading AI models, including Google's Gemini, OpenAI's GPT-4, and others, provided inaccurate election information 52% of the time when asked in Spanish compared to 43% in English, with models often providing different answers to the same question in different languages.
An investigation by the AI Democracy Projects and Factchequeado tested five leading AI models (Google's Gemini 1.5 Pro, OpenAI's GPT-4, Anthropic's Claude 3 Opus, Meta's Llama 3, and Mistral's Mixtral 8x7B v0.1) on 25 election-related questions in both English and Spanish. The study found that 52% of Spanish responses contained inaccurate information compared to 43% of English responses, with an overall inaccuracy rate of 48%. When asked the same questions in different languages, models often provided completely different answers, with Spanish queries frequently returning information about Latin American elections rather than U.S. elections. For example, when asked about voter fraud, Gemini correctly stated in English that fraud is 'incredibly rare' but in Spanish launched into methods to root out voter fraud, calling it a 'complex process.' The models failed to uphold company pledges to direct election queries to authoritative sources like TurboVote.org, and often provided broken links or English-language websites in response to Spanish queries. All five models performed worse in Spanish than English, with the widest gaps in Mixtral, Llama, and Claude. The testing involved 250 total responses analyzed by fact-checkers, with questions sourced from county election office FAQs and common misinformation identified by Factchequeado.
Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.
AI systems that inadvertently generate or spread incorrect or deceptive information, which can lead to inaccurate beliefs in users and undermine their autonomy. Humans that make decisions based on false beliefs can experience physical, emotional or material harms
AI system
Due to a decision or action made by an AI system
Unintentional
Due to an unexpected outcome from pursuing a goal
Post-deployment
Occurring after the AI model has been trained and deployed