BackIncreasing Fairness and Representation

This page is still being polished. If you have thoughts, please share them via the feedback form.

Data on this page is preliminary and may change. Please do not share or cite these figures publicly.

Increasing Fairness and Representation

Hodes (2025)|LLM classified

Mitigation Taxonomy

1AI System

Technical mechanisms and engineering interventions that directly modify how an AI system processes inputs, generates outputs, or operates, including changes to models, training procedures, runtime behaviors, and supporting hardware.

Definition

Algorithmic fairness refers to ensuring AI systems and models do not perpetuate or amplify societal biases based on protected characteristics (like race, gender, class), through technical approaches (fairness metrics, bias mitigation) and policy measures (affirmative safety requirements, model cards).

LLM Classification Details

Reasoning

Establishes fairness and ethics principles as design standards governing model development and organizational policies.

Code: 2.4.2Version: v0.5Classified: Jan 22, 2026

Other mitigations from Hodes (2025) (34)

Reduce Hallucinations

Reduce hallucination refers to techniques and methods used to minimize AI systems' tendency to generate false or fabricated information, addressing a critical challenge where language models produce inaccurate facts or citations that could spread misinformation.

1 AI System

Lifecycle:Build and Use ModelActor:DeveloperAIRM:Manage

Mitigate Hallucinations

Technical approaches to reduce LLM hallucinations - instances where AI models generate false or unsupported information while appearing confident in their responses

1 AI System

Lifecycle:Verify and ValidateActor:DeveloperAIRM:Manage

Detecting AI-Generated Content

Detecting AI-generated content involves technical methods and tools to identify whether content was created by artificial intelligence or humans, primarily through watermarking, linguistic analysis, and machine learning approaches.

1.2.5 Provenance & Watermarking

Lifecycle:Other (outside lifecycle)Actor:DeveloperAIRM:Measure

Risks from Persuasion

Risk that AI systems can systematically influence human beliefs and behaviors through sustained, personalized interactions by exploiting cognitive biases and adapting in real-time, enabling large-scale manipulation without human intervention.

99 Other

Lifecycle:Operate and MonitorActor:DeveloperAIRM:Govern

Content Moderation

Content moderation systems enable detecting and filtering toxic content (hate speech, harassment, misinformation) in real-time on digital platforms, while maintaining transparency in moderation decisions.

1.2.1 Guardrails & Filtering

Lifecycle:Operate and MonitorActor:DeployerAIRM:Manage

Make AI Manipulation Use Illegal

Legal framework to criminalize the malicious use of AI for manipulation of individuals or groups, including the creation and deployment of deepfakes and automated influence campaigns.

3.1.1 Legislation & Policy

Lifecycle:Other (outside lifecycle)Actor:Governance ActorAIRM:Govern

View all 34 mitigations from this source →

Source Document

Global Risk and AI Safety Preparedness (GRASP)

Hodes, Cyrus; Salem, Fadi; Corruble, Vincent; Ségerie, Charbel-Raphaël; Claybrough, Jonathan; Veron, Thibaud; Majid, Zainab; Fan, Jinyu; Lorin, Amaury (2025)

Project GRASP (Global Risk and AI Safety Preparedness) is a comprehensive database mapping AI risks and mitigation solutions. The initiative addresses both endogenous risk (autonomous AI systems that behave outside of human supervision) and exogenous risk (the human misuse of those AI systems). The platform serves policymakers, researchers, and industry leaders by providing tools required to identify risks, understand solutions, and find innovations.

View source

Classification

AI Lifecycle Stage

Build and Use Model

Training, fine-tuning, and integrating the AI model

Responsible Actor

Developer

Entity that creates, trains, or modifies the AI system

Governance Actor

NIST AI RMF Function

Govern

Policies, processes, and accountability structures for AI risk management

Risk Domains

Primary

1.1 Unfair discrimination and misrepresentation

Other

1.3 Unequal performance across groups