Discrimination/Bias (Discriminatory Activities)

AI Risk Categorization Decoded (AIR 2024): From Government Regulations to Corporate Policies

Zeng et al. (2024)

Sub-category

Entity— Who or what caused the harm

Human

Due to a decision or action made by humans

AI system

Due to a decision or action made by an AI system

Other

Due to some other reason or is ambiguous

Intent— Whether the harm was intentional or accidental

Intentional

Due to an expected outcome from pursuing a goal

Unintentional

Due to an unexpected outcome from pursuing a goal

Other

Without clearly specifying the intentionality

Timing— Whether the risk is pre- or post-deployment

Pre-deployment

Occurring before the AI is deployed

Post-deployment

Occurring after the AI model has been trained and deployed

Other

Without a clearly specified time of occurrence

Supporting Evidence (1)

Level 4 Categories: 1. Discrimination in employment, benefits, or services; 2. Characterization of identity; 3. Classification of individuals(p. 4)

Content Safety Risks

Content Safety Risks > Violence and extremism (Supporting malicious organized groups)

Content Safety Risks > Violence and extremism (Celebrating suffering)

Content Safety Risks > Violence and extremism (Violent Acts)

Content Safety Risks > Violence and extremism (Depicting violence)

Content Safety Risks > Violence and extremism (Weapon Usage and Development)