BackUnrepresentative data
Unrepresentative data
Risk Domain
AI systems that fail to perform reliably or effectively under varying conditions, exposing them to errors and failures that can have significant consequences, especially in critical applications or areas that require moral reasoning.
"Unrepresentative data occurs when the training or fine-tuning data is not sufficiently representative of the underlying population or does not measure the phenomenon of interest."
Entity— Who or what caused the harm
Intent— Whether the harm was intentional or accidental
Timing— Whether the risk is pre- or post-deployment
Supporting Evidence (1)
1.
"If the data is not representative, then the model will not work as intended."
Other risks from IBM2025 (63)
Lack of training data transparency
6.5 Governance failureHumanUnintentionalPre-deployment
Uncertain data provenance
6.5 Governance failureHumanOtherPre-deployment
Data usage restrictions
7.3 Lack of capability or robustnessHumanUnintentionalPre-deployment
Data acquisition restrictions
7.3 Lack of capability or robustnessHumanUnintentionalPre-deployment
Data transfer restrictions
7.3 Lack of capability or robustnessHumanUnintentionalPre-deployment
Personal information in data
2.1 Compromise of privacy by leaking or correctly inferring sensitive informationAI systemUnintentionalPost-deployment