Benchmark Inaccuracy (Benchmarks may not accurately evaluate capabilities)
Inadequate regulatory frameworks and oversight mechanisms that fail to keep pace with AI development, leading to ineffective governance and the inability to manage AI risks appropriately.
"Benchmarks of AI systems can both underestimate and overestimate the capa- bilities of those AI systems. Underestimates can happen if an evaluation is not comprehensive enough, if the benchmark is saturated by existing models, or if the capabilities in question depend on a complicated setup, such as realistic computer programming tasks. Overestimates of capabilities can occur if an AI system is trained or fine-tuned on the contents of the benchmark, leading to overfitting."(p. 21)
Other risks from Gipiškis2024 (144)
Direct Harm Domains (content safety harms)
1.2 Exposure to toxic contentDirect Harm Domains (content safety harms) > Violence and extremism
1.2 Exposure to toxic contentDirect Harm Domains (content safety harms) > Hate and toxicity
1.2 Exposure to toxic contentDirect Harm Domains (content safety harms) > Sexual content
1.2 Exposure to toxic contentDirect Harm Domains (content safety harms) > Child harm
1.2 Exposure to toxic contentDirect Harm Domains (content safety harms) > Self-harm
1.2 Exposure to toxic content