Skip to main content
Home/Risks/Gipiškis2024/Benchmark Inaccuracy (Benchmark saturation)

Benchmark Inaccuracy (Benchmark saturation)

Sub-category
Risk Domain

Inadequate regulatory frameworks and oversight mechanisms that fail to keep pace with AI development, leading to ineffective governance and the inability to manage AI risks appropriately.

"Benchmark saturation refers to benchmarks reaching their evaluation ceiling. The tendency towards benchmark saturation has been demonstrated in various benchmarks [19]. When benchmarks reach or are close to saturation, they stop being effective measures for new models, as more nuanced capability gains might not be detected."(p. 21)

Other risks from Gipiškis2024 (144)