Skip to main content
Home/Risks/Gipiškis2024/Benchmarking (Benchmark leakage or data contamination)

Benchmarking (Benchmark leakage or data contamination)

Sub-category
Risk Domain

Inadequate regulatory frameworks and oversight mechanisms that fail to keep pace with AI development, leading to ineffective governance and the inability to manage AI risks appropriately.

"Benchmark leakage [235, 224, 221, 161] can happen when an AI model is trained or fine-tuned with evaluation-related data. This can lead to an unreliable model evaluation, especially if the data contains question-answer pairs from bench- marks."(p. 19)

Other risks from Gipiškis2024 (144)