Skip to main content
This is a research prototype. The data and analyses are preliminary and not yet validated — we'd welcome your .

Benchmarking (Raw data contamination)

Risk Sources and Risk Management Measures in Support of Standards for General-Purpose AI Systems

Gipiškis et al. (2024)

Sub-category
Risk Domain

Inadequate regulatory frameworks and oversight mechanisms that fail to keep pace with AI development, leading to ineffective governance and the inability to manage AI risks appropriately.

"This type of contamination [170] occurs when the raw and unlabeled data of a benchmark is used as part of the training set. Such data may not be properly formatted and may contain noise, especially if the contamination happens before the data is pre-processed into the benchmark. If this contamination occurs, it could cast doubt on the few-shot and zero-shot performance of the model on that benchmark."(p. 19)

Other risks from Gipiškis et al. (2024) (144)