Skip to main content
BackFine-tuning related (Excessive or overly restrictive safety-tuning)
Home/Risks/Gipiškis2024/Fine-tuning related (Excessive or overly restrictive safety-tuning)

Fine-tuning related (Excessive or overly restrictive safety-tuning)

Sub-category
Risk Domain

AI systems that fail to perform reliably or effectively under varying conditions, exposing them to errors and failures that can have significant consequences, especially in critical applications or areas that require moral reasoning.

"Excessive safety training or safety tuning can impair the performance of AI systems, leading to overly cautious behavior. As a result, these systems may refuse to answer entirely safe prompts which are partially similar to harmful ones [27]."(p. 15)

Other risks from Gipiškis2024 (144)