Skip to main content
BackFine-tuning related (Degrading safety training due to benign fine-tuning)
Home/Risks/Gipiškis2024/Fine-tuning related (Degrading safety training due to benign fine-tuning)

Fine-tuning related (Degrading safety training due to benign fine-tuning)

Sub-category

"When downstream providers of AI systems fine-tune AI models to be more suitable for their needs, the resulting AI model can be more likely to produce undesired or harmful outputs (as compared to the non-fine-tuned model), even if the fine-tuning was done with harmless and commonly used data [154]."(p. 15)

Other risks from Gipiškis2024 (144)