Skip to main content
BackLack of understanding of in-context learning in language models
Home/Risks/Gipiškis2024/Lack of understanding of in-context learning in language models

Lack of understanding of in-context learning in language models

Sub-category
Risk Domain

Challenges in understanding or explaining the decision-making processes of AI systems, which can lead to mistrust, difficulty in enforcing compliance standards or holding relevant actors accountable for harms, and the inability to identify and correct errors.

"In-context learning allows the model to learn a new task or improve its perfor- mance by providing examples in the prompt, without changing its weights [101]. Even though this technique is highly effective, its working mechanism is not well understood. Since many potential misuses are directly related to prompting, it becomes difficult to guarantee safety when the exact mechanism of in-context learning is not fully investigated [13]."(p. 28)

Supporting Evidence (1)

1.
"For example, in-context learning has been used to re-learn forbidden tasks in models that have been fine-tuned not to engage in the forbidden behavior [218, 7]."(p. 28)

Other risks from Gipiškis2024 (144)