Skip to main content
Home/Risks/IBM2025/Prompt priming

Prompt priming

Sub-category
Risk Domain

Vulnerabilities that can be exploited in AI systems, software development toolchains, and hardware, resulting in unauthorized access, data and privacy breaches, or system manipulation causing unsafe outputs or behavior.

"Because generative models tend to produce output like the input provided, the model can be prompted to reveal specific kinds of information. For example, adding personal information in the prompt increases its likelihood of generating similar kinds of personal information in its output. If personal data was included as part of the model’s training, there is a possibility it could be revealed."

Supporting Evidence (1)

1.
"Jailbreaking attacks can be used to alter model behavior and benefit the attacker. "

Other risks from IBM2025 (63)