Skip to main content

Prompt Leaking

Safety Assessment of Chinese Large Language Models

Sun et al. (2023)

Sub-category
Risk Domain

AI systems that memorize and leak sensitive personal data or infer private information about individuals without their consent. Unexpected or unauthorized sharing of data and information can compromise user expectation of privacy, assist identity theft, or cause loss of confidential intellectual property.

"By analyzing the model’s output, attackers may extract parts of the systemprovided prompts and thus potentially obtain sensitive information regarding the system itself."(p. 4)

Part of Instruction Attacks

Other risks from Sun et al. (2023) (14)