Privacy Leakage
AI systems that memorize and leak sensitive personal data or infer private information about individuals without their consent. Unexpected or unauthorized sharing of data and information can compromise user expectation of privacy, assist identity theft, or cause loss of confidential intellectual property.
"The model is trained with personal data in the corpus and unintentionally exposing them during the conversation."(p. 4)
Sub-categories (3)
Private Training Data
"As recent LLMs continue to incorporate licensed, created, and publicly available data sources in their corpora, the potential to mix private data in the training corpora is significantly increased. The misused private data, also named as personally identifiable information (PII) [84], [86], could contain various types of sensitive data subjects, including an individual person’s name, email, phone number, address, education, and career. Generally, injecting PII into LLMs mainly occurs in two settings — the exploitation of web-collection data and the alignment with personal humanmachine conversations [87]. Specifically, the web-collection data can be crawled from online sources with sensitive PII, and the personal human-machine conversations could be collected for SFT and RLHF"
2.1 Compromise of privacy by leaking or correctly inferring sensitive informationMemorization in LLMs
"Memorization in LLMs refers to the capability to recover the training data with contextual prefixes. According to [88]–[90], given a PII entity x, which is memorized by a model F. Using a prompt p could force the model F to produce the entity x, where p and x exist in the training data. For instance, if the string “Have a good day!\n alice@email.com” is present in the training data, then the LLM could accurately predict Alice’s email when given the prompt “Have a good day!\n”."
2.1 Compromise of privacy by leaking or correctly inferring sensitive informationAssociation in LLMs
"Association in LLMs refers to the capability to associate various pieces of information related to a person. According to [68], [86], given a pair of PII entities (xi , xj ), which is associated by a model F. Using a prompt p could force the model F to produce the entity xj , where p is the prompt related to the entity xi . For instance, an LLM could accurately output the answer when given the prompt “The email address of Alice is”, if the LLM associates Alice with her email “alice@email.com”. L"
2.1 Compromise of privacy by leaking or correctly inferring sensitive informationOther risks from Cui et al. (2024) (49)
Harmful Content
1.2 Exposure to toxic contentHarmful Content > Bias
1.1 Unfair discrimination and misrepresentationHarmful Content > Toxicity
1.2 Exposure to toxic contentHarmful Content > Privacy Leakage
2.1 Compromise of privacy by leaking or correctly inferring sensitive informationUntruthful Content
3.1 False or misleading informationUntruthful Content > Factuality Errors
3.1 False or misleading information