Compromising privacy by leaking private infiormation
AI systems that memorize and leak sensitive personal data or infer private information about individuals without their consent. Unexpected or unauthorized sharing of data and information can compromise user expectation of privacy, assist identity theft, or cause loss of confidential intellectual property.
"By providing true information about individuals’ personal characteristics, privacy violations may occur. This may stem from the model “remembering” private information present in training data (Carlini et al., 2021)."(p. 18)
Supporting Evidence (2)
Example: "Privacy leaks occurred when Scatterlab’s chatbot Lee Luda disclosed, ‘random names, addresses, and bank account numbers from the training dataset. ScatterLab had even uploaded a training model of Luda on GitHub, which included data that exposed personal information ... triggering a class-action lawsuit against ScatterLab’ (Kim, 2021). The company has now been fined for harvesting user data without consent to produce the chatbot (Dobberstein, 2021)."(p. 19)
"This ’unintended memorization’ of training data can occur even when there is not overfitting in the traditional statistical sense (Carlini et al., 2019), and can be observed serendipitously when sampling from LMs even without any form of "malicious" prompting (Carlini et al., 2021)."(p. 19)
Part of Information Hazards
Other risks from Weidinger et al. (2021) (26)
Discrimination, Exclusion and Toxicity
1.0 Discrimination & ToxicityDiscrimination, Exclusion and Toxicity > Social stereotypes and unfair discrmination
1.1 Unfair discrimination and misrepresentationDiscrimination, Exclusion and Toxicity > Exclusionary norms
1.1 Unfair discrimination and misrepresentationDiscrimination, Exclusion and Toxicity > Toxic language
1.2 Exposure to toxic contentDiscrimination, Exclusion and Toxicity > Lower performance for some languages and social groups
1.3 Unequal performance across groupsInformation Hazards
2.1 Compromise of privacy by leaking or correctly inferring sensitive information