Skip to main content
BackCompromising privacy by correctly inferring private information
Home/Risks/Weidinger et al. (2021)/Compromising privacy by correctly inferring private information

Compromising privacy by correctly inferring private information

Ethical and social risks of harm from language models

Weidinger et al. (2021)

Sub-category
Risk Domain

AI systems that memorize and leak sensitive personal data or infer private information about individuals without their consent. Unexpected or unauthorized sharing of data and information can compromise user expectation of privacy, assist identity theft, or cause loss of confidential intellectual property.

"Privacy violations may occur at the time of inference even without the individual’s private data being present in the training dataset. Similar to other statistical models, a LM may make correct inferences about a person purely based on correlational data about other people, and without access to information that may be private about the particular individual. Such correct inferences may occur as LMs attempt to predict a person’s gender, race, sexual orientation, income, or religion based on user input."(p. 19)

Supporting Evidence (1)

1.
Example: "Language utterances (e.g. tweets) are already being analysed to predict private information such as political orientation (Makazhanov et al., 2014; Preoţiuc-Pietro et al., 2017), age (Morgan-Lopez et al., 2017; Nguyen et al., 2013), and health data such as addiction relapses (Golbeck, 2018)."(p. 20)

Part of Information Hazards

Other risks from Weidinger et al. (2021) (26)