In the context of generative AI, privacy violations arise when systems collect and divulge sensitive information that individuals or corporations do not consent to sharing with others. Privacy violations can occur both accidentally and intentionally.
Examples of accidental causes include AI models that memorize and inadvertently reproduce or leak sensitive personal information present in their training data, such as names, addresses, and medical records. Even when personal data is not included in the training dataset or directly offered by the user, models can make inferences about sensitive or protected traits of individuals based on predictive correlations within their history of interactions, build profiles of users, or train AI systems. As a result, models may save and reproduce sensitive information derived from prior interactions, such as classified intellectual property. A notable example is the case where Samsung employees accidentally leaked confidential intellectual property to OpenAI after using ChatGPT to help with coding tasks.
Intentional causes include the malicious design and use of AI to exploit users' trust by influencing them to share personal or private information about themselves or others. Privacy attacks, such as membership inference, could allow adversaries to gain knowledge of the private records used to train an AI model. Malicious actors could also deliberately extract private information from a model by crafting prompts designed to exploit the model's knowledge of sensitive data.
Excerpt from the MIT AI Risk Repository full report
AI systems that memorize and leak sensitive personal data or infer private information about individuals without their consent. Unexpected or unauthorized sharing of data and information can compromise user expectation of privacy, assist identity theft, or cause loss of confidential intellectual property.
Incident volume relative to governance coverage — each dot is one of 24 subdomains
Entity
Who or what caused the harm
Intent
Whether the harm was intentional or accidental
Timing
Whether the risk is pre- or post-deployment
Federal agents used facial recognition technology to identify and subsequently retaliate against Nicole Cleland, a legal observer who was monitoring immigration enforcement activities, resulting in the revocation of her Global Entry and TSA PreCheck privileges three days after the encounter.
Developers: Nec
Deployers: United States Customs And Border Protection, United States Border Patrol, Unnamed United States Border Patrol Agent, United States Department Of Homeland Security, United States Immigration And Customs Enforcement
The Chicago Tribune filed a copyright infringement lawsuit against Perplexity AI, alleging the company unlawfully copied millions of Tribune articles to train its AI search engine and chatbot, which reproduce content without linking to the original source.
Developers: Perplexity AI
Deployers: Perplexity AI
Secret Desires, an erotic AI chatbot and image generator platform, exposed nearly two million images and videos including nonconsensual explicit deepfakes of real women through vulnerable cloud storage containers.
Developers: Secret Desires
Deployers: Secret Desires
AI systems that fail to perform reliably or effectively under varying conditions, exposing them to errors and failures that can have significant consequences, especially in critical applications or areas that require moral reasoning.
260 shared governance docs
Challenges in understanding or explaining the decision-making processes of AI systems, which can lead to mistrust, difficulty in enforcing compliance standards or holding relevant actors accountable for harms, and the inability to identify and correct errors.
259 shared governance docs
Inadequate regulatory frameworks and oversight mechanisms that fail to keep pace with AI development, leading to ineffective governance and the inability to manage AI risks appropriately.
236 shared governance docs
Unequal treatment of individuals or groups by AI, often based on race, gender, or other sensitive characteristics, resulting in unfair outcomes and unfair representation of those groups.
197 shared governance docs
Authorize the Secretary of Defense to establish AI Institutes focused on national security. Directs support for interdisciplinary AI research, partnership, innovation ecosystems, and workforce development.
Requires the Secretary of Defense to develop requirements ensuring DoD-funded biological data resources facilitate AI use. Defines "qualified biological data," includes metrics for data quality, cybersecurity safeguards, privacy protections, and allows national security exceptions. Requires the Secretary to consult relevant sectors about the feasibility of new requirements and review existing frameworks.
Requires the Secretary of Defense to develop a cybersecurity policy for AI/ML systems no later than 180 days after the act is passed. Develop a comprehensive review of the effectiveness of the AI/ML policies. Addresses potential security risks, implements methods to mitigate those risks, and establishes standard policy. Requires a comprehensive report of the threats and cybersecurity measures by August 31, 2026.