In the context of generative AI, privacy violations arise when systems collect and divulge sensitive information that individuals or corporations do not consent to sharing with others. Privacy violations can occur both accidentally and intentionally.
Examples of accidental causes include AI models that memorize and inadvertently reproduce or leak sensitive personal information present in their training data, such as names, addresses, and medical records. Even when personal data is not included in the training dataset or directly offered by the user, models can make inferences about sensitive or protected traits of individuals based on predictive correlations within their history of interactions, build profiles of users, or train AI systems. As a result, models may save and reproduce sensitive information derived from prior interactions, such as classified intellectual property. A notable example is the case where Samsung employees accidentally leaked confidential intellectual property to OpenAI after using ChatGPT to help with coding tasks.
Intentional causes include the malicious design and use of AI to exploit users' trust by influencing them to share personal or private information about themselves or others. Privacy attacks, such as membership inference, could allow adversaries to gain knowledge of the private records used to train an AI model. Malicious actors could also deliberately extract private information from a model by crafting prompts designed to exploit the model's knowledge of sensitive data.
Excerpt from the MIT AI Risk Repository full report
AI systems that memorize and leak sensitive personal data or infer private information about individuals without their consent. Unexpected or unauthorized sharing of data and information can compromise user expectation of privacy, assist identity theft, or cause loss of confidential intellectual property.
Incident volume relative to governance coverage; each dot is one of 24 subdomains
Entity
Who or what caused the harm
Intent
Whether the harm was intentional or accidental
Timing
Whether the risk is pre- or post-deployment
Meta's AI-powered smart glasses captured intimate footage including people using bathrooms, having sex, and handling sensitive documents, which was then reviewed by human contractors in Kenya without the subjects' knowledge or consent.
Developers: Meta, Meta AI
Deployers: Meta, Meta AI, Sama
Elon Musk's AI chatbot Grok doxxed adult performer Siri Dahl by revealing her legal name and birth date to users, compromising personal information she had spent thousands of dollars trying to keep private.
Developers: Xai
Deployers: Xai
NPR host David Greene sued Google alleging that its NotebookLM AI tool's male podcast voice was trained on his voice without permission, with multiple colleagues and listeners identifying the resemblance.
Developers: Google
Deployers: Google
Challenges in understanding or explaining the decision-making processes of AI systems, which can lead to mistrust, difficulty in enforcing compliance standards or holding relevant actors accountable for harms, and the inability to identify and correct errors.
273 shared governance docs
AI systems that fail to perform reliably or effectively under varying conditions, exposing them to errors and failures that can have significant consequences, especially in critical applications or areas that require moral reasoning.
273 shared governance docs
Inadequate regulatory frameworks and oversight mechanisms that fail to keep pace with AI development, leading to ineffective governance and the inability to manage AI risks appropriately.
249 shared governance docs
Unequal treatment of individuals or groups by AI, often based on race, gender, or other sensitive characteristics, resulting in unfair outcomes and unfair representation of those groups.
205 shared governance docs
Authorize the Secretary of Defense to establish AI Institutes focused on national security. Directs support for interdisciplinary AI research, partnership, innovation ecosystems, and workforce development.
Requires the Secretary of Defense to develop requirements ensuring DoD-funded biological data resources facilitate AI use. Defines "qualified biological data," includes metrics for data quality, cybersecurity safeguards, privacy protections, and allows national security exceptions. Requires the Secretary to consult relevant sectors about the feasibility of new requirements and review existing frameworks.
Requires the Secretary of Defense to develop a cybersecurity policy for AI/ML systems no later than 180 days after the act is passed. Develop a comprehensive review of the effectiveness of the AI/ML policies. Addresses potential security risks, implements methods to mitigate those risks, and establishes standard policy. Requires a comprehensive report of the threats and cybersecurity measures by August 31, 2026.