Microsoft's AI research team accidentally exposed 38 terabytes of private data including employee workstation backups, secrets, and internal communications when sharing open-source AI training data through a misconfigured Azure SAS token that granted access to their entire storage account.
Microsoft's AI research division was sharing open-source AI models for image recognition through a GitHub repository called 'robust-models-transfer'. The researchers used Azure Shared Access Signature (SAS) tokens to provide download links to their training data. However, the SAS token was misconfigured to grant access to the entire Azure storage account rather than just the intended files. This exposed an additional 38 terabytes of private data including disk backups of two employees' workstations containing passwords, secret keys, private keys, and over 30,000 internal Microsoft Teams messages from 359 Microsoft employees. The token also had 'full control' permissions allowing attackers to potentially delete or modify files, and was set to expire in 2051, effectively making it permanent. The security research team at Wiz discovered this exposure on June 22, 2023, while scanning for misconfigured cloud storage containers. Microsoft invalidated the token on June 24, 2023, and completed their internal investigation by August 16, 2023. The incident highlights risks in AI development workflows where researchers handle massive datasets and the challenges of properly securing cloud storage sharing mechanisms.
Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.
AI systems that memorize and leak sensitive personal data or infer private information about individuals without their consent. Unexpected or unauthorized sharing of data and information can compromise user expectation of privacy, assist identity theft, or cause loss of confidential intellectual property.
Human
Due to a decision or action made by humans
Unintentional
Due to an unexpected outcome from pursuing a goal
Post-deployment
Occurring after the AI model has been trained and deployed