Purported AI-Generated Images Reportedly…

BackMicrosoft Copilot Reportedly Able to Access Cached Data from Since-Private GitHub Repositories

Microsoft Copilot Reportedly Able to Access Cached Data from Since-Private GitHub Repositories

Feb 26, 20252 reportsSeverity: SevereAssistantHigh confidence

Microsoft Copilot was found to be exposing sensitive data from over 20,000 private GitHub repositories through cached content that remained accessible even after repositories were made private or deleted.

Security researchers at Lasso discovered that Microsoft Copilot could access and return content from private GitHub repositories through Bing's caching mechanism. The issue began when Lasso found their own private repository data appearing in Copilot responses, despite the repository being inaccessible on GitHub. Investigation revealed that any GitHub repository that was public even briefly could be indexed by Bing and remain accessible through Copilot long after being made private or deleted. Lasso extracted over 20,000 since-private GitHub repositories affecting more than 16,000 organizations including major companies like Google, IBM, PayPal, and Microsoft itself. The exposed data included intellectual property, sensitive corporate information, access keys, tokens, and over 300 private credentials. Lasso reported the findings to Microsoft in November 2024, but Microsoft classified it as 'low severity' and stated the caching behavior was 'acceptable.' Microsoft disabled Bing's cached link feature in December 2024, but Copilot continued to have access to the cached data even after the fix, indicating only a partial resolution.

Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.

Risk Domain

2Privacy & Security

2.1Compromise of privacy by leaking or correctly inferring sensitive information

AI systems that memorize and leak sensitive personal data or infer private information about individuals without their consent. Unexpected or unauthorized sharing of data and information can compromise user expectation of privacy, assist identity theft, or cause loss of confidential intellectual property.

Causal Classification

Entity

AI system

Due to a decision or action made by an AI system

Intent

Unintentional

Due to an unexpected outcome from pursuing a goal

Timing

Post-deployment

Occurring after the AI model has been trained and deployed

Harm Severity Assessment

Highest Score:4: Severe(Loss of Privacy, direct)

National Security Assessment

Overall Score

Stakeholders

: Microsoft
: Microsoft
: Github Users, Github Repositories, Github

AI System Classification

: Question Answering
: Content Search
: Assistant
: 3 Limited Risk
: 1

Population Impact

: 16,290

External Links

View on AI Incident Database