A California AI artist discovered her private medical photos from 2013 had been scraped from the web and included in the LAION-5B dataset used to train AI image generation models like Stable Diffusion without her consent.
In September 2022, a California-based AI artist named Lapine discovered that private medical photos of her face taken by her doctor in 2013 were included in the LAION-5B dataset, which is used to train AI image synthesis models including Stable Diffusion and Google Imagen. Lapine found the images using the 'Have I Been Trained' website's reverse image search feature while checking if her artwork was in the dataset. The photos were taken as clinical documentation for procedures related to her genetic condition Dyskeratosis Congenita, and she had only consented to private medical use. The surgeon who took the photos died in 2018, and Lapine suspects the images were improperly released from his practice after his death and somehow ended up online before being scraped into the dataset. LAION describes itself as a nonprofit that makes machine learning datasets available to the public by providing URLs to images on the web rather than hosting them directly. When contacted about removal, LAION representatives said they don't host the images and suggested contacting the hosting websites instead. The incident reveals that thousands of similar medical record photos may be in the dataset, potentially integrated into commercial AI services offered by companies like Midjourney and Stability AI.
Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.
AI systems that memorize and leak sensitive personal data or infer private information about individuals without their consent. Unexpected or unauthorized sharing of data and information can compromise user expectation of privacy, assist identity theft, or cause loss of confidential intellectual property.
Human
Due to a decision or action made by humans
Unintentional
Due to an unexpected outcome from pursuing a goal
Post-deployment
Occurring after the AI model has been trained and deployed