Researchers identified 212 malicious large language model services ('Mallas') operating on underground marketplaces, which use AI systems like OpenAI's GPT models to generate malware, phishing emails, and scam websites for cybercriminals.
Researchers from Indiana University Bloomington conducted a systematic study examining 212 'Mallas' - malicious LLM services operating on underground marketplaces from November 2022 through October 2023. They collected 13,353 listings from nine underground marketplaces including Abacus Market, Kerberos Market, and others. The study found that 93.4% of these malicious services offered malware generation capabilities, 41.5% created phishing emails, and 17.45% generated scam websites. The researchers identified five backend LLMs being exploited: OpenAI GPT-3.5, OpenAI GPT-4, Pygmalion-13B, Claude-instant, and Claude-2-100k, with OpenAI models being most frequently targeted. Services like FraudGPT, WormGPT, EscapeGPT, and DarkGPT were found to produce sophisticated malware that could evade virus detection and create convincing phishing content. The malicious actors used two primary techniques: exploiting 'uncensored' open-source models with minimal safety checks, and 'jailbreaking' commercial models using 182 distinct jailbreak prompts to bypass safety measures. The researchers found that OpenAI's GPT Turbo 3.5 was particularly susceptible to jailbreak prompts.
Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.
Using AI systems to develop cyber weapons (e.g., by coding cheaper, more effective malware), develop new or enhance existing weapons (e.g., Lethal Autonomous Weapons or chemical, biological, radiological, nuclear, and high-yield explosives), or use weapons to cause mass harm.
Human
Due to a decision or action made by humans
Intentional
Due to an expected outcome from pursuing a goal
Post-deployment
Occurring after the AI model has been trained and deployed
No population impact data reported.