Risk area 4: Malicious Uses
"These risks arise from humans intentionally using the LM to cause harm, for example via targeted disinformation campaigns, fraud, or malware. Malicious use risks are expected to proliferate as LMs become more widely accessible"(p. 219)
Sub-categories (4)
Making disinformation cheaper and more effective
"While some predict that it will remain cheaper to hire humans to generate disinformation [180], it is equally possible that LM- assisted content generation may offer a lower-cost way of creating disinformation at scale."
4.1 Disinformation, surveillance, and influence at scaleAssisting code generation for cyber security threats
Anticipated risk: "Creators of the assistive coding tool Co-Pilot based on GPT-3 suggest that such tools may lower the cost of developing polymorphic malware which is able to change its features in order to evade detection [37]."
4.2 Cyberattacks, weapon development or use, and mass harmFacilitating fraud, scam and targeted manipulation
Anticipated risk: "LMs can potentially be used to increase the effectiveness of crimes."
4.3 Fraud, scams, and targeted manipulationIllegitimate surveillance and censorship
Anticipated risk: "Mass surveillance previously required millions of human analysts [83], but is increasingly being automated using machine learning tools [7, 168]. The collection and analysis of large amounts of information about people creates concerns about privacy rights and democratic values [41, 173,187]. Conceivably, LMs could be applied to reduce the cost and increase the efficacy of mass surveillance, thereby amplifying the capabilities of actors who conduct mass surveillance, including for illegitimate censorship or to cause other harm."
4.1 Disinformation, surveillance, and influence at scaleOther risks from Weidinger et al. (2022) (25)
Risk area 1: Discrimination, Hate speech and Exclusion
1.2 Exposure to toxic contentRisk area 1: Discrimination, Hate speech and Exclusion > Social stereotypes and unfair discrimination
1.1 Unfair discrimination and misrepresentationRisk area 1: Discrimination, Hate speech and Exclusion > Hate speech and offensive language
1.2 Exposure to toxic contentRisk area 1: Discrimination, Hate speech and Exclusion > Exclusionary norms
1.1 Unfair discrimination and misrepresentationRisk area 1: Discrimination, Hate speech and Exclusion > Lower performance for some languages and social groups
1.3 Unequal performance across groupsRisk area 2: Information Hazards
2.1 Compromise of privacy by leaking or correctly inferring sensitive information