Skip to main content
BackCapabilities that could be used to reduce human control - Manipulation
Home/Risks/DSIT (2023)/Capabilities that could be used to reduce human control - Manipulation

Capabilities that could be used to reduce human control - Manipulation

Capabilities and Risks from Frontier AI

DSIT (2023)

Sub-category
Risk Domain

AI systems that develop, access, or are provided with capabilities that increase their potential to cause mass harm through deception, weapons development and acquisition, persuasion and manipulation, political strategy, cyber-offense, AI development, situational awareness, and self-proliferation. These capabilities may cause mass harm due to malicious human actors, misaligned AI systems, or failure in the AI system.

"There is evidence that language models tend to respond as though they share the user’s stated views, and larger models do this more than smaller ones.276 The ability to predict people’s views and generate text that they will endorse could be useful for manipulation."(p. 27)

Supporting Evidence (1)

1.
"In an online study, 1500 participants used an opinionated LLM to help them write about a topic. They reported agreeing with the LLM’s opinion on the topic considerably more often in a subsequent survey, having changed their opinion to align with it.278"(p. 28)

Part of Loss of control

Other risks from DSIT (2023) (12)