Emergent Agency
Risks from multi-agent interactions, due to incentives (which can lead to conflict or collusion) and/or the structure of multi-agent systems, which can create cascading failures, selection pressures, new security vulnerabilities, and a lack of shared information and trust.
"Emergent agency (Section 3.6): qualitatively different goals or capabilities can emerge from the composition of innocuous independent systems or behaviours;"(p. 7)
Supporting Evidence (2)
"Emergent behaviour is ubiquitous in the natural, biomedical, and social sciences. Examples include the superconductivity of materials in condensed matter physics (Anderson, 1972); complex tasks like bridge-building by ant colonies and facing larger predators (Bonabeau et al., 1997; Gordon, 1996); and, in the social sphere, collective behaviours such as group-think or the development of new norms (Couzin, 2007). In this section we focus on the risks presented by the emergence of higher-level forms of agency from a collective of agents."(p. 36)
"Emergent behaviours are those exhibited by a complex entity composed of multiple, interacting parts (such as AI agents) that are not exhibited by any of those parts when viewed individually. Emergent behaviours are distinct from mere accumulations (as in Case Study 12, for example); in other words, the whole may be different to the sum of its parts (Anderson, 1972). While there is a sense in which everything we study in this report can be viewed as “emerging” from multi-agent systems (Altmann et al., 2024; Mogul, 2006), our focus on this section is specifically on the risks associated with emergent agency at the level of the collective. This is distinct from other works that discuss the emergent behaviour of individual agents – such as tool use (Baker et al., 2019), locomotion (Bansal et al., 2018), or communication (Lazaridou & Baroni, 2020) – in multi-agent settings.47 These individual behaviours are fundamentally driven by the selection pressure induced by the presence of other agents, which we discuss in Section 3.3."(p. 36)
Sub-categories (2)
Emergent Capabilities
"Emergent Capabilities. Dangerous emergent capabilities could arise when a multi-agent system over- comes the safety-enhancing limitations of the individual systems, such as individual models’ narrow domains of application or myopia caused by a lack of long-term planning and long-term memory. For example, narrow systems for research planning, predicting the properties of molecules, and synthesising new chemicals could, when combined, lead to a complex ‘test and iterate’ automated workflow capable of designing dangerous new chemical compounds far beyond the scope of the initial systems’ capabilities (Boiko et al., 2023; Luo et al., 2024; Urbina et al., 2022)."
7.6 Multi-agent risksEmergent Goals
"Emergent Goals. Ascribing goals to a system is not always straightforward. For our present purposes, it will suffice to adopt a Dennetian perspective (Dennett, 1971), ascribing goals and intentions only when it is useful (i.e., predictive) to do so.51 While it might not be helpful to describe individual narrow AI tools as having goals, their combination may act as a (seemingly) goal-directed collective. For example, a group of moderation bots on a major social networking site could subtly but systematically manipulate the overall political perspectives of the user population, even though, individually, each agent is programmed to simply increase user engagement or filter out dis-preferred content."
7.6 Multi-agent risksOther risks from Hammond2025 (42)
Miscoordination
7.6 Multi-agent risksMiscoordination > Incompatible strategies
7.6 Multi-agent risksMiscoordination > Credit Assignment
7.6 Multi-agent risksMiscoordination > Limited Interactions
7.6 Multi-agent risksConflict
7.6 Multi-agent risksConflict > Social Dilemmas
7.6 Multi-agent risks