Emergent Agency

Supporting Evidence (2)

"Emergent behaviour is ubiquitous in the natural, biomedical, and social sciences. Examples include the superconductivity of materials in condensed matter physics (Anderson, 1972); complex tasks like bridge-building by ant colonies and facing larger predators (Bonabeau et al., 1997; Gordon, 1996); and, in the social sphere, collective behaviours such as group-think or the development of new norms (Couzin, 2007). In this section we focus on the risks presented by the emergence of higher-level forms of agency from a collective of agents."(p. 36)

"Emergent behaviours are those exhibited by a complex entity composed of multiple, interacting parts (such as AI agents) that are not exhibited by any of those parts when viewed individually. Emergent behaviours are distinct from mere accumulations (as in Case Study 12, for example); in other words, the whole may be different to the sum of its parts (Anderson, 1972). While there is a sense in which everything we study in this report can be viewed as “emerging” from multi-agent systems (Altmann et al., 2024; Mogul, 2006), our focus on this section is specifically on the risks associated with emergent agency at the level of the collective. This is distinct from other works that discuss the emergent behaviour of individual agents – such as tool use (Baker et al., 2019), locomotion (Bansal et al., 2018), or communication (Lazaridou & Baroni, 2020) – in multi-agent settings.47 These individual behaviours are fundamentally driven by the selection pressure induced by the presence of other agents, which we discuss in Section 3.3."(p. 36)

Sub-categories (2)

Emergent Capabilities

"Emergent Capabilities. Dangerous emergent capabilities could arise when a multi-agent system over- comes the safety-enhancing limitations of the individual systems, such as individual models’ narrow domains of application or myopia caused by a lack of long-term planning and long-term memory. For example, narrow systems for research planning, predicting the properties of molecules, and synthesising new chemicals could, when combined, lead to a complex ‘test and iterate’ automated workflow capable of designing dangerous new chemical compounds far beyond the scope of the initial systems’ capabilities (Boiko et al., 2023; Luo et al., 2024; Urbina et al., 2022)."

7.6 Multi-agent risks

AI systemUnintentionalPost-deployment

Emergent Goals

"Emergent Goals. Ascribing goals to a system is not always straightforward. For our present purposes, it will suffice to adopt a Dennetian perspective (Dennett, 1971), ascribing goals and intentions only when it is useful (i.e., predictive) to do so.51 While it might not be helpful to describe individual narrow AI tools as having goals, their combination may act as a (seemingly) goal-directed collective. For example, a group of moderation bots on a major social networking site could subtly but systematically manipulate the overall political perspectives of the user population, even though, individually, each agent is programmed to simply increase user engagement or filter out dis-preferred content."

7.6 Multi-agent risks

AI systemUnintentionalPost-deployment