Destabilising Dynamics
Risks from multi-agent interactions, due to incentives (which can lead to conflict or collusion) and/or the structure of multi-agent systems, which can create cascading failures, selection pressures, new security vulnerabilities, and a lack of shared information and trust.
"Destabilising dynamics (Section 3.4): systems that adapt in response to one another can produce dangerous feedback loops and unpredictability;"(p. 7)
Supporting Evidence (1)
"Modern AI agents can adapt their strategies in response to events in their environment. The interaction of such agents can result in complex dynamics that are difficult to predict or control, sometimes resulting in damaging run-away effects."(p. 30)
Sub-categories (5)
Feedback Loops
"Feedback Loops. One of the best-known historical examples to illustrate destabilising dynamics in the context of autonomous agents is the 2010 flash crash, in which algorithmic trading agents entered into an unexpected feedback loop (Commission & Commission, 2010, see also Case Study 10).37 More generally, a feedback loop occurs when the output of a system is used as part of its input, creating a cycle that can either amplify or dampen the system’s behaviour. In multi-agent settings, feedback loops often arise from the interactions between agents, as each agent’s actions affect the environment and the behaviour of other agents, which in turn affect their own subsequent actions. Feedback loops can lead not only to financial crashes but to military conflicts (Richardson, 1960, see also ??) and ecological disasters (Holling, 1973)."
7.6 Multi-agent risksCyclic Behaviour
"Cyclic Behaviour. The dynamics described above are highly non-linear (small changes to the system’s state can result in large changes to its trajectory). Similar non-linear dynamics can emerge in multi- agent learning and lead to a variety of phenomena that do not occur in single-agent learning (Barfuss et al., 2019; Barfuss & Mann, 2022; Galla & Farmer, 2013; Leonardos et al., 2020; Nagarajan et al., 2020). One of the simplest examples of this phenomenon is Q-learning (Watkins & Dayan, 1992): in the case of a single agent, convergence to an optimal policy is guaranteed under modest conditions, but in the (mixed-motive) case of multiple agents, this same learning rule can lead to cycles and thus non- convergence (Zinkevich et al., 2005). While cycles in themselves need not carry any risk, their presence can subvert the expected or desirable properties of a given system."
7.6 Multi-agent risksChaos
"Chaos. Unlike the systems that tend towards fixed points or cycles described above, chaotic systems are inherently unpredictable and highly sensitive to initial conditions. While it might seem easy to dismiss such notions as mathematical exoticisms, recent work has shown that, in fact, chaotic dynamics are not only possible in a wide range of multi-agent learning setups (Andrade et al., 2021; Galla & Farmer, 2013; Palaiopanos et al., 2017; Sato et al., 2002; Vlatakis-Gkaragkounis et al., 2023), but can become the norm as the number of agents increases (Bielawski et al., 2021; Cheung & Piliouras, 2020; Sanders et al., 2018). To the best of our knowledge, such dynamics have not been seen in today’s frontier AI systems, but the proliferation of such systems increases the importance of reliably predicting their behaviour."
7.6 Multi-agent risksPhase Transitions
"Phase Transitions. Finally, small external changes to the system – such as the introduction of new agents or a distributional shift – can cause phase transitions, where the system undergoes an abrupt qualitative shift in overall behaviour (Barfuss et al., 2024). Formally, this corresponds to bifurcations in the system’s parameter space, which lead to the creation or destruction of dynamical attractors, resulting in complex and unpredictable dynamics (Crawford, 1991; Zeeman, 1976). For example, Leonardos & Piliouras (2022) show that changes to the exploration hyperparameter of RL agents can lead to phase transitions that drastically change the number and stability of the equilibria in a game, which in turn can have potentially unbounded negative effects on agents’ performance. Relatedly, there have been many observations of phase transitions in ML (Carroll, 2021; Olsson et al., 2022; Ziyin & Ueda, 2022), such as ‘grokking’, in which the test set error decreases rapidly long after the training error has plateaued (Power et al., 2022). These phenomena are still poorly understood, even in the case of a single system."
7.6 Multi-agent risksDistributional Shift
"Distributional Shift. Individual ML systems can perform poorly in contexts different from those in which they were trained. A key source of these distributional shifts is the actions and adaptations of other agents (Narang et al., 2023; Papoudakis et al., 2019; Piliouras & Yu, 2022), which in single-agent approaches are often simply or ignored or at best modelled exogenously. Indeed, the sheer number and variance of behaviours that can be exhibited other agents means that multi-agent systems pose an especially challenging generalisation problem for individual learners (Agapiou et al., 2022; Leibo et al., 2021; Stone et al., 2010). While distributional shifts can cause issues in common-interest settings (see Section 2.1), they are more worrisome in mixed-motive settings since the ability of agents to cooperate depends not only on the ability to coordinate on one of many arbitrary conventions (which might be easily resolved by a common language), but on their beliefs about what solutions other agents will find acceptable"
7.6 Multi-agent risksOther risks from Hammond2025 (42)
Miscoordination
7.6 Multi-agent risksMiscoordination > Incompatible strategies
7.6 Multi-agent risksMiscoordination > Credit Assignment
7.6 Multi-agent risksMiscoordination > Limited Interactions
7.6 Multi-agent risksConflict
7.6 Multi-agent risksConflict > Social Dilemmas
7.6 Multi-agent risks