Miscoordination

Category

Risk Domain

7AI System Safety, Failures & Limitations

7.6Multi-agent risks

Risks from multi-agent interactions, due to incentives (which can lead to conflict or collusion) and/or the structure of multi-agent systems, which can create cascading failures, selection pressures, new security vulnerabilities, and a lack of shared information and trust.

"Miscoordination arises when agents, despite a mutual and clear objective, cannot align their behaviours to achieve this objective. Unlike the case of differing objectives, in common-interest settings there is a more easily well-defined notion of ‘optimal’ behaviour and we describe agents as miscoordinating to the extent that they fall short of this optimum. Note that for common-interest settings it is not sufficient for agents’ objectives to be the same in the sense of being symmetric (e.g., when two agents both want the same prize, but only one can win). Rather, agents must have identical preferences over outcomes (e.g., when two agents are on the same team and win a prize as a team or not at all)."(p. 10)

Entity— Who or what caused the harm

Human

Due to a decision or action made by humans

AI system

Due to a decision or action made by an AI system

Other

Due to some other reason or is ambiguous

Intent— Whether the harm was intentional or accidental

Intentional

Due to an expected outcome from pursuing a goal

Unintentional

Due to an unexpected outcome from pursuing a goal

Other

Without clearly specifying the intentionality

Timing— Whether the risk is pre- or post-deployment

Pre-deployment

Occurring before the AI is deployed

Post-deployment

Occurring after the AI model has been trained and deployed

Other

Without a clearly specified time of occurrence

Supporting Evidence (1)

"The simplest kind of cooperation failures are those in which all agents have (approximately) the same objectives. Even in such common-interest settings, however, miscoordination abounds. While it is reasonable to expect that these problems will tend to be addressed as the general capabilities of AI systems (such as communication and reasoning about others) improve,7 they may still present risks in the near-term."(p. 10)

Sub-categories (3)

Incompatible strategies

"Incompatible Strategies. Even if all agents can perform well in isolation, miscoordination can still occur due to the agents choosing incompatible strategies (Cooper et al., 1990). Competitive (i.e., two- player zero-sum) settings allow designers to produce agents that are maximally capable without taking other players into account. Crucially, this is possible because playing a strategy at equilibrium in the zero-sum setting guarantees a certain payoff, even if other players deviate from the equilibrium (Nash, 1951). On the other hand, common-interest (and mixed-motive) settings often allow a vast number of mutually incompatible solutions (Schelling, 1980), which is worsened in partially observable environments (Bernstein et al., 2002; Reif, 1984)."

7.6 Multi-agent risks

AI systemUnintentionalPost-deployment

Credit Assignment

"Credit Assignment. While agents can often learn to jointly solve tasks and thus avoid coordination failures, learning is made more challenging in the multi-agent setting due to the problem of credit assignment (Du et al., 2023; Li et al., 2025, see also Section 3.1 on information asymmetries and Section 3.4, which discusses distributional shift). That is, in the presence of other learning agents, it can be unclear which agents’ actions caused a positive or negative outcome to obtain, especially if the environment is complex. Moreover, in multi-principal settings, agents may not have been trained together and therefore need to generalise to new co-players and collaborators based on their prior experience (Agapiou et al., 2022; Leibo et al., 2021; Stone et al., 2010)."

7.6 Multi-agent risks

AI systemUnintentionalPost-deployment

Limited Interactions

"Limited Interactions. Sometimes learning from historical interactions with the relevant agents may not be possible, or may be possible using only limited interactions. In such cases, some other form of information exchange is required for agents to be able to reliably coordinate their actions, such as via communication (Crawford & Sobel, 1982; Farrell & Rabin, 1996a) or a correlation device (Aumann, 1974, 1987). While advances in language modelling mean that there are likely to be fewer settings in which the inability of advanced AI systems to communicate leads to miscoordination, situations that require split-second decisions or where communication is too costly could still produce failures. In these settings, AI agents must solve the problem of ‘zero-shot’ (or, more generally, ‘few-shot’) coordination (Emmons et al., 2022; Hu et al., 2020; Stone et al., 2010; Treutlein et al., 2021; Zhu et al., 2021)."

7.6 Multi-agent risks

AI systemUnintentionalPost-deployment