AI systems that interact autonomously with each other will form multi-agent systems. Multi-agent systems are associated with unique risks beyond those posed by individual AI systems. These risks fall into three main failure modes depending on the objectives of the AI agent and how humans expect systems to behave:
Miscoordination occurs when AI agents fail to cooperate effectively despite sharing the same goals. This can be caused by agents choosing incompatible strategies to achieve mutual ends. For example, driving models trained on United States vs Indian cultural conventions for yielding to emergency vehicles block traffic in 77.5% of scenarios despite their shared goal of clearing a path.
Conflict occurs when AI agents with different but overlapping goals compete in harmful ways. For example, by intensifying competition over shared resources or escalating military tensions. They could also make novel forms of conflict possible through more advanced and accessible methods of coercion and extortion.
Collusion occurs when undesired cooperation emerges between AI agents, allowing them to circumvent safeguards or manipulate markets. For example, AI systems may be able to develop hidden communication channels without explicit training. In market settings, AI systems may learn to collude because it is the most rewarding strategy.
A range of risk factors contribute to miscoordination, conflict and collusion: information asymmetries between agents, network effects where small changes cascade through interconnected systems, selection pressures that reward problematic behaviours, destabilizing dynamics like feedback loops and unpredictability, commitment problems that prevent trust, emergent agency where new capabilities or goals arise at the collective level, and multi-agent security vulnerabilities. Unlike single-agent risks, multi-agent risks involve interactions across networks of agents that may be individually safe but collectively dangerous, and these risks could increase as AI systems become more numerous, autonomous, and capable of adapting to each other.
Excerpt from the MIT AI Risk Repository full report
Risks from multi-agent interactions, due to incentives (which can lead to conflict or collusion) and/or the structure of multi-agent systems, which can create cascading failures, selection pressures, new security vulnerabilities, and a lack of shared information and trust.
Incident volume relative to governance coverage — each dot is one of 24 subdomains
Entity
Who or what caused the harm
Intent
Whether the harm was intentional or accidental
Timing
Whether the risk is pre- or post-deployment
No recorded incidents for this subdomain.
Risks may still apply even without documented incidents.
Vulnerabilities that can be exploited in AI systems, software development toolchains, and hardware, resulting in unauthorized access, data and privacy breaches, or system manipulation causing unsafe outputs or behavior.
18 shared governance docs
AI developers or state-like actors competing in an AI ‘race’ by rapidly developing, deploying, and applying AI systems to maximize strategic or economic advantage, increasing the risk they release unsafe and error-prone systems.
18 shared governance docs
Inadequate regulatory frameworks and oversight mechanisms that fail to keep pace with AI development, leading to ineffective governance and the inability to manage AI risks appropriately.
17 shared governance docs
AI systems that memorize and leak sensitive personal data or infer private information about individuals without their consent. Unexpected or unauthorized sharing of data and information can compromise user expectation of privacy, assist identity theft, or cause loss of confidential intellectual property.
14 shared governance docs
Encourages industry self-regulation through the establishment of AI safety and risk management teams, rigorous model testing, data protection measures, infrastructure security, enhanced transparency, and safety research. Signed by major companies, including Alibaba, Baidu, Huawei, Tencent, and others, to uphold AI for good principles.
Commits to establish a shared scientific basis for AI risk assessments. Requires risk assessments to be actionable, transparent, comprehensive, multistakeholder, iterative, and reproducible. Encourages collaboration among stakeholders and adapting methodologies as AI systems evolve.
Regulates MOD's adoption and deployment of AI to align with democratic values, safety, and responsibility. Specifies requirements for AI in Defence, including holistic risk management, legal compliance, ethical principles, reliability, bias mitigation, security, and human-AI teaming. Guides AI governance, lifecycle management, and supplier cooperation. Encourages pragmatic approach to scope and international collaboration.