This page is still being polished. If you have thoughts, please share them via the feedback form.
Data on this page is preliminary and may change. Please do not share or cite these figures publicly.
Laws, mandates, and enforcement mechanisms requiring state authority to create or enforce.
Also in Ecosystem
Technical oversight in AI regulation involves a variety of components that ensure AI systems adhere to ethical and safety standards. These components include transparency and explainability, auditing and monitoring, accountability mechanisms, and the establishment of safety standards and certification processes.
Transparency and Explainability: Transparency is a cornerstone of responsible AI governance. For AI systems to be effectively regulated, their decision-making processes must be interpretable by human operators and auditors (Bengio et al., 2024b; Bommasani et al., 2024b). Explainability refers to the ability to trace and understand how an AI system arrives at its conclusions. This is particularly important in highstakes fields such as healthcare and criminal justice, where opaque decision-making can lead to harmful consequences. The push for transparency aligns with regulatory frameworks, such as the European Union’s GDPR, which mandates that individuals have the right to an explanation of AI-driven decisions (Bommasani et al., 2024a). • Auditing and Monitoring: The auditing of AI systems is essential for identifying potential biases, operational flaws, or security vulnerabilities. AI audits can be performed at various stages of system development, from pre-deployment assessments to continuous monitoring once AI systems are in operation (Bengio et al., 2024b; Bommasani et al., 2024b). Continuous monitoring ensures that AI systems remain compliant with ethical guidelines and legal requirements over time. Monitoring frameworks should include mechanisms for tracking data quality, decision-making processes, and model performance, especially in dynamic environments where AI models learn and adapt (Bengio et al., 2024b). • Accountability Mechanisms: Accountability ensures that developers and operators of AI systems are responsible for the outcomes produced by their technologies. One of the major proposals in this area is the introduction of mandatory incident reporting for high-risk AI applications (Bengio et al., 2024b). This would require companies and organizations to disclose failures or unethical outcomes produced by their AI systems. Additionally, clear guidelines must be established to define liability in cases where AI systems cause harm, particularly in scenarios where the harm could have been anticipated or prevented through proper oversight (Bommasani et al., 2024a). • Safety Standards and Certification: The development of safety standards and certification processes for AI systems is a critical element of technical oversight. These standards should be based on international cooperation to ensure harmonized regulatory approaches across different jurisdictions. Certification processes would involve third-party assessments to verify that AI systems meet established safety and ethical benchmarks before they are deployed in critical settings (Bommasani et al., 2024a). Such standards should cover aspects like data privacy, algorithmic fairness, and robustness against adversarial attacks (Bengio et al., 2024b; Bommasani et al., 2024b).
Proposals
Value Misalignment
99.9 OtherValue Misalignment > Mitigating social bias
1 AI SystemValue Misalignment > Privacy protection
1 AI SystemValue Misalignment > Methods for mitigating toxicity
1 AI SystemValue Misalignment > Methods for mitigating LLM amorality
1 AI SystemRobustness to attack
1 AI SystemLarge Language Model Safety: A Holistic Survey
Shi, Dan; Shen, Tianhao; Huang, Yufei; Li, Zhigen; Leng, Yongqi; Jin, Renren; Liu, Chuang; Wu, Xinwei; Guo, Zishan; Yu, Linhao; Shi, Ling; Jiang, Bojian; Xiong, Deyi (2024)
The rapid development and deployment of large language models (LLMs) have introduced a new frontier in artificial intelligence, marked by unprecedented capabilities in natural language understanding and generation. However, the increasing integration of these models into critical applications raises substantial safety concerns, necessitating a thorough examination of their potential risks and associated mitigation strategies. This survey provides a comprehensive overview of the current landscape of LLM safety, covering four major categories: value misalignment, robustness to adversarial attacks, misuse, and autonomous AI risks. In addition to the comprehensive review of the mitigation methodologies and evaluation resources on these four aspects, we further explore four topics related to LLM safety: the safety implications of LLM agents, the role of interpretability in enhancing LLM safety, the technology roadmaps proposed and abided by a list of AI companies and institutes for LLM safety, and AI governance aimed at LLM safety with discussions on international cooperation, policy proposals, and prospective regulatory directions. Our findings underscore the necessity for a proactive, multifaceted approach to LLM safety, emphasizing the integration of technical solutions, ethical considerations, and robust governance frameworks. This survey is intended to serve as a foundational resource for academy researchers, industry practitioners, and policymakers, offering insights into the challenges and opportunities associated with the safe integration of LLMs into society. Ultimately, it seeks to contribute to the safe and beneficial development of LLMs, aligning with the overarching goal of harnessing AI for societal advancement and well-being. A curated list of related papers has been publicly available at https://github.com/tjunlp-lab/Awesome-LLM-Safety-Papers.
Other (outside lifecycle)
Outside the standard AI system lifecycle
Governance Actor
Regulator, standards body, or oversight entity shaping AI policy
Govern
Policies, processes, and accountability structures for AI risk management