BackTechnical and operational risks

Technical vulnerabilities (Robustness - …

Home/Risks/G'sell (2024)/Technical and operational risks

Dishonesty - Targeted surveillance

Technical vulnerabilities (Robustness - …

Home/Risks/G'sell (2024)/Technical and operational risks

Dishonesty - Targeted surveillance

Technical vulnerabilities (Robustness - …

Technical and operational risks

Regulating under Uncertainty: Governance Options for Generative AI

G'sell (2024)

Source DOI

Category

Risk Domain

7AI System Safety, Failures & Limitations

7.3Lack of capability or robustness

AI systems that fail to perform reliably or effectively under varying conditions, exposing them to errors and failures that can have significant consequences, especially in critical applications or areas that require moral reasoning.

"To date, technical limitations and vulnerabilities are present in most generative AI models in various contexts. Consequently, malicious users find it easier to breach an AI system’s safety and ethical guardrails to execute harmful actions.223 Normal user behavior—actions within an AI system’s intended use—can also lead to harmful outcomes. Whether these harmful outcomes result from normal or malicious use, they stem from the inherent limitations of current technology, which future advancements may overcome. This section examines the technical vulnerabilities that can affect AI models, the tendency of generative AI models to generate inaccurate information, and the inherent opacity of these AI systems, which complicates the understanding and mitigation of these difficulties."(p. 60)

Entity— Who or what caused the harm

Human

Due to a decision or action made by humans

AI system

Due to a decision or action made by an AI system

Other

Due to some other reason or is ambiguous

Intent— Whether the harm was intentional or accidental

Intentional

Due to an expected outcome from pursuing a goal

Unintentional

Due to an unexpected outcome from pursuing a goal

Other

Without clearly specifying the intentionality

Timing— Whether the risk is pre- or post-deployment

Pre-deployment

Occurring before the AI is deployed

Post-deployment

Occurring after the AI model has been trained and deployed

Other

Without a clearly specified time of occurrence

Sub-categories (6)

Technical vulnerabilities (Robustness - unexpected behaviour)

"There is no assurance that generative AI models will consistently behave as their developers and users intend. Unwanted content is not necessarily due to intentional adversarial behavior. Generative AI models can unexpectedly produce potentially harmful content, including materials that are racist, discriminatory, or sexually explicit, or that promote violence, terrorism, or hate."

7.3 Lack of capability or robustness

AI systemOtherPost-deployment

Technical vulnerabilities (Robustness - vulnerability to jailbreaking

"Individuals can manipulate models into performing actions that violate the model’s usage restrictions—a phenomenon known as “jailbreaking.” These manipulations may result in causing the model to perform tasks that the developers have explicitly prohibited (see section 3.2.1.). For instance, users may ask the model to provide information on how to conduct illegal activities— asking for detailed instructions on how to build a bomb or create highly toxic drugs."

2.2 AI system security vulnerabilities and attacks

HumanIntentionalPost-deployment

Technical vulnerabilities (The risk of misalignment)

"To assess whether an AI model is reliable or robust, it is crucial to consider whether the model is “aligned.” “Alignment” focuses on whether an AI model effectively operates in accordance with the goals established by its designers.238 A misaligned AI model may pursue some objectives, but not the intended ones. Therefore, misaligned AI models can malfunction and cause harm."

7.1 AI pursuing its own goals in conflict with human goals or values

AI systemOtherPost-deployment

Factually incorrect content (inaccuracies and fabricated sources)

"One of the most vexing problems associated with AI models is that they occasionally present false information as if it is factual—often with authoritative-sounding text and fabricated quotes and sources. This unpredictable phenomenon of generating false information is well known to AI researchers, who have termed such erroneous output with the euphemistic label “hallucination.” "

3.1 False or misleading information

AI systemUnintentionalPost-deployment

Opacity (the black box problem)

"Opacity surrounding the technical, internal decision-making processes of generative AI models is popularly known as the “black box problem.”277 Generative AI models, most ubiquitously built on deep neural networks with hundreds of billions of internal connections,278 have become so complex that their internal decision-making processes are no longer traceable or interpretable to even the most advanced expert observers. This means that, while the inputs and outputs of a system can be observed, developers cannot explain in detail why specific inputs correspond to specific outputs."

7.4 Lack of transparency or interpretability

OtherUnintentionalOther

Opacity (industry opacity)

"Opacity is not solely due to the technological complexity that limits developers’ and users’ understanding of how generative models function on a technical level. It is further exacerbated by the practices of organizations and companies that are advancing the field. Many are private companies that choose to withhold from the public many of the precise characteristics of their most advanced models."

6.4 Competitive dynamics

HumanIntentionalOther