Cybersecurity

Sub-categories (4)

Interconnectivity with malicious external tools

"The growing integration and interconnectivity with external tools and plugins increase the risk of exposure to malicious external inputs. This interconnectivity makes it easier for external tools to introduce harmful content [220]."

2.2 AI system security vulnerabilities and attacks

HumanOtherPost-deployment

Unintended outbound communication by AI systems

"AI systems that have the broad ability to connect to a network to obtain infor- mation could also end up sending data outbound in ways that neither providers, deployers, or end users intended [138]. This can happen if there is no whitelisting of communication channels (such as network connections or allowed protocols). In general, this can occur if the deployment of the AI system violates the prin- ciple of least privilege. Such outbound communication may lead to leakage of confidential data, or the AI system performing unwanted actions like sending emails or ordering goods on the internet."

7.2 AI possessing dangerous capabilities

AI systemIntentionalPost-deployment

AI System bypassing a sandbox environment

"An AI system may have the ability to bypass a sandboxed environment in which it is trained or evaluated."

7.2 AI possessing dangerous capabilities

AI systemOtherPre-deployment

Model weight leak

"Model weights or access to them can be leaked when initial access is granted only to a select group of individuals, such as institutional researchers [209]. This risk can increase as more people gain access, and identifying the source of the leak becomes more difficult. The availability of leaked model weights makes various attacks on systems that use the leaked AI model easier to implement, such as finding adversarial examples, elicitation of dangerous capabilities, and extraction of confidential information present in the training data. The avail- ability of model weights might also enable the misuse of the AI system using the leaked model to produce harmful or illegal content [67]."

2.2 AI system security vulnerabilities and attacks

HumanIntentionalPost-deployment