BackIdentifiers of AI or non-AI content

This page is still being polished. If you have thoughts, please share them via the feedback form.

Data on this page is preliminary and may change. Please do not share or cite these figures publicly.

Identifiers of AI or non-AI content

Jones (2024)|LLM classified

Mitigation Taxonomy

1AI System

1.2Non-Model

1.2.5Provenance & Watermarking

Output attribution, content watermarking, and AI detection mechanisms.

Also in Non-Model

1.2.1 Guardrails & Filtering1.2.2 Runtime Environment1.2.3 Monitoring & Detection1.2.4 Security Infrastructure

Definition

Identify what content is and is not from AI systems. Some methods also identify the originating AI system or even user.

LLM Classification Details

Reasoning

Technical mechanism detects and attributes AI-generated content; identifies originating system or user through output analysis.

Code: 1.2.5Version: v0.5Classified: Jan 22, 2026

Sub-mitigations (5)

AI-content watermarking

1.2.5 Provenance & Watermarking

Lifecycle:Build and Use ModelActor:DeveloperAIRM:Manage

Human-content watermarking

Similar to watermarking AI outputs, some systems may be able to watermark human-generated content.

1.2.5 Provenance & Watermarking

Lifecycle:Other (outside lifecycle)Actor:Governance ActorAIRM:Manage

Hash databases and perceptual hashing

Hash functions are one-way functions that usually take arbitrary inputs (like some AI-generated content) and output a short string that represents that content. For example, hash(<some image>) = abcd.

1.2.5 Provenance & Watermarking

Lifecycle:DeployActor:Other (multiple actors)AIRM:Manage

Content provenance

Content provenance (also known as ‘chain of custody’) focuses on recording how content has been created and updated over time. This provides much more detailed information than other methods (which are usually a more binary yes/no for being AI-generated).

1.2.5 Provenance & Watermarking

Lifecycle:Operate and MonitorActor:Governance ActorAIRM:Manage

Content classifiers

Content classifiers aim to directly identify existing AI content without special changes to the AI content itself. Usually, these are AI systems themselves, trained to distinguish between real and AI images (similar to a discriminator in a GAN).

1.2.5 Provenance & Watermarking

Lifecycle:Operate and MonitorActor:DeveloperAIRM:Measure

Other mitigations from Jones (2024) (49)

Compute goverance

Regulate companies in the highly concentrated AI chip supply chain, given AI chips are key inputs to developing frontier AI models.

3.1.1 Legislation & Policy

Lifecycle:Other (outside lifecycle)Actor:Governance ActorAIRM:Govern

Data input controls

Filter data used to train AI models, e.g. don’t train your model with instructions to launch cyberattacks.

1.1.1 Training Data

Lifecycle:Collect and Process DataActor:DeveloperAIRM:Manage

Licensing

Require organisations or specific training runs to be licensed by a regulatory body, similar to licensing regimes in other high-risk industries.

3.1.4 Compliance Requirements

Lifecycle:Other (outside lifecycle)Actor:Governance ActorAIRM:Govern

On-chip governance mechanisms

Make alterations to AI hardware (primarily AI chips), that enable verifying or controlling the usage of this hardware.

1.2.4 Security Infrastructure

Lifecycle:Other (stage not listed)Actor:Infrastructure ProviderAIRM:Govern

Safety cases

Develop structured arguments demonstrating that an AI system is unlikely to cause catastrophic harm, to inform decisions about training and deployment.

2.2.4 Assurance Documentation

Lifecycle:Plan and DesignActor:Governance ActorAIRM:Measure

Evaluations (aka “evals”)

Give AI systems standardised tests to assess their capabilities, which can inform the risks they might pose.

2.2.2 Testing & Evaluation

Lifecycle:Verify and ValidateActor:Governance ActorAIRM:Measure

View all 49 mitigations from this source →

Source Document

The AI regulator’s toolbox: A list of concrete AI governance practices

Jones, Adam (2024)

This article explains concrete AI governance practices people are exploring as of August 2024. Prior summaries have mapped out high-level areas of work, but rarely dive into concrete practice details. This summary explores specific practices addressing risks from advanced AI systems. Practices are grouped into categories based on where in the AI lifecycle they best fit. The primary goal of this article is to help newcomers contribute to the field of AI governance by providing a comprehensive overview of available practices.

View source

Classification

AI Lifecycle Stage

Operate and Monitor

Running, maintaining, and monitoring the AI system post-deployment

Deploy

Responsible Actor

Developer

Entity that creates, trains, or modifies the AI system

DeployerGovernance Actor

NIST AI RMF Function

Map

Identifying and documenting AI risks, contexts, and impacts

Govern

Risk Domains

Primary

3.1 False or misleading information

Other

4.1 Disinformation, surveillance, and influence at scale 7.4 Lack of transparency or interpretability