Before collecting training data, impleme…

BackAudit input data before it is used to train the AI system

This page is still being polished. If you have thoughts, please share them via the feedback form.

Data on this page is preliminary and may change. Please do not share or cite these figures publicly.

Audit input data before it is used to train the AI system

UK_DSIT (2023)|LLM classified

Mitigation Taxonomy

1AI System

1.1Model

1.1.1Training Data

Modifications to training data composition, quality, and filtering that affect what the model learns.

Also in Model

1.1.2 Learning Objectives1.1.3 Capability Modification1.1.4 Model Architecture

LLM Classification Details

Reasoning

Auditing input data before training filters and validates training corpus composition to affect model learning.

Code: 1.1.1Version: v0.6Classified: Feb 6, 2026

Part of

Data input controls and audits

The data used to train AI systems influences how they behave. Where frontier AI systems are trained on poor quality or undesirable data, this increases the risks they pose and could enhance their potential dangerous capabilities. By controlling and auditing the data AI systems are trained or fine-tuned on, it is possible to make more accurate predictions about their capabilities and mitigate risks by, for example, removing input data that may produce an AI system with dangerous capabilities. Data input controls and audits can also provide important information to downstream users and regulators.

Sub-mitigations (8)

Audit datasets used for pre-training but also those used for fine-tuning, classifiers, and other tools

Inappropriate datasets could result in systems that fail to disobey harmful instructions.

2.2.3 Auditing & Compliance

Lifecycle:Collect and Process DataActor:DeveloperAIRM:Measure

Use technical tools – such as classifiers and filters – to audit large datasets

to support scalability and privacy. These could be used in combination with human oversight, which can verify and augment these assessments.

1.2.1 Guardrails & Filtering

Lifecycle:Collect and Process DataActor:DeveloperAIRM:Measure

Assess the overall composition of training data

This could include the data sources, the provenance of the data, indicators of data quality and integrity, and measures of bias and representativeness. The amount and variety of data are simple, reliable predictors of risk, and provide an additional line of defence where more targeted assessments are limited.

2.2.1 Risk Assessment

Lifecycle:Collect and Process DataActor:DeveloperAIRM:Measure

Audit datasets for various types of information

Including: Information that might enhance dangerous system capabilities, such as information about weapons manufacturing or terrorism. Private or sensitive information. AI systems may be subject to data extraction attacks, where determined users can prompt systems to reveal pieces of training data, or may even reveal this information accidentally. This makes it important to know whether datasets include private or sensitive information, for example, names, addresses, or security vulnerabilities. Biases in the data. Training data that is imbalanced or inaccurate can result in an AI system being less accurate for people with certain personal characteristics or providing a skewed picture of particular groups. Ensuring a better balance in the training data could help to address this.[footnote 6] Harmful content, such as child sexual abuse materials, hate speech, or online abuse. Having a better understanding of harmful content in datasets can inform safety measures (for example, by highlighting domains where additional safeguards like content filters should be applied). Misinformation. Training an AI system on inaccurate information increases the likelihood the outputs of the system will be inaccurate and could lead to harm.

1.1.1 Training Data

Lifecycle:Collect and Process DataActor:DeveloperAIRM:Measure

Draw on external expertise in conducting input data audits.

For example, biosecurity experts could be consulted to identify information relevant to biological weapons manufacturing, which may not be readily obvious to non-experts.

2.2.1 Risk Assessment

Lifecycle:Collect and Process DataActor:DeveloperAIRM:Measure

Use data audits to improve understanding of how training data affects AI system behaviour

For example, if model evaluations reveal a potentially dangerous capability, data audits can help ascertain the extent to which the training data contributed to it.

2.2.1 Risk Assessment

Lifecycle:Collect and Process DataActor:DeveloperAIRM:Measure

Conduct audits on datasets used by their customers to fine-tune AI systems

Customers are often allowed to fine-tune systems on their own datasets. By carrying out audits to ensure that customers are not encouraging undesirable behaviours, frontier AI organisations can use their expertise and insight into the AI system’s original training data to identify potential harms upstream. It is important that frontier AI organisations are mindful of privacy concerns and make use of privacy preserving techniques, where appropriate.

2.2.3 Auditing & Compliance

Lifecycle:Collect and Process DataActor:DeveloperAIRM:Measure

Document the results of input data audits, including metadata

Frontier AI organisations could look to emerging standards when documenting the results of input data audits, such as datasheets for datasets

2.2.4 Assurance Documentation

Lifecycle:Collect and Process DataActor:DeveloperAIRM:Measure

Other mitigations from UK_DSIT (2023) (181)

Model reporting and information sharing

Transparency around frontier AI can help governments to effectively realise the benefits of AI and mitigate AI risks. Transparency can also encourage sharing of best practices across frontier AI organisations, enable users to make well-informed choices about whether and how to use AI systems, and increase public trust, helping to drive AI adoption. Reporting and sharing information where appropriate could ensure that different parties can access the information they need to support effective governance, develop best practice, inform decision-making about the use of AI systems, and build public trust. Some reporting practices- such as model cards- are already used among frontier AI organisations, whereas other practices included here are areas for future consideration. Given the recent rapid pace of progress in AI, the appropriate government and international governance institutions are still being considered and we recognise that limits the ability of frontier AI organisations to share information with governments, even where it would be desirable. Throughout this section ‘relevant government authorities’ is used to indicate a good practice for information sharing with governments while recognising such relevant authorities may still be under development.

3.3.1 Industry Coordination

Lifecycle:Other (outside lifecycle)Actor:DeveloperAIRM:Govern

Model reporting and information sharing > Share model-agonistic information

3.3.1 Industry Coordination

Lifecycle:Other (outside lifecycle)Actor:Governance ActorAIRM:Govern

Model reporting and information sharing > Share model-specific information

Sharing information about specific frontier AI models allows external actors to develop a more granular picture of ongoing AI development and potential risks that will need to be addressed.

3.3.1 Industry Coordination

Lifecycle:Other (outside lifecycle)Actor:DeveloperAIRM:Map

Model reporting and information sharing > Share different information with different parties

99 Other

Lifecycle:Other (outside lifecycle)Actor:Governance ActorAIRM:Manage

Security controls including securing model weights

To ensure the safety of frontier AI, consideration of cyber security, protective security risk management and insider risk mitigation is key. Cyber security, both of models and the systems that deploy them, must be considered from the outset of development to ensure that the benefits of AI can be realised. Cyber security is a key underpinning for the safety, reliability, predictability, ethics and potential regulatory compliance of an AI system. To avoid putting safety or sensitive data at risk, it is important to consider the cyber security of AI systems, as well as models in isolation, and to implement cyber security processes throughout the AI lifecycle, particularly where that component is a foundation for other systems. As AI systems advance, developers must maintain an awareness of possible attacks, identify vulnerabilities and implement mitigations. Failure to do so will risk designing vulnerabilities into future AI models and systems. A Secure by Design approach allows developers to ‘bake in’ security from the outset of design and development. Cyber security must be considered in concert with physical and personnel security. Developing a coherent, holistic, risk based and proportionate security strategy, supported by effective governance structures, is essential. Where the compromise of an AI system could lead to tangible or widespread physical damage, significant loss of business operations, leakage of sensitive or confidential information, reputational damage and/or legal challenge, then it is important that AI security risks are treated as mission critical.

2.3.2 Access & Security Controls

Lifecycle:Other (multiple stages)Actor:DeveloperAIRM:Manage

Security controls including securing model weights > Implement strong cyber security measures and processes (including security evaluations) across their AI systems, including underlying infrastructure and supply chains

2.3 Operations & Security

Lifecycle:Other (multiple stages)Actor:Other (multiple actors)AIRM:Manage

View all 181 mitigations from this source →

Source Document

Emerging processes for frontier AI safety

UK Department for Science, Innovation and Technology (2023)

The UK recognises the enormous opportunities that AI can unlock across our economy and our society. However, without appropriate guardrails, such technologies can pose significant risks. The AI Safety Summit will focus on how best to manage the risks from frontier AI such as misuse, loss of control and societal harms. Frontier AI organisations play an important role in addressing these risks and promoting the safety of the development and deployment of frontier AI. The UK has therefore encouraged frontier AI organisations to publish details on their frontier AI safety policies ahead of the AI Safety Summit hosted by the UK on 1 to 2 November 2023. This will provide transparency regarding how they are putting into practice voluntary AI safety commitments and enable the sharing of safety practices within the AI ecosystem. Transparency of AI systems can increase public trust, which can be a significant driver of AI adoption. This document complements these publications by providing a potential list of frontier AI organisations’ safety policies. These have been gathered after extensive research and will need updating regularly given the emerging nature of this technology. The safety processes are not listed in order of importance but are summarised in themes. The government is not suggesting or mandating any particular combination of policies – merely detailing the current suite available so that others can understand, interpret and compare frontier companies’ safety policies. This document contains the world’s first overview of emerging safety processes focused on frontier AI and is intended to be a useful tool to boost transparency. This conversation is for frontier AI and whilst it is important that safety is applied throughout the AI sector, it is also important that innovation is not stifled, hence why policies must be proportionate and based on capabilities which are the key driver of risk. This document contains processes and associated practices that some frontier AI organisations are already implementing and others that are being considered within academia and broader civil society. It is intended as a guide for readers of frontier AI companies’ AI safety policies to better understand what good policy might look like, though organisations themselves will be best placed to determine their applicability. Through this exercise, the government intends to help inform dialogue on potential appropriate measures for individual organisations to consider at the UK AI Safety Summit.

View source

Classification

AI Lifecycle Stage

Collect and Process Data

Gathering, curating, labelling, and preprocessing training data

Responsible Actor

Developer

Entity that creates, trains, or modifies the AI system

NIST AI RMF Function

Measure

Quantifying, testing, and monitoring identified AI risks

Risk Domains

Primary

1 Discrimination & Toxicity

Other

2.2 AI system security vulnerabilities and attacks 7.3 Lack of capability or robustness