This page is still being polished. If you have thoughts, please share them via the feedback form.
Data on this page is preliminary and may change. Please do not share or cite these figures publicly.
Modifications to training data composition, quality, and filtering that affect what the model learns.
Also in Model
Reasoning
Auditing input data before training filters and validates training corpus composition to affect model learning.
Audit datasets used for pre-training but also those used for fine-tuning, classifiers, and other tools
Inappropriate datasets could result in systems that fail to disobey harmful instructions.
2.2.3 Auditing & ComplianceUse technical tools – such as classifiers and filters – to audit large datasets
to support scalability and privacy. These could be used in combination with human oversight, which can verify and augment these assessments.
1.2.1 Guardrails & FilteringAssess the overall composition of training data
This could include the data sources, the provenance of the data, indicators of data quality and integrity, and measures of bias and representativeness. The amount and variety of data are simple, reliable predictors of risk, and provide an additional line of defence where more targeted assessments are limited.
2.2.1 Risk AssessmentAudit datasets for various types of information
Including: Information that might enhance dangerous system capabilities, such as information about weapons manufacturing or terrorism. Private or sensitive information. AI systems may be subject to data extraction attacks, where determined users can prompt systems to reveal pieces of training data, or may even reveal this information accidentally. This makes it important to know whether datasets include private or sensitive information, for example, names, addresses, or security vulnerabilities. Biases in the data. Training data that is imbalanced or inaccurate can result in an AI system being less accurate for people with certain personal characteristics or providing a skewed picture of particular groups. Ensuring a better balance in the training data could help to address this.[footnote 6] Harmful content, such as child sexual abuse materials, hate speech, or online abuse. Having a better understanding of harmful content in datasets can inform safety measures (for example, by highlighting domains where additional safeguards like content filters should be applied). Misinformation. Training an AI system on inaccurate information increases the likelihood the outputs of the system will be inaccurate and could lead to harm.
1.1.1 Training DataDraw on external expertise in conducting input data audits.
For example, biosecurity experts could be consulted to identify information relevant to biological weapons manufacturing, which may not be readily obvious to non-experts.
2.2.1 Risk AssessmentUse data audits to improve understanding of how training data affects AI system behaviour
For example, if model evaluations reveal a potentially dangerous capability, data audits can help ascertain the extent to which the training data contributed to it.
2.2.1 Risk AssessmentConduct audits on datasets used by their customers to fine-tune AI systems
Customers are often allowed to fine-tune systems on their own datasets. By carrying out audits to ensure that customers are not encouraging undesirable behaviours, frontier AI organisations can use their expertise and insight into the AI system’s original training data to identify potential harms upstream. It is important that frontier AI organisations are mindful of privacy concerns and make use of privacy preserving techniques, where appropriate.
2.2.3 Auditing & ComplianceDocument the results of input data audits, including metadata
Frontier AI organisations could look to emerging standards when documenting the results of input data audits, such as datasheets for datasets
2.2.4 Assurance DocumentationModel reporting and information sharing
Transparency around frontier AI can help governments to effectively realise the benefits of AI and mitigate AI risks. Transparency can also encourage sharing of best practices across frontier AI organisations, enable users to make well-informed choices about whether and how to use AI systems, and increase public trust, helping to drive AI adoption. Reporting and sharing information where appropriate could ensure that different parties can access the information they need to support effective governance, develop best practice, inform decision-making about the use of AI systems, and build public trust. Some reporting practices- such as model cards- are already used among frontier AI organisations, whereas other practices included here are areas for future consideration. Given the recent rapid pace of progress in AI, the appropriate government and international governance institutions are still being considered and we recognise that limits the ability of frontier AI organisations to share information with governments, even where it would be desirable. Throughout this section ‘relevant government authorities’ is used to indicate a good practice for information sharing with governments while recognising such relevant authorities may still be under development.
3.3.1 Industry CoordinationModel reporting and information sharing > Share model-agonistic information
3.3.1 Industry CoordinationModel reporting and information sharing > Share model-specific information
Sharing information about specific frontier AI models allows external actors to develop a more granular picture of ongoing AI development and potential risks that will need to be addressed.
3.3.1 Industry CoordinationModel reporting and information sharing > Share different information with different parties
99 OtherSecurity controls including securing model weights
To ensure the safety of frontier AI, consideration of cyber security, protective security risk management and insider risk mitigation is key. Cyber security, both of models and the systems that deploy them, must be considered from the outset of development to ensure that the benefits of AI can be realised. Cyber security is a key underpinning for the safety, reliability, predictability, ethics and potential regulatory compliance of an AI system. To avoid putting safety or sensitive data at risk, it is important to consider the cyber security of AI systems, as well as models in isolation, and to implement cyber security processes throughout the AI lifecycle, particularly where that component is a foundation for other systems. As AI systems advance, developers must maintain an awareness of possible attacks, identify vulnerabilities and implement mitigations. Failure to do so will risk designing vulnerabilities into future AI models and systems. A Secure by Design approach allows developers to ‘bake in’ security from the outset of design and development. Cyber security must be considered in concert with physical and personnel security. Developing a coherent, holistic, risk based and proportionate security strategy, supported by effective governance structures, is essential. Where the compromise of an AI system could lead to tangible or widespread physical damage, significant loss of business operations, leakage of sensitive or confidential information, reputational damage and/or legal challenge, then it is important that AI security risks are treated as mission critical.
2.3.2 Access & Security ControlsSecurity controls including securing model weights > Implement strong cyber security measures and processes (including security evaluations) across their AI systems, including underlying infrastructure and supply chains
2.3 Operations & SecurityEmerging processes for frontier AI safety
UK Department for Science, Innovation and Technology (2023)
The UK recognises the enormous opportunities that AI can unlock across our economy and our society. However, without appropriate guardrails, such technologies can pose significant risks. The AI Safety Summit will focus on how best to manage the risks from frontier AI such as misuse, loss of control and societal harms. Frontier AI organisations play an important role in addressing these risks and promoting the safety of the development and deployment of frontier AI. The UK has therefore encouraged frontier AI organisations to publish details on their frontier AI safety policies ahead of the AI Safety Summit hosted by the UK on 1 to 2 November 2023. This will provide transparency regarding how they are putting into practice voluntary AI safety commitments and enable the sharing of safety practices within the AI ecosystem. Transparency of AI systems can increase public trust, which can be a significant driver of AI adoption. This document complements these publications by providing a potential list of frontier AI organisations’ safety policies. These have been gathered after extensive research and will need updating regularly given the emerging nature of this technology. The safety processes are not listed in order of importance but are summarised in themes. The government is not suggesting or mandating any particular combination of policies – merely detailing the current suite available so that others can understand, interpret and compare frontier companies’ safety policies. This document contains the world’s first overview of emerging safety processes focused on frontier AI and is intended to be a useful tool to boost transparency. This conversation is for frontier AI and whilst it is important that safety is applied throughout the AI sector, it is also important that innovation is not stifled, hence why policies must be proportionate and based on capabilities which are the key driver of risk. This document contains processes and associated practices that some frontier AI organisations are already implementing and others that are being considered within academia and broader civil society. It is intended as a guide for readers of frontier AI companies’ AI safety policies to better understand what good policy might look like, though organisations themselves will be best placed to determine their applicability. Through this exercise, the government intends to help inform dialogue on potential appropriate measures for individual organisations to consider at the UK AI Safety Summit.
Collect and Process Data
Gathering, curating, labelling, and preprocessing training data
Developer
Entity that creates, trains, or modifies the AI system
Measure
Quantifying, testing, and monitoring identified AI risks
Primary
1 Discrimination & Toxicity