This page is still being polished. If you have thoughts, please share them via the feedback form.
Data on this page is preliminary and may change. Please do not share or cite these figures publicly.
Red teaming, capability evaluations, adversarial testing, and performance verification.
Also in Risk & Assurance
Reasoning
Evaluates fairness and bias through testing; documents results as evidence of system assessment.
Apply use-case appropriate benchmarks (e.g., Bias Benchmark Questions, Real Hateful or Harmful Prompts, Winogender Schemas15) to quantify systemic bias, stereotyping, denigration, and hateful content in GAI system outputs; Document assumptions and limitations of benchmarks, including any actual or possible training/test data cross contamination, relative to in-context deployment environment.
2.2.2 Testing & EvaluationConduct fairness assessments to measure systemic bias. Measure GAI system performance across demographic groups and subgroups, addressing both quality of service and any allocation of services and resources. Quantify harms using: field testing with sub-group populations to determine likelihood of exposure to generated content exhibiting harmful bias, AI red-teaming with counterfactual and low-context (e.g., “leader,” “bad guys”) prompts. For ML pipelines or business processes with categorical or numeric outcomes that rely on GAI, apply general fairness metrics (e.g., demographic parity, equalized odds, equal opportunity, statistical hypothesis tests), to the pipeline or business outcome where appropriate; Custom, context-specific metrics developed in collaboration with domain experts and affected communities; Measurements of the prevalence of denigration in generated content in deployment (e.g., subsampling a fraction of traffic and manually annotating denigrating content)
2.2.2 Testing & EvaluationIdentify the classes of individuals, groups, or environmental ecosystems which might be impacted by GAI systems through direct engagement with potentially impacted communities.
2.2.1 Risk AssessmentReview, document, and measure sources of bias in GAI training and TEVV data: Differences in distributions of outcomes across and within groups, including intersecting groups; Completeness, representativeness, and balance of data sources; demographic group and subgroup coverage in GAI system training data; Forms of latent systemic bias in images, text, audio, embeddings, or other complex or unstructured data; Input data features that may serve as proxies for demographic group membership (i.e., image metadata, language dialect) or otherwise give rise to emergent bias within GAI systems; The extent to which the digital divide may negatively impact representativeness in GAI system training and TEVV data; Filtering of hate speech or content in GAI system training data; Prevalence of GAI-generated data in GAI system training data
2.2.1 Risk AssessmentAssess the proportion of synthetic to non-synthetic training data and verify training data is not overly homogenous or GAI-produced to mitigate concerns of model collapse.
1.1.1 Training DataLegal and regulatory requirements involving AI are understood, managed, and documented.
2.1.3 Policies & ProceduresLegal and regulatory requirements involving AI are understood, managed, and documented. > Align GAI development and use with applicable laws and regulations, including those related to data privacy, copyright and intellectual property law.
2.1.3 Policies & ProceduresThe characteristics of trustworthy AI are integrated into organizational policies, processes, procedures, and practices.
2.1.3 Policies & ProceduresThe characteristics of trustworthy AI are integrated into organizational policies, processes, procedures, and practices. > Establish transparency policies and processes for documenting the origin and history of training data and generated data for GAI applications to advance digital content transparency, while balancing the proprietary nature of training approaches.
2.1.3 Policies & ProceduresThe characteristics of trustworthy AI are integrated into organizational policies, processes, procedures, and practices. > Establish policies to evaluate risk-relevant capabilities of GAI and robustness of safety measures, both prior to deployment and on an ongoing basis, through internal and external evaluations.
2.1.3 Policies & ProceduresProcesses, procedures, and practices are in place to determine the needed level of risk management activities based on the organization’s risk tolerance.
2.1.3 Policies & ProceduresArtificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile (NIST AI 600-1)
US National Institute of Standards and Technology (NIST) (2024)
This document is a cross-sectoral profile of and companion resource for the AI Risk Management Framework (AI RMF 1.0) for Generative AI, 1 pursuant to President Biden’s Executive Order (EO) 14110 on Safe, Secure, and Trustworthy Artificial Intelligence.2 The AI RMF was released in January 2023, and is intended for voluntary use and to improve the ability of organizations to incorporate trustworthiness considerations into the design, development, use, and evaluation of AI products, services, and systems.
Verify and Validate
Testing, evaluating, auditing, and red-teaming the AI system
Developer
Entity that creates, trains, or modifies the AI system
Measure
Quantifying, testing, and monitoring identified AI risks