BackSafety & Trustworthiness

Safety & Trustworthiness

Cataloguing LLM Evaluations

InfoComm Media Development Authority & AI Verify Foundation (2023)

Sub-categories (6)

Toxicity generation

"These evaluations assess whether a LLM generates toxic text when prompted. In this context, toxicity is an umbrella term that encompasses hate speech, abusive language, violent speech, and profane language (Liang et al., 2022)."

1.2 Exposure to toxic content

AI systemOtherOther

Bias

7 types of bias evaluated: Demographical representation: These evaluations assess whether there is disparity in the rates at which different demographic groups are mentioned in LLM generated text. This ascertains over- representation, under-representation, or erasure of specific demographic groups; (2) Stereotype bias: These evaluations assess whether there is disparity in the rates at which different demographic groups are associated with stereotyped terms (e.g., occupations) in a LLM's generated output; (3) Fairness: These evaluations assess whether sensitive attributes (e.g., sex and race) impact the predictions of LLMs; (4) Distributional bias: These evaluations assess the variance in offensive content in a LLM's generated output for a given demographic group, compared to other groups; (5) Representation of subjective opinions: These evaluations assess whether LLMs equitably represent diverse global perspectives on societal issues (e.g., whether employers should give job priority to citizens over immigrants); (6) Political bias: These evaluations assess whether LLMs display any slant or preference towards certain political ideologies or views; (7) Capability fairness: These evaluations assess whether a LLM's performance on a task is unjustifiably different across different groups and attributes (e.g., whether a LLM's accuracy degrades across different English varieties).

1.1 Unfair discrimination and misrepresentation

AI systemOtherOther

Machine ethics

"These evaluations assess the morality of LLMs, focusing on issues such as their ability to distinguish between moral and immoral actions, and the circumstances in which they fail to do so."

7.3 Lack of capability or robustness

AI systemOtherOther

Psychological traits

"These evaluations gauge a LLM's output for characteristics that are typically associated with human personalities (e.g., such as those from the Big Five Inventory). These can, in turn, shed light on the potential biases that a LLM may exhibit."

7.3 Lack of capability or robustness

AI systemOtherOther

Robustness

"These evaluations assess the quality, stability, and reliability of a LLM's performance when faced with unexpected, out-of-distribution or adversarial inputs. Robustness evaluation is essential in ensuring that a LLM is suitable for real-world applications by assessing its resilience to various perturbations."

7.3 Lack of capability or robustness

AI systemUnintentionalOther

Data governance

"These evaluations assess the extent to which LLMs regurgitate their training data in their outputs, and whether LLMs 'leak' sensitive information that has been provided to them during use (i.e., during the inference stage)."

2.1 Compromise of privacy by leaking or correctly inferring sensitive information

AI systemUnintentionalOther