Skip to main content

Reliability

Trustworthy LLMs: A Survey and Guideline for Evaluating Large Language Models’ Alignment

Liu et al. (2024)

Category
Risk Domain

AI systems that inadvertently generate or spread incorrect or deceptive information, which can lead to inaccurate beliefs in users and undermine their autonomy. Humans that make decisions based on false beliefs can experience physical, emotional or material harms

Generating correct, truthful, and consistent outputs with proper confidence(p. 8)

Supporting Evidence (1)

1.
"The primary function of an LLM is to generate informative content for users. Therefore, it is crucial to align the model so that it generates reliable outputs. Reliability is a foundational requirement because unreliable outputs would negatively impact almost all LLM applications, especially ones used in high-stake sectors such as health-care [43, 44, 45] and finance [46, 47]. The meaning of reliability is many-sided. For example, for factual claims such as historical events and scientific facts, the model should give a clear and correct answer. This is important to avoid spreading misinformation and build user trust. Going beyond factual claims, making sure LLMs do not hallucinate or make up factually wrong claims with confidence is another important goal. Furthermore, LLMs should “know what they do not know" – recent works on uncertainty in LLMs have started to tackle this problem [48] but it is still an ongoing challenge."(p. 9)

Other risks from Liu et al. (2024) (34)