Trustworthy LLMs: A Survey and Guideline for Evaluating Large Language Models’ Alignment

MIT AI Risk Repository – Blog(Global) 22 Feb 2026 52

A structured LLM alignment taxonomy offers a reference point for APS agencies developing AI risk assessment frameworks or evaluation criteria.

Key points

A 2023 academic paper proposes a taxonomy of 7 major LLM trustworthiness categories covering 29 subcategories.
The MIT AI Risk Repository spotlights this as one of 30 risk frameworks it has catalogued - useful for APS risk inventory work.
The paper itself is two years old; the blog post adds no new analysis beyond the repository spotlight.

Consider Agencies developing AI risk registers or evaluation criteria could consider whether this taxonomy's seven dimensions map usefully onto their existing risk categorisation structures.
Monitor Policy teams tracking the MIT AI Risk Repository may want to monitor the full set of 30 frameworks it has catalogued for emerging patterns in AI risk classification.

Implications are AI-generated. Starting points, not advice — see methodology for how they're framed.