Model Evaluation for Extreme Risks

MIT AI Risk Repository – Blog(Global) 6 Feb 2026 55

Agencies developing AI risk frameworks can draw on this taxonomy of dangerous capabilities to stress-test their own assessment approaches.

Key points

A 2023 paper proposes embedding model evaluation for dangerous capabilities and alignment into AI governance processes.
Nine dangerous capability categories are identified, including cyber-offense, deception, self-proliferation, and situational awareness.
MIT AI Risk Repository surfaces this as one of 25 risk frameworks - useful reference material for agencies building AI risk taxonomies.

Consider Agencies developing AI risk assessment or procurement evaluation frameworks may want to consult this taxonomy of dangerous capabilities as a reference point.
Monitor Risk and assurance teams could monitor the MIT AI Risk Repository as it catalogues further frameworks, given its utility as a curated evidence base for AI governance work.

Implications are AI-generated. Starting points, not advice — see methodology for how they're framed.