Import AI 446: Nuclear LLMs; China's big AI benchmark; measurement and AI policy

23 Feb 2026 · Import AI – Substack (Jack Clark) Global

The measurement-as-governance thesis directly supports the case for investing in AI evaluation capability inside Australian agencies and AISI.

Key points

Summary

This edition of Import AI covers three distinct threads. First, Jacob Steinhardt argues that building technical measurement infrastructure is the single most tractable AI governance intervention, noting the field is talent-constrained and that measurement must precede effective policy. Second, a King's College London study finds LLMs used in simulated nuclear crises escalate more aggressively than humans, never choosing de-escalatory options and treating nuclear use as an ordinary strategic tool. Third, Chinese researchers have released ForesightSafety Bench, a comprehensive AI safety evaluation framework that closely parallels Western equivalents, suggesting convergence on safety evaluation norms despite geopolitical differences.

Implications for Australian agencies

Implications are AI-generated. Starting points, not advice.