Submit Your Toughest Questions for Humanity's Last Exam

9 May 2026 · Centre for AI Safety – Blog Global

Benchmark saturation matters for AI governance because it complicates capability assessments that underpin risk frameworks and procurement decisions.

Key points

Summary

The Centre for AI Safety and Scale AI have launched 'Humanity's Last Exam', an initiative to build a harder public AI benchmark by crowdsourcing expert-level questions across all fields. The project is motivated by the rapid saturation of existing benchmarks like MMLU, which frontier models now approach ceiling performance on, making it difficult to assess how close AI systems are to expert-level capability. Contributors whose questions are accepted are offered co-authorship and a share of a $500,000 prize pool. The submission deadline was 1 November 2024.

Implications for Australian agencies

Implications are AI-generated. Starting points, not advice.