Superhuman Automated Forecasting

9 May 2026 · Centre for AI Safety – Blog Global

AI forecasting tools approaching crowd-level accuracy signal a near-term capability APS policy and risk teams may encounter in vendor proposals or decision-support contexts.

Key points

Summary

The Centre for AI Safety has published FiveThirtyNine, a GPT-4o-based forecasting bot that matches the accuracy of crowd forecasters on a 177-question Metaculus evaluation set, with 87.7% accuracy versus the crowd's 87.0%. The bot uses structured web search, reason-weighing, and bias-adjusted probability outputs to respond to arbitrary queries. CAIS positions the tool as a potential aid for policymakers and public discourse, citing advantages in speed and cost over prediction markets. Known limitations include automation bias risk, no fine-tuning, poor performance on very recent events, and no reject option for invalid queries.

Implications for Australian agencies

Implications are AI-generated. Starting points, not advice.