Import AI 453: Breaking AI agents; MirrorCode; and ten views on gradual disempowerment
AI agent attack vectors and long-horizon coding capability both have direct implications for how APS agencies assess AI deployment risk and vendor assurance.
Key points
- MirrorCode benchmark shows AI can autonomously reimplement complex software of 16,000+ lines of code.
- Google DeepMind paper identifies six attack genres against AI agents, with technical and legal mitigations proposed.
- AI agent security is framed as an ecosystem-level problem requiring standards, liability reform, and red teaming.
Summary
This edition of Import AI covers three items. First, METR and Epoch AI's MirrorCode benchmark demonstrates that frontier models can autonomously reimplement sophisticated software—a capability previously requiring weeks of human expert effort. Second, a Google DeepMind paper categorises six attack genres against AI agents—including content injection, semantic manipulation, and systemic attacks—alongside a layered mitigation framework spanning technical controls, ecosystem standards, and legal liability. Third, the Windfall Policy Atlas catalogues 48 policy responses to transformative AI across five categories, providing a navigable tool for policy exploration.
Implications for Australian agencies
- Consider Agencies deploying or evaluating AI agents may want to assess their current security posture against the six attack genres identified by Google DeepMind, particularly for agentic use cases with external data access.
- Monitor AI governance teams may want to monitor MirrorCode and similar benchmarks as evidence bases for assessing AI capability claims from vendors procuring agentic software tools.
Implications are AI-generated. Starting points, not advice.
"Import AI 453: Breaking AI agents; MirrorCode; and ten views on gradual disempowerment" Source: Import AI – Substack (Jack Clark) Published: 13 April 2026 URL: https://importai.substack.com/p/import-ai-453-breaking-ai-agents This edition of Import AI covers three items. First, METR and Epoch AI's MirrorCode benchmark demonstrates that frontier models can autonomously reimplement sophisticated software—a capability previously requiring weeks of human expert effort. Second, a Google DeepMind paper categorises six attack genres against AI agents—including content injection, semantic manipulation, and systemic attacks—alongside a layered mitigation framework spanning technical controls, ecosystem standards, and legal liability. Third, the Windfall Policy Atlas catalogues 48 policy responses to transformative AI across five categories, providing a navigable tool for policy exploration. Implications for Australian agencies: - [Consider] Agencies deploying or evaluating AI agents may want to assess their current security posture against the six attack genres identified by Google DeepMind, particularly for agentic use cases with external data access. - [Monitor] AI governance teams may want to monitor MirrorCode and similar benchmarks as evidence bases for assessing AI capability claims from vendors procuring agentic software tools. Retrieved from SIMS, 18 May 2026.