Towards Safer Generative Language Models: A Survey on Safety Risks, Evaluations, and Improvements

MIT AI Risk Repository – Blog(Global) 18 Sep 2024 55

Provides a structured taxonomy of LLM safety risks that APS governance and risk teams can use when assessing generative AI deployments.

Key points

MIT AI Risk Repository summarises a survey identifying seven core safety risks in generative language models.
Risk categories include toxic content, hallucination, privacy leakage, and malicious use - directly relevant to APS AI governance frameworks.
Survey is from 2023 (arXiv:2302.09270); useful as a taxonomy reference but not cutting-edge given rapid field evolution.

Consider Governance and risk teams could assess whether this taxonomy aligns with or usefully supplements existing agency AI risk frameworks and assessment templates.
Monitor Policy teams may want to monitor the MIT AI Risk Repository more broadly as a curated source of risk frameworks relevant to responsible AI work.

Implications are AI-generated. Starting points, not advice — see methodology for how they're framed.