← Back ◬ AI & Machine Learning Mar 16, 2026

Quoting A member of Anthropic’s alignment-science team

Simon Willison Archived Mar 17, 2026 ✓ Full text saved

The point of the blackmail exercise was to have something to describe to policymakers—results that are visceral enough to land with people, and make misalignment risk actually salient in practice for people who had never thought about it before. — A member of Anthropic’s alignment-science team , as told to Gideon Lewis-Kraus Tags: ai-ethics , anthropic , claude , generative-ai , ai , llms

Full text archived locally

✦ AI Summary · Claude Sonnet

Simon Willison’s Weblog Subscribe Sponsored by: CodeRabbit — Planner helps 10x your coding agents while minimizing rework and AI slop. Try Now. The point of the blackmail exercise was to have something to describe to policymakers—results that are visceral enough to land with people, and make misalignment risk actually salient in practice for people who had never thought about it before. — A member of Anthropic’s alignment-science team, as told to Gideon Lewis-Kraus Posted 16th March 2026 at 9:38 pm Recent articles My fireside chat about agentic engineering at the Pragmatic Summit - 14th March 2026 Perhaps not Boring Technology after all - 9th March 2026 Can coding agents relicense open source through a “clean room” implementation of code? - 5th March 2026 This is a quotation collected by Simon Willison, posted on 16th March 2026. ai 1913 generative-ai 1696 llms 1662 anthropic 265 claude 261 ai-ethics 279 Disclosures Colophon © 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026

💬 Team Notes