← Back ◬ AI & Machine Learning Apr 24, 2026

An update on recent Claude Code quality reports

Simon Willison Archived Apr 24, 2026 ✓ Full text saved

An update on recent Claude Code quality reports It turns out the high volume of complaints that Claude Code was providing worse quality results over the past two months was grounded in real problems. The models themselves were not to blame, but three separate issues in the Claude Code harness caused complex but material problems which directly affected users. Anthropic's postmortem describes these in detail. This one in particular stood out to me: On March 26, we shipped a change to clear Claude

Full text archived locally

✦ AI Summary · Claude Sonnet

Simon Willison’s Weblog Subscribe Sponsored by: Sonar — Now with SAST + SCA for secure, dependency-aware Agentic Engineering. SonarQube Advanced Security An update on recent Claude Code quality reports (via) It turns out the high volume of complaints that Claude Code was providing worse quality results over the past two months was grounded in real problems. The models themselves were not to blame, but three separate issues in the Claude Code harness caused complex but material problems which directly affected users. Anthropic's postmortem describes these in detail. This one in particular stood out to me: On March 26, we shipped a change to clear Claude's older thinking from sessions that had been idle for over an hour, to reduce latency when users resumed those sessions. A bug caused this to keep happening every turn for the rest of the session instead of just once, which made Claude seem forgetful and repetitive. I frequently have Claude Code sessions which I leave for an hour (or often a day or longer) before returning to them. Right now I have 11 of those (according to ps aux | grep 'claude ') and that's after closing down dozens more the other day. I estimate I spend more time prompting in these "stale" sessions than sessions that I've recently started! If you're building agentic systems it's worth reading this article in detail - the kinds of bugs that affect harnesses are deeply complicated, even if you put aside the inherent non-deterministic nature of the models themselves. Posted 24th April 2026 at 1:31 am Recent articles DeepSeek V4 - almost on the frontier, a fraction of the price - 24th April 2026 Extract PDF text in your browser with LiteParse for the web - 23rd April 2026 A pelican for GPT-5.5 via the semi-official Codex backdoor API - 23rd April 2026 This is a link post by Simon Willison, posted on 24th April 2026. ai 1982 prompt-engineering 186 generative-ai 1758 llms 1725 anthropic 276 coding-agents 196 claude-code 108 Monthly briefing Sponsor me for $10/month and get a curated email digest of the month's most important LLM developments. Pay me to send you less! Sponsor & subscribe Disclosures Colophon © 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026

💬 Team Notes