claude opus 4.8 ships with honesty improvements

source: simon willison: claude opus 4.8: "a modest but tangible improvement"

level: technical

anthropic released claude opus 4.8, described as a modest but tangible improvement over its predecessor. the company noted that users will find it a minor incremental upgrade, and they are working on lower-cost models with similar capabilities. pricing remains at $5 per million input tokens and $25 per million output tokens, with a fast mode at double the cost, now reduced from previous models. the context window stays at 1 million tokens, with a 128,000 token max output, and both knowledge and training data cutoffs are january 2026.

a key focus is honesty. the model is about four times less likely to let flaws in its own code go unremarked, and it had the lowest incorrect rate on benchmarks by abstaining when uncertain rather than guessing. early testers report it flags uncertainties more often and makes fewer unsupported claims. this aligns with anthropic's training to avoid unsupported statements, addressing a common issue where models confidently assert progress without evidence.

new features include mid-conversation system messages, allowing updated instructions without restating the full prompt, which preserves cache hits and reduces costs in agentic loops. the minimum cacheable prompt length dropped from 4,096 to 1,024 tokens, improving efficiency. these changes aim to make long-running conversations more practical and cost-effective for developers building on the api.

why it matters: reduced hallucination and mid-conversation steering can make ai agents more reliable and cheaper to run in production.

source: simon willison: claude opus 4.8: "a modest but tangible improvement"