source: simon willison: quoting openai
level: technical
openai announced a limited preview of its gpt-5.6 model series, introducing three variants: sol, the flagship model; terra, a balanced option for everyday tasks; and luna, a fast, low-cost model. terra matches gpt-5.5 performance at half the price, while luna provides strong capabilities at the lowest cost. the preview is initially restricted to a small group of trusted partners, with broader availability planned in the coming weeks after coordination with the u.s. government.
pricing is set per million tokens: sol costs $5 for input and $30 for output, terra is $2.50 input and $15 output, and luna is $1 input and $6 output. the series also introduces more predictable prompt caching, including explicit cache breakpoints and a minimum cache life of 30 minutes. cache writes are billed at 1.25 times the uncached input rate, while cache reads retain a 90% discount on cached input.
the tiered approach allows developers to choose models based on cost and performance needs, from high-end reasoning to budget-friendly applications. the caching improvements aim to reduce latency and costs for repeated prompts, making the models more efficient for production use. openai's engagement with the government suggests a focus on safety and controlled rollout before wider release.
why it matters: the new pricing and caching features help ai developers optimize costs and performance when integrating large language models into applications.
source: simon willison: quoting openai