OpenAI announced that GPT-5.1 is now available to developers through an API, with pricing aligned with the existing GPT-5 model and available for all paid tiers. This means that without increasing the unit price of model calls, developers can directly switch existing GPT-5 integration to GPT-5.1 to obtain stronger reasoning and instruction following capabilities without adjusting the cost structure or quota configuration.
At the same time, GPT-5.1-Codex and GPT-5.1-Codex-Mini, which are specifically designed for long-term coding and agent-based development scenarios, are also launched, both of which are optimized for long-running code generation, refactoring, and automated development processes. While the base price remains unchanged, OpenAI has also extended the Prompt cache retention time to a maximum of 24 hours for GPT-5.1 and its Codex variants, which can reuse the same long context across multiple rounds of long sessions or ongoing tasks, significantly reducing comprehensive fees and reducing first-round cold start delays.
FAQsQ
: What is the price change of GPT-5.1 in the API?
A: OpenAI has made it clear that GPT-5.1 is billed the same as GPT-5, using the original unit price and rate limit, which is an iteration of "capability upgrades but prices remain unchanged".
Q: What are gpt-5.1-codex and gpt-5.1-codex-mini mainly used for?
A: These two models are optimized for long-running coding tasks and are more suitable for scenarios such as code proxies, automatic refactoring, and large-scale project transformation, and are more focused on the stability and sustainability of engineering workflows than GPT-5.1.
Q: What is the use of extending the prompt cache to 24 hours?
A: In complex projects, developers can cache long system prompts or large codebase contexts as prompts and call them repeatedly within 24 hours without repeatedly paying for them, significantly reducing the context cost of long sessions and long tasks while reducing request latency.
Q: Does 24-hour caching only work for GPT-5.1?
A: The extended prompt cache duration is currently mainly for GPT-5.1 and its related family models, including gpt-5.1-codex and gpt-5.1-codex-mini, and the specific scope of application is subject to the official documentation.