Developers gain major speed and cost savings with new GPT-5.1 update

Elyse Betters Picaro/ZDNET

Follow ZDNET: Add us as a preferred source on Google.

ZDNET’s key takeaways

GPT-5.1 speeds up coding with adaptive and no-reasoning modes.
New prompt caching cuts API costs for embedded app developers.
New tools make AI agents more capable inside modern IDEs.

OpenAI is back with a new 5.1 update to its previous GPT-5 large language model. GPT-5 was introduced in August, which is decades ago in AI’s time warp-speed version of our universe.

OpenAI is, of course, using AI to help it code faster. After all, it’s in a race with the other big players to get that trajillion-dollar valuation. Besides, it’s been proven beyond a shadow of a doubt that AI coding, in the hands of a professional coder, is an almost magical force multiplier and project accelerator.

(Disclosure: Ziff Davis, ZDNET’s parent company, filed an April 2025 lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.)

Also: OpenAI’s GPT-5.1 makes ChatGPT ‘warmer’ and smarter – how its upgraded modes work now

For an overview of GPT-5.1’s benefits for consumer chatbot users, read Senior Editor Sabrina Ortiz’ explainer. But if you’re interested in using AI in your coding, or embedded in your software, keep reading. This release has some tangible speed and cost-savings benefits.

In this article, we’re talking about GPT-5.1 in the API. In other words, we’re looking at sending prompts to the AI via a program’s function call, and getting back a result as the return value to that call.

This API-driven AI functionality works within the software products developers make, but since the developer tools themselves also use the API to provide intelligence, it increases the usefulness of those tools. This also benefits developers using OpenAI’s Codex coding agent, because Codex is now available in a 5.1 release.

Also: The best free AI courses and certificates for upskilling in 2025 – and I’ve tried them all

JetBrains, for example, is a maker of excellent development tools. Although I moved off of the JetBrains platform because VS Code is much more widely used (and I often need to talk to you about it), JetBrains products are still some of my favorites. In fact, using VS Code, I sometimes miss some of JetBrains’ features.

That’s why it was so interesting when Denis Shiryaev, head of AI DevTools Ecosystem at JetBrains, described the company’s experience with this new GPT-5.1 release in an OpenAI blog post. He said, “GPT 5.1 isn’t just another LLM — it’s genuinely agentic, the most naturally autonomous model I’ve ever tested.”

“It writes like you, codes like you, effortlessly follows complex instructions, and excels in front-end tasks, fitting neatly into your existing codebase,” he said.

Let’s look at some of the reasons why GPT-5.1 is getting such an enthusiastic response.

Adaptive reasoning

I found coding with GPT-5 to be astonishingly powerful, but occasionally tedious. No matter what I asked the AI, the response took time. Even the simplest question could take a few minutes to return a response. That’s because all queries sent the request to the same model.

GPT-5.1 evaluates the prompt given and, based on whether the question is basically easy or hard, it adjusts how much cognitive effort it puts into the answer. This means that simple questions will no longer have the delay that was so frustrating when using the older coding model.

Here’s a prompt I gave GPT-5 just a few days ago: “Please check my work. I’ve been renaming EDD_SL_Plugin_Updater so that each plugin using it has a unique name to avoid conflicts. I updated the class name in the updater file, updated the updater file name, and then updated references to the file and class in the plugin’s main file. Can you check the plugins and be sure there are no errors? Report back to me if you find anything and don’t make any changes.”

Also: 10 ChatGPT prompt tricks I use – to get the best results, faster

That’s a big request, requiring the AI to scan something like 12,000 files and give me an analysis. It should use all the thinking power it can muster.

By contrast, a prompt like “What WP-CLI command shows the list of installed plugins?” is a really simple request. It’s basically a documentation lookup that requires no real intelligence at all. It’s just a quick time saver prompt, so I don’t have to switch to the browser and do a Google search.

Responses for the quick question are faster, and the process uses fewer tokens. Tokens are the measure of the amount of processing used. API calls are charged based on tokens, which means that simple convenience questions will cost less to ask.

There’s one other aspect of this that’s pretty powerful, which is what OpenAI describes as “more persistent deep reasoning.” Nothing sucks more than having a long conversation with the AI, and then having it lose track of what you were talking about. Now, OpenAI says the AI can stay on track longer.

‘No reasoning’ mode

This is another one of those cases where I feel OpenAI could benefit from some solid product management for its product naming. This mode doesn’t turn off context understanding, quality code writing, or understanding instructions. It just turns off deep, chain-of-thought style analysis. They should call it “don’t overthink” mode.

Think of it this way. We all have a friend who overthinks every single issue or action. It bogs them down, takes them forever to get simple things done, and often leads to analysis paralysis. There’s a time for big thinking, and there’s a time to just choose paper or plastic and move on.

Also: I teamed up two AI tools to solve a major bug – but they couldn’t do it without me

This new no reasoning mode enables the AI to avoid its usual step-by-step deliberation and just jump to an answer. It’s ideal for simple lookups or basic tasks. This cuts latency (time for response) dramatically. It also creates a more responsive, quicker, and more fluid coding experience.

Combining no reasoning mode with adaptive reasoning means the AI can take the time to answer hard questions, but can rapid-fire respond to simpler ones.

Extended prompt caching

Another speed boost (with accompanying cost reduction) is extended prompt caching. When an AI is given a prompt, it first has to use its natural language processing capabilities to parse that prompt to figure out what it is that it’s being asked.

This is no small feat. It’s taken AI researchers decades to get AIs to the point that they can understand natural language, as well as the context and subtle meanings of what’s being said.

So, when a prompt is issued, the AI has to do some real work to tokenize it, to create an internal representation from which to construct a response. This is not without its resource utilization cost.

Also: 10 ChatGPT Codex secrets I only learned after 60 hours of pair programming with it

If a question gets re-asked during a session, and the same or similar prompt has to be reinterpreted, that cost is incurred again. Keep in mind that we’re not only talking about prompts that a programmer gives an API, but prompts that run inside an application, which may often be repeated during application use.

Take, for example, a detailed prompt for a customer support agent, which has to process the same set of basic starting rules for every customer interaction. That prompt might take thousands of tokens just to parse, and would need to be done thousands of times a day.

By caching the prompt (and OpenAI is now doing this for 24 hours), the prompt gets compiled once and then is available for reuse. The speed improvements and cost savings could be considerable.

Better business case for design-ins

All of these improvements provide OpenAI with a better business case to present to customers for design-ins. Design-in is a fairly old term of art, used to describe when a component is designed into a product.

Probably the most famous (and most consequential) design-in was when IBM chose the Intel 8088 CPU for the original IBM PC back in 1981. That one decision launched the entire x86 ecosystem and fueled Intel’s success in processors for decades.

Today, Nvidia is the beneficiary of enormous design-in decisions on the part of data center operators, hungry for the most AI processing power they can find. That demand has pushed Nvidia to become the world’s most valuable company in terms of market cap, somewhere north of $5 trillion.

Also: I got 4 years of product development done in 4 days for $200, and I’m still stunned

OpenAI benefits from design-ins as well. CapCut is a video app with 361 million downloads in 2025. Temu is a shopping app with 438 million downloads in 2025. If, for example, either company were to embed AI into their app, and if they were to do so using API calls from OpenAI, OpenAI would stand to make a ton of cash from the cumulative volume of API calls and their associated billing.

But as with physical components, the cost of goods sold is always an issue with design-ins. Every fraction of a cent in COGS can increase the overall end price or dangerously impact margins.

So, bottom line, if OpenAI can substantially reduce the cost of API calls and still deliver AI value, as it seems to have done with GPT-5.1, there’s a much better chance it can make the case for including GPT-5.1 in developers’ products.

More new capabilities

The GPT-5.1 release also includes better coding performance. The AI is more steerable and biddable, meaning that it follows directions better. If only my pup could be more biddable, we wouldn’t have the constant painful yapping when the mail is delivered.

The coding AI does less unnecessary overthinking, is more conversational during tool-calling sequences, and has more overall friendly behavior during sequence interactions. There’s also a new apply_patch tool that helps with multi-step coding sequences and agentic actions, along with a new shell tool that does better when being asked to generate command-line commands and evaluate and act based on responses.

Also: OpenAI has new agentic coding partner for you now: GPT-5-Codex

I’m pretty pumped about this new release. Since I’m already using GPT-5, it will be nice to see how much more responsive it is with GPT-5.1 now.

What about you? Have you tried using GPT-5 or the new GPT-5.1 models in your coding or development workflow? Are you seeing the kinds of speed or cost improvements OpenAI is promising, or are you still evaluating whether these changes matter for your projects? How important are features like adaptive reasoning, no reasoning mode, or prompt caching when you’re deciding which AI model to build into your tools or products? Let us know in the comments below.

You can follow my day-to-day project updates on social media. Be sure to subscribe to my weekly update newsletter, and follow me on Twitter/X at @DavidGewirtz, on Facebook at Facebook.com/DavidGewirtz, on Instagram at Instagram.com/DavidGewirtz, on Bluesky at @DavidGewirtz.com, and on YouTube at YouTube.com/DavidGewirtzTV.

What's Hot

Massive Black Friday sale at Walmart knocks up to $800 off these big-screen TVs

Are you ready for a $1,000 Steam Machine? Some analysts think you should be.

OpenAI is piloting group conversations in ChatGPT

Developers gain major speed and cost savings with new GPT-5.1 update

Build Mode starts at the beginning: How Forethought AI found product-market fit

14 Best Bed Frames (2025), Tested in Our Homes

Apple TV is getting MLS games at no extra cost

Massive Black Friday sale at Walmart knocks up to $800 off these big-screen TVs

PayPal’s blockchain partner accidentally minted $300 trillion in stablecoins

The best AirPods deals for October 2025

How to Disable Some or All AI Features on your Samsung Galaxy Phone

PayPal’s blockchain partner accidentally minted $300 trillion in stablecoins

The best AirPods deals for October 2025

Latest Post

Massive Black Friday sale at Walmart knocks up to $800 off these big-screen TVs

Are you ready for a $1,000 Steam Machine? Some analysts think you should be.

OpenAI is piloting group conversations in ChatGPT

Subscribe to Updates

What's Hot

Developers gain major speed and cost savings with new GPT-5.1 update

ZDNET’s key takeaways

Adaptive reasoning

‘No reasoning’ mode

Extended prompt caching

Better business case for design-ins

More new capabilities

Related Posts

Subscribe to Updates