Aug 13, 11:00 AM

OpenAI Halves GPT-5 Latency in Cursor and Cuts Cached-Token Fees

OpenAI has rolled out server-side changes to GPT-5 that markedly speed up the model’s responses in Cursor, a popular AI-assisted code editor. Developers testing the updated interface report that 95th-percentile latency is now roughly twice as fast after improvements to caching and API throughput. The upgrade arrives alongside a pricing revision that reduces the cost of cached or repetitive input tokens to one-tenth of the standard rate, down from one-quarter previously. Users say the lower fee makes long programming sessions and automated refactoring significantly cheaper. While the base GPT-5 model now executes faster, developers note that capability varies by subscription tier. ChatGPT Plus and Team accounts provide a smaller context window and lower "GPT-5 Thinking" setting than Pro accounts, which continue to offer the higher-reasoning configuration used in Cursor’s “gpt-5-high” option.

#Cursor #GPT #ChatGPT Plus #Team #Pro

Written with ChatGPT .

Sources

Additional media

Image #1 for story openai-halves-gpt-5-latency-cursor-cuts-cached-token-fees-91937d40

Image #2 for story openai-halves-gpt-5-latency-cursor-cuts-cached-token-fees-91937d40

Image #3 for story openai-halves-gpt-5-latency-cursor-cuts-cached-token-fees-91937d40

Image #4 for story openai-halves-gpt-5-latency-cursor-cuts-cached-token-fees-91937d40

OpenAI Halves GPT-5 Latency in Cursor and Cuts Cached-Token Fees

Sources

Additional media

Similar Stories