Summary
OpenAI published details on a new WebSocket mode for the Responses API that keeps a persistent connection alive for multi-step agent loops. The company says the change made agentic workflows up to 40% faster end to end and helped GPT-5.3-Codex-Spark hit much higher effective throughput without forcing developers to change API shapes.
What changed
OpenAI introduced a persistent WebSocket transport for the Responses API so agent loops can reuse connection state instead of sending a fresh synchronous request for each tool step.
Why it matters
As coding and workflow agents get faster, API overhead becomes a real bottleneck. This upgrade matters because it turns transport design into a product differentiator for agent platforms and lowers latency for tools built around repeated tool-call loops.
Evidence excerpt
OpenAI says it made agent loops using the API 40% faster end to end by building a persistent WebSocket connection instead of a chain of synchronous API calls.