Summary

OpenAI published details on a new WebSocket mode for the Responses API that keeps a persistent connection alive for multi-step agent loops. The company says the change made agentic workflows up to 40% faster end to end and helped GPT-5.3-Codex-Spark hit much higher effective throughput without forcing developers to change API shapes.

What changed

OpenAI introduced a persistent WebSocket transport for the Responses API so agent loops can reuse connection state instead of sending a fresh synchronous request for each tool step.

Why it matters

As coding and workflow agents get faster, API overhead becomes a real bottleneck. This upgrade matters because it turns transport design into a product differentiator for agent platforms and lowers latency for tools built around repeated tool-call loops.

Evidence excerpt

OpenAI says it made agent loops using the API 40% faster end to end by building a persistent WebSocket connection instead of a chain of synchronous API calls.

Sources