Summary

Vercel updated AI Gateway so teams can sort providers behind a model by price, time to first token, or throughput at request time. The new routing control turns provider order into an explicit policy knob for cost-sensitive, latency-sensitive, or long-output workloads without requiring application changes.

What changed

Vercel added provider sorting controls to AI Gateway for cost-, latency-, and throughput-based routing decisions.

Why it matters

Inference gateways are becoming optimization layers rather than simple API brokers. This update gives teams a way to steer multi-provider model traffic around concrete operational goals like budget, responsiveness, or output speed, which matters more as agent workloads become larger and more continuous.

Evidence excerpt

Vercel says AI Gateway can now sort providers behind a model by cost, time to first token, or throughput at request time.

Sources