signal insight

Vercel adds cost, latency, and throughput sorting to AI Gateway provider routing

Vercel updated AI Gateway so teams can sort providers behind a model by cost, time to first token, or throughput at request time. The release also exposes routing metadata that shows execution order and deprioritized providers, making gateway policy easier to inspect and tune.

Published May 15, 2026 Updated May 19, 2026 1 sources

VercelAI Gatewayenterprise controlsfeature updatehigh impact

enterprise-controlsroutingcost-controllatencyinferencefeature-update

Impact: high
Confidence: 98%
Change type: feature update
First seen: May 15, 2026
Last updated: May 19, 2026
Audience: developersplatform teamsmlops teams
Status: Published

Summary

What changed

Vercel added provider sorting controls to AI Gateway for cost-, latency-, and throughput-based routing decisions, plus routing metadata for inspection.

Why it matters

Inference gateways are turning into optimization layers rather than simple API brokers. This matters because teams can now express routing policy around budget, latency, and output speed directly in the gateway, which is increasingly useful for agent loops, high-volume workloads, and multi-provider reliability strategies.

Evidence excerpt

Vercel says AI Gateway can now sort providers behind a model by cost, time to first token, or throughput at request time, and responses include routing metadata that explains execution order and any deprioritized providers.

Sources

vercel.com