Summary
OpenAI described Multipath Reliable Connection, a network protocol it says is already deployed on its largest NVIDIA GB200 training supercomputers and has been used to train multiple OpenAI models. The company is also contributing the MRC specification to the Open Compute Project as part of a broader push to make large-scale Ethernet training fabrics more resilient.
What changed
OpenAI publicly detailed MRC, said it is already in production on major training clusters, and released the specification as an Open Compute Project contribution.
Why it matters
This is a real infrastructure signal, not a paper-only research note. OpenAI is effectively saying networking reliability is now a first-order constraint on frontier training, and it is trying to shape the shared protocol stack for clusters that run well beyond 100,000 GPUs.
Evidence excerpt
OpenAI says MRC is already deployed across its largest NVIDIA GB200 supercomputers, has been used to train multiple OpenAI models, and is now available as an Open Compute Project contribution.