Summary
MiMo-V2.5 Voice launched on April 27, 2026 with an open-source 8B ASR model from Xiaomi aimed at bilingual Chinese-English transcription, Chinese dialects, code-switched speech, and song lyrics. The launch is positioned as voice infrastructure for real-world speech products rather than a narrow benchmark demo.
What changed
Xiaomi MiMo launched MiMo-V2.5-ASR as an open-source speech recognition model focused on dialects, code-switching, noisy audio, and lyrics.
Why it matters
Voice products often fail in the messy conditions that benchmarks underweight, especially multilingual and dialect-heavy workflows. This matters because Xiaomi is pushing open voice infrastructure toward production-grade ASR scenarios that usually keep teams tied to closed APIs.
Evidence excerpt
The launch describes MiMo-V2.5-ASR as an 8B open-source speech recognition model from Xiaomi that handles Mandarin, English, eight Chinese dialects, code-switched speech, and song lyrics.