(25 Qs Updated 16 April 2026)
This page shows a test of many Kinyarwanda speechβtoβtext systems.

For each system, we play the same Kinyarwanda audio and capture its text output.
We then compare that text to a trusted reference answer and give a score between 0 and 1.
| # | Workflow | Accuracy (Mean) | Latency (Median, s) |
|---|
| 2 | Mbaza+Gem3.1pro (A2T) | 0.89 | 10.18 |
| 6 | Omni+Gem3.1pro (A2T) | 0.90 | 11.13 |
| 9 | Mbaza+Gem3.1flash-lite (A2T) | 0.87 | 3.88 |
| 10 | Omni+Deepseek3.2 | 0.86 | 8.63 |
| 11 | Omni+KimiK2.5 | 0.82 | 12.61 |
| 12 | Omni+MiniMax2.7 | 0.62 | 6.82 |
| 13 | Omni+GLM5.1 | 0.79 | 10.81 |
| 14 | Mbaza+MiniMax2.7 | 0.50 | 5.13 |
| 15 | Mbaza+KimiK2.5 | 0.82 | 13.83 |
| 16 | Mbaza+GPTOSS 120B | 0.83 | 6.50 |
| 17 | Mbaza+Qwen3.6+ | 0.84 | 9.54 |
A higher score means the system output is closer to the reference text and usually more accurate.
You can use these scores to:
- See which system gets the best score
- Compare different models and pipelines side by side
- Choose the best system for your app, research, or product
- Download all results for deeper analysis