Model Selection
Model selection is the product decision of choosing which speech model should handle the current dictation job.
Model Selection Model selection is the product decision of choosing which speech model should handle the current dictation job.
What it means
Model selection is the layer where product UX meets speech infrastructure. It decides which recognizer the app will trust for a given user, language, or machine.
What good model selection looks like
The best UI makes tradeoffs legible without turning the product into an engineer-only tool. Users should understand why a choice is recommended and what they give up by switching.
Why it matters in Mallo
Mallo serves users with different Macs, different languages, and different tolerance for latency. Exposing model selection well is part of making the app feel adaptable rather than opinionated in the wrong places.
FAQ
Common questions
What should a user weigh when selecting a model?
Language coverage, startup time, local hardware cost, transcription quality, and how often the user switches contexts.
Should apps auto-select the model for users?
Auto-selection can be helpful, but users still need a clear explanation and an escape hatch when the default is not working for them.
Why is model selection a glossary term instead of just a setting?
Because it changes real workflow behavior and not just a technical preference buried in a menu.
Sources
Further reading
- whisper.cpp (GitHub)
- Qwen3-ASR-Toolkit (GitHub)
- NVIDIA NeMo ASR Models (NVIDIA)