Dictation
Dictation is a speech-to-text workflow where spoken words are converted into written text.
Clear definitions for Mac dictation, voice typing, speech recognition, cleanup passes, and the workflow language around using Mallo well.
Start here
Dictation is a speech-to-text workflow where spoken words are converted into written text.
Local-first speech recognition keeps audio processing on your device by default instead of sending every utterance to a remote server.
Speech recognition is the system that turns spoken audio into text tokens a computer can work with.
Voice typing means speaking instead of pressing keys so spoken words become typed text inside an app.
All terms
12 published terms
Cursor insertion means generated text lands directly at the active caret position inside the app you are already using.
Dictation is a speech-to-text workflow where spoken words are converted into written text.
Dictionary replacement is a rule-based text cleanup step that swaps known terms into the forms you want after speech is recognized.
Hold-to-Talk means dictation runs only while you keep a shortcut pressed, giving you tight start-and-stop control.
Local-first speech recognition keeps audio processing on your device by default instead of sending every utterance to a remote server.
Multilingual dictation means a speech-to-text workflow can handle more than one language in real writing use.
Speech recognition is the system that turns spoken audio into text tokens a computer can work with.
A speech model is the engine that predicts text from audio and largely determines speed, language fit, and accuracy tradeoffs.
Speech-to-text is the process of converting spoken audio into written text.
Toggle dictation starts with one shortcut press and keeps listening until the user stops it with another action.
Voice typing means speaking instead of pressing keys so spoken words become typed text inside an app.
whisper.cpp is an on-device inference runtime used to run Whisper-family speech models locally.