The default backend. Best accuracy and broadest language coverage. Uses Apple Silicon Metal GPU acceleration for fast inference. Supports 10+ model sizes from Tiny (75MB) to Large V3 (2.9GB).
Fastest backend on Apple Silicon. Runs on the dedicated Neural Engine, leaving GPU free for other tasks. Features CTC vocabulary boosting — your dictionary entries and prompt words bias the decoder at the acoustic level.
Native Apple framework available on macOS Tahoe and later. Uses Apple's built-in speech recognition models with system-level optimization.
All models are downloaded once and cached locally. Switch between them at any time.
| Model | Size | Speed | Notes |
|---|---|---|---|
| Tiny | 75 MB | Fastest | Quick, lower accuracy |
| Base | 142 MB | Fast | Good for simple dictation |
| Small | 466 MB | Medium | Balanced |
| Medium | 1.5 GB | Slow | High accuracy |
| Large V3 | 2.9 GB | Slowest | Maximum accuracy |
| Large V3 Turbo | 1.5 GB | Fast | 8x faster than Large V3 |
| Large V3 Turbo Q5Default | 547 MB | Fast | Default — best balance of speed, size, accuracy |
| Large V3 Q5 | 1.1 GB | Medium | Quantized, smaller file |
| Distil Large V3 | 756 MB | Very Fast | 6x faster than Large V3 |
| Distil Small (EN) | 166 MB | Very Fast | English only |
Start with Whisper (the default). It offers the best accuracy and language coverage. If you want maximum speed and primarily dictate in English, try Parakeet. Apple Speech requires macOS Tahoe or later.
The default Large V3 Turbo Q5 (547 MB) offers the best balance of speed, accuracy, and file size. Use smaller models (Tiny, Base) for speed on older hardware. Use Large V3 (2.9 GB) for maximum accuracy in challenging audio.
Yes. Whisperer supports hot-swapping — change your transcription backend or model size, and the new engine loads while the old one is released. No app restart needed.
Models range from ~100MB (Tiny) to ~3GB (Large V3) of memory. Whisperer checks available system memory before loading and warns if insufficient. The default model (Large V3 Turbo Q5) uses about 600MB.