Hi all — I'm the developer of Vox Dictum, just released on the Mac App Store. I build under Cobalt InFX (cobaltinfx.com).
I kept running into the same problem: I record a lot of meetings and interviews, and every cloud transcription service meant uploading confidential conversations to someone else's server. So I built one that runs entirely on the Mac — no cloud, no accounts, nothing leaves the device.
What it does:
- Transcribes audio/video recordings on-device using WhisperKit
- Identifies who said what (speaker diarisation via Pyannote) — rename a speaker once and it updates everywhere
- Generates AI summaries on-device (Qwen3 via MLX) — key decisions, action items, topics, all processed locally
- Handles 57 languages including Urdu, Arabic, and Hindi
- Detects and helps resolve overlapping speech
- Exports transcripts as TXT, Markdown, SRT (with speaker names), and JSON
It's Apple Silicon only (M1 or later), macOS 15.0+. There's a free tier with unlimited transcription, speaker labelling, editing, and export — no trial period, no nag screens. Pro adds larger models and the AI summaries.
Mac App Store: https://apps.apple.com/gb/app/vox-dictum/id6761995146?mt=12
Happy to answer any questions, and genuinely interested in feedback — especially from anyone who transcribes regularly. This is V1.1; I'm actively working on the next release.
M Ozair
cobaltinfx.com



I kept running into the same problem: I record a lot of meetings and interviews, and every cloud transcription service meant uploading confidential conversations to someone else's server. So I built one that runs entirely on the Mac — no cloud, no accounts, nothing leaves the device.
What it does:
- Transcribes audio/video recordings on-device using WhisperKit
- Identifies who said what (speaker diarisation via Pyannote) — rename a speaker once and it updates everywhere
- Generates AI summaries on-device (Qwen3 via MLX) — key decisions, action items, topics, all processed locally
- Handles 57 languages including Urdu, Arabic, and Hindi
- Detects and helps resolve overlapping speech
- Exports transcripts as TXT, Markdown, SRT (with speaker names), and JSON
It's Apple Silicon only (M1 or later), macOS 15.0+. There's a free tier with unlimited transcription, speaker labelling, editing, and export — no trial period, no nag screens. Pro adds larger models and the AI summaries.
Mac App Store: https://apps.apple.com/gb/app/vox-dictum/id6761995146?mt=12
Happy to answer any questions, and genuinely interested in feedback — especially from anyone who transcribes regularly. This is V1.1; I'm actively working on the next release.
M Ozair
cobaltinfx.com


