Speechmatics (batch transcription)
Transcribe an audio file via Speechmatics' async batch API. Submits a job, polls until complete, then writes the transcript.
Quick start
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a
Defaults:
- Language:
en - Operating point:
enhanced(better accuracy; usestandardfor faster/cheaper) - Output:
<input>.txtin the same directory - Poll interval: 3s, timeout: 600s
Useful flags
{baseDir}/scripts/transcribe.sh /path/to/audio.ogg --language da
{baseDir}/scripts/transcribe.sh /path/to/meeting.wav --operating-point standard
{baseDir}/scripts/transcribe.sh /path/to/call.mp3 --diarization speaker
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --format json --out /tmp/transcript.json
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --format srt --out /tmp/subs.srt
{baseDir}/scripts/transcribe.sh /path/to/long.wav --timeout 1800
Formats: txt (default, plain text), json (Speechmatics json-v2 with word timings), srt (subtitles).
API key
The script reads the API key from (in order):
--api-key <key>flagSPEECHMATICS_API_KEYenvironment variable (set by openclaw from the entry below)skills.entries.speechmatics.apiKeyin$OPENCLAW_CONFIG_PATH(default~/.openclaw/openclaw.json)
Configure via openclaw.json:
{
skills: {
entries: {
speechmatics: {
apiKey: "SPEECHMATICS_KEY_HERE",
},
},
},
}
Override the API base (e.g. EU region or a proxy) with --base-url or SPEECHMATICS_BASE_URL. Default: https://asr.api.speechmatics.com/v2.
Notes
- Supports most common audio formats (wav, mp3, m4a, ogg, flac, mp4, etc.) — Speechmatics transcodes server-side.
- File size limit: 2 GB per job.
- Batch jobs complete in roughly 1:10 wallclock to audio duration on
enhanced;standardis faster. - Always confirm any destructive follow-up (e.g. replying based on a transcript) before acting.