speechmatics

Transcribe audio files (voice notes, recordings, podcasts) to text via the Speechmatics batch transcription API. Use when the user asks to transcribe audio, convert speech to text, or get a transcript of a recording, and Speechmatics is configured.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "speechmatics" with this command: npx skills add coreyh/speechmatics

Speechmatics (batch transcription)

Transcribe an audio file via Speechmatics' async batch API. Submits a job, polls until complete, then writes the transcript.

Quick start

{baseDir}/scripts/transcribe.sh /path/to/audio.m4a

Defaults:

  • Language: en
  • Operating point: enhanced (better accuracy; use standard for faster/cheaper)
  • Output: <input>.txt in the same directory
  • Poll interval: 3s, timeout: 600s

Useful flags

{baseDir}/scripts/transcribe.sh /path/to/audio.ogg --language da
{baseDir}/scripts/transcribe.sh /path/to/meeting.wav --operating-point standard
{baseDir}/scripts/transcribe.sh /path/to/call.mp3 --diarization speaker
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --format json --out /tmp/transcript.json
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --format srt --out /tmp/subs.srt
{baseDir}/scripts/transcribe.sh /path/to/long.wav --timeout 1800

Formats: txt (default, plain text), json (Speechmatics json-v2 with word timings), srt (subtitles).

API key

The script reads the API key from (in order):

  1. --api-key <key> flag
  2. SPEECHMATICS_API_KEY environment variable (set by openclaw from the entry below)
  3. skills.entries.speechmatics.apiKey in $OPENCLAW_CONFIG_PATH (default ~/.openclaw/openclaw.json)

Configure via openclaw.json:

{
  skills: {
    entries: {
      speechmatics: {
        apiKey: "SPEECHMATICS_KEY_HERE",
      },
    },
  },
}

Override the API base (e.g. EU region or a proxy) with --base-url or SPEECHMATICS_BASE_URL. Default: https://asr.api.speechmatics.com/v2.

Notes

  • Supports most common audio formats (wav, mp3, m4a, ogg, flac, mp4, etc.) — Speechmatics transcodes server-side.
  • File size limit: 2 GB per job.
  • Batch jobs complete in roughly 1:10 wallclock to audio duration on enhanced; standard is faster.
  • Always confirm any destructive follow-up (e.g. replying based on a transcript) before acting.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

Cult Of Carcinization

Give your agent a voice — and ears. The Cult of Carcinization is the bot-first gateway to ScrappyLabs TTS and STT. Speak with 20+ voices, design your own from a text description, transcribe audio to text, and evolve into a permanent bot identity. No human signup required.

Registry SourceRecently Updated
2K3Profile unavailable
General

Assembly Large Audio Transcriber

Transcribe large audio files (100MB+, up to 1GB/12 hours) with speaker diarization. Uses AssemblyAI API with direct HTTP calls. Supports MP3, WAV, M4A, FLAC,...

Registry SourceRecently Updated
1410Profile unavailable
General

Deapi Audio

Text-to-speech, voice cloning, voice design, and transcribe audio files via deAPI GPU network. Trigger on 'text to speech', 'TTS', 'generate voice', 'read al...

Registry SourceRecently Updated
1781Profile unavailable
General

MLX Audio Server

Local 24x7 OpenAI-compatible API server for STT/TTS, powered by MLX on your Mac.

Registry SourceRecently Updated
2.7K0Profile unavailable