whisper-transcribe

Transcribe audio files to text using OpenAI Whisper. Supports speech-to-text with auto language detection, multiple output formats (txt, srt, vtt, json), batch processing, and model selection (tiny to large). Use when transcribing audio recordings, podcasts, voice messages, lectures, meetings, or any audio/video file to text. Handles mp3, wav, m4a, ogg, flac, webm, opus, aac formats.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "whisper-transcribe" with this command: npx skills add JosunLP/whisper-transcribe

Whisper Transcribe

Transcribe audio with scripts/transcribe.sh:

# Basic (auto-detect language, base model)
scripts/transcribe.sh recording.mp3

# German, small model, SRT subtitles
scripts/transcribe.sh --model small --language de --format srt lecture.wav

# Batch process, all formats
scripts/transcribe.sh --format all --output-dir ./transcripts/ *.mp3

# Word-level timestamps
scripts/transcribe.sh --timestamps interview.m4a

Models

ModelRAMSpeedAccuracyBest for
tiny~1GB⚡⚡⚡★★Quick drafts, known language
base~1GB⚡⚡★★★General use (default)
small~2GB★★★★Good accuracy
medium~5GB🐢★★★★★High accuracy
large~10GB🐌★★★★★Best accuracy (slow on Pi)

Output Formats

  • txt — Plain text transcript
  • srt — SubRip subtitles (for video)
  • vtt — WebVTT subtitles
  • json — Detailed JSON with timestamps and confidence
  • all — Generate all formats at once

Requirements

  • whisper CLI (pip install openai-whisper)
  • ffmpeg (for audio decoding)
  • First run downloads the model (~150MB for base)

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Speech to Text

Transcribe or translate audio files to text using a public Hugging Face Whisper Space over Gradio. Use when the user sends voice notes, audio attachments, me...

Registry SourceRecently Updated
910Profile unavailable
General

Whisper AI Audio to Text Transcriber

Converts audio and video recordings into accurate, multilingual text transcripts with speaker ID and timestamps for meetings, podcasts, lectures, and more.

Registry SourceRecently Updated
390Profile unavailable
General

Telegram Voice Transcribe

Transcribe Telegram voice messages and audio notes into text using the OpenAI Whisper API. Use when (1) a user sends a voice message or audio note via Telegr...

Registry SourceRecently Updated
2450Profile unavailable
General

Faster Whisper Transcription

Transcribes local voice messages to text using Faster Whisper models for fast, privacy-focused speech recognition on audio files.

Registry SourceRecently Updated
5630Profile unavailable