telegram-voice-transcribe

Transcribe Telegram voice messages and audio notes into text using the OpenAI Whisper API. Use when (1) a user sends a voice message or audio note via Telegram and you need to read or understand its content, (2) you receive a message with a voice file_id in the Telegram message metadata, (3) the user explicitly asks you to transcribe an audio file. Produces the transcript as plain text so you can respond naturally. Requires OPENAI_API_KEY env var and optionally TELEGRAM_BOT_TOKEN for file_id mode. NOT for live audio streams, video transcription, or non-Telegram audio pipelines (though the script supports local files and URLs too).

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "telegram-voice-transcribe" with this command: npx skills add dreadterror/telegram-voice-transcribe

telegram-voice-transcribe

Transcribe Telegram voice notes into text via OpenAI Whisper (whisper-1).

Quick Workflow

  1. Detect a voice message: look for voice.file_id or audio.file_id in the inbound message metadata.
  2. Run the transcription script:
    python3 ~/openclaw/skills/telegram-voice-transcribe/scripts/transcribe.py \
      --file-id <file_id> --language es
    
  3. Read the JSON output — transcript field contains the text.
  4. Respond to the user based on the transcript content (treat it like typed text).

Script Modes

ModeFlagWhen to use
Telegram file_id--file-id <id>Standard case — voice message in Telegram
Local file--file <path>Testing, or file already downloaded
URL--url <https://...>Audio hosted externally

Always pass --language es for Spanish speakers to improve speed and accuracy.

Output

{"transcript": "Hola, necesito que hagas un cambio en el juego", "language": "es", "duration_s": 4.2}

If error key is present, surface it to the user and check setup.

Environment Requirements

  • OPENAI_API_KEY — required (set via openclaw configure)
  • TELEGRAM_BOT_TOKEN — required for --file-id mode

See references/setup.md for full setup, hooks integration, costs, and local Whisper alternative.

Error Handling

ErrorFix
OPENAI_API_KEY not setConfigure key via openclaw configure --section env
TELEGRAM_BOT_TOKEN requiredAdd bot token to env
openai package not installedpip install openai
Telegram 400 Bad Requestfile_id expired — Telegram file_ids expire after ~48h
File too largeWhisper API limit is 25MB; split audio or use local Whisper

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Speech to Text

Transcribe or translate audio files to text using a public Hugging Face Whisper Space over Gradio. Use when the user sends voice notes, audio attachments, me...

Registry SourceRecently Updated
890Profile unavailable
General

Feishu Voice Loop

Accept text or voice input, transcribe if needed, generate natural OpenAI TTS speech, and send audio output to Feishu chat or web player.

Registry SourceRecently Updated
1150Profile unavailable
General

MLX Audio Server

Local 24x7 OpenAI-compatible API server for STT/TTS, powered by MLX on your Mac.

Registry SourceRecently Updated
2.2K0Profile unavailable
General

Webchat Voice Full Stack

One-step full-stack installer for OpenClaw WebChat voice input with local speech-to-text. Orchestrates three focused skills in order: local STT backend (fast...

Registry SourceRecently Updated
5514Profile unavailable