openrouter-transcribe

Transcribe audio files via OpenRouter using audio-capable models (Gemini, GPT-4o-audio, etc).

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "openrouter-transcribe" with this command: npx skills add obviyus/openrouter-transcribe

OpenRouter Audio Transcription

Transcribe audio files using OpenRouter's chat completions API with input_audio content type. Works with any audio-capable model.

Quick start

{baseDir}/scripts/transcribe.sh /path/to/audio.m4a

Output goes to stdout.

Useful flags

# Custom model (default: google/gemini-2.5-flash)
{baseDir}/scripts/transcribe.sh audio.ogg --model openai/gpt-4o-audio-preview

# Custom instructions
{baseDir}/scripts/transcribe.sh audio.m4a --prompt "Transcribe with speaker labels"

# Save to file
{baseDir}/scripts/transcribe.sh audio.m4a --out /tmp/transcript.txt

# Custom caller identifier (for OpenRouter dashboard)
{baseDir}/scripts/transcribe.sh audio.m4a --title "MyApp"

How it works

  1. Converts audio to WAV (mono, 16kHz) using ffmpeg
  2. Base64 encodes the audio
  3. Sends to OpenRouter chat completions with input_audio content
  4. Extracts transcript from response

API key

Set OPENROUTER_API_KEY env var, or configure in ~/.clawdbot/clawdbot.json:

{
  skills: {
    "openrouter-transcribe": {
      apiKey: "YOUR_OPENROUTER_KEY"
    }
  }
}

Headers

The script sends identification headers to OpenRouter:

  • X-Title: Caller name (default: "Peanut/Clawdbot")
  • HTTP-Referer: Reference URL (default: "https://clawdbot.com")

These show up in your OpenRouter dashboard for tracking.

Troubleshooting

ffmpeg format errors: The script uses a temp directory (not mktemp -t file.wav) because macOS's mktemp adds random suffixes after the extension, breaking format detection.

Argument list too long: Large audio files produce huge base64 strings that exceed shell argument limits. The script writes to temp files (--rawfile for jq, @file for curl) instead of passing data as arguments.

Empty response: If you get "Empty response from API", the script will dump the raw response for debugging. Common causes:

  • Invalid API key
  • Model doesn't support audio input
  • Audio file too large or corrupted

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Hit Preview EN

Analyzes English short video scripts for TikTok, YouTube Shorts, and Instagram Reels to predict hit potential and suggest improvements with AI or local fallb...

Registry SourceRecently Updated
General

Local Document AI OpenVINO

Parse local PDFs and document images with PaddleOCR-VL or PaddleOCR-VL-1.5 on OpenVINO, then route the structured parse into downstream document-to-data or d...

Registry SourceRecently Updated
General

Auto Study

Use when performing study tasks on browser-based platforms such as Yuketang, Xuexitong, Zhihuishu, and Pintia, including answering quizzes and page actions.

Registry SourceRecently Updated
General

宠物体态健康分析技能

Identifies obesity, emaciation, external injuries, skin abnormalities, and abnormal mental states, helping pet owners detect health issues promptly. | 宠物体态健康...

Registry SourceRecently Updated