leanvox

Generate speech (TTS), transcribe audio (STT), create voice-overs, and multi-speaker dialogue via LeanVox API. Use when: (1) converting text to speech, (2) transcribing audio files, (3) voice cloning from a sample, (4) generating multi-speaker dialogue, (5) creating voice-overs from transcripts, (6) browsing or searching curated voices. Supports Standard (fast, $5/1M chars), Pro (expressive, $10/1M chars), and Max (instruction-based, $30/1M chars) model tiers with 238+ curated voices. Requires LEANVOX_API_KEY from api.leanvox.com.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "leanvox" with this command: npx skills add leanvox/leanvox-skill/leanvox-leanvox-skill-leanvox

LeanVox TTS/STT API Skill

Authentication

Set LEANVOX_API_KEY environment variable. Get a key at https://api.leanvox.com (free $1.00 signup credit).

export LEANVOX_API_KEY="your-key-here"

Model Tiers — Pick the Cheapest That Works

TierCostBest ForVoice Support
Standard$5/1M charsFast narration, notifications, bulk2 built-in voices (af_heart, am_michael)
Pro$10/1M charsExpressive, natural, podcasts238+ curated voices with cloning
Max$30/1M charsCreative, instruction-drivenDescribe voice via text prompt

Default to Standard unless the user needs specific voices (→ Pro) or voice design from text description (→ Max).

Quick Reference

Generate Speech

scripts/tts.sh "Hello world" --model standard --voice af_heart --output hello.mp3

Transcribe Audio

scripts/stt.sh audio.mp3                          # sync (< 5 min)
scripts/stt.sh audio.mp3 --async                   # async (> 5 min)

Multi-Speaker Dialogue

scripts/dialogue.sh dialogue.json --output conversation.mp3

Voice-Over (transcribe → edit → re-voice)

scripts/voiceover.sh input.mp3 --voice podcast_conversational_female

Browse Voices

scripts/voices.sh --category podcast --gender female

Clone a Voice

scripts/clone.sh reference.wav "Text to speak in cloned voice" --output cloned.mp3

Endpoint Details

For full API reference including all parameters, see references/api-reference.md. For the complete curated voice catalog, see references/voice-catalog.md.

Key Constraints

  • Max text length: 10,000 Unicode characters per request
  • Async threshold: Use async for text > 5,000 chars or audio files > 5 minutes
  • Billing minimum: 100 characters (shorter text billed as 100)
  • Audio format: Returns MP3 via presigned URL (download separately, no auth header)
  • Rate limits: 60 RPM (free), 1,000 RPM (paid)
  • 1,000 chars ≈ 1 minute of audio output

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

claw2ui

Generate interactive web pages (dashboards, charts, tables, reports) and serve them via public URL. Use this skill when the user explicitly asks for data vis...

Registry SourceRecently Updated
General

WeChat Article Summarize

Read one or more WeChat public account article links from mp.weixin.qq.com, extract cleaned full text and optional image links, summarize each article in Chi...

Registry SourceRecently Updated
General

Openfinance

Connect bank accounts to AI models using openfinance.sh

Registry SourceRecently Updated
General

---

合同审查清单AI助手 - 5类合同+3大特殊条款,风险识别与修改建议

Registry SourceRecently Updated