Voice Bridge Light

# Voice Bridge Light

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "Voice Bridge Light" with this command: npx skills add fangbb-coder/voice-bridge-light

Voice Bridge Light

Lightweight offline voice bridging service providing OpenAI-compatible STT/TTS HTTP API.

Features

  • TTS Text-to-Speech: Supports Edge TTS (online) and Piper (local)
  • STT Speech Recognition: Based on Whisper local recognition
  • OpenAI Compatible API: Compatible with OpenAI Audio API
  • Lightweight Deployment: Minimal dependencies, easy to install

Usage

Installation

pip install -r requirements.txt

Start Service

Default using Edge TTS:

python api_server.py

Using Piper (model required):

TTS_ENGINE=piper PIPER_MODEL=models/piper/zh_CN-huayan-medium.onnx python api_server.py

API Endpoints

EndpointMethodDescription
GET /healthGETHealth check
POST /audio/speechPOSTTTS speech synthesis
POST /audio/transcriptionsPOSTSTT speech recognition

Configuration Environment Variables

VariableDefaultDescription
VOICE_BRIDGE_HOST0.0.0.0Listen address
VOICE_BRIDGE_PORT18790Listen port
TTS_ENGINEedgeTTS engine: edge or piper
EDGE_VOICEzh-CN-XiaoxiaoNeuralEdge TTS voice
PIPER_MODELmodels/piper/zh_CN-huayan-medium.onnxPiper model path
STT_MODELbaseWhisper model size

TTS Request Example

curl -X POST http://localhost:18790/audio/speech \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Hello, world!",
    "voice": "zh-CN-XiaoxiaoNeural",
    "response_format": "mp3"
  }' \
  --output speech.mp3

STT Request Example

curl -X POST http://localhost:18790/audio/transcriptions \
  -F "file=@speech.mp3" \
  -H "Content-Type: multipart/form-data"

OpenClaw Integration

Configure in openclaw.json:

{
  "tts": {
    "enabled": true,
    "provider": "local-piper",
    "baseUrl": "http://127.0.0.1:18790",
    "apiKey": "local",
    "voice": "zh-CN-XiaoxiaoNeural"
  }
}

Dependencies

  • Python 3.8+
  • edge-tts (Edge TTS)
  • faster-whisper (Whisper STT)
  • soundfile (audio processing)
  • Flask + Flask-CORS (web service)

Service Management

systemd Service (Recommended)

[Unit]
Description=Voice Bridge Light - STT/TTS HTTP API
After=network.target

[Service]
Type=simple
User=root
WorkingDirectory=/root/.openclaw/workspace/skills/voice-bridge-light
ExecStart=/usr/bin/python3 api_server.py
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target

Enable and start:

systemctl daemon-reload
systemctl enable voice-bridge-light.service
systemctl start voice-bridge-light.service

Performance

  • TTS latency: < 1s (Edge TTS requires network)
  • STT latency: depends on audio length, real-time CPU
  • Memory usage: ~300-500MB (mainly from Whisper model)

Notes

  • Edge TTS requires internet access to Microsoft services
  • Piper requires downloading model files (first use)
  • Whisper model loads slowly on first run, recommend warm-up
  • Production environment recommended to use systemd management

License

MIT

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

Cult Of Carcinization

Give your agent a voice — and ears. The Cult of Carcinization is the bot-first gateway to ScrappyLabs TTS and STT. Speak with 20+ voices, design your own from a text description, transcribe audio to text, and evolve into a permanent bot identity. No human signup required.

Registry SourceRecently Updated
2K3Profile unavailable
General

Smallest Ai

Ultra-fast text-to-speech and speech-to-text via Smallest AI's Lightning v3.1 and Pulse models. Use when the user wants to generate speech, convert text to v...

Registry SourceRecently Updated
2480Profile unavailable
General

语音交互技能-feishu&qq-byLi

飞书语音交互技能。支持语音消息自动识别、AI 处理、语音回复全流程。需要配置 FEISHU_APP_ID 和 FEISHU_APP_SECRET 环境变量。使用 faster-whisper 进行语音识别,Edge TTS 进行语音合成,自动转换 OPUS 格式并通过飞书发送。适用于飞书平台的语音对话场景。

Registry SourceRecently Updated
1970Profile unavailable
General

Feishu Voice Loop

Accept text or voice input, transcribe if needed, generate natural OpenAI TTS speech, and send audio output to Feishu chat or web player.

Registry SourceRecently Updated
3970Profile unavailable