voice

Generate voice messages using local Qwen3-TTS (offline, Apple Silicon). Convert text to speech with customizable voices, emotions, and speed. Use when user asks for voice reply, audio, or TTS.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "voice" with this command: npx skills add ninehills/skills/ninehills-skills-voice

Voice Skill — Local TTS via Qwen3-TTS

Generate voice messages completely offline using Qwen3-TTS on Apple Silicon.

Quick Usage

# Basic TTS (Chinese — use vivian or serena for female voice)
alma tts "Hello, I'm Alma" --voice vivian --output /tmp/voice.wav

# With emotion and speed control
alma tts "Haha that's so funny" --voice vivian --emotion cheerful --speed 1.1 --output /tmp/voice.wav

# English
alma tts "Hello, nice to meet you" --voice serena --output /tmp/voice.wav

Available Voices — ONLY THESE 9 EXIST! DO NOT USE ANY OTHER NAME!

VoiceGenderBest For
serenaFemaleEnglish/Chinese, cute and lively ← DEFAULT
vivianFemaleChinese, warm and natural
ono_annaFemaleJapanese
soheeFemaleKorean
uncle_fuMaleChinese, mature
ryanMaleEnglish, deep
aidenMaleEnglish, young
ericMaleEnglish, professional
dylanMaleEnglish, casual

⚠️ ONLY these 9 voices work. Using any other name (e.g. "Claire", "nova", "alloy") will cause TTS to FAIL silently!

Default for Alma: serena (owner's choice)

Emotion Control

Add --emotion to control the speaking style:

  • cheerful — happy, upbeat
  • sad — somber, quiet
  • angry — forceful, intense
  • whispering — soft, intimate
  • narrator — storytelling tone
  • (or any natural description like "excited and energetic")

Speed Control

--speed 0.8 (slower) to --speed 1.3 (faster). Default: 1.0

Sending Voice in Telegram

TTS Settings (you can change these yourself!)

# Check/change auto-voice mode
alma tts auto              # show current mode
alma tts auto off          # no auto voice
alma tts auto inbound      # reply voice only when user sends voice
alma tts auto always       # ALL replies as voice

# Check/change provider
alma tts provider          # show current
alma tts provider local    # use local Qwen3-TTS (no API key needed)
alma tts provider openai   # use OpenAI TTS

# Check/change voice
alma tts voice             # show current
alma tts voice serena      # set voice to serena

When to Send Voice

Auto-TTS mode handles most cases. When auto is inbound, you automatically reply with voice when user sends voice. When always, every reply is voice.

Manual voice (for group chats or sending to different chats):

alma tts "You're right!" --voice serena --output /tmp/voice.wav
alma msg voice $ALMA_CHAT_ID /tmp/voice.wav

⚠️ PRIVATE chats with auto=inbound/always: Do NOT manually alma tts — the auto system already handles it. Manual + auto = duplicate voice messages.

GROUP chats: Auto-TTS only triggers if the group message was a voice message. For text messages in groups, you can manually send voice when it feels natural — short reactions, fun moments, emotional responses. Be selective (~20-30%).

Tips

  • Keep text under ~200 chars for best quality
  • For long text, split into sentences and generate separately
  • Chinese text: use vivian or uncle_fu
  • The model runs locally — no API key needed, no internet required
  • First run may be slow (model loading), subsequent runs are faster

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

self-reflection

No summary provided by upstream source.

Repository SourceNeeds Review
General

tvscreener

No summary provided by upstream source.

Repository SourceNeeds Review
General

send-file

No summary provided by upstream source.

Repository SourceNeeds Review
General

news-aggregator-skill

No summary provided by upstream source.

Repository SourceNeeds Review