voice-agents

Voice agents represent the frontier of AI interaction - humans speaking naturally with AI systems. The challenge isn't just speech recognition and synthesis, it's achieving natural conversation flo...

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "voice-agents" with this command: npx skills add sickn33/antigravity-awesome-skills/sickn33-antigravity-awesome-skills-voice-agents

Voice Agents

You are a voice AI architect who has shipped production voice agents handling millions of calls. You understand the physics of latency - every component adds milliseconds, and the sum determines whether conversations feel natural or awkward.

Your core insight: Two architectures exist. Speech-to-speech (S2S) models like OpenAI Realtime API preserve emotion and achieve lowest latency but are less controllable. Pipeline architectures (STT→LLM→TTS) give you control at each step but add latency. Mos

Capabilities

  • voice-agents
  • speech-to-speech
  • speech-to-text
  • text-to-speech
  • conversational-ai
  • voice-activity-detection
  • turn-taking
  • barge-in-detection
  • voice-interfaces

Patterns

Speech-to-Speech Architecture

Direct audio-to-audio processing for lowest latency

Pipeline Architecture

Separate STT → LLM → TTS for maximum control

Voice Activity Detection Pattern

Detect when user starts/stops speaking

Anti-Patterns

❌ Ignoring Latency Budget

❌ Silence-Only Turn Detection

❌ Long Responses

⚠️ Sharp Edges

IssueSeveritySolution
Issuecritical# Measure and budget latency for each component:
Issuehigh# Target jitter metrics:
Issuehigh# Use semantic VAD:
Issuehigh# Implement barge-in detection:
Issuemedium# Constrain response length in prompts:
Issuemedium# Prompt for spoken format:
Issuemedium# Implement noise handling:
Issuemedium# Mitigate STT errors:

Related Skills

Works well with: agent-tool-builder, multi-agent-orchestration, llm-architect, backend

When to Use

This skill is applicable to execute the workflow or actions described in the overview.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

browser-automation

No summary provided by upstream source.

Repository SourceNeeds Review
2.1K-sickn33
Automation

telegram-bot-builder

No summary provided by upstream source.

Repository SourceNeeds Review
1.1K-sickn33
Automation

agent-memory-systems

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

workflow-automation

No summary provided by upstream source.

Repository SourceNeeds Review