video-producer-agent

Use this skill to create complete videos with voiceover and music. Triggers: "create video", "product video", "explainer video", "promo video", "demo video", "training video", "ad video", "commercial", "marketing video", "video with voiceover", "video with music", "brand video", "testimonial video" Orchestrates: script, voiceover, background music, video clips/images, and final assembly.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "video-producer-agent" with this command: npx skills add michaelboeding/skills/michaelboeding-skills-video-producer-agent

Video Producer

Create complete videos with voiceover, music, and visuals.

This is an orchestrator skill that combines:

  • Script/storyboard generation (Claude)
  • Voiceover synthesis (Gemini TTS)
  • Background music (Lyria)
  • Video clip generation (Veo 3.1) or image animation
  • Final assembly (FFmpeg via media-utils)

Workflow

Step 1: Gather Requirements (REQUIRED)

⚠️ DO NOT skip this step. DO NOT run init_project.py until you have ALL answers.

Use interactive questioning — ask ONE question at a time, wait for the response, then ask the next. This creates a collaborative spec-driven process.

Question Flow

⚠️ Use the AskUserQuestion tool for each question below. Do not just print questions in your response — use the tool to create interactive prompts with the options shown.

Q1: Subject

"I'll create that video! First — what's it about?

(e.g., product launch, brand story, tutorial, explainer — or describe your own)"

Wait for response.

Q2: Duration

"How long should the video be?

  • 15 seconds (quick hook)
  • 30 seconds (standard ad)
  • 60 seconds (explainer)
  • 2+ minutes (detailed)
  • Or specify your own duration"

Wait for response.

Q3: Style

"What visual style?

  • Premium/luxury
  • Fun/playful
  • Corporate/professional
  • Dramatic/cinematic
  • Minimal/clean
  • Or describe your own style"

Wait for response.

Q4: Assets

"Do you have existing images or video clips to use?

  • No, generate everything
  • Yes, I have images (provide paths)
  • Yes, I have video clips (provide paths)"

Wait for response.

Q5: Audio Strategy

"How should we handle audio?

  • Custom — I generate voiceover + background music
  • Veo native — Use Veo's built-in dialogue/SFX/ambient
  • Silent — No audio, add later"

Wait for response.

Q6: Voice (if custom audio)

"What voice tone for the voiceover?

  • Professional
  • Friendly/warm
  • Energetic
  • Calm/soothing
  • Dramatic
  • Or describe your own tone"

Wait for response.

Q7: Music (if custom audio)

"What music vibe?

  • Modern electronic
  • Cinematic/epic
  • Upbeat pop
  • Ambient/chill
  • Corporate
  • Or describe your own style"

Wait for response.

Q8: Format

"What aspect ratio?

  • 16:9 (YouTube, web)
  • 9:16 (TikTok, Reels, Shorts)
  • 1:1 (Instagram feed)"

Wait for response.

Q9: Resolution

"What resolution?

  • 720p (faster generation)
  • 1080p (standard HD)"

Wait for response.

Q10: Model

"Which Veo model?

  • veo-3.1 — Latest, highest quality (default)
  • veo-3.1-fast — Faster generation, slightly lower quality
  • veo-3 — Previous generation
  • veo-3-fast — Previous gen, faster"

Wait for response.

Quick Reference

QuestionDetermines
SubjectScene content and prompts
DurationScene count (Veo clips must be 4, 6, or 8 seconds)
StyleVisual prompts and music selection
AssetsGenerate vs use existing
Audiocustom, veo_audio, or silent
VoiceTTS voice selection
MusicLyria prompt
FormatAspect ratio for Veo
Resolution720p or 1080p output quality
Modelveo-3.1, veo-3.1-fast, veo-3, veo-3-fast

Step 2: Initialize Project

Once you have the user's answers, initialize the project with their preferences:

python3 ${CLAUDE_PLUGIN_ROOT}/skills/video-producer-agent/scripts/init_project.py \
  --name "Product Launch Video" \
  --duration 30 \
  --aspect-ratio 16:9 \
  --audio-strategy custom \
  --scenes 5

Step 3: Configure project.json

Edit project.json with scene prompts, voiceover text, and music style based on user's answers.

Step 4: Assemble the Video

python3 ${CLAUDE_PLUGIN_ROOT}/skills/video-producer-agent/scripts/assemble.py \
  --project ~/my_video_project/

Project Structure

When you initialize a project, this folder structure is created:

my_project/
├── project.json          # Configuration: scenes, voiceover, music, settings
├── storyboard.md         # Planning document for the video
├── scenes/               # Generated video clips from Veo
│   ├── scene1_intro.mp4
│   ├── scene2_features.mp4
│   └── scene3_cta.mp4
├── audio/                # Audio assets
│   ├── voiceover.wav     # Generated voiceover
│   ├── background_music.wav  # Generated music
│   └── final_mix.mp3     # Mixed audio track
├── work/                 # Intermediate files (auto-generated)
│   ├── silent_scene1.mp4
│   ├── video_concatenated.mp4
│   └── ...
└── output/               # Final deliverables
    └── product_launch_video_final.mp4

Scripts

init_project.py

Initialize a new video project with folder structure and templates.

# Basic project
python3 init_project.py --name "My Video" --duration 30

# Create in specific directory
python3 init_project.py --name "Demo Video" --output ~/Videos/

# Vertical video for social
python3 init_project.py --name "Instagram Reel" --aspect-ratio 9:16 --duration 15

# Use Veo's native audio (no custom voiceover/music)
python3 init_project.py --name "Cinematic Scene" --audio-strategy veo_audio

# More scenes
python3 init_project.py --name "Long Video" --duration 60 --scenes 5

Options:

OptionDefaultDescription
--namerequiredProject name
--outputcurrent dirParent directory
--duration30Target duration in seconds
--aspect-ratio16:916:9, 9:16, 1:1, 4:3
--audio-strategycustomcustom, veo_audio, silent
--scenes3Number of scene placeholders

assemble.py

Orchestrate the full video assembly pipeline.

# Full pipeline (generate everything + assemble)
python3 assemble.py --project ~/my_project/

# Skip generation (use existing scene/audio files)
python3 assemble.py --project ~/my_project/ --skip-generation

# Dry run (show what would be done)
python3 assemble.py --project ~/my_project/ --dry-run

Pipeline steps:

  1. Generate video scenes (Veo 3.1)
  2. Strip audio from scenes (if custom audio)
  3. Generate voiceover (Gemini TTS)
  4. Generate background music (Lyria)
  5. Mix voiceover + music
  6. Concatenate video clips
  7. Merge audio with video
  8. Output final video

project.json Configuration

{
  "name": "Product Launch Video",
  "duration_target": 30,
  "aspect_ratio": "16:9",
  "resolution": "720p",
  "audio_strategy": "custom",
  
  "scenes": [
    {
      "id": 1,
      "name": "scene1_hero",
      "prompt": "Cinematic slow zoom on premium product, dramatic lighting, high-end commercial style",
      "duration": 6,
      "notes": "Music only, no voiceover"
    },
    {
      "id": 2,
      "name": "scene2_features",
      "prompt": "Product features demonstration, sleek animations, modern tech aesthetic",
      "duration": 8,
      "notes": "Voiceover starts here"
    },
    {
      "id": 3,
      "name": "scene3_cta",
      "prompt": "Product with logo on clean background, call to action moment",
      "duration": 6,
      "notes": "Music swells, voiceover ends"
    }
  ],
  
  "voiceover": {
    "enabled": true,
    "text": "Introducing the future of audio. Crystal clear sound. All-day comfort. Experience the difference.",
    "voice": "Charon",
    "style": "Professional, confident, premium brand voice"
  },
  
  "music": {
    "enabled": true,
    "prompt": "modern electronic, premium, sleek, product showcase, subtle bass",
    "duration": 35,
    "bpm": 100,
    "brightness": 0.6
  },
  
  "assembly": {
    "transition": "fade",
    "transition_duration": 0.5,
    "music_volume": 0.3,
    "fade_in": 1.0,
    "fade_out": 2.0
  }
}

Audio Strategies

StrategyDescriptionUse When
customStrip Veo audio, add custom voiceover + musicMost videos
veo_audioKeep Veo's generated audio (dialogue, SFX)Cinematic scenes, dialogues
silentStrip audio, output silent videoAdding audio later

Workflow: Creating a Video

Step 1: Initialize Project

python3 init_project.py --name "Wireless Earbuds Promo" --duration 30

Step 2: Plan the Storyboard

Edit storyboard.md to plan your video structure:

## Scene 1: Hero Reveal (0-5s)
- Visual: Earbuds emerging from shadow, premium lighting
- Audio: Music only (dramatic intro)

## Scene 2: Sound Quality (5-12s)
- Visual: Sound waves, person enjoying music
- Audio: Voiceover: "Crystal clear sound. Immersive bass."

## Scene 3: Comfort (12-20s)
- Visual: Close-up of fit, person running
- Audio: Voiceover: "All-day comfort. Secure fit."

## Scene 4: CTA (20-30s)
- Visual: Product + logo
- Audio: Voiceover: "Experience the difference." + music swell

Step 3: Configure project.json

Fill in the scene prompts, voiceover text, and music style based on your storyboard.

Step 4: Assemble

python3 assemble.py --project ~/wireless_earbuds_promo/

Step 5: Review and Iterate

Check the output in output/. If adjustments needed:

# Re-run with existing scenes (just re-mix audio)
python3 assemble.py --project ~/wireless_earbuds_promo/ --skip-generation

What You Can Create

TypeExample
Product video30s hero video showcasing a product
Explainer videoHow-to or feature explanation
Promo/ad videoMarketing advertisement
Demo videoProduct demonstration
Training videoInternal training content
TestimonialCustomer quote style video
Brand videoCompany/brand story

Prerequisites

  • GOOGLE_API_KEY - For Veo (video), Gemini TTS (voice), Lyria (music)
  • FFmpeg installed: brew install ffmpeg

Video Styles & Music Pairings

StyleMusic PromptVoice
Premium/Luxury"elegant, minimal, ambient, sophisticated"Charon (informative)
Tech/Modern"electronic, futuristic, clean, innovative"Kore (firm)
Fun/Playful"upbeat, cheerful, acoustic, positive"Puck (upbeat)
Corporate"professional, inspiring, orchestral lite"Orus (firm)
Lifestyle"chill, aspirational, indie, warm"Aoede (breezy)
Dramatic/Cinematic"epic, orchestral, emotional, building"Gacrux (mature)

Common Video Structures

Product Video (30s)

0-5s:   Hero shot (music only)
5-15s:  Features (voiceover + music)
15-25s: Lifestyle/use case (voiceover + music)
25-30s: Logo + CTA (music fade)

Explainer Video (60s)

0-5s:   Hook/problem statement
5-20s:  Solution introduction
20-45s: How it works (3 steps)
45-55s: Benefits summary
55-60s: CTA

Testimonial Video (45s)

0-5s:   Intro/name card
5-35s:  Testimonial quote (multiple scenes)
35-45s: Product shot + logo

Manual Workflow (Without Scripts)

If you prefer to run each step manually:

Generate scenes:

python3 ${CLAUDE_PLUGIN_ROOT}/skills/video-generation/scripts/veo.py \
  --batch scenes.json

Generate voiceover:

python3 ${CLAUDE_PLUGIN_ROOT}/skills/voice-generation/scripts/gemini_tts.py \
  --text "Your voiceover text..." \
  --voice Charon \
  --style "Professional, warm"

Generate music:

python3 ${CLAUDE_PLUGIN_ROOT}/skills/music-generation/scripts/lyria.py \
  --prompt "modern electronic, premium" \
  --duration 35

Strip audio from clips:

python3 ${CLAUDE_PLUGIN_ROOT}/skills/media-utils/scripts/video_strip_audio.py \
  -i scene*.mp4

Concatenate videos:

python3 ${CLAUDE_PLUGIN_ROOT}/skills/media-utils/scripts/video_concat.py \
  -i silent_*.mp4 --transition fade -o video.mp4

Mix audio:

python3 ${CLAUDE_PLUGIN_ROOT}/skills/media-utils/scripts/audio_mix.py \
  --voice voiceover.wav --music music.wav -o audio.mp3

Merge audio + video:

python3 ${CLAUDE_PLUGIN_ROOT}/skills/media-utils/scripts/video_audio_merge.py \
  --video video.mp4 --audio audio.mp3 -o final.mp4

Input Files You Can Provide

File TypeHow It's Used
Product imagesAnimate with Veo as first frame
Logo (PNG)Overlay on final scene
Existing voiceoverPlace in audio/voiceover.wav
Brand musicPlace in audio/background_music.wav
Video clipsPlace in scenes/ folder
Script/copyUse for voiceover text

Limitations

  • Veo video duration: Max 8 seconds per clip (concatenate for longer)
  • Veo 3.1 includes audio: All clips have generated audio (strip if using custom)
  • Processing time: Video generation takes 1-3 minutes per clip
  • Resolution: Currently 720p or 1080p (1080p for 8s only)

Error Handling

ErrorSolution
"GOOGLE_API_KEY not set"Set up API key per README
"FFmpeg not found"Install: brew install ffmpeg
"project.json not found"Run init_project.py first
Video generation timeoutRetry, or use shorter duration
Audio/video sync issuesAdjust scene durations

Example Prompts

Simple:

"Create a 30-second product video for my new coffee maker"

Detailed:

"Create a 45-second product video for our new wireless earbuds. Premium, luxury feel. I have product photos attached. Professional male voiceover. Modern electronic music. 16:9 for YouTube."

With project:

"Initialize a video project called 'App Demo' with 5 scenes, 60 seconds total, vertical format for TikTok"

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

add-to-xcode

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

patent-lawyer-agent

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

audio-producer-agent

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

review-analyst-agent

No summary provided by upstream source.

Repository SourceNeeds Review