MV Render
Render music videos with waveform visualization and synced lyrics from audio + lyrics input.
Prerequisites
-
Remotion project at scripts/ directory within this skill
-
Node.js + npm dependencies installed
-
ffprobe available (for audio duration detection)
First-Time Setup
Before first use, check and install dependencies:
1. Check Node.js
node --version
2. Install npm dependencies
cd {project_root}/{.claude or .codex}/skills/acestep-simplemv/scripts && npm install
3. Check ffprobe
ffprobe -version
If ffprobe is not available, install ffmpeg (which includes ffprobe):
-
Windows: choco install ffmpeg or download from https://ffmpeg.org/download.html and add to PATH
-
macOS: brew install ffmpeg
-
Linux: sudo apt-get install ffmpeg (Debian/Ubuntu) or sudo dnf install ffmpeg (Fedora)
Quick Start
cd {project_root}/{.claude or .codex}/skills/acestep-simplemv/ ./scripts/render-mv.sh --audio /path/to/song.mp3 --lyrics /path/to/song.lrc --title "Song Title"
Output: MP4 file at out/<audio_basename>.mp4 (or custom --output path).
Script Usage
./scripts/render-mv.sh --audio <file> --lyrics <lrc_file> --title "Title" [options]
Options: --audio Audio file path (absolute paths supported) --lyrics LRC format lyrics file (timestamped) --lyrics-json JSON lyrics file [{start, end, text}] (alternative to --lyrics) --title Video title (default: "Music Video") --subtitle Subtitle text --credit Bottom credit text --offset Lyric timing offset in seconds (default: -0.5) --output Output file path (default: out/<audio_basename>.mp4) --codec h264|h265|vp8|vp9 (default: h264) --background Background image file path (if omitted, uses animated gradient) --browser Custom browser executable path (Chrome/Edge/Chromium) --max-size Max output file size in MB (e.g. 24). Auto-compresses if exceeded. Use for IM platforms (WhatsApp≤16MB, Discord≤25MB, Telegram≤50MB)
Environment variables: BROWSER_EXECUTABLE Path to browser executable (overrides auto-detection)
Browser Detection
Remotion requires a Chromium-based browser for rendering. The script auto-detects browsers in this priority order:
-
BROWSER_EXECUTABLE environment variable
-
--browser CLI argument
-
Remotion cache (chrome-headless-shell , downloaded by Remotion)
-
System Chrome (auto-uses --chrome-mode=chrome-for-testing )
-
System Edge (pre-installed on Windows 10/11, auto-uses --chrome-mode=chrome-for-testing )
-
System Chromium (auto-uses --chrome-mode=chrome-for-testing )
Important: New versions of Chrome/Edge removed the old headless mode. When using regular Chrome/Edge/Chromium, the script automatically sets --chrome-mode=chrome-for-testing (which uses --headless=new ). When using chrome-headless-shell , it uses the default headless-shell mode (which uses --headless=old ). This is handled transparently.
If no browser is found, Remotion will attempt to download chrome-headless-shell from Google servers. This will fail if Google servers are inaccessible from your network.
Workarounds for restricted networks
Since Edge is pre-installed on Windows 10/11, it should be auto-detected without any manual configuration. The script automatically detects Chrome/Edge and uses the correct headless mode. If auto-detection fails:
Option 1: Set environment variable
export BROWSER_EXECUTABLE="/path/to/msedge.exe"
Option 2: Pass as CLI argument
./scripts/render-mv.sh --audio song.mp3 --lyrics song.lrc --title "Song" --browser "/path/to/msedge.exe"
Option 3: Enable proxy and let Remotion download chrome-headless-shell
Examples
Basic render
./scripts/render-mv.sh --audio /tmp/abc123_1.mp3 --lyrics /tmp/abc123.lrc --title "夜桜"
Custom output path
./scripts/render-mv.sh --audio song.mp3 --lyrics song.lrc --title "My Song" --output /tmp/my_mv.mp4
With subtitle and credit
./scripts/render-mv.sh --audio song.mp3 --lyrics song.lrc --title "Song" --subtitle "Artist Name" --credit "Generated by ACE-Step"
With background image
./scripts/render-mv.sh --audio song.mp3 --lyrics song.lrc --title "Song" --background /path/to/cover.jpg
Compress for Discord upload (max 25MB)
./scripts/render-mv.sh --audio song.mp3 --lyrics song.lrc --title "Song" --max-size 24
Compress for WhatsApp (max 16MB)
./scripts/render-mv.sh --audio song.mp3 --lyrics song.lrc --title "Song" --max-size 15
IM Platform Upload Limits
When sending MV to chat platforms, use --max-size to auto-compress:
Platform Limit Recommended --max-size
WhatsApp 16MB 15
Discord (free) 25MB 24
Telegram 50MB 48
Slack (free) 1GB
The compression uses ffmpeg two-pass encoding to achieve the best quality within the size constraint.
Container / Docker Font Support
When running in containers (e.g. OpenClaw), CJK fonts may not be pre-installed, causing lyrics to render as □ boxes. The script automatically:
-
Detects if CJK fonts are available (via fc-list )
-
Attempts to install fonts-noto-cjk (Debian/Ubuntu), font-noto-cjk (Alpine), or google-noto-sans-cjk-fonts (Fedora/RHEL)
-
Falls back with a warning and manual install instructions if auto-install fails
If auto-install doesn't work, manually install fonts before rendering:
Debian/Ubuntu
apt-get install -y fonts-noto-cjk
Alpine
apk add font-noto-cjk
Fedora/RHEL
dnf install -y google-noto-sans-cjk-fonts
File Naming
IMPORTANT: Use the audio file's job ID as the output filename to avoid overwriting. Do NOT use custom names like --output my_song.mp4 . Let the default naming handle it (derives from audio filename).
Default output uses the audio filename as base:
-
Audio: acestep_output/{job_id}_1.mp3
-
Lyrics: acestep_output/{job_id}_1.lrc
-
Video: Pass --output acestep_output/{job_id}.mp4 (use the job ID from the audio file)
Example: if audio is chatcmpl-abc123_1.mp3 , pass --output acestep_output/chatcmpl-abc123.mp4
Title Guidelines
-
Keep --title short and single-line (max ~50 chars, auto-truncated)
-
Use --subtitle for additional info
-
Do NOT put newlines in --title
Good: --title "Open Source" --subtitle "ACE-Step v1.5"
Bad: --title "Open Source - ACE-Step v1.5\nCelebrating Music AI"
Notes
-
Audio files with absolute paths are auto-copied to public/ by render.mjs
-
Duration is auto-detected via ffprobe
-
Typical render time: ~1-2 minutes for a 90s song
-
Output resolution: 1920x1080, 30fps