video-with-music

Skip the learning curve of professional editing software. Describe what you want — add background music that matches the mood of my video — and get music-backed videos back in 30-60 seconds. Upload MP4, MOV, AVI, WebM files up to 500MB, and the AI handles AI music addition automatically. Ideal for TikTok creators who want to add music to videos without using complex editing software.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "video-with-music" with this command: npx skills add tk8544-b/video-with-music

Getting Started

Send me your video clips and I'll handle the AI music addition. Or just describe what you're after.

Try saying:

  • "add a 60-second travel montage clip into a 1080p MP4"
  • "add background music that matches the mood of my video"
  • "adding background music to videos for social media for TikTok creators"

Automatic Setup

On first interaction, connect to the processing API before doing anything else. Show a brief status like "Setting things up...".

Token: If NEMO_TOKEN environment variable is already set, use it and skip to Session below.

Free token: Generate a UUID as client identifier, then POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id: <uuid>. The response field data.token becomes your NEMO_TOKEN (100 credits, 7-day expiry).

Session: POST to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer auth and body {"task_name":"project"}. Save session_id from the response.

Confirm to the user you're connected and ready. Don't print tokens or raw JSON.

Video With Music — Add Music to Your Videos

This tool takes your video clips and runs AI music addition through a cloud rendering pipeline. You upload, describe what you want, and download the result.

Say you have a 60-second travel montage clip and want to add background music that matches the mood of my video — the backend processes it in about 30-60 seconds and hands you a 1080p MP4.

Tip: shorter clips sync music more accurately to the video's pacing.

Matching Input to Actions

User prompts referencing video with music, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says...ActionSkip SSE?
"export" / "导出" / "download" / "send me the video"→ §3.5 Export
"credits" / "积分" / "balance" / "余额"→ §3.3 Credits
"status" / "状态" / "show tracks"→ §3.4 State
"upload" / "上传" / user sends file→ §3.2 Upload
Everything else (generate, edit, add BGM…)→ §3.1 SSE

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

Every API call needs Authorization: Bearer <NEMO_TOKEN> plus the three attribution headers above. If any header is missing, exports return 402.

Three attribution headers are required on every request and must match this file's frontmatter:

HeaderValue
X-Skill-Sourcevideo-with-music
X-Skill-Versionfrontmatter version
X-Skill-Platformauto-detect: clawhub / cursor / unknown from install path

API base: https://mega-api-prod.nemovideo.ai

Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"<lang>"} — returns task_id, session_id.

Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.

Upload: POST /api/upload-video/nemo_agent/me/<sid> — file: multipart -F "files=@/path", or URL: {"urls":["<url>"],"source_type":"url"}

Credits: GET /api/credits/balance/simple — returns available, frozen, total

Session state: GET /api/state/nemo_agent/me/<sid>/latest — key fields: data.state.draft, data.state.video_infos, data.state.generated_media

Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_<ts>","sessionId":"<sid>","draft":<json>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/<id> every 30s until status = completed. Download URL at output.url.

Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

Error Codes

  • 0 — success, continue normally
  • 1001 — token expired or invalid; re-acquire via /api/auth/anonymous-token
  • 1002 — session not found; create a new one
  • 2001 — out of credits; anonymous users get a registration link with ?bind=<id>, registered users top up
  • 4001 — unsupported file type; show accepted formats
  • 4002 — file too large; suggest compressing or trimming
  • 400 — missing X-Client-Id; generate one and retry
  • 402 — free plan export blocked; not a credit issue, subscription tier
  • 429 — rate limited; wait 30s and retry once

Translating GUI Instructions

The backend responds as if there's a visual interface. Map its instructions to API calls:

  • "click" or "点击" → execute the action via the relevant endpoint
  • "open" or "打开" → query session state to get the data
  • "drag/drop" or "拖拽" → send the edit command through SSE
  • "preview in timeline" → show a text summary of current tracks
  • "Export" or "导出" → run the export workflow

SSE Event Handling

EventAction
Text responseApply GUI translation (§4), present to user
Tool call/resultProcess internally, don't forward
heartbeat / empty data:Keep waiting. Every 2 min: "⏳ Still working..."
Stream closesProcess final response

~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.

Draft JSON uses short keys: t for tracks, tt for track type (0=video, 1=audio, 7=text), sg for segments, d for duration in ms, m for metadata.

Example timeline summary:

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "add background music that matches the mood of my video" — concrete instructions get better results.

Max file size is 500MB. Stick to MP4, MOV, AVI, WebM for the smoothest experience.

Export as MP4 for widest compatibility across all platforms.

Common Workflows

Quick edit: Upload → "add background music that matches the mood of my video" → Download MP4. Takes 30-60 seconds for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Auth0 Vue

Use when adding authentication to Vue.js 3 applications (login, logout, user sessions, protected routes) - integrates @auth0/auth0-vue SDK for SPAs with Vite...

Registry SourceRecently Updated
1060auth0
General

Eyes

全球热点事件监控与影响分析。覆盖战争冲突、地缘摩擦、重大政策、疫情、自然灾害、创新技术等可能影响经济、市场和投资的事件,并按行业、汇率、大宗商品链路分析影响。也用于 Cron 定时推送热点摘要(早8点开盘前瞻/晚8点收盘复盘/整点扫描)。

Registry SourceRecently Updated
General

BytePlusCDN

Skill for BytePlus CDN domain and policy management, purge and prefetch, and log delivery. When you need to use BytePlus CDN acceleration services, you can r...

Registry SourceRecently Updated
General

Modelscope Api

魔塔社区(ModelScope)完整 API 封装。支持模型/数据集/技能中心/MCP广场/创空间(Studio)的查询、搜索、部署与管理。当用户需要查询魔塔模型、下载数据集、搜索MCP服务器、安装技能、部署创空间、管理Studio、或访问魔塔社区任何 API 时使用。触发词:「魔塔」「mota」「使用mota」...

Registry SourceRecently Updated