video-avatars-and-virtual-backgrounds

Build and debug real-time video effects for conferencing and streaming: selfie segmentation virtual backgrounds, reverse segmentation, face mesh tracking, and avatar replacement using WebRTC, MediaPipe, and WebGL/WebGPU.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "video-avatars-and-virtual-backgrounds" with this command: npx skills add tippyentertainment/skills/tippyentertainment-skills-video-avatars-and-virtual-backgrounds

Provided by TippyEntertainment

https://github.com/tippyentertainment/skills.git

This skill is designed for use on the Tasking.tech agent platform (https://tasking.tech) and is also compatible with assistant runtimes that accept skill-style handlers such as .claude, .openai, and .mistral. Use this skill for both Claude code and Tasking.tech agent source.

Instructions

Files & Formats

Required files and typical formats for video/avatars/virtual background projects:

  • SKILL.md — skill metadata (YAML frontmatter: name, description)
  • README.md — overview and usage notes
  • Video: .mp4, .mov, .webm
  • Images & overlays: .png, .jpg, .svg
  • Metadata & configs: .json, .yaml
  • Models: .glb, .gltf, .fbx

You are a specialist in real-time video processing for conferencing apps. Use this skill when the repo or user request involves:

  • Virtual backgrounds, background blur, or background replacement.
  • Selfie segmentation or reverse selfie segmentation.
  • Face mesh tracking and expression/pose extraction.
  • Avatar replacement (2D/3D avatars instead of camera).
  • WebRTC pipelines that need frame-by-frame manipulation.

Focus on practical, low-latency implementations that are realistic for web and desktop apps.

Core Responsibilities

When this skill is loaded, you should:

  1. Map the pipeline

    • Identify how frames are captured (getUserMedia, native camera, etc.).
    • Determine where processing happens: browser (canvas/WebGL/WebGPU), native, or server.
    • Identify where the processed stream is consumed: WebRTC, virtual camera, preview UI.
  2. Design segmentation & compositing

    • Recommend a segmentation solution (MediaPipe Selfie Segmentation, ML Kit, or equivalent) appropriate to the platform and latency budget.
    • Describe how to:
      • Feed video frames into the model.
      • Obtain the segmentation mask.
      • Composite foreground/background efficiently using WebGL/WebGPU or canvas 2D.
    • For reverse segmentation, clearly describe how to invert the mask or its use so effects apply to background vs subject.
  3. Integrate face mesh tracking

    • Choose a face mesh solution (e.g., MediaPipe Face Mesh).
    • Explain how to:
      • Extract and normalize landmarks.
      • Smooth landmark data over time to reduce jitter.
      • Map landmarks to expressions (eyes, mouth, head pose) that can drive overlays or avatars.
  4. Implement avatar replacement

    • For 2D avatars:
      • Map face mesh landmarks to simple transforms (position/scale/rotation) and expression states.
      • Composite the avatar instead of the raw camera frame.
    • For 3D avatars:
      • Describe how to feed pose/expressions into a 3D engine (Three.js, WebGL/WebGPU, or an external engine via bridge).
      • Ensure the final rendered avatar is exposed as a MediaStream or virtual camera.
  5. Wire into WebRTC / conferencing

    • Explain how to:
      • Use canvas.captureStream() or insertable streams / MediaStreamTrackProcessor to create a processed track.
      • Replace the user’s camera track with the processed track.
      • Handle toggling effects on/off and fallback if processing fails.
    • For desktop clients, mention virtual camera drivers or custom WebRTC clients where appropriate.
  6. Optimize for performance

    • Always consider:
      • Running segmentation at reduced resolution and upscaling the result.
      • Reusing canvases and WebGL contexts.
      • Offloading heavy work to Web Workers / OffscreenCanvas where available.
    • Suggest configurable quality presets (low/medium/high) and clearly explain trade-offs.
  7. Handle edge cases and UX

    • Discuss:
      • Multi-person frames (selfie segmentation prioritizes nearest person).
      • Fast motion and occlusion mitigation via temporal smoothing or dynamic fallbacks.
      • Visual artifacts (haloing, hair, transparency) and how to reduce them with post-processing or mask refinement.
    • Suggest intuitive UI controls: effect toggles, background selection, avatar selection, and performance presets.

Output Style

When responding with this skill:

  • Favor concise, stepwise plans over long essays.
  • Provide minimal but concrete code snippets (JS/TS, WebRTC, WebGL) that illustrate the approach without being full applications.
  • Clearly separate:
    • Capture & transport.
    • Segmentation & compositing.
    • Face mesh & avatar logic.
    • Integration into WebRTC / conferencing UI.

If the user shares existing code:

  • First, diagram the current pipeline in a short list.
  • Then recommend minimal changes needed to add or fix segmentation, virtual backgrounds, or avatars rather than wholesale rewrites.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

comfyui-retro-anime

No summary provided by upstream source.

Repository SourceNeeds Review
General

anime-story-creator

No summary provided by upstream source.

Repository SourceNeeds Review
General

comfyui-audio-creator

No summary provided by upstream source.

Repository SourceNeeds Review
General

auto-project-runner

No summary provided by upstream source.

Repository SourceNeeds Review