mediapipe-usage

Google MediaPipe Usage (Web / Pose Landmarker)

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "mediapipe-usage" with this command: npx skills add liuchiawei/agent-skills/liuchiawei-agent-skills-mediapipe-usage

Google MediaPipe Usage (Web / Pose Landmarker)

Quick Start

  • Install @mediapipe/tasks-vision , resolve WASM from CDN

  • Create PoseLandmarker with createFromOptions

  • Use detect() for single image, or detectForVideo() in a throttled requestAnimationFrame loop

Setup

Install (prefer pnpm):

pnpm add @mediapipe/tasks-vision

WASM root: Resolve vision tasks from CDN when creating the task:

const vision = await FilesetResolver.forVisionTasks( "https://cdn.jsdelivr.net/npm/@mediapipe/tasks-vision@latest/wasm", );

Create the Pose Landmarker Task

Use PoseLandmarker.createFromOptions(vision, options) :

import { PoseLandmarker, FilesetResolver } from "@mediapipe/tasks-vision";

const vision = await FilesetResolver.forVisionTasks( "https://cdn.jsdelivr.net/npm/@mediapipe/tasks-vision@latest/wasm", );

const poseLandmarker = await PoseLandmarker.createFromOptions(vision, { baseOptions: { modelAssetPath: modelUrl, // see reference.md for lite/full/heavy URLs delegate: "GPU", // falls back to CPU if unavailable }, runningMode: "VIDEO", // or "IMAGE" for single image numPoses: 1, minPoseDetectionConfidence: 0.5, minPosePresenceConfidence: 0.5, minTrackingConfidence: 0.5, });

  • runningMode: IMAGE for single image → use detect(image) . VIDEO for stream → use detectForVideo(video, timestamp) .

  • baseOptions.modelAssetPath: URL to a .task model (lite / full / heavy). See reference.md for URLs.

  • delegate: "GPU" preferred; some environments fall back to CPU.

Run the Task

Single image (runningMode IMAGE):

const result = poseLandmarker.detect(imageElement);

Video / webcam (runningMode VIDEO):

Call detectForVideo(video, timestamp) inside a requestAnimationFrame loop. Throttle by time (e.g. ~33 ms between frames) to avoid excessive work:

let lastFrameTime = 0; function detectLoop() { const now = performance.now(); if (video.readyState >= 2 && now - lastFrameTime > 33) { lastFrameTime = now; const result = poseLandmarker.detectForVideo(video, now); if (result.landmarks?.length) { const landmarks = result.landmarks[0]; // first person // use landmarks } } requestAnimationFrame(detectLoop); } requestAnimationFrame(detectLoop);

Result Shape

  • result.landmarks: Array of poses; each pose is NormalizedLandmark[] (33 points). Each landmark: x , y , z (normalized 0–1; z is depth relative to hip center), visibility (0–1).

  • result.worldLandmarks: Optional 3D coordinates in meters (same indices).

  • Single person: use result.landmarks[0] .

Practical Patterns (Know-how)

  • State machine: idle → loading (load model) → ready (can start) → active (webcam + detection) → error. When switching model variant, close the old PoseLandmarker instance and create a new one.

  • Throttle: Run detectForVideo only when performance.now() - lastFrameTime > 33 (≈30 fps) to avoid blocking the main thread.

  • Smoothing: Apply a smoothing factor (e.g. 0.3) to derived values (pitch, bank) to reduce jitter; use a dead zone (in degrees) to ignore small movements.

  • Confidence: Use each landmark’s visibility ; ignore or downweight points below a threshold. Helper: getLandmark(landmarks, index, minConfidence) returning the point only if visibility >= minConfidence .

  • Gestures: e.g. “hands forward” = compare shoulder vs wrist z; “hands overhead” = compare wrist y to shoulder y. Use consecutive-frame counters for toggles (e.g. require N frames in pose before firing an action).

Cleanup

  • Stop webcam: stream.getTracks().forEach(t => t.stop()) .

  • Release task: poseLandmarker.close() when done or before creating a new instance.

Additional Resources

  • For full 33 landmark indices and skeleton connections, see reference.md

  • For minimal example, skeleton overlay, and landmark-to-control mapping, see examples.md

  • Official docs: Pose Landmarker Web JS, Setup guide for web

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

next-intl-app-router

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

web-design-guidelines

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

ui-ux-pro-max

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

copywriting

No summary provided by upstream source.

Repository SourceNeeds Review