nvidia-kimi-vision

Analyze images using NVIDIA Kimi K2.5 vision model via NVIDIA NIM API. Perfect for adding vision to non-vision models like MiniMax M2.5, GLM-5, or any model without native image support. Supports png, jpg, jpeg, webp.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "nvidia-kimi-vision" with this command: npx skills add miladnoo/nvidia-kimi-vision

NVIDIA Kimi Vision

Fast image analysis using Kimi K2.5 multimodal model from NVIDIA NIM.

Why This Skill?

Fast - NVIDIA NIM inference
Quality - Kimi K2.5 is a solid vision model
Simple - Just pass an image and prompt
Free tier - Available through NVIDIA build.nvidia.com

API Setup (IMPORTANT)

When using this skill, if no API key is found, it will automatically guide the user through setup:

Step 1: Get a Free API Key

Go to https://build.nvidia.com
Sign up / Log in with GitHub or Google
Search for "Kimi K2.5"
Click on the model and get your free API key

Step 2: Save the Key

# Option A: Save to file (recommended)
mkdir -p ~/.config
echo 'your-api-key-here' > ~/.config/nvidia-kimi-api-key

# Option B: Pass directly when running
python3 scripts/analyze_image.py photo.jpg "What's this?" sk-your-key-here

First Time Setup (for agents)

When a user tries to use this skill without an API key, the script will output clear setup instructions. Guide them through:

Visiting https://build.nvidia.com
Getting their free API key
Saving it to ~/.config/nvidia-kimi-api-key

Usage

python3 scripts/analyze_image.py <image_path> "<prompt>" [api_key]

Examples

# What's in this image?
python3 scripts/analyze_image.py "/path/to/image.jpg" "Describe what's in this image"

# Extract text from screenshot
python3 scripts/analyze_image.py "/path/screenshot.png" "Extract all text"

# Analyze a meme
python3 scripts/analyze_image.py "/path/meme.jpg" "Explain this meme"

# With API key inline
python3 scripts/analyze_image.py photo.jpg "What's this?" sk-xxxxx

Image Formats

Supports: png, jpg, jpeg, webp

Rate Limits

The free tier through NVIDIA NIM has some limits, but they're not clearly documented on the site. Check https://build.nvidia.com for the latest info on your specific key's limits.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Open Registry Record Open in ClawHub

Related Skills

Related by shared tags or category signals.

General

Content Refresher

Use when updating outdated content, fixing traffic/ranking decay, refreshing stats, adding new sections, or improving freshness signals. 内容更新/排名恢复

Registry SourceRecently Updated

1.9K1aaron-he-zhu

General

AssemblyAI Transcriber

Transcribe audio files with speaker diarization (who speaks when). Supports 100+ languages, automatic language detection, and timestamps. Use for meetings, interviews, podcasts, or voice messages. Requires AssemblyAI API key.

Registry SourceRecently Updated

1.4K0xenofex7

General

mac-node-snapshot

A robust, permission-friendly method to capture macOS screens via OpenClaw screen.record. Ideal for headless environments or ensuring capture reliability.

Registry SourceRecently Updated

1.4K0taozhe6

General

Amazon Asin Lookup Api Skill

This skill helps users extract structured product details from Amazon using a specific ASIN (Amazon Standard Identification Number). Use this skill when the...

Registry SourceRecently Updated

1.3K1phheng