ml-model-eval-benchmark

Compare model candidates using weighted metrics and deterministic ranking outputs. Use for benchmark leaderboards and model promotion decisions.

Safety Notice

This item is sourced from the public archived skills repository. Treat as untrusted until reviewed.

Copy this and send it to your AI assistant to learn

Install skill "ml-model-eval-benchmark" with this command: npx skills add 0x-professor/ml-model-eval-benchmark

ML Model Eval Benchmark

Overview

Produce consistent model ranking outputs from metric-weighted evaluation inputs.

Workflow

Define metric weights and accepted metric ranges.
Ingest model metrics for each candidate.
Compute weighted score and ranking.
Export leaderboard and promotion recommendation.

Use Bundled Resources

Run scripts/benchmark_models.py to generate benchmark outputs.
Read references/benchmarking-guide.md for weighting and tie-break guidance.

Guardrails

Keep metric names and scales consistent across candidates.
Record weighting assumptions in the output.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Open in GitHub Open in ClawHub

Related Skills

Related by shared tags or category signals.

General

newton-quotation-pdf-extraction

从PDF报价单中提取产品信息（型号、数量、价格、币种、图片）。当用户需要从PDF报价单或产品目录中提取结构化产品数据时使用，特别适用于电商产品列表或价格表。

Archived SourceRecently Updated

--1688aiinfra

General

kami-package-detection

A free skill by Kami SmartHome. Get notified the moment a package arrives at your door. Detects packages, parcels, and bags from RTSP camera streams using AI vision.

Archived SourceRecently Updated

--13681882136

General

amoeba-management-analysis

阿米巴经营分析技能。基于稻盛和夫阿米巴经营理念，提供单位时间核算、经营会计报表分析、阿米巴组织划分评估、业绩改善诊断等能力。当用户需要做阿米巴经营分析、单位时间核算、经营会计、阿米巴组织划分、利润中心分析、内部交易定价、业绩评价时触发。触发词：阿米巴、阿米巴经营、单位时间核算、经营会计、利润中心、内部交易、阿米巴划分、巴长、稻盛和夫、京瓷会计学

Archived SourceRecently Updated

--156554395

General

bigmodel-image-video

使用 BigModel (CogView/CogVideoX) API 生成高质量图片和视频。当用户需要"生成图片"、"制作视频"、"AI 绘画"、"创建封面"、"设计海报"、"视觉内容生成"、或任何需要创建图像/视频内容的场景时使用此技能。即使没有明确提到"生成"，只要用户需要创建、设计或制作视觉内容（如小说封面、产品图片、宣传图、短视频等），都应该主动使用此技能。

Archived SourceRecently Updated

--156554395