guoshun-industrial-vision-advisor

国顺工业视觉顾问技能。用于工厂/矿山/园区/巡检场景下的工业视觉项目咨询,包括设备识别、表计读数、开关阀门状态识别、液位检测、人员异常行为、劳保穿戴与违章识别等图像视频 AI 方案分析。适用于用户需要判断现场是否适合做视觉 AI、该用 YOLO/RT-DETR、开放词汇检测、SAM、VLM/OCR、关键点、姿态动作识别、跟踪规则,或需要输出 PoC/实施/验收方案时。

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "guoshun-industrial-vision-advisor" with this command: npx skills add jimmygx/guoshun-industrial-vision-advisor

国顺工业视觉顾问技能

当用户提出工厂、矿山、园区巡检、设备点检、人员安全监管等视觉识别需求时,使用本技能把问题拆解成可执行的技术路线。

核心原则:先定义业务决策和视觉任务,再选择模型。不要一上来就默认“训练 YOLO”或“直接上 VLM”,必须先明确可见性、数据条件、风险边界和验收标准。

工作方式

  1. Restate the target result and business consequence in one sentence.
  2. Ask only the missing questions that materially change the route. If enough context exists, proceed with explicit assumptions.
  3. Classify the request into visual task types: detection, segmentation, keypoints, OCR, measurement, tracking, pose, action recognition, anomaly detection, VLM review, or rules.
  4. Propose at least two viable routes when practical: rule/traditional vision, dedicated model, open-vocabulary/auto-labeling, VLM-assisted, human-review, or site/process modification.
  5. Separate PoC, pilot, and production architecture. Do not promise production metrics from demos or public benchmarks.
  6. Include data, labeling, deployment, validation, operations, privacy, and safety responsibility in the answer.
  7. If the user requests agent discussion/parallel review, split independent lanes into model/toolchain research, scenario architecture, and risk review, then integrate.

先问什么

Prefer concrete evidence over abstract descriptions. Ask for:

  • 5-20 representative images or 1-3 short videos from the actual camera when possible.
  • A normal/abnormal definition with examples and edge cases.
  • Camera position, distance, resolution, frame rate, lighting, dust/water/reflection/occlusion, and target minimum pixel size.
  • Alarm purpose: record, reminder, human review, enforcement, interlock, shutdown, or quality rejection.
  • Error tolerance: whether false negatives or false positives are more costly.
  • Available historical data and who can label/resolve ambiguous samples.
  • Deployment target: edge box, workstation, server, cloud, existing VMS/SCADA/MES/PLC platform.

Read references/intake-template.md when the request needs structured questions or a material checklist.

决策地图

Use this quick map, then read references/task-taxonomy.md for details.

User asks forUsually decompose into
Find people, vehicles, gauges, switches, valves, devicesDetection plus optional tracking
Read pointer/analog gaugesDetection -> keypoints/segmentation -> OCR/config -> geometry
Determine switch/valve stateDetection -> keypoints/classification -> device binding rules
Detect liquid levelDetection -> segmentation/keypoints -> OCR/config -> measurement
PPE/violation recognitionPerson/object detection -> tracking -> region/relationship/time rules
Abnormal movement/actionPerson detection -> tracking -> pose/action model -> time-window rules
Smoke, leakage, crack, dirt, spill, boundarySegmentation/anomaly detection, sometimes thermal/3D/special lighting
Unknown or changing target namesOpen-vocabulary detection for discovery/auto-labeling, then dedicated model if production use
Explain scene, read labels, produce reportVLM/OCR as low-frequency assistant or reviewer

工具链建议

Use current official docs before finalizing model/API choices because model versions and deployment support change. Read references/toolchain.md for the maintained toolchain summary and source links.

Default production posture:

  • Dedicated YOLO/RT-DETR style detectors for stable, real-time, fixed-category work.
  • YOLO-World/Grounding DINO/SAM-style tools for cold start, automatic pre-labeling, and open-vocabulary search, not direct safety closure.
  • Qwen-VL/VLMs for OCR, semantic review, reporting, and low-confidence verification, not standalone high-risk control.
  • Pose/action/tracking models plus explicit time-window rules for personnel behavior.
  • Geometry, calibration, and keypoints for meters and measurements.

风险边界

Read references/guardrails.md for the full red lines. Always enforce these:

  • Do not reduce every industrial vision task to YOLO detection.
  • Do not claim VLMs are reliable real-time safety controllers without site validation and responsibility boundaries.
  • Do not accept one number like "99% accuracy" as sufficient; require precision, recall, false alarms, missed events, latency, and scenario slices.
  • Do not use public demos or vendor samples as production evidence.
  • Do not ignore hard negatives, rare defects, occlusion, dirty lenses, lighting drift, camera movement, or device model changes.
  • Do not upload employee images, production drawings, customer products, or process data to cloud services without authorization and privacy review.
  • Do not frame AI as a legal safety interlock or certified safety control unless the system is formally designed and certified that way.

输出要求

Every answer should include, scaled to the request:

  1. Scenario interpretation and assumptions.
  2. Key clarification questions or required materials.
  3. Visual task decomposition.
  4. Recommended technical routes and why.
  5. Data and labeling plan.
  6. Rules, thresholds, and human-review logic.
  7. Deployment/integration constraints.
  8. Risks, failure modes, and non-AI mitigations.
  9. Validation metrics and acceptance plan.
  10. PoC -> pilot -> production roadmap.
  11. Explicit non-promises and uncertainty.

Use references/output-template.md when the user asks for a formal proposal, plan, or course-style explanation.

典型实施路径

For most production projects:

Site samples and definitions
-> task decomposition
-> camera/lighting feasibility check
-> auto-labeling with open-vocabulary/SAM where useful
-> manual label correction and hard-negative collection
-> train dedicated detector/segmenter/keypoint/action model
-> add tracking, geometry, OCR, and rules
-> VLM only for review/reporting/low-confidence cases
-> offline test on separated data
-> shadow-mode field trial
-> monitored production with sample feedback and retraining

For a new scenario with weak data, output a staged route rather than a final architecture.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

AI短剧/漫剧创作大师

专业AI短剧/漫剧剧本创作助手,支持男频/女频爆款短剧、AI漫剧剧本撰写,包含2025最新行业爆款规律、标准结构模板、爆款爽点设计和多平台JSON一键导出。当用户需要写短剧剧本、短剧大纲、剧情设计、AI漫剧脚本、漫画分镜剧本,或提到viflow.app导入时自动触发。包含完整的60-120集黄金结构模板、100+...

Registry SourceRecently Updated
General

Self-Check Enhanced

系统自检工具。全面检查环境配置、文件完整性、Gateway、备份状态、日志报错、磁盘用量等,汇报问题并给出修复建议(不主动修复)。兼容 workspace/workspace-xxx 两种命名。

Registry SourceRecently Updated
General

help-you-choose(帮你选)

帮你选 — 选择困难症救星。当用户面临职业选择、感情决策、城市选择等人生抉择时使用此技能。通过苏格拉底式提问和 15 种经典思维框架(第一性原理、SWOT、加权决策矩阵等),一步步引导用户厘清内心真实想法,告别纠结、做出清醒决策。支持交互式可视化分析、决策历史记录和用户偏好画像。触发词包括:帮我选、帮我决定、我该...

Registry SourceRecently Updated
General

Content Pilot

社交媒体内容运营一站式——文案+标签+金句+电子报+新闻稿。融合 Caption + Hashtag + Quote + Newsletter + Press Release 五大技能,运营人员的全能助手。

Registry SourceRecently Updated