dataset-curator

This skill ensures that the data you feed to your AI is clean, accurate, and safe.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "dataset-curator" with this command: npx skills add famaoai-creator/gemini-skills/famaoai-creator-gemini-skills-dataset-curator

Dataset Curator

This skill ensures that the data you feed to your AI is clean, accurate, and safe.

Capabilities

  1. Data Cleaning & Structuring
  • Removes duplicates, boilerplate, and noisy text from knowledge bases.

  • Converts unstructured documents into clean Markdown or JSON/Vector-friendly formats.

  1. Privacy Audit
  • Scans datasets for PII (Personal Identifiable Information) before they are sent to LLMs or vector databases.

Usage

  • "Clean up the knowledge/ directory and structure it for better RAG performance."

  • "Audit this customer feedback dataset for sensitive info before we use it for AI training."

Knowledge Protocol

  • This skill adheres to the knowledge/orchestration/knowledge-protocol.md . It automatically integrates Public, Confidential (Company/Client), and Personal knowledge tiers, prioritizing the most specific secrets while ensuring no leaks to public outputs.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

data-transformer

No summary provided by upstream source.

Repository SourceNeeds Review
General

local-reviewer

No summary provided by upstream source.

Repository SourceNeeds Review
General

completeness-scorer

No summary provided by upstream source.

Repository SourceNeeds Review
General

api-fetcher

No summary provided by upstream source.

Repository SourceNeeds Review