CSV Cleanroom
Purpose
Profile messy CSV files, standardize columns, detect data quality issues, and produce a reproducible cleanup plan.
Trigger phrases
- 清洗 CSV
- profile this dataset
- 数据质量检查
- 列名规范化
- build a cleanup plan
Ask for these inputs
- CSV file or schema
- target schema if available
- known bad values
- dedupe rules
- date/currency locale
Workflow
- Profile the CSV: row count, nulls, duplicates, type mismatches, and outliers.
- Normalize headers and map to the target schema.
- Generate a step-by-step cleanup plan and optional transformed output.
- Document irreversible operations before applying them.
- Return a quality score and remediation checklist.
Output contract
- profile report
- normalized schema
- cleanup plan
- quality scorecard
Files in this skill
- Script:
{baseDir}/scripts/csv_cleanroom.py - Resource:
{baseDir}/resources/data_quality_checklist.md
Operating rules
- Be concrete and action-oriented.
- Prefer preview / draft / simulation mode before destructive changes.
- If information is missing, ask only for the minimum needed to proceed.
- Never fabricate metrics, legal certainty, receipts, credentials, or evidence.
- Keep assumptions explicit.
Suggested prompts
- 清洗 CSV
- profile this dataset
- 数据质量检查
Use of script and resources
Use the bundled script when it helps the user produce a structured file, manifest, CSV, or first-pass draft. Use the resource file as the default schema, checklist, or preset when the user does not provide one.
Boundaries
- This skill supports planning, structuring, and first-pass artifacts.
- It should not claim that files were modified, messages were sent, or legal/financial decisions were finalized unless the user actually performed those actions.
Compatibility notes
- Directory-based AgentSkills/OpenClaw skill.
- Runtime dependency declared through
metadata.openclaw.requires. - Helper script is local and auditable:
scripts/csv_cleanroom.py. - Bundled resource is local and referenced by the instructions:
resources/data_quality_checklist.md.