Rill ClickHouse Model Workflow
This skill orchestrates $rilldata and $clickhouse-best-practices for two modes:
- Mode A: Review/Refactor Existing Model
- Mode B: Create ClickHouse Model From Scratch (for example from S3 path/source info)
Keep existing model intent when refactoring and avoid unnecessary churn.
1. Prerequisite Check (Hard Gate)
- Confirm the
clickhouse-best-practicesskill is available and installed fromhttps://github.com/ClickHouse/agent-skills. - If the skill is missing or install origin cannot be confirmed, stop immediately and help the user install it with:
npx skills add clickhouse/agent-skills
- After providing the install command, instruct the user to rerun the task once installation is complete.
- Do not run diagnostics, propose plans, or generate model files until the prerequisite is satisfied.
2. Load Required Guidance
Always load relevant guidance from both skill sets.
Rill rule files:
../rilldata/rules/runtime-model.md(source:https://github.com/rilldata/rill/blob/main/runtime/ai/instructions/data/resources/model.md)../rilldata/rules/project-files-models.md(source:https://docs.rilldata.com/reference/project-files/models)../rilldata/rules/runtime-rillyaml.md(source:https://github.com/rilldata/rill/blob/main/runtime/ai/instructions/data/resources/rillyaml.md)../rilldata/rules/project-files-rill-yaml.md(source:https://docs.rilldata.com/reference/project-files/rill-yaml)../rilldata/rules/project-files-connectors.mdwhen connector definitions are needed (source:https://docs.rilldata.com/reference/project-files/connectors)
ClickHouse rules:
- Load applicable rules from
clickhouse-best-practices/rules/*. - Always include primary key/order, partitioning, data type, join, and index/skipping-index checks when relevant.
3. Startup
Start Rill with:
rill start . --no-ui --no-open --verbose --log-format json 2>&1 | tee /tmp/rill-no-ui.log
Keep the runtime available while running diagnostics.
4. Connector Resolution Logic
- Read
rill.yamland captureolap_connector. - Determine effective output connector:
- Mode A: Read target model YAML/SQL and determine explicit output connector (for example model-level
connectororoutput.connectorwhen present). - Mode B: Infer intended output from user request plus project defaults, and use project
olap_connectorwhen model output connector is not specified.
- Mode A: Read target model YAML/SQL and determine explicit output connector (for example model-level
- Connector decision:
- If effective connector is
clickhouse, proceed. - Otherwise stop and report the task is out of scope for this skill.
- If effective connector is
5. Query Diagnostics via Rill
Run diagnostics through rill query first, for example:
rill query --local --path . --connector clickhouse --sql "SELECT partition, count() AS parts, sum(rows) AS rows FROM system.parts WHERE active AND table='<model-name>' GROUP BY partition ORDER BY partition DESC LIMIT 20" --limit 20
Also run model-aware variants for additional target tables when needed (for joins, staging tables, or referenced outputs).
6. Dev-Data Sufficiency Rule
In dev, if returned rows/partitions are too limited to support a confident ClickHouse best-practice assessment:
- Increase both SQL
LIMITand CLI--limittogether using steps such as20 -> 100 -> 500 -> 1000. - After each increase, re-run diagnostics and reassess evidence quality.
- Record:
- original limits
- updated limits
- why the increase was needed
- Stop increasing once evidence is sufficient. Do not use unnecessarily high limits.
7. Mode A: Review / Refactor Existing Model
- Analyze existing model SQL/YAML plus runtime diagnostics.
- Evaluate against applicable ClickHouse and Rill rules.
- Prepare minimal, intent-preserving refactor plan before any edits.
8. Mode B: Create Model From Scratch
8.1 Parameter Checklist Template (Required Before SQL/YAML)
Always output a checklist template and fill what is known before producing model YAML/SQL:
- target model name
- source connector/type (for example
s3,gcs,https, warehouse connector) - source path/URI or table reference
- file/data format (
Parquet,CSV,JSON, table) - event time column (if available)
- expected grain and primary query/filter patterns
- date partition hints from path/table (if any)
- expected volume and refresh expectations
- environment assumptions (
dev/prod) - existing connector resources found in project
8.2 Best-Effort Input Policy
- Use known inputs first.
- Ask targeted follow-up questions only for missing fields that materially affect correctness.
- Continue with safe defaults for non-critical unknowns.
- Explicitly document assumptions before approval.
8.3 Model Creation Defaults
Use a conservative baseline unless evidence supports more:
type: modelmaterialize: true- explicit ClickHouse output targeting behavior aligned with project defaults
output.order_byis required for materialized ClickHouse tables
If source info indicates date-like partitioned paths/tables (for example year=.../month=.../day=...), prefer incremental partitioned shape:
incremental: truepartitions.glob(or equivalent partition declaration)- partition-aware SQL using
{{ .partition.uri }}where appropriate
Set output.primary_key and output.partition_by only when supported by observed source structure, query patterns, or diagnostics evidence.
8.4 Artifact Policy
After approval:
- Create or update the target model file.
- Create or update connector YAML only if the required connector is missing or invalid.
- Keep edits minimal and consistent with existing project conventions.
9. Rule-Driven Review (Both Modes)
Evaluate findings and/or generated design against applicable ClickHouse rules and Rill constraints.
ClickHouse focus (as applicable):
- PK/order design and filter alignment
- partition strategy and lifecycle
- data type selection
- joins and join filtering strategy
- index/skipping-index opportunities
Rill focus:
- valid model syntax/properties
- connector/output semantics
- project-level default connector behavior
10. Planning and Approval Gate
- Produce a concrete edit plan first.
- Pause for user review and explicit approval.
- Do not modify or create model/connector files before approval.
- After approval, apply only approved edits.
11. Required Output Format
Use exactly these sections:
## Rules Checked## Findings## Proposed Plan## User Approval Status## Applied Changes## Verification Results
Additional requirements:
- In create-from-scratch tasks, include the completed parameter checklist and explicit assumptions in
## Findings. - If approval has not been granted,
## Applied Changesmust state that no changes were made.
12. Citation Requirements
- Cite each ClickHouse rule by rule name (for example
schema-pk-plan-before-creation). - Cite each consulted Rill rule file by filename and include the corresponding source URL.