一键导入
clickhouse
// MUST USE when reviewing ClickHouse schemas, queries, or configurations. Contains 28 rules that MUST be checked before providing recommendations. Always read relevant rule files and cite specific rules in responses.
// MUST USE when reviewing ClickHouse schemas, queries, or configurations. Contains 28 rules that MUST be checked before providing recommendations. Always read relevant rule files and cite specific rules in responses.
Maintain and update knowledge base about Duyet Le (@duyet) - profile, resume, blog posts, projects.
Generate or review high-density, interactive, self-contained HTML artifacts based on Thariq Shihipar's "Unreasonable Effectiveness of HTML" patterns. Use this skill when asked to "create HTML," "build a dashboard," "format a PR review," "make an interactive explainer," or when you need to "review and update" existing HTML using the "good-html" style.
Fine-tune LLMs with Unsloth using GRPO or SFT. Supports FP8, vision models, mobile deployment, Docker, packing, GGUF export, dataset preparation, synthetic data, MLX (Apple Silicon). Use when: train with GRPO, fine-tune, reward functions, SFT training, FP8 training, vision fine-tuning, phone deployment, docker training, packing, export to GGUF, prepare dataset, synthetic data, install unsloth, environment flags, MLX training.
Knowledge base about Duyet Le (@duyet) and duyetbot behavioral patterns. Owner profile, knowledge sources, and execution transparency.
Engineering discipline and systematic quality verification. Core principles, anti-patterns, decision rules, and gate procedures for code review and delivery.
Maintain @duyet/skills repository - add, update, validate skills, generate documentation, and manage releases. Use when making changes to this skills repository.
| name | clickhouse |
| description | MUST USE when reviewing ClickHouse schemas, queries, or configurations. Contains 28 rules that MUST be checked before providing recommendations. Always read relevant rule files and cite specific rules in responses. |
Guidance for ClickHouse covering schema design, query optimization, and data ingestion. Contains 28 atomic rules across 3 categories (schema, query, insert), prioritized by impact. Extended with 14 reference files covering cluster management, backups, monitoring, and integrations.
Official docs: ClickHouse Best Practices
Official docs: ClickHouse Best Practices
Example credentials in documentation (password123, AKIAIOSFODNN7EXAMPLE) are placeholders only. Never use these in production. Use proper secret management:
For installation and operational procedures:
apt, yum, helm) over direct downloadslatest in productionBefore answering ClickHouse questions, follow this priority order:
rules/ directoryrule-name..."references/ for deeper topic coverageWhy rules take priority: ClickHouse has specific behaviors (columnar storage, sparse indexes, merge tree mechanics) where general database intuition can be misleading. The rules encode validated, ClickHouse-specific guidance.
Read these rule files in order:
rules/schema-pk-plan-before-creation.md — ORDER BY is immutablerules/schema-pk-cardinality-order.md — Column ordering in keysrules/schema-pk-prioritize-filters.md — Filter column inclusionrules/schema-pk-filter-on-orderby.md — Query filter alignmentrules/schema-types-native-types.md — Proper type selectionrules/schema-types-minimize-bitwidth.md — Numeric type sizingrules/schema-types-lowcardinality.md — LowCardinality usagerules/schema-types-avoid-nullable.md — Nullable vs DEFAULTrules/schema-types-enum.md — Enum for finite value setsrules/schema-partition-low-cardinality.md — Partition count limitsrules/schema-partition-lifecycle.md — Partitioning purposerules/schema-partition-query-tradeoffs.md — Partition pruning trade-offsrules/schema-partition-start-without.md — Start without partitioningrules/schema-json-when-to-use.md — JSON type usageCheck for:
Read these rule files:
rules/query-join-choose-algorithm.md — Algorithm selectionrules/query-join-use-any.md — ANY vs regular JOINrules/query-join-filter-before.md — Pre-join filteringrules/query-join-consider-alternatives.md — Dictionaries/denormalizationrules/query-join-null-handling.md — join_use_nulls settingrules/query-index-skipping-indices.md — Secondary index usagerules/query-mv-incremental.md — Incremental materialized viewsrules/query-mv-refreshable.md — Refreshable materialized viewsCheck for:
Read these rule files:
rules/insert-batch-size.md — Batch sizing requirementsrules/insert-async-small-batches.md — Async insert usagerules/insert-format-native.md — Native format for performancerules/insert-mutation-avoid-update.md — UPDATE alternativesrules/insert-mutation-avoid-delete.md — DELETE alternativesrules/insert-optimize-avoid-final.md — OPTIMIZE TABLE risksCheck for:
Structure review responses as follows:
## Rules Checked
- `rule-name-1` — Compliant / Violation found
- `rule-name-2` — Compliant / Violation found
...
## Findings
### Violations
- **`rule-name`**: Description of the issue
- Current: [what the code does]
- Required: [what it should do]
- Fix: [specific correction]
### Compliant
- `rule-name`: Brief note on why it's correct
## Recommendations
[Prioritized list of changes, citing rules]
| Priority | Category | Impact | Prefix | Count |
|---|---|---|---|---|
| 1 | Primary Key Selection | CRITICAL | schema-pk- | 4 |
| 2 | Data Type Selection | CRITICAL | schema-types- | 5 |
| 3 | JOIN Optimization | CRITICAL | query-join- | 5 |
| 4 | Insert Batching | CRITICAL | insert-batch- | 1 |
| 5 | Mutation Avoidance | CRITICAL | insert-mutation- | 2 |
| 6 | Partitioning Strategy | HIGH | schema-partition- | 4 |
| 7 | Skipping Indices | HIGH | query-index- | 1 |
| 8 | Materialized Views | HIGH | query-mv- | 2 |
| 9 | Async Inserts | HIGH | insert-async- | 2 |
| 10 | OPTIMIZE Avoidance | HIGH | insert-optimize- | 1 |
| 11 | JSON Usage | MEDIUM | schema-json- | 1 |
schema-pk-plan-before-creation — Plan ORDER BY before table creation (immutable)schema-pk-cardinality-order — Order columns low-to-high cardinalityschema-pk-prioritize-filters — Include frequently filtered columnsschema-pk-filter-on-orderby — Query filters must use ORDER BY prefixschema-types-native-types — Use native types, not String for everythingschema-types-minimize-bitwidth — Use smallest numeric type that fitsschema-types-lowcardinality — LowCardinality for <10K unique stringsschema-types-enum — Enum for finite value sets with validationschema-types-avoid-nullable — Avoid Nullable; use DEFAULT insteadschema-partition-low-cardinality — Keep partition count 100-1,000schema-partition-lifecycle — Use partitioning for data lifecycle, not queriesschema-partition-query-tradeoffs — Understand partition pruning trade-offsschema-partition-start-without — Consider starting without partitioningschema-json-when-to-use — JSON for dynamic schemas; typed columns for knownquery-join-choose-algorithm — Select algorithm based on table sizesquery-join-use-any — ANY JOIN when only one match neededquery-join-filter-before — Filter tables before joiningquery-join-consider-alternatives — Dictionaries/denormalization vs JOINquery-join-null-handling — join_use_nulls=0 for default valuesquery-index-skipping-indices — Skipping indices for non-ORDER BY filtersquery-mv-incremental — Incremental MVs for real-time aggregationsquery-mv-refreshable — Refreshable MVs for complex joinsinsert-batch-size — Batch 10K-100K rows per INSERTinsert-async-small-batches — Async inserts for high-frequency small batchesinsert-format-native — Native format for best performanceinsert-mutation-avoid-update — ReplacingMergeTree instead of ALTER UPDATEinsert-mutation-avoid-delete — Lightweight DELETE or DROP PARTITIONinsert-optimize-avoid-final — Let background merges workNeed to store data?
├── < 1M rows, dimension → Memory
└── ≥ 1M rows → MergeTree family
├── Deduplication? → ReplacingMergeTree(version)
├── Changelog? → CollapsingMergeTree(sign)
├── Pre-aggregation? → AggregatingMergeTree()
├── Replication? → ReplicatedMergeTree(...)
└── Default → MergeTree()
See references/table-engines.md for complete reference.
| Issue | Quick Fix |
|---|---|
| Too many parts | OPTIMIZE TABLE table FINAL (see insert-optimize-avoid-final) |
| Slow query | EXPLAIN SELECT ... to check index usage |
| Mutation stuck | Check system.mutations, consider alternatives per insert-mutation-avoid-update |
| Replication lag | Check system.replication_queue, ZooKeeper |
| OOM on query | Increase max_memory_usage, optimize query |
See references/debugging.md for detailed troubleshooting.
For topics beyond the 28 rules, see the references/ directory:
references/core-concepts.md — Architecture, data model, internalsreferences/schema-design.md — Database engines, migrations, version controlreferences/table-design.md — ORDER BY, partitioning, column selectionreferences/table-engines.md — Complete MergeTree family referencereferences/sql-reference.md — Complete SQL dialect, data typesreferences/query-optimization.md — EXPLAIN, JOINs, projections, skip indexesreferences/advanced-features.md — Materialized views, mutations, TTL, dictionariesreferences/debugging.md — Query debugging, merges, mutations, replicationreferences/cluster-management.md — Distributed tables, replication, shardingreferences/backup-restore.md — Backup strategies, disaster recoveryreferences/monitoring.md — Query monitoring, health checks, system queriesreferences/integrations.md — Kafka, S3, PostgreSQL, MySQL, BI toolsreferences/best-practices.md — Complete checklist and anti-patternsreferences/external.md — Altinity KB links, official docsreferences/system-queries.md — Ready-to-use queries for operationsVersion: 1.3.0 Rules: Synced with ClickHouse/agent-skills (Apache-2.0) References: Altinity Knowledge Base (200+ articles) + ClickHouse Official Docs