一键导入
diagnose-clickhouse-clusters
// Diagnose ClickHouse cluster health and provide concrete remediation.
// Diagnose ClickHouse cluster health and provide concrete remediation.
Optimize slow queries, analyze SQL performance, and collect evidence for expensive workloads.
Diagnose ClickHouse runtime query failures when the user wants database-level cause and fix guidance from an error or numeric error code, not source-code root cause analysis.
Investigate application or repository source code with search_file and read_file to explain behavior, trace root causes of runtime or query errors, and answer with precise file citations.
Query ClickHouse system tables to inspect query logs, monitor cluster health, check replication status, and analyze slow queries. Use when the user mentions "system tables", "query_log", "ClickHouse monitoring", "cluster status", "slow queries", or asks to diagnose ClickHouse operational issues.
Expert system for generating, validating, and optimizing ClickHouse SQL. Use this when the user needs data, queries, or analysis.
Rules for charts and visualization. Use when the user asks for charts, graphs, plots, or visual representations (line, bar, pie, timeseries).
| name | diagnose-clickhouse-clusters |
| description | Diagnose ClickHouse cluster health and provide concrete remediation. |
collect_cluster_status before health conclusions about current cluster health.collect_rca_evidence directly when the symptom and target are already clear. Use collect_cluster_status first only when you need current health context, severity/outliers, or help choosing the RCA symptom/scope.high_part_count and unknown.status_analysis_mode="windowed" and reuse the same time window in follow-up calls.visualization skill. Do not emit chart specs directly from this skill.Do not hardcode parts thresholds in responses. Use the thresholds and severities returned by collect_cluster_status.
Use one of these two formats:
Summary table:
Always print a table title line exactly before the table: ### Summary.
| Status | Nodes with Issues | Checks Run | Timestamp |
|---|---|---|---|
| 🟢 OK / 🟠 WARNING / 🔴 CRITICAL | N | categories | ISO8601 |
Findings by category:
Always print a table title line exactly before the table: ### Findings by Category.
Use a markdown table (not bullets) with one row per category.
Required columns:
| Category | Status | Key Metrics | Top Outlier / Scope | Notes |
|---|---|---|---|---|
| parts / errors / replication / ... | 🟢 OK / 🟠 WARNING / 🔴 CRITICAL | concise metric values with thresholds | node/table if present, else - | one short phrase |
Table rules:
collect_cluster_status in stable order.🟠 WARNING), never emoji-only.Key Metrics, put the 1-2 most important metrics only (single-line, semicolon-separated if needed).Notes as compact key/value items (single-line).max_parts_per_table=533 (>500)), avoid prose-heavy sentences.`db.table` or `db`) in all table cells.Notes as compact comma-separated items.Top Outlier / Scope to -.Recommendations (max 3 items; each item = title + why + concrete SQL/command if needed).
Use compact structure only:
cause | support_score | evidence.
In evidence, render up to 3 evidence_for items prefixed with ✓ and up to 2 evidence_against items prefixed with ✗, separated by <br/>.
When excluded_candidates is non-empty, include at least one excluded reason as a ✗ item for the most relevant row.
Evidence fidelity rules:
candidate.evidence_for and candidate.evidence_against from collect_rca_evidence for that row.observations, other candidates, or status output into the evidence cell.candidate.evidence_for or candidate.evidence_against.indicators_matched/indicators_checked, but never imply more matched checks than the tool returned.3. **Possible Actions**, then a blank line, then an indented nested numbered list using exactly 1., 2., 3..
Do not continue the outer top-level numbering for action items.4. **Gaps / Next Checks**, then a blank line, then indented bullets using exactly -.RCA brevity limits:
collect_cluster_status before giving any opinion on current health.status_analysis_mode="windowed" when user asks for a bounded time window or historical context.collect_rca_evidence. collect_cluster_status is optional unless current health context is needed.gaps[] is non-empty, explicitly state what evidence is missing.support_score < 0.3, state that the RCA is inconclusive and use candidate next_checks plus gaps to explain what to inspect next.0.30-0.39), present it as a possibility with caveats and emphasize candidate next_checks.evidence_for and evidence_against.collect_rca_evidence.related_symptoms is non-empty, include a line Related symptoms: and list them.