원클릭으로
clickhouse-query-profiling
// Profile GKG queries against ClickHouse with the query-profiler CLI. For optimizing query performance, comparing query plans, investigating slow queries, or checking ClickHouse resource usage.
// Profile GKG queries against ClickHouse with the query-profiler CLI. For optimizing query performance, comparing query plans, investigating slow queries, or checking ClickHouse resource usage.
Query the GitLab Knowledge Graph (Orbit) via `glab orbit remote` CLI subcommands or run a local copy with `glab orbit local`. Use for code-structure questions (who calls this function, where is this symbol defined), cross-project dependency and blast-radius analysis, merge-request and contributor queries, and any question answerable by traversing GitLab's unified entity graph (projects, users, MRs, issues, pipelines, files, definitions, vulnerabilities).
Audit and update documentation after code changes. Use when architecture, APIs, or behavior changed and docs may have drifted.
Investigate query evaluation failures in the Knowledge Graph synthetic data pipeline. Use when queries fail or return unexpected results after running the evaluate binary.
AST-based code search and rewrite via tree-sitter patterns. Use instead of Grep/Edit for structural matching, batch rewrites, or context-aware queries (e.g. "unwrap inside impl blocks").
Investigate the history, usage, and liveness of code using search and git blame/log. Use when determining if code is dead, understanding why something exists, finding all callers before refactoring, or deciding whether something is safe to remove. Also useful for answering "who added this and why" or "is anything still using this".
GitLab Pajamas Design System expert for building UIs with Pajamas components and patterns. Use when: (1) implementing UI that should follow GitLab's Pajamas design system, (2) selecting or configuring Pajamas/GlComponent components (GlButton, GlAlert, GlModal, etc.), (3) translating Figma designs into Pajamas-compliant code, (4) questions about Pajamas component usage, variants, categories, or accessibility, (5) building GitLab-style interfaces, or (6) the user mentions "Pajamas", "GitLab UI", "Gl components", or "design system" in a GitLab context. Works hand-in-hand with the implement-design skill and Figma MCP tools.
| name | clickhouse-query-profiling |
| description | Profile GKG queries against ClickHouse with the query-profiler CLI. For optimizing query performance, comparing query plans, investigating slow queries, or checking ClickHouse resource usage. |
Run GKG JSON DSL queries directly against any ClickHouse instance and get back execution stats, EXPLAIN plans, CPU/memory profiling, and instance health. No gRPC server or Rails needed.
mise build
Set connection via env vars or a .env file:
export CLICKHOUSE_URL=http://localhost:8123
export CLICKHOUSE_DATABASE=gkg_graph
export CLICKHOUSE_USER=<user>
export CLICKHOUSE_PASSWORD=<password>
Or use CLI flags: --ch-url, --ch-database, --ch-user, --ch-password.
Basic search with stats:
mise query:profile -- \
-t '1/' \
'{"query_type":"search","node":{"id":"p","entity":"Project","columns":["name"]},"limit":5}'
With EXPLAIN plans:
mise query:profile -- \
-t '1/' --explain \
'{"query_type":"traversal","nodes":[{"id":"mr","entity":"MergeRequest"},{"id":"p","entity":"Project"}],"relationships":[{"type":"IN_PROJECT","from":"mr","to":"p"}],"limit":10}'
With CPU/memory profiling from system.query_log:
mise query:profile -- \
-t '1/' --explain --profile --processors \
'{"query_type":"aggregation","nodes":[{"id":"mr","entity":"MergeRequest"},{"id":"p","entity":"Project"}],"relationships":[{"type":"IN_PROJECT","from":"mr","to":"p"}],"aggregations":[{"function":"count","target":"mr","group_by":"p","alias":"mr_count"}],"limit":10}'
With instance health snapshot:
mise query:profile -- -t '1/' --health '{"query_type":"search","node":{"id":"p","entity":"Project"},"limit":1}'
Multiple traversal paths:
mise query:profile -- -t '1/2/3/' -t '1/4/5/' '{"query_type":"search","node":{"id":"p","entity":"Project"},"limit":5}'
Query from file:
mise query:profile -- -t '1/' --explain @fixtures/queries/my_query.json
By default output is pretty-printed JSON to stdout. Use --format json for compact JSON.
Write results to a file with --output (-o). Parent directories are created automatically:
mise query:profile -- -t '1/' --explain -o fixtures/profiling/results.json @fixtures/queries/my_query.json
When --output is used, the profiler writes the JSON file and prints the path to stderr. When omitted, results go to stdout. Progress and errors always go to stderr, so they never mix with JSON output.
Result files go in fixtures/profiling/ (gitignored).
Use --output so results are written to disk when the run finishes, then read the output file to analyze results:
mise query:profile -- -t '1/' --explain --profile \
-o fixtures/profiling/showcase.json \
@fixtures/queries/optimization_showcase.json
# Then read fixtures/profiling/showcase.json to analyze
The JSON output has these sections:
query -- the original query JSONsecurity_context -- org_id and traversal_pathscompilation -- parameterized SQL, rendered SQL, hydration planexecutions -- per-query metrics (1 base + N hydration queries), each with:
label -- "base" or "hydration:Project", etc.query_id -- ClickHouse query ID for system table correlationstats -- read_rows, read_bytes, result_rows, result_bytes, elapsed_ns, memory_usageexplain_plan -- EXPLAIN PLAN with index usage (with --explain)explain_pipeline -- EXPLAIN PIPELINE processor graph (with --explain)query_log -- CPU times, cache stats, ProfileEvents (with --profile)processors -- per-processor pipeline breakdown (with --processors)summary -- totals across all queriesinstance_health -- server health (with --health)The main optimization targets:
read_rows -- total rows scanned. This is the number to reduce.read_bytes -- data volume read from disk/cache.memory_usage -- RAM consumed per query. Watch for spills.elapsed_ns -- server-side wall clock time.With --profile, you also get:
selected_parts / selected_marks -- how well the primary key prunes datamark_cache_hits / mark_cache_misses -- mark cache hit ratereal_time_us vs user_time_us -- tells you if a query is CPU-bound or I/O-boundos_io_wait_us -- time spent waiting on diskexternal_sort_bytes / external_agg_bytes -- nonzero means memory spilled to diskWith --processors:
input_wait_us means a processor is starved for input (upstream is the bottleneck)output_wait_us means a processor is blocked on output (downstream is the bottleneck)With --health:
active_merges -- background merges compete with queries for I/Otemp_files_* -- active memory spills across the servertable_parts -- high part count per table means fragmentation, which slows readsThe profiler can run all queries from a file where keys are query names and values are query objects. Queries are executed sequentially so profiling numbers are not polluted by concurrent load.
mise query:profile -- -t '1/' --explain @fixtures/queries/optimization_showcase.json
Filter to a subset by name substring:
mise query:profile -- -t '1/' --explain --filter aggregation @fixtures/queries/optimization_showcase.json
Write multi-query results to a file:
mise query:profile -- -t '1/' --explain \
-o fixtures/profiling/showcase.json \
@fixtures/queries/optimization_showcase.json
Progress is printed to stderr ([3/25] query_name...). Output is a JSON object keyed by query name, each value being the standard profiler output. Queries that fail are recorded with an {"error": "..."} value and the run continues with the rest.
mise query:profile -- -t '1/' --explain -o fixtures/profiling/before.json QUERYmise buildmise query:profile -- -t '1/' --explain -o fixtures/profiling/after.json QUERYmise query:diff -- fixtures/profiling/before.json fixtures/profiling/after.json --labels before,aftermise query:diff compares two or more profiler result files and produces a markdown table.
Two-way comparison (default metric is read_rows):
mise query:diff -- fixtures/profiling/baseline.json fixtures/profiling/dedup.json --labels baseline,dedup
Compare memory usage:
mise query:diff -- fixtures/profiling/baseline.json fixtures/profiling/dedup.json --labels baseline,dedup --metric memory
All metrics in separate tables:
mise query:diff -- fixtures/profiling/baseline.json fixtures/profiling/dedup.json --labels baseline,dedup --all-metrics
N-way comparison (3+ files):
mise query:diff -- fixtures/profiling/v1.json fixtures/profiling/v2.json fixtures/profiling/v3.json --labels v1,v2,v3
CSV output for spreadsheets:
mise query:diff -- fixtures/profiling/baseline.json fixtures/profiling/dedup.json --format csv
Available metrics: read_rows (default), read_bytes, memory, elapsed_ms.
--profile runs SYSTEM FLUSH LOGS before querying system.query_log, which adds about 100ms.--profile and --health require the ClickHouse user to have SELECT on system tables.json (compact). Use --format pretty for human-readable output.