with one click
cpu-profile
Profile a ClickHouse query using the sampling query profiler and system.trace_log. Use when the user wants to find CPU hotspots, analyze where time is spent in a query, or investigate performance bottlenecks.
Menu
Profile a ClickHouse query using the sampling query profiler and system.trace_log. Use when the user wants to find CPU hotspots, analyze where time is spent in a query, or investigate performance bottlenecks.
| name | cpu-profile |
| description | Profile a ClickHouse query using the sampling query profiler and system.trace_log. Use when the user wants to find CPU hotspots, analyze where time is spent in a query, or investigate performance bottlenecks. |
| argument-hint | ["query_id or query text"] |
| disable-model-invocation | false |
| allowed-tools | Task, Bash, Read, Grep, Glob, AskUserQuestion |
Profile a ClickHouse query using the built-in sampling query profiler (system.trace_log).
Collects CPU stack traces at configurable intervals and analyzes them to find hotspots.
$ARGUMENTS (optional): Either a query_id to analyze existing traces, or a SQL query to execute with profiling enabled.If $ARGUMENTS looks like a UUID (e.g., a1b2c3d4-e5f6-...), treat it as a query_id and skip to Step 3.
If $ARGUMENTS is a SQL query or query description, proceed to Step 2.
If $ARGUMENTS is empty, ask the user:
If the user wants to see recent slow queries:
SELECT
query_id,
query_duration_ms,
formatReadableSize(memory_usage) AS peak_memory,
left(query, 120) AS query_preview
FROM system.query_log
WHERE type = 'QueryFinish'
AND event_date >= today() - 1
AND query_duration_ms > 1000
AND query NOT LIKE '%system.%'
ORDER BY query_duration_ms DESC
LIMIT 20
SETTINGS allow_introspection_functions = 1
Generate a unique query ID and run the query with aggressive profiling settings (100us sampling = ~10,000 samples/sec).
Use clickhouse-client in non-interactive mode with an explicit --query_id to avoid any race with concurrent queries:
PROFILE_QID="cpu-profile-$(uuidgen)"
clickhouse-client --query_id "$PROFILE_QID" -q "
SELECT ...
SETTINGS query_profiler_cpu_time_period_ns = 100000,
query_profiler_real_time_period_ns = 100000
"
Alternatively, if running interactively, parse the Query id: <uuid> line that clickhouse-client prints before each query.
After execution, verify the query completed and collect metadata:
SELECT query_id, query_duration_ms, formatReadableSize(memory_usage) AS peak_memory
FROM system.query_log
WHERE type = 'QueryFinish' AND query_id = '{query_id}'
SETTINGS allow_introspection_functions = 1
Wait 2 seconds for trace_log to flush, then proceed to Step 3.
Run these analyses in parallel using Task tool (3 tasks):
SELECT
count() AS samples,
round(100.0 * count() / (SELECT count() FROM system.trace_log WHERE query_id = '{query_id}' AND trace_type = 'CPU'), 2) AS pct,
demangle(addressToSymbol(trace[1])) AS function
FROM system.trace_log
WHERE query_id = '{query_id}'
AND trace_type = 'CPU'
GROUP BY function
ORDER BY samples DESC
LIMIT 30
SETTINGS allow_introspection_functions = 1
SELECT
count() AS samples,
arrayStringConcat(
arrayMap(x -> demangle(addressToSymbol(x)), trace),
'\n '
) AS stack
FROM system.trace_log
WHERE query_id = '{query_id}'
AND trace_type = 'CPU'
GROUP BY trace
ORDER BY samples DESC
LIMIT 15
SETTINGS allow_introspection_functions = 1
SELECT
concat(
arrayStringConcat(
arrayReverse(arrayMap(x -> demangle(addressToSymbol(x)), trace)),
';'
),
' ',
toString(count())
)
FROM system.trace_log
WHERE query_id = '{query_id}'
AND trace_type = 'CPU'
GROUP BY trace
ORDER BY count() DESC
SETTINGS allow_introspection_functions = 1
FORMAT TSVRaw
Save this output to tmp/cpu_profile_{query_id}.collapsed for optional flamegraph generation.
Also collect metadata:
SELECT
count() AS total_samples,
min(event_time_microseconds) AS first_sample,
max(event_time_microseconds) AS last_sample,
dateDiff('millisecond', min(event_time_microseconds), max(event_time_microseconds)) AS profile_duration_ms
FROM system.trace_log
WHERE query_id = '{query_id}' AND trace_type = 'CPU'
SETTINGS allow_introspection_functions = 1
Using outputs from all three agents, produce a structured report:
Ask the user:
trace_type = 'Real' to find wall-clock hotspots (IO waits, lock contention)flamegraph.pl is available, render SVG:
flamegraph.pl --title "CPU Profile: {query_id}" --countname samples --width 1800 \
tmp/cpu_profile_{query_id}.collapsed > tmp/cpu_flamegraph_{query_id}.svg
Or suggest using https://www.speedscope.app with the collapsed file.addressToLine for source file:line mapping:
SELECT
count() AS samples,
demangle(addressToSymbol(trace[1])) AS function,
addressToLine(trace[1]) AS source_location
FROM system.trace_log
WHERE query_id = '{query_id}' AND trace_type = 'CPU'
GROUP BY function, source_location
ORDER BY samples DESC
LIMIT 30
SETTINGS allow_introspection_functions = 1
Repeat drill-down until user selects "Done".
query_profiler_cpu_time_period_ns setting controls sampling frequency. Default is 1,000,000,000 (1 sample/sec). Use 100,000 (100us) for detailed profiling of short queries, 1,000,000 (1ms) for longer queries.trace_type = 'CPU' counts CPU time; trace_type = 'Real' counts wall-clock time (includes IO waits).allow_introspection_functions = 1 is required for addressToSymbol, demangle, addressToLine.clickhouse-common-static-dbg package must be installed for symbol resolution.FROM clusterAllReplicas(default, system.trace_log) to collect traces from all nodes.system.trace_log are stored as arrays of addresses, with index 1 being the innermost (leaf) frame./cpu-profile โ Interactive: choose a query to profile/cpu-profile a1b2c3d4-e5f6-7890-abcd-ef1234567890 โ Analyze existing traces for a query_id/cpu-profile SELECT count() FROM lineitem WHERE l_shipdate > '1995-01-01' โ Execute and profile a queryInvestigate a ClickHouse CI failure end-to-end from a PR or S3 report URL. Fetches the failed tests and their output, searches for an existing tracking GitHub issue, classifies each as flaky vs a real regression using play.clickhouse.com master history, downloads and reads the harness artifacts only for failures that history does not explain, and reports a root-cause hypothesis. Read-only first pass โ never commits, pushes, or edits.
Extract the inner ELF from a ClickHouse self-extracting `clickhouse` binary, including when its architecture differs from the host (e.g. to load an aarch64 CI core dump on an x86 workstation). Use when gdb/lldb needs real symbols from a downloaded CI/release binary, or when self-extraction by running the binary is not possible because of an architecture mismatch.
Create a ClickHouse git worktree with submodules hardlinked from the main repo. Use when the user wants to create a new worktree for ClickHouse development.
Fix the "CH Inc sync" job in a pull request. Diagnoses the failure as one of three types โ conflicts found, build failed, or tests failed โ and handles each accordingly.
Review a ClickHouse Pull Request for correctness, safety, performance, and compliance. Use when the user wants to review a PR or diff.
Continue work on an existing PR - resolve conflicts, fix CI failures, address reviewer feedback, and push updates. Use when the user wants to pick up and advance a pull request.