تشغيل أي مهارة في Manus بنقرة واحدة

$pwd:

profiling-statement-fingerprints

Name: Profiling Statement Fingerprints
Author: cockroachdb

// Ranks and analyzes statement fingerprints using aggregated SQL statistics from crdb_internal.statement_statistics to identify slow, resource-intensive, or error-prone query patterns. Use when investigating historical performance trends, identifying optimization opportunities, or diagnosing recurring slowness without DB Console access.

تشغيل في Manus

$ git log --oneline --stat

stars:١

forks:١

updated:٩ أبريل ٢٠٢٦ في ٠٢:٥٤

مستكشف الملفات

4 ملفات

SKILL.md

readonly

related-skills.json

نفس المستودع

benchmarking-transaction-patterns.md

from "cockroachdb/cursor-plugin"

Guides benchmarking and comparing explicit multi-statement transactions versus single-statement CTE transactions in CockroachDB, with fair test methodology, contention analysis, and performance interpretation. Use when comparing transaction formulations, benchmarking CockroachDB workloads under contention, investigating retry pressure, or deciding whether to rewrite multi-step application flows into single SQL statements.

2026-05-121

analyzing-schema-change-storage-risk.md

from "cockroachdb/cursor-plugin"

Estimates storage requirements for CockroachDB online schema change backfills using SHOW RANGES WITH DETAILS, KEYS, INDEXES. Use before CREATE INDEX, ADD COLUMN with INDEX/UNIQUE, ALTER PRIMARY KEY, CREATE MATERIALIZED VIEW, CREATE TABLE AS, REFRESH, or SET LOCALITY on tables with large per-index footprints, to avoid mid-backfill disk exhaustion.

2026-05-031

setting-up-local-cluster.md

from "cockroachdb/cursor-plugin"

Downloads and starts a local CockroachDB cluster for development using the official binary. Use when a developer needs a local CockroachDB instance, when no cluster is available, or when setting up a new development environment.

2026-04-241

designing-application-transactions.md

from "cockroachdb/cursor-plugin"

Guides application developers in designing correct and performant transaction patterns for CockroachDB, covering transaction lifetime, implicit vs explicit transactions, retry handling with exponential backoff, pushing invariants into SQL, selective pessimistic locking, set-based operations, connection pooling, prepared statements, keyset pagination, follower reads, and separating business logic from database logic. Use when building applications on CockroachDB, designing transaction workflows, handling retries, optimizing application-layer database interactions, or configuring connection pools.

2026-04-091

designing-multi-region-applications.md

from "cockroachdb/cursor-plugin"

Guides developers in selecting and implementing multi-region patterns for CockroachDB applications, covering active-passive vs active-active architectures, REGIONAL BY ROW, GLOBAL tables, manual geo-partitioning with lease preferences, and live demo setup with validation queries. Use when designing multi-region database topologies, choosing between REGIONAL BY ROW and manual partitioning, building multi-region demos, or optimizing cross-region latency.

2026-04-091

analyzing-range-distribution.md

from "cockroachdb/cursor-plugin"

Analyzes CockroachDB range distribution across tables and indexes using SHOW RANGES to identify range count, size patterns, leaseholder placement, and replication health. Use when investigating hotspots, uneven data distribution, range fragmentation, or validating zone configuration effects without DB Console access.

2026-04-091

package.json

"author": "cockroachdb"

"repository": "cockroachdb/cursor-plugin"

فتح مستودع GitHub عرض مستودعات المنشئ

$ install --global

$ download --local

تشغيل في Manus

$ useful --forSOC

مصممو قواعد البياناتمهن الحاسوب والرياضيات15-1243L4

name	profiling-statement-fingerprints
description	Ranks and analyzes statement fingerprints using aggregated SQL statistics from crdb_internal.statement_statistics to identify slow, resource-intensive, or error-prone query patterns. Use when investigating historical performance trends, identifying optimization opportunities, or diagnosing recurring slowness without DB Console access.
compatibility	Requires SQL access with VIEWACTIVITY or VIEWACTIVITYREDACTED cluster privilege. Uses crdb_internal.statement_statistics (production-safe). Execution statistics fields are sampled and may be sparse.
metadata	{"author":"cockroachdb","version":"1.0"}

Profiling Statement Fingerprints

Analyzes historical statement performance patterns using aggregated SQL statistics to identify slow, resource-intensive, or error-prone query fingerprints. Uses crdb_internal.statement_statistics for time-windowed analysis of latency, CPU, contention, admission delays, and failure rates - entirely via SQL without requiring DB Console access.

Complement to triaging-live-sql-activity: This skill analyzes historical patterns; for immediate triage of currently running queries, see triaging-live-sql-activity.

When to Use This Skill

Identify slowest statement fingerprints over past hours/days/weeks
Find queries with high CPU consumption, contention, or admission waits
Investigate performance regressions or plan changes
Locate full table scans or missing indexes via index recommendations
Analyze resource consumption by application or database
SQL-only historical analysis without DB Console access

For immediate incident response: Use triaging-live-sql-activity to triage currently running queries and cancel runaway work. For transaction-level analysis: Use profiling-transaction-fingerprints to analyze retry patterns, commit latency, and statement composition at the transaction boundary. For background job monitoring: Use monitoring-background-jobs for long-running schema changes and automatic jobs excluded from statement statistics.

Prerequisites

SQL connection to CockroachDB cluster
VIEWACTIVITY or VIEWACTIVITYREDACTED cluster privilege for cluster-wide visibility
Statement statistics collection enabled (default): sql.stats.automatic_collection.enabled = true

Check collection status:

SHOW CLUSTER SETTING sql.stats.automatic_collection.enabled;  -- Should return: true

See triaging-live-sql-activity permissions reference for RBAC setup (same privileges).

Core Concepts

Statement Fingerprints vs Live Queries

Statement fingerprint: Normalized SQL pattern with parameterized constants (e.g., SELECT * FROM users WHERE id = $1 vs SELECT * FROM users WHERE id = 123)

Key differences:

Time scope: Historical hourly buckets vs real-time current state
Granularity: Aggregated pattern statistics vs individual execution instances

Time-Series Bucketing

aggregated_ts: Hourly UTC buckets (e.g., 2026-02-21 14:00:00 = 14:00-14:59 executions) Data retention: Default ~7 days (check sql.stats.persisted_rows.max) Best practice: Always filter by time window: WHERE aggregated_ts > now() - INTERVAL '24 hours'

Aggregated vs Sampled Metrics

Metric Category	JSON Path	Scope	Use Case
Aggregated	`statistics.statistics.*`	All executions	Latency, row counts, execution counts
Sampled	`statistics.execution_statistics.*`	~10% sample	CPU, contention, admission wait, memory/disk

Critical: Always check sampled metrics presence: WHERE (statistics->'execution_statistics'->>'cnt') IS NOT NULL

JSON Field Extraction

Operators:

->: Extract JSON object (returns JSON)
->>: Extract as text (returns text)
::TYPE: Cast to specific type

Examples:

metadata->>'db'                                              -- Database name
(statistics->'statistics'->>'cnt')::INT                      -- Execution count
(statistics->'statistics'->'runLat'->>'mean')::FLOAT8        -- Mean latency (seconds)
(statistics->'execution_statistics'->'cpuSQLNanos'->>'mean')::FLOAT8 / 1e9  -- CPU (convert nanos to seconds)

Units: Latency = seconds, CPU/admission = nanoseconds (÷ 1e9), Memory/disk = bytes (÷ 1048576 for MB)

See JSON field reference for complete schema.

Core Diagnostic Queries

Query 1: Top Statements by Mean Run Latency

SELECT
  fingerprint_id,
  metadata->>'db' AS database,
  metadata->>'query' AS query_text,
  (statistics->'statistics'->>'cnt')::INT AS execution_count,
  (statistics->'statistics'->'runLat'->>'mean')::FLOAT8 AS mean_run_lat_seconds,
  (statistics->'statistics'->'runLat'->>'max')::FLOAT8 AS max_run_lat_seconds,
  (metadata->>'fullScan')::BOOL AS full_scan,
  metadata->'index_recommendations' AS index_recommendations,
  aggregated_ts
FROM crdb_internal.statement_statistics
WHERE aggregated_ts > now() - INTERVAL '24 hours'
  AND (statistics->'statistics'->'runLat'->>'mean')::FLOAT8 > 1.0  -- > 1 second mean latency
ORDER BY (statistics->'statistics'->'runLat'->>'mean')::FLOAT8 DESC
LIMIT 20;

Focus: Slowest queries; check full_scan and index_recommendations for optimization opportunities.

Query 2: Admission Control Impact

SELECT
  fingerprint_id,
  metadata->>'db' AS database,
  metadata->>'query' AS query_text,
  (statistics->'statistics'->>'cnt')::INT AS execution_count,
  (statistics->'execution_statistics'->'admissionWaitTime'->>'mean')::FLOAT8 / 1e9 AS mean_admission_wait_seconds,
  (statistics->'statistics'->'runLat'->>'mean')::FLOAT8 AS mean_run_lat_seconds,
  aggregated_ts
FROM crdb_internal.statement_statistics
WHERE aggregated_ts > now() - INTERVAL '24 hours'
  AND (statistics->'execution_statistics'->>'cnt') IS NOT NULL
  AND (statistics->'execution_statistics'->'admissionWaitTime'->>'mean')::FLOAT8 > 0
ORDER BY (statistics->'execution_statistics'->'admissionWaitTime'->>'mean')::FLOAT8 DESC
LIMIT 20;

Interpretation: High admission wait = cluster at resource limits (CPU, memory, I/O). Ratio > 1.0 (wait > runtime) indicates severe queueing.

Query 3: Plan Hash Diversity

SELECT
  fingerprint_id,
  metadata->>'db' AS database,
  metadata->>'query' AS query_text,
  COUNT(DISTINCT plan_hash) AS distinct_plan_count,
  array_agg(DISTINCT plan_hash ORDER BY plan_hash) AS plan_hashes,
  SUM((statistics->'statistics'->>'cnt')::INT) AS total_executions
FROM crdb_internal.statement_statistics
WHERE aggregated_ts > now() - INTERVAL '7 days'
GROUP BY fingerprint_id, metadata->>'db', metadata->>'query'
HAVING COUNT(DISTINCT plan_hash) > 1
ORDER BY COUNT(DISTINCT plan_hash) DESC, SUM((statistics->'statistics'->>'cnt')::INT) DESC
LIMIT 20;

Interpretation: Multiple plans indicate instability from schema changes, statistics updates, or routing changes. Performance can vary significantly between plans.

Query 4: High Contention Statements

SELECT
  fingerprint_id,
  metadata->>'db' AS database,
  metadata->>'app' AS application,
  substring(metadata->>'query', 1, 150) AS query_preview,
  (statistics->'statistics'->>'cnt')::INT AS execution_count,
  (statistics->'execution_statistics'->'contentionTime'->>'mean')::FLOAT8 / 1e9 AS mean_contention_seconds,
  ROUND(
    ((statistics->'execution_statistics'->'contentionTime'->>'mean')::FLOAT8 / 1e9) /
    NULLIF((statistics->'statistics'->'runLat'->>'mean')::FLOAT8, 0) * 100, 2
  ) AS contention_pct_of_runtime,
  aggregated_ts
FROM crdb_internal.statement_statistics
WHERE aggregated_ts > now() - INTERVAL '24 hours'
  AND (statistics->'execution_statistics'->>'cnt') IS NOT NULL
  AND (statistics->'execution_statistics'->'contentionTime'->>'mean')::FLOAT8 > 0
ORDER BY (statistics->'execution_statistics'->'contentionTime'->>'mean')::FLOAT8 DESC
LIMIT 20;

Interpretation: >20% contention = transaction conflicts, hot row access. Remediate with batching, transaction boundary changes, or schema redesign.

Query 5: High CPU Consumers

SELECT
  fingerprint_id,
  metadata->>'db' AS database,
  substring(metadata->>'query', 1, 150) AS query_preview,
  (statistics->'execution_statistics'->'cpuSQLNanos'->>'mean')::FLOAT8 / 1e9 AS mean_cpu_seconds,
  (statistics->'statistics'->>'cnt')::INT AS total_executions,
  ROUND(
    ((statistics->'execution_statistics'->'cpuSQLNanos'->>'mean')::FLOAT8 / 1e9) *
    (statistics->'statistics'->>'cnt')::INT, 2
  ) AS estimated_total_cpu_seconds,
  (metadata->>'fullScan')::BOOL AS full_scan,
  aggregated_ts
FROM crdb_internal.statement_statistics
WHERE aggregated_ts > now() - INTERVAL '24 hours'
  AND (statistics->'execution_statistics'->>'cnt') IS NOT NULL
  AND (statistics->'execution_statistics'->'cpuSQLNanos'->>'mean')::FLOAT8 > 0
ORDER BY estimated_total_cpu_seconds DESC
LIMIT 20;

Focus: estimated_total_cpu_seconds shows cluster impact. High mean CPU often correlates with full_scan = true.

Query 6: Memory and Disk Spill Detection

SELECT
  fingerprint_id,
  metadata->>'db' AS database,
  substring(metadata->>'query', 1, 150) AS query_preview,
  (statistics->'execution_statistics'->'maxMemUsage'->>'mean')::FLOAT8 / 1048576 AS mean_mem_mb,
  (statistics->'execution_statistics'->'maxMemUsage'->>'max')::FLOAT8 / 1048576 AS max_mem_mb,
  (statistics->'execution_statistics'->'maxDiskUsage'->>'mean')::FLOAT8 / 1048576 AS mean_disk_mb,
  (statistics->'execution_statistics'->'maxDiskUsage'->>'max')::FLOAT8 / 1048576 AS max_disk_mb,
  metadata->>'stmtType' AS statement_type,
  aggregated_ts
FROM crdb_internal.statement_statistics
WHERE aggregated_ts > now() - INTERVAL '24 hours'
  AND (statistics->'execution_statistics'->>'cnt') IS NOT NULL
  AND (statistics->'execution_statistics'->'maxDiskUsage'->>'mean')::FLOAT8 > 0  -- Has disk spills
ORDER BY (statistics->'execution_statistics'->'maxDiskUsage'->>'mean')::FLOAT8 DESC
LIMIT 20;

Interpretation: Disk usage > 0 = memory spill (~100-1000x slower than in-memory). Common for large aggregations, sorts, hash joins. Fix with indexes or increased sql.distsql.temp_storage.workmem.

Query 7: Error-Prone Statements

SELECT
  fingerprint_id,
  metadata->>'db' AS database,
  substring(metadata->>'query', 1, 150) AS query_preview,
  (statistics->'statistics'->>'cnt')::INT AS total_executions,
  COALESCE((statistics->'statistics'->>'failureCount')::INT, 0) AS failure_count,
  ROUND(
    COALESCE((statistics->'statistics'->>'failureCount')::INT, 0)::NUMERIC /
    NULLIF((statistics->'statistics'->>'cnt')::INT, 0) * 100, 2
  ) AS failure_rate_pct,
  aggregated_ts
FROM crdb_internal.statement_statistics
WHERE aggregated_ts > now() - INTERVAL '24 hours'
  AND (statistics->'statistics'->>'cnt')::INT > 10
  AND COALESCE((statistics->'statistics'->>'failureCount')::INT, 0) > 0
ORDER BY failure_rate_pct DESC, failure_count DESC
LIMIT 20;

Common causes: Constraint violations, query timeouts, transaction retry errors (40001), permission denied.

Common Workflows

Workflow 1: Slowness Investigation

Identify slow fingerprints: Run Query 1 with 24h window, focus on mean_run_lat_seconds > 5 and high execution counts
Check for full scans: Filter full_scan = true, review index_recommendations
Correlate to applications: Group by metadata->>'app', contact teams with specific patterns
Cross-reference live activity: If ongoing, use triaging-live-sql-activity to cancel runaway queries

Workflow 2: Contention Analysis

Find high-contention statements: Run Query 4, focus on contention_pct_of_runtime > 20%
Check plan stability: Run Query 3 for contending fingerprints (plan changes affect lock order)
Remediate: Batch operations, use SELECT FOR UPDATE, partition hot tables, denormalize schema

Workflow 3: Admission Control Debugging

Identify admission waits: Run Query 2, calculate wait ratio
Correlate with CPU: Run Query 5 for same window, cross-reference fingerprint IDs
Analyze time patterns: Group by aggregated_ts to find peak periods
Triage: Short-term: spread batch jobs; Long-term: add capacity, optimize queries

Workflow 4: Memory Spill Investigation

Find spilling statements: Run Query 6, focus on max_disk_mb > 100
Analyze patterns: Identify large GROUP BY, ORDER BY, hash joins
Remediate: Add indexes, increase workmem (with caution), rewrite queries, use materialized views

Safety Considerations

Read-only operations: All queries are SELECT statements against production-approved crdb_internal.statement_statistics.

Performance impact:

Consideration	Impact	Mitigation
Large table	Many rows with high statement diversity	Always use time filters and `LIMIT`
JSON parsing	CPU overhead	Use specific time windows, avoid tight loops
Broad windows	7-day queries = more rows	Default to 24h; expand only when needed

Privacy: Use VIEWACTIVITYREDACTED to redact query constants in multi-tenant environments.

Troubleshooting

Issue	Cause	Fix
Empty results	No data or stats collection disabled	Check `sql.stats.automatic_collection.enabled = true`
`column does not exist`	JSON field typo or version mismatch	Verify field names; check CockroachDB version
NULL in sampled metrics	Metric not sampled in bucket	Filter: `WHERE (statistics->'execution_statistics'->>'cnt') IS NOT NULL`
Query text shows `<hidden>`	Using VIEWACTIVITYREDACTED	Expected; use VIEWACTIVITY if authorized
"invalid input syntax for type json"	Malformed JSON path	Check operators: `->` for JSON, `->>` for text
Very slow query	Large table, no time filter	Always add time window and LIMIT
Empty `index_recommendations`	No recommendations or optimal	Normal if indexes exist

Key Considerations

Time windows: Default to 24h; expand to 7d for trends
Sampled metrics: Not all executions captured; check sample size (cnt)
JSON safety: Use defensive NULL checks; handle type casting errors
Privacy: Use VIEWACTIVITYREDACTED in production
Performance: Always include time filters and LIMIT
Complement to live triage: Use together for complete coverage (historical + real-time)
Data retention: Default ~7 days; verify with sql.stats.persisted_rows.max
Plan instability: Multiple plan hashes indicate optimizer/schema changes

References

Skill references:

JSON field schema and extraction
Metrics catalog and units
SQL query variations
RBAC and privileges (shared with triaging-live-sql-activity)

Official CockroachDB Documentation:

Related skills:

triaging-live-sql-activity - For immediate triage of currently running queries
profiling-transaction-fingerprints - For transaction-level analysis including retry patterns and commit latency

profiling-statement-fingerprints

المزيد من هذا المستودع

المزيد من هذا المستودع

Profiling Statement Fingerprints

When to Use This Skill

Prerequisites

Core Concepts

Statement Fingerprints vs Live Queries

Time-Series Bucketing

Aggregated vs Sampled Metrics

JSON Field Extraction

Core Diagnostic Queries

Query 1: Top Statements by Mean Run Latency

Query 2: Admission Control Impact

Query 3: Plan Hash Diversity

Query 4: High Contention Statements

Query 5: High CPU Consumers

Query 6: Memory and Disk Spill Detection

Query 7: Error-Prone Statements

Common Workflows

Workflow 1: Slowness Investigation

Workflow 2: Contention Analysis

Workflow 3: Admission Control Debugging

Workflow 4: Memory Spill Investigation

Safety Considerations

Troubleshooting

Key Considerations

References

Profiling Statement Fingerprints

When to Use This Skill

Prerequisites

Core Concepts

Statement Fingerprints vs Live Queries

Time-Series Bucketing

Aggregated vs Sampled Metrics

JSON Field Extraction

Core Diagnostic Queries

Query 1: Top Statements by Mean Run Latency

Query 2: Admission Control Impact

Query 3: Plan Hash Diversity

Query 4: High Contention Statements

Query 5: High CPU Consumers

Query 6: Memory and Disk Spill Detection

Query 7: Error-Prone Statements

Common Workflows

Workflow 1: Slowness Investigation

Workflow 2: Contention Analysis

Workflow 3: Admission Control Debugging

Workflow 4: Memory Spill Investigation

Safety Considerations

Troubleshooting

Key Considerations

References