Name: Wukongim Perf Triage
Author: WuKongIM

name	wukongim-perf-triage
description	Use when investigating WuKongIM wk-sim, wkbench dev-sim, Docker Compose three-node cluster performance, throughput, timeout, sendack, recv, fanout, Raft, delivery, data-plane, or distributed benchmark regressions.

WuKongIM Performance Triage

Core Rule

Use evidence-first triage for WuKongIM three-node Docker Compose and wk-sim performance work. Do not tune configuration or change code before collecting evidence, classifying the failure, and proving one falsifiable hypothesis.

Before running a benchmark, read docs/development/PERF_TRIAGE.md for current project commands, scenario presets, and evidence layout.

Required Flow

Define scenario, clean-vs-accumulated mode, duration, and success criteria.
Establish a healthy smoke-default baseline before high-rate runs.
Collect status timeline, Compose logs, split node logs, metrics, pprof, Docker stats, git revision, and Compose config.
Classify the issue before changing workload, config, or code.
Write one falsifiable hypothesis and one expected observation.
Run one minimal experiment that changes exactly one variable.
Change code only after evidence points to a code defect and a regression test can fail before the fix.
Verify with the target scenario plus smoke-default and sampled-correctness.
Record concise findings in docs/development/WKSIM_STRESS_FINDINGS.md.

Hard Stops

If smoke-default fails, stop high-rate testing and diagnose baseline health first.
If send_errors > 0, recv_errors > 0, or last_error is non-empty, classify the failure before optimization.
Do not compare clean-stack and accumulated-stack runs as the same evidence class.
Do not blame the server when wk-sim is CPU-bound or concurrency-limited.
Do not change more than one workload/config/code variable per experiment.
Do not add local-only deployment branches; single-node means single-node cluster.

Isolation Guide

Use isolated scenarios before mixed traffic:

Scenario	Use For
`smoke-default`	baseline startup, readiness, and default workload health
`sampled-correctness`	recv correctness and cross-node delivery
`person-hotpath`	personal channel send, metadata refresh, sendack, routing
`group-fanout`	subscriber expansion, delivery tag, cross-node fanout
`mixed-highrate`	final contention across gateway, data-plane, Raft/store, delivery, GC
accumulated data	storage growth, replay, metadata refresh, history-sensitive issues

Classification Hints

send_errors: send/append/forwarding path; inspect leader routing, channel runtime, Raft append/apply.
recv_errors after sendack success: delivery path; inspect presence, delivery tag, fanout, recv matching.
Early-window timeouts with channelmeta.bootstrap: cold start or warmup gap.
Clean passes but accumulated fails: storage, replay, metadata refresh, or scan issue.
Service idle while wk-sim is busy: benchmark client bottleneck.
connected_users stable but active_users flaps: online churn or reconnect instability; inspect gateway/session repair and target logs before tuning throughput.
All containers saturated: local Docker capacity boundary.

Required Report Shape

## Scenario
- workload:
- clean or accumulated:
- duration:
- success criteria:

## Evidence
- status:
- compose logs:
- app logs:
- error logs:
- warn logs:
- metrics:
- pprof:
- docker stats:

## Classification
- category:
- confidence:
- reason:

## Hypothesis
- hypothesis:
- falsification test:

## Next Experiment
- one variable to change:
- expected result:
- stop condition:

## Fix Eligibility
- code change needed: yes/no
- reason:
- required regression test:

WuKongIM Performance Triage

Core Rule

Before running a benchmark, read docs/development/PERF_TRIAGE.md for current project commands, scenario presets, and evidence layout.

Required Flow

Define scenario, clean-vs-accumulated mode, duration, and success criteria.

Establish a healthy smoke-default baseline before high-rate runs.

Collect status timeline, Compose logs, split node logs, metrics, pprof, Docker stats, git revision, and Compose config.

Classify the issue before changing workload, config, or code.

Write one falsifiable hypothesis and one expected observation.

Run one minimal experiment that changes exactly one variable.

Change code only after evidence points to a code defect and a regression test can fail before the fix.

Verify with the target scenario plus smoke-default and sampled-correctness.

Record concise findings in docs/development/WKSIM_STRESS_FINDINGS.md.

Hard Stops

If smoke-default fails, stop high-rate testing and diagnose baseline health first.

If send_errors > 0, recv_errors > 0, or last_error is non-empty, classify the failure before optimization.

Do not compare clean-stack and accumulated-stack runs as the same evidence class.

Do not blame the server when wk-sim is CPU-bound or concurrency-limited.

Do not change more than one workload/config/code variable per experiment.

Do not add local-only deployment branches; single-node means single-node cluster.

Isolation Guide

Use isolated scenarios before mixed traffic:

Scenario

Use For

smoke-default

baseline startup, readiness, and default workload health

sampled-correctness

recv correctness and cross-node delivery

person-hotpath

personal channel send, metadata refresh, sendack, routing

group-fanout

subscriber expansion, delivery tag, cross-node fanout

mixed-highrate

final contention across gateway, data-plane, Raft/store, delivery, GC

accumulated data

storage growth, replay, metadata refresh, history-sensitive issues

Classification Hints

send_errors: send/append/forwarding path; inspect leader routing, channel runtime, Raft append/apply.

recv_errors after sendack success: delivery path; inspect presence, delivery tag, fanout, recv matching.

Early-window timeouts with channelmeta.bootstrap: cold start or warmup gap.

Clean passes but accumulated fails: storage, replay, metadata refresh, or scan issue.

Service idle while wk-sim is busy: benchmark client bottleneck.

connected_users stable but active_users flaps: online churn or reconnect instability; inspect gateway/session repair and target logs before tuning throughput.

All containers saturated: local Docker capacity boundary.

Required Report Shape

## Scenario - workload: - clean or accumulated: - duration: - success criteria: ## Evidence - status: - compose logs: - app logs: - error logs: - warn logs: - metrics: - pprof: - docker stats: ## Classification - category: - confidence: - reason: ## Hypothesis - hypothesis: - falsification test: ## Next Experiment - one variable to change: - expected result: - stop condition: ## Fix Eligibility - code change needed: yes/no - reason: - required regression test:

wukongim-perf-triage

WuKongIM Performance Triage

Core Rule

Required Flow

Hard Stops

Isolation Guide

Classification Hints

Required Report Shape

WuKongIM Performance Triage

Core Rule

Required Flow

Hard Stops

Isolation Guide

Classification Hints

Required Report Shape