| name | ui-visual-regression |
| description | Automates visual regression testing for DB Console UI changes. Compares screenshots and network requests between the current branch and its merge base using roachprod, playwright-cli, and ImageMagick. Use when the user wants to verify UI changes haven't introduced visual or behavioral regressions. |
| allowed-tools | Bash(roachprod:*), Bash(playwright-cli:*), Bash(compare:*), Bash(diff:*), Bash(git:*), Bash(./dev:*), Bash(npm:*), Bash(which:*), Bash(curl:*), Bash(mkdir:*), Bash(cp:*), Bash(ls:*), Bash(tail:*), Bash(sleep:*), Bash(rm:*), Bash(gh:*), Bash(grep:*), Bash(date:*) |
UI Visual Regression Testing
Automates screenshot-based visual comparison and network request diffing between the current branch and its merge base commit.
Prerequisites
Check these before starting and install any missing tools:
which roachprod
which playwright-cli
which compare
- roachprod: Must be on PATH. Part of CockroachDB dev tooling.
- playwright-cli: Install with
npm install -g @playwright/cli if missing. The old package playwright-cli is deprecated.
- ImageMagick: Install with
brew install imagemagick if compare is not found.
- ./dev: CockroachDB dev build tool (in repo root).
Workflow
Follow these steps in order. Present the test plan (Step 3) to the user and wait for explicit approval before proceeding.
Step 1: Determine Output Directory
Each invocation stores results in a unique timestamped subdirectory to avoid overwriting previous runs:
RUN_ID=$(date +%Y%m%d-%H%M%S)
RUN_DIR="/tmp/ui-regression-test/${RUN_ID}"
mkdir -p "${RUN_DIR}"
All before/after/diff screenshots and network logs for this run go under ${RUN_DIR}/.
Step 2: Analyze Git Diff for Changed UI Pages
Identify which DB Console pages are affected by the current branch's changes.
Check both committed and uncommitted changes:
git log --oneline master..HEAD
git status --short
If there are commits ahead of master, use the commit diff:
git diff --name-only $(git merge-base HEAD master)..HEAD
If all changes are uncommitted (no commits ahead of master), use the working tree diff:
git diff --name-only HEAD
git ls-files --others --exclude-standard
Filter results for files under:
pkg/ui/workspaces/db-console/src/views/
pkg/ui/workspaces/cluster-ui/src/
Map changed file paths to DB Console URL routes using the reference table below.
For cluster-ui changes, trace the changed exports to their db-console consumers by searching for imports of the changed module in pkg/ui/workspaces/db-console/.
Step 3: Create and Present Test Plan
Before taking screenshots, present the user with:
- List of changed files detected
- Routes to test (mapped from changed files)
- Viewport size (default: 1920x1080)
- Any routes needing parameters and the defaults you'll use
- Whether an existing local cluster was detected and what will happen to it
Wait for the user to explicitly approve the plan before proceeding. The user may want to adjust routes, viewport size, or other parameters.
Step 4: Set Up Local Cluster
roachprod list | grep local
roachprod destroy local
roachprod create local -n 3
roachprod stage local release latest
roachprod start local --insecure
Step 4b: Create Test Data
After starting the cluster, create test data so pages render with populated content rather than empty states. Tailor the data to the pages being tested.
For database/table/index pages:
roachprod sql local:1 --insecure -- -e "
CREATE TABLE defaultdb.test_table (id INT PRIMARY KEY, name STRING, value INT, INDEX idx_name (name), INDEX idx_value (value));
INSERT INTO defaultdb.test_table SELECT i, 'name_' || i::STRING, i * 10 FROM generate_series(1, 1000) AS g(i);
"
For pages that show statement fingerprints or index usage, run a workload to generate stats:
roachprod sql local:1 --insecure -- -e "
SELECT * FROM defaultdb.test_table WHERE id = 1;
SELECT * FROM defaultdb.test_table WHERE name = 'name_50';
SELECT * FROM defaultdb.test_table WHERE value = 500;
SELECT count(*) FROM defaultdb.test_table;
"
Important: Statement statistics are flushed periodically (not immediately). Wait at least 60 seconds after running the workload before capturing screenshots that show statement fingerprints or index usage data. Alternatively, capture the "before" screenshots first (which adds natural delay), then capture "after".
Step 5: Start Dev Server
Important: roachprod adminui local reports https:// URLs, but insecure clusters actually serve HTTP. Always use http:// when --insecure was used.
roachprod adminui local
curl -s -o /dev/null -w "%{http_code}" http://127.0.0.1:29001/
./dev ui watch --db http://127.0.0.1:29001
Wait for the dev server to finish compiling. Monitor output for "Compiled successfully":
tail -3 <background-task-output-file> | grep "Compiled successfully"
The dev server runs on http://localhost:3000.
Step 6: Capture Screenshots & Network Requests — Current Branch (After)
mkdir -p "${RUN_DIR}/after/<page-name>"
For each route in the test plan:
playwright-cli open http://localhost:3000
playwright-cli resize 1920 1080
playwright-cli goto "http://localhost:3000/#/<route>"
sleep 10
playwright-cli snapshot
playwright-cli network
cp .playwright-cli/network-<timestamp>.log "${RUN_DIR}/after/<page-name>/<scenario>-network.txt"
playwright-cli screenshot --filename="${RUN_DIR}/after/<page-name>/<scenario>.png"
playwright-cli close
Step 7: Checkout Base Commit
git rev-parse --abbrev-ref HEAD
git merge-base HEAD master
git stash -u
git checkout <merge-base>
The ./dev ui watch process will detect file changes and recompile automatically. Wait for recompilation to finish before proceeding.
Step 8: Capture Screenshots & Network Requests — Base (Before)
Wait for the dev server to finish recompiling after the checkout. Monitor for "Compiled successfully" in the dev server output.
mkdir -p "${RUN_DIR}/before/<page-name>"
Repeat the same capture process as Step 6, saving to ${RUN_DIR}/before/<page-name>/ instead of after/<page-name>/. Apply the same data-loading verification — check that the page has fully rendered data before taking the screenshot. The Redux-based data fetching path (on the base commit) may take longer to load than SWR; retry if the snapshot shows loading skeletons or empty state.
Step 9: Return to Original Branch
git checkout <original-branch>
git stash pop
Step 10: Compare Screenshots with ImageMagick
mkdir -p "${RUN_DIR}/diff"
For each page:
compare -metric AE \
"${RUN_DIR}/before/<page>.png" \
"${RUN_DIR}/after/<page>.png" \
"${RUN_DIR}/diff/<page>-diff.png" 2>&1
For pages with differences, generate a highlighted diff:
compare -highlight-color red -lowlight-color none \
"${RUN_DIR}/before/<page>.png" \
"${RUN_DIR}/after/<page>.png" \
"${RUN_DIR}/diff/<page>-highlighted.png"
Interpreting Pixel Differences
When the diff image shows pixel differences, view the before and after screenshots side by side and classify each differing area. Not all data that changes between captures is acceptable — distinguish between truly dynamic values and stable data that indicates a regression.
Capture ordering and temporal differences: The "after" screenshots (current branch) are captured first, then the base commit's "before" screenshots are captured later. This means the "before" screenshots will always show a longer cluster uptime. Keep this in mind when interpreting differences — uptime, memory usage, and other time-dependent values will naturally differ between captures because of this ordering, not because of code changes.
Truly dynamic (acceptable differences):
- Timestamps and "time ago" values (e.g., "3 minutes" vs "6 minutes" uptime)
- Exact byte counts for memory/disk that fluctuate with system activity
Stable data (differences indicate a regression):
- Replica counts — these stabilize within seconds of cluster startup and should be similar between captures
- Capacity usage percentages — should be consistent (e.g., "0 %" vs "NaN %" is a bug, not dynamic data)
- Node status badges (LIVE, SUSPECT, DEAD) — must match the actual cluster state
- Column values showing "NaN", "undefined", empty, or missing where the baseline shows real values
- Version strings, node names, node IDs
- Values that differ by a large factor (e.g., 2x) between before and after — this suggests a data aggregation bug, not dynamic data
Rule of thumb: If a value is NaN, empty, undefined, or structurally different from the baseline (not just a slightly different number), treat it as a regression until proven otherwise. A value that went from a real number to NaN is never "dynamic data" — it means the data pipeline is broken.
Step 11: Compare Network Requests
For each page:
diff "${RUN_DIR}/before/<page-name>/network.txt" \
"${RUN_DIR}/after/<page-name>/network.txt"
Categorize differences:
- New requests: API calls added (expected for new features, flag if unexpected)
- Removed requests: API calls that no longer happen (may indicate broken data fetching)
- Changed endpoints: URL or method changes
- Status code changes: Requests that now fail or succeed differently
- Request count changes: Same endpoint called more/fewer times (watch for N+1 regressions)
Pay special attention to:
- Requests to
_status/* and _admin/* endpoints (CockroachDB internal APIs)
- Any requests returning 4xx/5xx that previously returned 2xx
- Duplicate requests suggesting SWR deduplication isn't working
When comparing, ignore:
- Request order: SWR and Redux dispatch requests differently; order alone is not a regression
- Health check count:
_admin/v1/health is polled on an interval; count depends on timing
- Polling requests: Focus on the distinct set of endpoints called, not repeated polling calls
Step 12: Cleanup & Report Results
Cleanup:
roachprod destroy local
rm -rf .playwright-cli/
Report with three sections:
Visual Comparison Summary
- Pages with zero pixel differences (no visual regression)
- Pages with differences: pixel count, percentage, path to diff image
- For each differing area, classify as:
- Truly dynamic (timestamps, fluctuating byte counts) — acceptable
- Stable data regression (NaN, empty, missing, or structurally different values for replicas, capacity, status badges, etc.) — flag as a bug
- Include paths to before/after/diff screenshots for manual inspection
- Note the output directory:
${RUN_DIR}
Network Request Summary
- Pages with identical network behavior
- Pages with network changes: categorized list of differences
- Flag potentially problematic changes (removed requests, new errors, duplicate calls)
Overall Assessment
- Whether the changes appear to be a safe refactor (same visual output, same or equivalent network behavior)
- Any regressions that need investigation
File Path to Route Mapping
Use this table to map changed files to DB Console routes for testing.
| File Path Pattern | Route | Default Test URL |
|---|
views/reports/containers/network/ | /reports/network/:nodeId | /#/reports/network/region |
views/cluster/containers/clusterOverview/ | /overview/list | /#/overview/list |
views/cluster/containers/nodeGraphs/ | /metrics/:dashboard/cluster | /#/metrics/overview/cluster |
views/databases/ | /databases | /#/databases |
views/sqlActivity/ | /sql-activity | /#/sql-activity |
views/jobs/ | /jobs | /#/jobs |
views/insights/ | /insights | /#/insights |
views/reports/containers/settings/ | /reports/settings | /#/reports/settings |
views/hotRanges/ | /hotranges | /#/hotranges |
For cluster-ui changes, trace exports back to db-console consumers:
grep -r "from.*cluster-ui.*<changed-module>" \
pkg/ui/workspaces/db-console/src/
Troubleshooting
Proxy errors (EPROTO, Error occurred while trying to proxy)
The dev server proxy target uses the wrong protocol. Insecure roachprod clusters serve HTTP, not HTTPS. Verify with:
curl -s -o /dev/null -w "%{http_code}" http://127.0.0.1:29001/
curl -s -o /dev/null -w "%{http_code}" https://127.0.0.1:29001/
Restart ./dev ui watch with --db http://... instead of https://.
Dev server recompilation errors after checkout
When checking out the base commit, the dev server may briefly show compilation errors as files change in stages (e.g., an import references a file that was removed by the stash before the checkout restores the old version). Wait for the final compilation to settle — the last "Compiled successfully" message is what matters.
playwright-cli not found after install
The @playwright/cli npm package installs the playwright-cli binary. If which playwright-cli still fails after npm install -g @playwright/cli, check your npm global bin directory is on PATH:
npm bin -g
Page shows loading state or empty data
Some pages (especially those using Redux sagas) may take longer to load data on initial render. Do not rely on a fixed sleep duration. After waiting, check the snapshot output for indicators of incomplete loading:
- Loading skeletons or shimmer bars
- "Nodes (0)" or "No data to display"
- Spinners or progress indicators
If found, wait another 10 seconds and retry the snapshot. Repeat up to 3 times before flagging the page as potentially broken.
Notes
- DB Console uses hash routing (
/#/path), so URLs include the # prefix
- Each invocation creates a timestamped subdirectory under
/tmp/ui-regression-test/ to preserve history across runs
- The
./dev ui watch process runs in the background and hot-reloads on file changes
- Some pages require data in the cluster to render meaningfully (e.g., SQL Activity needs running queries)
- This skill uses
roachprod (not roachdev) for all cluster operations
playwright-cli creates a .playwright-cli/ directory in the working directory — clean it up after the test
- When stashing, always use
git stash -u to include untracked files (new files not yet committed)
- ImageMagick
compare returns exit code 1 when images differ — this is expected behavior, not an error
- The "after" (current branch) is captured before the "before" (base commit), so the base commit screenshots will always show longer cluster uptime — this is expected and should not be flagged as a regression