| name | e2e-test-create |
| description | Analyze React component source code to understand UI structure, then generate idiomatic Cypress E2E tests following Metabase conventions. Falls back to Playwright MCP browser exploration only when code reading and screenshot debugging are insufficient.
|
| disable-model-invocation | true |
| allowed-tools | ["Bash","Read","Write","Grep","Glob","Skill","mcp__playwright__*"] |
Code-Reading-First → Generate Cypress Tests (Metabase)
You are writing Cypress E2E tests for the Metabase codebase.
Before generating ANY test code, you MUST analyze React component source code
to understand DOM structure, selectors, and user flows.
Phase 0 — Research
- Read existing helpers before writing anything:
e2e/support/helpers/ — all shared helpers (restore, signInAs, openOrdersTable, etc.)
e2e/support/cypress_sample_database.ts — table/field schema constants (ORDERS, PRODUCTS, etc.)
e2e/support/cypress_sample_instance_data.ts — instance-specific IDs (ORDERS_DASHBOARD_ID, NORMAL_USER_ID, etc.)
- Glob
e2e/test/scenarios/ to find the closest existing spec to the area under test.
Study its patterns — match them exactly.
- Glob
frontend/src/metabase/ to find React components for the feature area.
Phase 1 — Code Analysis
Read React component source to understand DOM structure. No browser needed — source code has everything.
- Find relevant components: Glob and grep
frontend/src/metabase/ for the feature area.
- Extract selectors: Grep for
data-testid in relevant components.
- Note visible text: Read component JSX for button labels, headings, placeholders.
- Note aria attributes: Grep for
aria-label in relevant components.
- Understand user flows: Read event handlers (onClick, onSubmit, onChange) to understand interactions.
- Find API calls: Grep for
Api.use, fetch, useQuery, endpoint definitions to identify API calls to intercept.
- Cross-reference with existing specs: Find specs in the same area and reuse their proven selectors and
cy.intercept patterns.
Phase 2 — Start Backend
Use MB_EDITION=oss by default. Only use MB_EDITION=ee when the user explicitly asks to write an enterprise test.
Start the backend using run_in_background: true (NOT &).
bin/e2e-backend automatically detects if a backend is already running and reuses it.
MB_EDITION=oss bin/e2e-backend
Do NOT manually generate snapshots by running unrelated test specs.
The bun test-cypress runner has GENERATE_SNAPSHOTS: true by default and automatically
generates snapshots before running any spec. When running tests in Phase 4 via the /e2e-test skill,
snapshots will be generated on the first run if they don't already exist.
Restore clean test data:
curl -sf -X POST http://localhost:4000/api/testing/restore/default
Phase 3 — Generate Cypress Spec
Follow the Metabase Cypress conventions:
@./../_shared/cypress-conventions.md
When you identified API calls during code analysis, stub or wait on them using the intercept pattern shown above.
Phase 4 — Validate
After generating specs:
- Check that all imported helpers exist (Grep
e2e/support/helpers/).
- You MUST use the
/e2e-test skill to run tests — do NOT run bun test-cypress directly.
The /e2e-test skill handles edition selection, snapshot management, and correct env vars.
/e2e-test GREP="should do the thing" --spec e2e/test/scenarios/<path>
If you created multiple it() blocks, run each one individually to isolate failures.
Phase 5 — Fix Failures (up to 2 attempts)
When a test fails, try to fix it from Cypress output first:
- Read the failure screenshot (path printed under
(Screenshots)).
- Read the error message and code frame from the console output.
- Fix the test and re-run (back to Phase 4, step 2).
If you cannot diagnose the issue after 2 attempts, proceed to Phase 6.
Phase 6 — Playwright Fallback
Only reach this phase after 2 failed fix attempts from Phase 5. The backend is already running.
Restore clean test data:
curl -sf -X POST http://localhost:4000/api/testing/restore/default
Bypass CSP headers before navigating (Metabase serves strict CSP that blocks dev server scripts).
Use browser_run_code to set this up:
async (page) => {
await page.context().route('**/*', async (route) => {
const response = await route.fetch();
const headers = { ...response.headers() };
delete headers['content-security-policy'];
delete headers['content-security-policy-report-only'];
await route.fulfill({ response, headers });
});
const response = await page.request.post('http://localhost:4000/api/session', {
data: { username: 'admin@metabase.test', password: '12341234' }
});
const session = await response.json();
await page.context().addCookies([{
name: 'metabase.DEVICE',
value: session.id,
domain: 'localhost',
path: '/'
}]);
await page.goto('http://localhost:4000');
await page.waitForLoadState('networkidle');
return 'signed in';
}
Maintain an observation log incrementally. After EVERY significant Playwright interaction,
IMMEDIATELY append what you observed to the scratch file BEFORE performing the next interaction:
cat >> /tmp/e2e-observations.md << 'OBSERVATION'
- URL: /question/notebook#...
- Clicked: "Box plot" button → visible text "Box plot", role: radio
- Selectors: data-testid="viz-type-button", findByText("Box plot")
- API call: POST /api/dataset (triggered on viz change)
- Key state: after selecting viz type, summary sidebar shows metric picker
OBSERVATION
For each page/flow:
- Take an accessibility snapshot (
browser_snapshot).
- Click through interactive elements, fill forms, trigger modals.
- Append to the observation log immediately after each step: URLs, visible text, aria labels,
data-testid attrs, API calls.
- Screenshot key states.
After exploration:
- Read back your observation log:
cat /tmp/e2e-observations.md
- Fix the test using observed selectors and behavior.
- Re-run the test (back to Phase 4).
- Clean up:
rm -f /tmp/e2e-observations.md
Phase 7 — Cleanup
After all tests pass (or after giving up on fixing failures), always kill the backend on port 4000:
lsof -ti:4000 | xargs kill 2>/dev/null || true
Do NOT use broad pkill patterns — there may be other Metabase instances on different ports.
The backend process started in Phase 2 will NOT be killed automatically when the Claude session ends.
Leaving it running wastes resources and can interfere with future sessions. Always clean up.
What NOT to do (workflow)
- Do NOT use Playwright as the first step — always analyze source code first.
- Do NOT kill the backend between phases — it stays running throughout.
- Do NOT invent selectors you didn't find in source code or observe in the browser.
For convention-level "do nots" (selectors, waits, helpers, etc.), see the conventions file referenced in Phase 3.