con un clic
e2e
// Write and run web E2E tests (Playwright) using TDD — locations, patterns, commands, and debugging.
// Write and run web E2E tests (Playwright) using TDD — locations, patterns, commands, and debugging.
Debug a running kandev development instance. Use when the user reports a bug, unexpected behavior, or asks to investigate an issue while kandev is running via `make dev`. Triages the bug class first (backend-logic → Go test, live-instance → /debug/export, UI → browser), launches an ISOLATED parallel instance when a running app is needed, and tears down only what it started.
Ensures UI feature work ships with desktop and mobile parity, responsive behavior, and mobile Playwright E2E coverage. Use when implementing, planning, reviewing, or testing any new feature, page, component, workflow, form, dialog, sidebar, navigation, dashboard, or visual UI change; if work touches frontend or user-facing UI, this skill must run even when user mentions only desktop or says "new feature".
Add debug logs (temporary console.log / structured Warn, or permanent namespaced loggers) to investigate or instrument runtime behaviour. Use whenever the user wants to add logs, log statements, console.logs, trace, instrument, or print runtime behaviour to debug a frontend or backend issue. Triggers include "add debug logs", "add some logs", "log this", "trace this", "instrument", "investigate why", "print", "console.log around". Temporary debug logs must be stripped before creating a PR; persistent ones (frontend `createDebugLogger`, backend tier-appropriate level) stay.
Commit, push, and create a PR. Default is ready-for-review with auto-fixup. Use --draft to skip review/fixup.
Wait for CI checks and automated reviews (CodeRabbit, Greptile, Claude, cubic) on a PR, fix failures and address comments, then push.
Stage and commit changes using Conventional Commits. Use when there are dirty/staged files to commit, the user says "commit", or before pushing a PR.
| name | e2e |
| description | Write and run web E2E tests (Playwright) using TDD — locations, patterns, commands, and debugging. |
Write E2E tests using TDD (Red-Green-Refactor). Always run the tests you create and watch them fail before implementing.
/tdd — Follow the Red-Green-Refactor cycle when writing tests./verify — Run after completing tests to ensure everything passes across the monorepo./playwright-cli — Interactive browser automation. Use to validate features against the dev server before writing tests, and to debug failing tests with --debug=cli.apps/web/e2e/
apps/web/e2e/
├── fixtures/
│ ├── backend.ts # Worker-scoped backend + frontend process
│ ├── test-base.ts # Extended fixture (apiClient, seedData, testPage)
│ └── office-fixture.ts # Office fixtures (officeApi, officeSeed with workspace+agent)
├── helpers/
│ ├── api-client.ts # HTTP client for seeding data (read for available methods)
│ └── office-api-client.ts # Office-specific API client (onboarding, issues, agents)
├── pages/ # Page objects (read for available pages and methods)
└── tests/ # Spec files (*.spec.ts), grouped by feature
├── task/ # Task creation, deletion, archiving, environment, subtasks
├── kanban/ # Kanban board, mobile kanban, preview panel
├── session/ # Session lifecycle, resume, recovery, multi-session, layout
├── workflow/ # Workflow steps, settings, automation, import/export
├── git/ # Git changes panel, commits, diffs, symlinks
├── pr/ # PR detection, watchers, changes panel
├── terminal/ # Terminal agent, keyboard, settings
├── chat/ # Quick chat, message queue, clarification, markdown, toolbar
├── settings/ # Config management, agent profiles, editor integration
└── review/ # Code review diffs
Each worker gets an isolated backend, frontend, database, and mock agent — no Docker, no API keys needed.
Always run headless (make test-e2e). Never use --headed, e2e:headed, or test-e2e-headed — headed mode requires a display and will fail in agent environments.
pnpm e2e:run (managed runner — builds, runs, tears down)e2e/scripts/run-e2e.sh handles the build, the run, and cleanup in one command. Use it instead of stitching the steps together. It auto-selects docker vs host, runs N shards concurrently, enforces strict WS accounting by default (matching CI), and never leaves root-owned artifacts behind.
cd apps/web
pnpm e2e:run # auto: docker if daemon + CI image available, else host; builds first
pnpm e2e:run tests/task/my-test.spec.ts # single file (extra args pass through to Playwright)
pnpm e2e:run --shards 3 # 3 shards concurrently on this machine (isolated)
pnpm e2e:run --no-build -- --grep "task creation" # skip rebuild; forward flags after --
pnpm e2e:docker # force the docker CI image (full isolation from a host dev instance)
pnpm e2e:clean # remove build/test artifacts, incl. root-owned ones from prior docker runs
The runner solves the sharp edges hand-rolling would hit: in docker it builds the CGO backend on the host and runs it in the runtime image (forward-compatible when the host glibc ≤ the image's — the usual case; it smoke-tests this and only falls back to the build image if the host is newer), builds the FE standalone on the host, pre-creates the standalone symlinks as relative links so in-container global-setup doesn't recreate them as root, and keeps Playwright output container-local. See apps/web/e2e/README.md → "the managed runner".
make test-e2e # all tests, headless (host)
cd apps && pnpm --filter @kandev/web e2e -- tests/task/my-test.spec.ts # single file
cd apps && pnpm --filter @kandev/web e2e -- --grep "task creation" # by name
CRITICAL: E2E tests run against the production build (.next/standalone/), not dev mode. After any frontend code change, you must rebuild before running tests (pnpm e2e:run does this for you):
make build-web # ~30s, required after every frontend change
Without this, tests run against stale code and failures are misleading. make build-backend is also required after Go changes. make test-e2e and pnpm e2e:run handle both automatically.
helpers/api-client.ts and pages/ to discover available seed methods and page objects../../fixtures/test-base — provides testPage, apiClient, and seedData (pre-created workspace with default workflow). Pull backend from the fixture too when you need the backend URL — it's worker-scoped, dynamic, and process.env.KANDEV_API_BASE_URL is not set in the Playwright runner (only in the frontend SSR child process). Use backend.baseUrl.data-testid attributes for selectors — add them to components as neededapiClient.mockGitHub*() methods to seed mock dataapiClient.createTaskWithAgent(...) returns CreateTaskResponse, which is Task & { session_id?: string; agent_execution_id?: string }. Read created.session_id directly — don't call listTaskSessions(taskId) just to fetch the session that was auto-started by the same call./t/:id contains the TASK ID, not the session ID. Backend routes like /port-proxy/:sessionId/:port/*path expect the session ID. Don't extract IDs from window.location.pathname when you need a session ID — pull from the API response.page.request shares cookies/storage with the page context. Fine for the current no-auth local backend; if auth ever lands, this is where you'd plug it in.page.waitForResponse("**/api/v1/...") on the backend URL will never fire — the browser only sees the POST to the server-action endpoint. Assert the user-visible outcome (redirect, toast, store change) and/or re-query state via apiClient instead. Client-side fetches (most lib/api/domains/* calls) are visible to waitForResponse; server actions in app/actions/* are not.dev_script configured, so the preview panel renders a placeholder ("Configure a dev script…") and the URL input never appears — tests that try to drive it hang on the locator timeout. To use the preview iframe in a test, set one first: await apiClient.updateRepository(seedData.repositoryId, { dev_script: "echo dev" }). Then click the Preview dockview tab (await session.clickTab("Preview")) — the toolbar will mount and the URL input becomes targetable.Example:
import { test, expect } from "../../fixtures/test-base";
import { KanbanPage } from "../../pages/kanban-page";
test.describe("my feature", () => {
test("does something", async ({ testPage, seedData, apiClient }) => {
const task = await apiClient.createTask(seedData.workspaceId, "Test Task", "Description");
const kanban = new KanbanPage(testPage);
await kanban.goto(seedData.workspaceId);
await expect(kanban.taskCardByTitle("Test Task")).toBeVisible();
});
});
Before writing an E2E test, validate the feature works interactively using playwright-cli against a dev server. This gives a fast feedback loop — code changes are picked up by hot reload in ~1-2 seconds, no production rebuild needed. Once confirmed working, translate the interactions into a proper E2E test.
Multiple agents may run in parallel, so use random ports to avoid collisions. Fixture ports auto-offset from 18080 (backend) and 13000 (frontend) using E2E_PORT_OFFSET (derived from PID % 30 by default) — stay outside those ranges. Parallel E2E test runs are safe by default.
OFFSET=$((RANDOM % 100))
BACKEND_PORT=$((19000 + OFFSET))
FRONTEND_PORT=$((14000 + OFFSET))
Start the backend:
E2E_TMP=$(mktemp -d) && mkdir -p "$E2E_TMP/.kandev" && \
printf '[user]\n name = E2E Test\n email = e2e@test.local\n[commit]\n gpgsign = false\n' > "$E2E_TMP/.gitconfig" && \
HOME="$E2E_TMP" KANDEV_HOME_DIR="$E2E_TMP/.kandev" KANDEV_SERVER_PORT=$BACKEND_PORT \
KANDEV_DATABASE_PATH="$E2E_TMP/kandev.db" KANDEV_MOCK_AGENT=only \
KANDEV_MOCK_GITHUB=true KANDEV_DOCKER_ENABLED=false KANDEV_WORKTREE_ENABLED=false \
KANDEV_LOG_LEVEL=warn apps/backend/bin/kandev &
Start the dev frontend:
KANDEV_API_BASE_URL=http://localhost:$BACKEND_PORT NEXT_PUBLIC_KANDEV_API_PORT=$BACKEND_PORT \
pnpm --filter @kandev/web dev --port $FRONTEND_PORT &
playwright-cli open http://localhost:$FRONTEND_PORT
playwright-cli snapshot # see page structure and element refs
playwright-cli click e5 # interact using refs from snapshot
playwright-cli fill e3 "test input"
playwright-cli snapshot # verify result
apps/web/playwright-cli snapshot or playwright-cli screenshot to verifyOnce validated, write the Playwright test using project fixtures and page objects. The playwright-cli interactions map directly to Playwright API calls:
| playwright-cli | Playwright API |
|---|---|
playwright-cli click e5 | page.getByTestId('...').click() |
playwright-cli fill e3 "text" | page.getByTestId('...').fill('text') |
playwright-cli snapshot (verify element visible) | expect(page.getByTestId('...')).toBeVisible() |
Use data-testid selectors in the test (not snapshot refs), and wrap common flows in page objects.
After confirming the feature works, capture screenshots or a video as proof for the PR:
# Screenshots of key states
playwright-cli screenshot --filename=apps/web/.pr-assets/feature-before.png
# ... interact to show the feature ...
playwright-cli screenshot --filename=apps/web/.pr-assets/feature-after.png
# Or record a video walkthrough
playwright-cli video-start apps/web/.pr-assets/feature-demo.webm
# ... perform the user flow ...
playwright-cli video-stop
Create apps/web/.pr-assets/manifest.json so the /pr skill picks them up:
{
"assets": [
{"name": "feature-demo", "file": "feature-demo.webm", "format": "gif", "caption": "Feature demo"},
{"name": "feature-after", "file": "feature-after.png", "format": "png", "caption": "Result"}
]
}
Always verify against the production build before finishing — dev mode can hide SSR/hydration issues:
playwright-cli close
# Kill dev server and backend
make build-web
cd apps && pnpm --filter @kandev/web e2e -- tests/path/to/test.spec.ts
Tests are grouped by feature area in subdirectories under tests/. When creating a new test:
pr/, a test for session resume goes in session/, etc.test.describe blocks. Don't create a new file for each narrow scenario.../../ (e.g., from "../../fixtures/test-base").testPage.reload()) and assert the state is still correct. This catches hydration bugs and SSR/client mismatches.apiClient to set up preconditions quickly, but always verify the result by opening the page and checking the DOM.When a test fails:
error-context.md from test-results/<test-name>/ — contains a YAML DOM snapshot showing exactly what was rendered. Search for expected elements, check if the page is in the right state (e.g., simple mode vs advanced mode). These files persist across runs — always confirm timestamps (portable: ls -la e2e/test-results/.../error-context.md; or stat -c %y on Linux / stat -f %Sm on macOS) or rebuild + rerun the spec fresh before trusting the snapshot. A stale context from a previous failure mode will send you debugging the wrong bug.e2e/test-results/ — see what the page actually renderedplaywright-cli:
cd apps && PLAYWRIGHT_HTML_OPEN=never pnpm --filter @kandev/web e2e -- tests/path.spec.ts --debug=cli &
# Wait for "Debugging Instructions" with session name
playwright-cli attach tw-<session>
playwright-cli snapshot # inspect page state at failure point
playwright-cli console # check for JS errors
playwright-cli network # check API responses
| Category | Signals | Fast loop |
|---|---|---|
| Test logic | Wrong selector, wrong expected text, missing page object method | Fix test files, re-run immediately (no rebuild -- Playwright transpiles TS at runtime) |
| Frontend-only | Screenshot shows wrong UI, missing element, client error. API calls succeed. | Start dev server, fix with hot reload, verify with playwright-cli, then make build-web + re-run test |
| Backend | 500 errors, wrong API response, "Backend did not become healthy" | Fix Go code, make build-backend, re-run test |
make build-backend build-web, check with E2E_DEBUG=1cd apps && pnpm installE2E_PORT_OFFSET (derived from PID). Set E2E_PORT_OFFSET=0 for deterministic portsfixtures/backend.ts) and overall test timeouts (60s in playwright.config.ts) are separate and should not be modified either.CI splits tests across 10 shards. To reproduce a specific shard locally:
# List which tests are in a shard
npx playwright test --config e2e/playwright.config.ts --shard=2/10 --list
# Run that shard locally (requires production build)
make build-backend build-web
cd apps/web && npx playwright test --config e2e/playwright.config.ts --shard=2/10
E2E tests run against the production build (next build), not dev mode. Always rebuild with make build-web (or pnpm --filter @kandev/web build) after code changes before running E2E tests locally.
A test that flakes under parallel/sharded load is one of two things — decide which before touching it:
--retries=0, a few reps:
pnpm e2e:docker --no-build -- --repeat-each=4 --workers=1 --retries=0 tests/path.spec.ts:LINE
# or raw: pnpm exec playwright test --config e2e/playwright.config.ts --project=chromium --repeat-each=4 --workers=1 --retries=0 tests/path.spec.ts:LINE
waitForRequest that times out the full window means the request never fired (a click swallowed during hydration) — retry the action with await expect(async () => { ... }).toPass(), don't extend the timeout.--repeat-each across many heavy specs in one long-lived worker. It exhausts per-worker resources (agentctl port range, memory) over a long run and manufactures false failures unrelated to the test. Use one fresh container per spec instead.data-testid selectors over text-based locators. Text content can change when UI is updated (e.g., hiding a badge), breaking tests that match by text. Use getByTestId() or locator("[data-testid='...']") for stable targeting.clickSessionChatTab() (stable data-testid) instead of sessionTabByText("1") (fragile text match) for session tabs.openSidebarMenuAndClick() helper in session-page.ts retries the full open-click sequence on detachment — use this pattern for similar interactions.Follow /tdd when writing E2E tests:
data-testid, feature not implemented, etc.)data-testid attributes, run the test until green/verify when done