con un clic
blazediff
// Run, author, or update BlazeDiff visual regression tests. Trigger on "visual test", "screenshot regression", "blazediff", "/blazediff".
// Run, author, or update BlazeDiff visual regression tests. Trigger on "visual test", "screenshot regression", "blazediff", "/blazediff".
| name | blazediff |
| description | Run, author, or update BlazeDiff visual regression tests. Trigger on "visual test", "screenshot regression", "blazediff", "/blazediff". |
CLI binary is blazediff-agent (the name blazediff belongs to the cargo image-diff binary).
Sibling files in this skill directory — read on demand:
JUDGING.md — judging ambiguous diffs (pendingJudgments > 0) + zsh-safe shell loops for writing verdicts.MASKING.md — picking selectors, mass-masking shared noise across routes, applying masks.--json on every blazediff-agent call; parse fields. Do not echo CLI output.check --json returns a slim payload: { summaryPath, createdAt, totalEntries, passed, failed, pendingJudgments, results }. results lists non-pass entries only, each as { id, url, status, verdict?: { label, headline, action } }. The full per-entry detail (regions, paths, rationale) lives in <TARGET>/.blazediff/summary.md and <TARGET>/.blazediff/judgments/<id>/request.json.capture --stdin call piped a JSON list of routes — never a per-route loop.ls, cat, find for paths the CLI already returns.N captured | M skipped (reasons) | K auth-gated; for check: P/T passed (F failed) plus failure ids.TARGET="$(cd /path/to/repo/apps/website && pwd -P)"
blazediff-agent --cwd "$TARGET" ...
--cwd. Never cd into the target. The CLI catches the common double-nest case (apps/website/apps/website) but absolute paths avoid it entirely..blazediff/manifest.json (in --cwd) exists → check.blazediff-agent --cwd "$TARGET" check --judge host --json (the CLI starts the dev server if devServer is configured; otherwise hits the configured baseUrl directly). Capture, diff, verdict, and judge run through a LangGraph state graph so per-entry stages overlap.
page.goto timeout the first route in a fresh dev session sometimes times out (page.goto: Timeout 30000ms exceeded). If that happens, rerun the same check command — the dev server is now warm and the next pass usually completes. Don't change waitFor or restart the dev server; the issue is one-time compilation, not a routing or wait-condition bug.browserType.launch: Target page, context or browser has been closed). Rerun with the sandbox/escape escalation the host agent provides (in Codex: approve the command for "always run outside sandbox"). Not a blazediff bug.P/T passed. Stop.pendingJudgments > 0): the heuristic couldn't classify some diffs. You are the judge — read JUDGING.md in this skill directory for the full workflow. After judging, re-run check --apply-judgments --json, then re-evaluate as if from step 2/4.<TARGET>/.blazediff/summary.md (5-column id | baseline | actual | diff | verdict table with inline image previews; the --json stdout has the same data as CheckReport). Each failing entry has a verdict: { label, headline, action, rationale[] }. Emit one line per failure: <id>: <verdict.label> — <verdict.headline>. Then act per verdict.label:
regression-likely → point the user at <TARGET>/.blazediff/actual/<id>.diff.png and ask them to investigate. Do not rewrite.intentional-likely → ask the user to confirm; if yes, blazediff-agent --cwd "$TARGET" rewrite <id> --json.noise-likely → ask the user once: ignore, mask, or rewrite. Prefer masking over rewriting when the source is inherently non-deterministic (carousel, iframe, clock, randomized avatar) — rewriting only delays the next flake. See MASKING.md. If rewriting, group with other rewrites in one call (rewrite <id1> <id2> ...).
Never rewrite or mask without explicit user confirmation.Use verdict.action === "rewrite-if-intended" (or explicit user confirmation) before calling rewrite. When the user confirms a failing entry's new state is correct:
blazediff-agent --cwd "$TARGET" rewrite --failed --jsonblazediff-agent --cwd "$TARGET" rewrite <id> [<id>...] --jsonblazediff-agent --cwd "$TARGET" rewrite --all --jsonrewrite preserves the existing manifest entry's mask, viewport, waitFor, and fullPage settings — only the PNG is regenerated. After it returns, suggest the user re-run check to confirm and then git add .blazediff/baselines/ && git commit.
When the user asks to wipe blazediff's state and start over (manifest stale beyond repair, switching frameworks, etc.):
blazediff-agent --cwd "$TARGET" reset --yes --json — deletes the entire .blazediff/ directory (config, manifest, baselines, actual, judgments, summary, pid/log). Tracked dev server is stopped first.reset without explicit user request — it discards committed baselines.Setup (config + Chromium). One onboard --json call writes .blazediff/config.json and ensures bundled Chromium (no prompts under --json; no capture step):
blazediff-agent --cwd "$TARGET" onboard --url <url> --json.onboard --dev-command "<cmd>" --port <n> --json.onboard --json. When multiple dev scripts exist, non-interactive picks the highest-priority one (dev); override with --dev-script <name>.Chromium uses the bundled playwright — no sudo, no npx playwright install --with-deps. (On Linux, OS-level deps may still need npx playwright install-deps chromium if a run fails on missing libs; tell the user.) Skip either step with --no-browsers.
Dev server. If config.devServer is non-null, run blazediff-agent --cwd "$TARGET" serve-status --detach --json. Expect this to wait up to 60s for the port to open before returning. Do not background or poll it.
Discover routes. Prefer reading the router source directly:
app/**/page.{tsx,jsx,mdx} + pages/**/*.{tsx,jsx} (skip api/, _app, _document, _error).<Route path=...> in router.{ts,tsx}.app/routes or src/routes.If the framework is unknown or the router source is opaque, call blazediff-agent --cwd "$TARGET" discover --json. That command does a BFS crawl from the configured baseUrl (depth 2, up to 50 routes), reads .next/routes-manifest.json if present, and reads /sitemap.xml. It's a fallback for when source-walking fails.
Filter. Drop /api/*, dynamic segments without sample data, redirects/404s. For routes that need login or any pre-screenshot interaction, attach a harness via harnesses: [...] on the entry. See harnesses below.
Capture in one call. Build a JSON array of route entries and pipe it through stdin:
cat <<'EOF' | blazediff-agent --cwd "$TARGET" capture --stdin --mode baseline --json
[
{"id":"home","url":"/","mask":[".timestamp"]},
{"id":"pricing","url":"/pricing"},
{"id":"dashboard","url":"/dashboard","harnesses":[{"name":"auth","params":{"persona":"default"}}]}
]
EOF
Entries: { id, url, mask?, viewport?, waitFor?, fullPage?, harnesses?, mode? }. Only id and url required. harnesses is an ordered list of { name, params? } (or bare "name" strings) resolving to .blazediff/harnesses/<name>.js. Manifest entries are written automatically (pass --no-manifest to skip).
id: semantic kebab-case (home, pricing, docs-getting-started), not URL slug.mask: CSS selectors for unstable regions (timestamps, randomized IDs, avatars, "X ago" times, carousels, third-party iframes). Omit if none. The agent always masks [data-blazediff-agent-mask] automatically, so prefer tagging the source element when you can edit it. See MASKING.md for full guidance.Teardown — ALWAYS run, even on error. If config.devServer is non-null, run blazediff-agent --cwd "$TARGET" serve-status --kill --json as the very last step regardless of capture success/failure. The CLI kills by tracked PID first, then falls back to whatever process is listening on the configured port — so it cleans up stale dev servers from prior crashed runs too. If the kill returns stopped: false, no server was running; that's fine. Wrap your capture call so this step runs even if capture failed mid-list (shell trap, try/finally in the host agent's flow, etc.).
Final summary line. Suggest git add .blazediff/ && git commit.
A harness is a pluggable script in .blazediff/harnesses/<name>.js attached to an entry via harnesses: [{ name, params? }]. There are two phases:
Each harness is an ESM module that default-exports a Harness. TypeScript is not auto-transpiled — write .js/.mjs and use the JSDoc @type annotation for intellisense:
/** @type {import("@blazediff/agent").Harness} */
export default {
// phase defaults to "interact": runs AFTER the base screenshot.
async run({ page, screenshot }) {
await page.getByRole("button", { name: "More options" }).click();
await screenshot("menu"); // -> a sub-baseline with id "<entry>__menu"
},
};
Attach it: {"id":"weather","url":"/weather","harnesses":["weather-menu"]}. The base shot weather fires automatically; every screenshot("menu") becomes its own manifest/baseline/diff entry weather__menu.
Rules:
nth-child. Stability + masks are re-applied before each sub-shot automatically. Screenshot names must be alphanumeric/kebab (no __, spaces, or slashes).rewrite <parent-id> — it re-runs the harness and regenerates every child.Login is just a phase:"setup" harness. Credentials live only in env vars — never in the LLM, manifest, or harness file (the harness references process.env.BLAZEDIFF_AUTH_*, nothing else).
.blazediff/harnesses/auth.js exists, skip to step 4. (Legacy .blazediff/auth.js from before the harness change must be regenerated.).blazediff/.env (or the user can supply them), write the harness yourself; do not make the user run harness record. Identify the login form's email / password / submit selectors by reading the /login route's source component (prefer name= / type= / id= / role-based selectors; avoid nth-child), or by snapshotting the live page. Write .blazediff/harnesses/auth.js:
/** @type {import("@blazediff/agent").Harness<{ persona?: string }>} */
export default {
phase: "setup", // runs before navigation; must NOT call screenshot()
async run({ page, params }) {
const upper = (params.persona ?? "default").toUpperCase().replace(/[^A-Z0-9]/g, "_");
const email = process.env[`BLAZEDIFF_AUTH_${upper}_EMAIL`];
const password = process.env[`BLAZEDIFF_AUTH_${upper}_PASSWORD`];
if (!email || !password) throw new Error(`missing BLAZEDIFF_AUTH_${upper}_EMAIL / _PASSWORD`);
await page.goto("<LOGIN URL>");
await page.locator('input[name="email"]').fill(email);
await page.locator('input[name="password"]').fill(password);
await Promise.all([
page.waitForURL((u) => !u.pathname.startsWith("/login")),
page.getByRole("button", { name: /sign in|log in/i }).click(),
]);
if (new URL(page.url()).pathname === new URL("<LOGIN URL>").pathname) {
throw new Error("login did not leave /login — check selectors/redirect");
}
},
};
Never inline real credentials — only process.env.BLAZEDIFF_AUTH_* references.blazediff-agent --cwd "$TARGET" harness record auth --login --persona default --url <login URL> (opens Playwright codegen; they log in once; typed creds are rewritten to env-var refs and written to .blazediff/harnesses/auth.js). You can't drive the recorder yourself.{ "name": "auth", "params": { "persona": "<persona>" } } to the entry's harnesses (use "default" unless the user specifies otherwise).BLAZEDIFF_AUTH_<PERSONA>_EMAIL / ..._PASSWORD at capture time. The CLI auto-loads env files from --cwd: .blazediff/.env[.local] (blazediff-scoped, auto-gitignored) then the project-root .env[.local]; real exported env wins, and .blazediff/ files beat the app root. Drop creds in .blazediff/.env. If the user hasn't supplied any, ask for them — don't invent placeholders.--mode baseline an existing manifest entry without explicit user request..blazediff/manifest.json directly..blazediff/harnesses/ when you can (login: simple form, creds via env). Use harness record only for flows you can't author (OAuth/SSO/MFA/captcha). Never write real credentials into a harness file — env refs only.CI=1 or no TTY), only check is allowed.