| name | capture |
| description | Capture HTTP traffic from web apps using playwright-cli. Includes site fingerprinting (framework detection, protection checks, iframe detection, auth detection, API discovery) and full traffic recording with tracing and optional HAR output. TRIGGER when: "record traffic from", "capture API calls from", "start Phase 1 for", "analyze traffic from URL", "assess site", "site fingerprint", "start capture for", "open browser for", or any URL is given as the first step of CLI generation. DO NOT trigger for: Phase 2 implementation, test writing, or quality validation.
|
| version | 0.4.0 |
Traffic Capture (Phase 1)
Assess the site, then capture comprehensive HTTP traffic. This skill combines
site assessment with full traffic recording in a single browser session.
CRITICAL EXECUTION RULES
NEVER use run_in_background: true for ANY playwright-cli command.
All playwright-cli commands must run in the foreground with appropriate timeouts.
Background execution causes task ID tracking failures ā the command completes
before you can read the output. See references/playwright-cli-commands.md
for the timeout table.
NEVER use eval for complex expressions. eval fails silently on ternaries,
comma operators, and multi-branch logic with "not well-serializable" errors.
Use run-code instead. See references/framework-detection.md for details.
ESM context ā no require(). run-code uses ESM. Use await import('fs')
instead of require('fs'). See references/playwright-cli-commands.md.
Prerequisites (Hard Gate)
Do NOT start unless:
Default capture method: playwright-cli tracing (standard workflow below).
Optional --mitmproxy mode
Use this when the default --mitmproxy flag was passed to /cli-anything-web,
or when you need no body truncation, real-time noise filtering, and enhanced
metadata (timestamps, cookies, body sizes). Requires pip install mitmproxy
(Python 3.12+).
python ${CLAUDE_PLUGIN_ROOT}/scripts/mitmproxy-capture.py start-proxy --port 8080
npx @playwright/cli@latest -s=<app> open <url> \
--config=.playwright/cli.proxy.config.json --headed
npx @playwright/cli@latest -s=<app> close
python ${CLAUDE_PLUGIN_ROOT}/scripts/mitmproxy-capture.py stop-proxy \
--port 8080 -o <app>/traffic-capture/raw-traffic.json
The start-proxy command creates .playwright/cli.proxy.config.json as part
of startup ā no manual config file needed. When the default playwright-cli path
fails entirely (e.g., Node not available), fall back to chrome-devtools-mcp via
launch-chrome-debug.sh ā see HARNESS.md Tool Hierarchy.
Public API Shortcut
If the target site has a documented public REST/JSON API (e.g., Hacker News Firebase API, Dev.to API, Reddit API, Wikipedia API), browser capture is optional:
- Probe the API endpoints directly with
httpx or curl
- Save responses as
<app>/traffic-capture/raw-traffic.json
- Skip to Phase 2 (methodology)
This applies when:
- API docs exist (OpenAPI/Swagger, developer docs page,
/api/ prefix)
- The API is publicly accessible without browser-specific auth
- Endpoints return JSON (not HTML)
If unsure whether a public API exists, proceed with browser capture as normal.
Resume from Checkpoint
Before starting, check if a previous capture session exists:
python ${CLAUDE_PLUGIN_ROOT}/scripts/capture-checkpoint.py restore <app>
If a checkpoint exists, read the guidance field and resume from the last
completed step instead of starting over. This prevents duplicate work when
sessions are interrupted.
Step 1: Setup
mkdir -p <app>/traffic-capture
npx @playwright/cli@latest kill-all 2>/dev/null || true
npx @playwright/cli@latest -s=<app> open <url> --headed --persistent
If --mitmproxy mode: Replace the open command above with:
python ${CLAUDE_PLUGIN_ROOT}/scripts/mitmproxy-capture.py start-proxy --port 8080
npx @playwright/cli@latest -s=<app> open <url> --config=.playwright/cli.proxy.config.json --headed
This starts the proxy first, then opens the browser routed through it.
All subsequent snapshot, click, fill, goto commands work exactly the same.
Do NOT ask the user to log in yet ā Step 2 will determine if auth is needed.
Step 2: Site Fingerprint (Single Command)
Run the all-in-one site fingerprint command instead of individual eval calls.
This is faster, more reliable, and detects framework + protection + iframes +
auth requirements in one shot.
Use the script file ā multi-line JS with arrow functions and optional chaining
fails in playwright-cli's single-line command parser. The script file approach
has been tested and works reliably:
npx @playwright/cli@latest -s=<app> run-code "$(grep -v '^\s*//' ${CLAUDE_PLUGIN_ROOT}/scripts/site-fingerprint.js | tr '\n' ' ')"
IMPORTANT: The site-fingerprint.js script must be loaded via the command
above. Do NOT copy-paste the JS inline ā it will fail with SyntaxError.
The grep -v strips comments and tr joins lines for single-line execution.
Interpret fingerprint results
The fingerprint returns four groups: framework, protection, auth, iframes.
Map each true flag to the next action:
| Group | Action |
|---|
| framework | See references/framework-detection.md for the full protocol table (googleBatch / nextPages / nextApp / nuxt / spaRoot). |
| protection | See references/protection-detection.md ā always start at the escalation ladder at the top (plain httpx ā curl_cffi ā curl_cffi + cookies ā camoufox ā hybrid). |
| auth | Table below (Auth detection section). |
| iframes | If iframeCount > 0, see references/playwright-cli-advanced.md for the in-iframe re-run snippet. |
Claude-facing shortcuts:
googleBatch: true ā generate rpc/ subpackage (batchexecute protocol).
cloudflareManagedChallenge: true ā tier 4 (camoufox) is required; curl_cffi alone will fail.
awsWaf: true ā capture aws-waf-token cookie; use curl_cffi for GraphQL, cookie-only for SSR.
akamai: true or datadome: true ā 1ā2 s delays between requests are mandatory.
serviceWorker: true ā note in assessment.md; generated CLI uses service_workers="block".
iframeCount > 0 ā re-run the fingerprint inside the iframe. Google Labs apps (Stitch / MusicFX / ImageFX) follow this pattern ā parent has WIZ_global_data, iframe has the real app.
Note: snapshot and click <ref> auto-resolve iframes. Only drop down to
run-code for iframe interaction when built-in commands fail.
Auth detection (BEFORE exploration)
Check the fingerprint auth fields:
| Condition | Meaning | Action |
|---|
hasLoginButton && !hasUserMenu | Login required, not logged in | Ask user to log in NOW |
hasUserMenu | Already logged in | Proceed to capture |
!hasLoginButton && !hasUserMenu | No auth needed (public site) | Skip auth, proceed |
If auth is needed:
- Tell the user: "This site requires login. Please log in in the browser window."
- Wait for user confirmation
- Save auth state and tighten permissions (CLAUDE.md mandates
chmod 600):
npx @playwright/cli@latest -s=<app> state-save <app>/traffic-capture/<app>-auth.json
chmod 600 <app>/traffic-capture/<app>-auth.json
If NO auth is needed: Skip directly to Step 2b.
2b. Classify Site Profile
Based on fingerprint results AND what you see in the UI, classify the site:
| Profile | Auth? | Operations | Exploration Focus |
|---|
| Auth + CRUD | Yes | Create, Read, Update, Delete | Full CRUD per resource |
| Auth + Generation | Yes | Generate, Poll, Download | Generation lifecycle + projects |
| Auth + Read-only | Yes | Read, Search, Export | Read operations + auth flow |
| No-auth + CRUD | No/Optional | Full CRUD | Skip auth, full CRUD |
| No-auth + Read-only | No | Read, Search | Minimal capture |
2c. Quick API Probe (Force SPA Navigation Trick)
Start a SHORT trace, click 3-4 internal links, stop. This reveals hidden API
endpoints that SSR hides on initial page load.
npx @playwright/cli@latest -s=<app> tracing-start
npx @playwright/cli@latest -s=<app> click <internal-link-1>
npx @playwright/cli@latest -s=<app> click <internal-link-2>
npx @playwright/cli@latest -s=<app> click <internal-link-3>
npx @playwright/cli@latest -s=<app> tracing-stop
python ${CLAUDE_PLUGIN_ROOT}/scripts/parse-trace.py .playwright-cli/traces/ --latest \
--output <app>/traffic-capture/probe-traffic.json
This probe trace is separate from the full capture in Step 3 ā Step 3 will
start a fresh trace that overwrites the .network file in .playwright-cli/traces/.
The parsed probe-traffic.json is kept in traffic-capture/ so it stays available
for cross-referencing during Step 4.
Check the probe results ā what API patterns did you find?
See references/api-discovery.md for the priority chain and decision tree.
2d. Write Assessment Summary
Create <app>/traffic-capture/assessment.md to consolidate all findings:
# Site Assessment: <app>
- **URL**: <url>
- **Framework**: <detected framework or "none/custom">
- **Protocol**: <REST / GraphQL / batchexecute / HTML scraping / hybrid>
- **Protection**: <none / cloudflare / captcha / aws-waf / etc.>
- **Auth required**: <yes (type: Google SSO / cookie / JWT / API key) / no>
- **Iframes**: <yes (N frames, app in frame N at <url>) / no>
- **Site profile**: <Auth+CRUD / Auth+Generation / Auth+Read-only / No-auth+CRUD / No-auth+Read-only>
- **Capture strategy**: <API-first / SSR+API hybrid / batchexecute / HTML scraping / protected-manual>
- **Key observations**: <any quirks, localized UI, rate limits, special patterns>
Step 3: Full Traffic Capture
Now do the comprehensive capture based on what Step 2 revealed.
npx @playwright/cli@latest -s=<app> run-code "async page => {
await page.context().routeFromHAR('<app>/traffic-capture/capture.har', {
update: true,
updateContent: 'embed',
updateMode: 'full'
});
return 'HAR recording started';
}"
npx @playwright/cli@latest -s=<app> tracing-start
If --mitmproxy mode: Skip tracing-start and HAR recording above.
mitmproxy is already capturing all traffic since Step 1 ā just proceed
to the exploration below. Every click, navigation, and form submission
is automatically recorded by the proxy.
HAR recording is optional but recommended. It produces a standard HAR file
alongside the trace. This enables mitmproxy2swagger to auto-generate an
OpenAPI spec: pip install mitmproxy2swagger && mitmproxy2swagger -i capture.har -o api-spec.yaml -p <base-url>
The HAR file is saved when the browser context is closed (Step 5).
Exploration by site profile
Use the concrete targets in references/exploration-checklists.md for the
profile identified in Step 2b. Each profile has an explicit entry count,
distinct-path count, and WRITE-op target that validate-capture.py (Step 4)
will enforce. Minimum bar across all profiles:
- ā„ 15 entries, ā„ 3 distinct URL paths, protocol ā
unknown
- ā„ 1 WRITE op (unless the site is genuinely read-only ā pass
--read-only to the validator)
- < 50% error rate (dominant 4xx/5xx means auth or rate-limit failure)
Pacing for protected sites
If any of cloudflare, cloudflareManagedChallenge, akamai, datadome,
awsWaf, or rateLimit fired in the fingerprint, leave 1ā2 s between
clicks / form submits. Faster exploration triggers per-IP challenges within
~30 requests and corrupts the trace.
General interaction rules
Step 4: Stop, Save, Parse
npx @playwright/cli@latest -s=<app> tracing-stop
If tracing-stop fails:
- Retry once with 15s timeout
- If it fails again ā the trace is lost. Start a new trace (Step 3).
- NEVER retry more than twice. See
references/playwright-cli-tracing.md for recovery.
python ${CLAUDE_PLUGIN_ROOT}/scripts/parse-trace.py \
.playwright-cli/traces/ --latest \
--output <app>/traffic-capture/raw-traffic.json
python ${CLAUDE_PLUGIN_ROOT}/scripts/validate-capture.py <app>
If validate-capture.py returns a non-zero exit code, do not proceed to Step 5.
Re-open the browser (Step 1), continue exploration to fill the gaps the validator
flagged, then re-run Step 4. Only mark the capture complete after the validator
passes (or warns, with your explicit sign-off on each warning).
For deeper inspection:
python ${CLAUDE_PLUGIN_ROOT}/scripts/analyze-traffic.py \
<app>/traffic-capture/raw-traffic.json --summary
If --mitmproxy mode: Replace the parse/analyze block above with:
python ${CLAUDE_PLUGIN_ROOT}/scripts/mitmproxy-capture.py stop-proxy \
--port 8080 -o <app>/traffic-capture/raw-traffic.json
python ${CLAUDE_PLUGIN_ROOT}/scripts/validate-capture.py <app>
python ${CLAUDE_PLUGIN_ROOT}/scripts/analyze-traffic.py \
<app>/traffic-capture/raw-traffic.json --summary
No tracing-stop or parse-trace.py needed ā mitmproxy already has the data.
The analysis will include enhanced fields (request_sequence, session_lifecycle,
endpoint_sizes) that are only available with mitmproxy capture.
Step 5: Close
npx @playwright/cli@latest -s=<app> close
python ${CLAUDE_PLUGIN_ROOT}/scripts/capture-checkpoint.py update <app> --step complete
If an endpoint is missing ā USE THE FEATURE
Don't grep JS bundles. Start a new trace ā screenshot ā click the button ā fill
ā submit ā stop ā parse. The browser IS the API documentation.
Fallback
Fallback: If playwright-cli is not available, see HARNESS.md Tool Hierarchy for chrome-devtools-mcp fallback instructions.
Next Step
When capture is complete (raw-traffic.json has WRITE operations, or the site is
read-only with only GET requests), invoke methodology to analyze the traffic
and build the CLI.
References
See references/ for:
playwright-cli-commands.md ā command syntax, timeouts, ESM rules
playwright-cli-tracing.md ā trace file format, recovery protocol
playwright-cli-sessions.md ā named sessions, auth persistence
playwright-cli-advanced.md ā waits, iframes, localized UIs, downloads
framework-detection.md ā framework ā protocol table
protection-detection.md ā anti-bot escalation ladder (curl_cffi ā camoufox ā hybrid)
api-discovery.md ā protocol priority chain, decision tree
exploration-checklists.md ā per-profile capture targets with concrete numbers