| name | nodriver-browser |
| description | Persistent Chrome/Chromium browser automation skill built on nodriver. Use when a page needs JavaScript rendering, authorized login/session continuity, clicking or typing, DOM snapshots with stable refs, screenshots, or multi-step look-think-act flows that ordinary WebFetch/search cannot complete. Auto-starts a headless or headed Chrome daemon, can use an isolated skill profile or the user's Chrome profile, and preserves one tab across calls; not for static pages, simple searches, JSON APIs, or one-off scrapes. |
nodriver-browser
A persistent Chrome/Chromium browser that stays alive between Claude's turns. Built on nodriver (CDP-direct, no Selenium, no navigator.webdriver). Every script attaches to the same long-running browser, performs one action, exits — the browser and its tab keep going.
Core invariant: ONE daemon, ONE persistent tab (tabs[0]). Every script reports tabs_open in its output. If you ever see tabs_open > 1, treat it as a real signal that something opened a stray tab — read the warning and act on it.
When to use this skill
- The page needs JavaScript to render (SPA, infinite scroll, lazy load)
- WebFetch is blocked by anti-bot systems (Cloudflare, DataDome, Imperva, hCaptcha)
- The task requires interaction: clicking buttons, filling forms, multi-step flows, dropdown selection
- You need session state across multiple actions (logged-in scraping, multi-page checkout, OAuth flows)
- You need visual proof of a page (screenshot) for debugging or reporting
When NOT to use this skill
- Static HTML that loads fully on first GET → use
WebFetch
- One-shot search query → use
WebSearch (or grok-search skill if it exists)
- A JSON API endpoint → use
curl via Bash, you don't need a browser at all
- A single quick scrape with no interactivity → consider a one-off Python script, not this skill
This skill spawns a Chromium process (~150-200 MB RAM) that stays alive until you explicitly stop it. Worth it for interactive flows; overkill for one URL.
Quick start
Scripts live in scripts/ next to this SKILL.md — resolve paths relative to the skill root.
Run them directly as executables; the #!/usr/bin/env -S uv run --script
shebang invokes uv and reads the PEP 723 metadata for you. Do not run these
scripts with python, python3, or python -m; that bypasses the shebang,
dependency metadata, and Python version pin. If executable dispatch is
unavailable, use uv run --script scripts/nav.py ....
scripts/nav.py https://news.ycombinator.com
scripts/nav.py --headed --user-profile https://example.com
scripts/snapshot.py
scripts/click.py r17
scripts/state.py
scripts/stop_daemon.py
Script reference
All runnable scripts use uv run --script with PEP 723 metadata and return JSON to stdout. Invoke them directly, not through python or python3. Every script (except daemon control) appends tabs_open: N and emits a warning field if N > 1.
Leading browser options work on start_daemon.py and every script that auto-starts/attaches:
| Option | Purpose |
|---|
--headless | Start a headless daemon. This is the default when no daemon is running. |
--headed | Start a visible Chrome/Chromium window. If a daemon is already running in headless mode, stop it first. |
--skill-profile | Use the isolated profile at ~/.cache/nodriver-skill/profile/. This is the default. |
--user-profile | Use the user's Chrome profile root. Useful for existing logged-in state; requires that regular Chrome is not already locking the same profile. |
--profile-directory NAME | Use a Chrome profile directory such as Default or Profile 1; implies --user-profile. |
--user-data-dir PATH | Override the Chrome user-data root; implies --user-profile. |
--no-sandbox | Disable Chrome's OS sandbox. Use only when Chrome cannot start in constrained environments such as PRoot/container/root setups. Do not use for normal system Chrome or the user's Chrome profile. |
Environment equivalents: NODRIVER_SKILL_MODE=headed|headless, NODRIVER_SKILL_PROFILE=skill|user, NODRIVER_CHROME_PROFILE_DIRECTORY="Profile 1", NODRIVER_CHROME_USER_DATA_DIR=/path/to/User Data, NODRIVER_CHROME_NO_SANDBOX=1.
Daemon control
| Script | Purpose | Output |
|---|
start_daemon.py | Idempotent start. No-op if already running. Supports leading browser options. | {ok, pid, port, mode, profile, no_sandbox, already_running} |
stop_daemon.py | Kill daemon, clean PID file + stale singleton locks. Fails safely if a live CDP browser exists but no safe PID can be resolved. | {ok, stopped} or {ok: false, error} |
status.py | Daemon health + tab list. | {alive, pid, process: {uptime_s, rss_kb}, tabs: [...]} |
Navigation & state
| Script | Args | Purpose |
|---|
nav.py | URL | Navigate the persistent tab. Returns the new URL/title/scroll. |
state.py | — | Cheap status read of the current tab. No DOM mutation. |
back.py | — | history.back() |
forward.py | — | history.forward() |
reload.py | — | location.reload() |
Interaction (the look-think-act primitives)
| Script | Args | Purpose |
|---|
snapshot.py | — | Full page text + numbered interactive refs. Writes refs to /tmp/nodriver-skill/refs.json so click/type/press can resolve them. |
click.py | REF | Click element by ref id from latest snapshot. |
type.py | REF TEXT | Clear field and type. Dispatches input+change so React/Vue notice. |
hover.py | REF | Move mouse to element center via CDP, triggering CSS :hover and mouseover/mouseenter events. |
press.py | KEY or REF KEY | Send keyboard event (Enter, Tab, Escape, ArrowDown, single chars, ...). |
select.py | REF VALUE or REF --index N | Select an option from a <select> dropdown by value, visible text, or index. Dispatches change. |
scroll.py | up|down|top|bottom|N | Scroll viewport (N is pixels). |
upload.py | REF FILE [FILE...] | Set files on an <input type="file"> by ref. Validates element type, uses CDP DOM.setFileInputFiles, dispatches change. |
wait.py | SELECTOR [--text] [--timeout N] | Block until selector exists (or text appears with --text). Default 30s. |
eval.py | JS_EXPR | Escape hatch: arbitrary JS expression. Multi-statement → wrap in IIFE. |
Tab visibility & hygiene
| Script | Args | Purpose |
|---|
tabs.py | — | List ALL open tabs (index, url, title, target_id). Use this to see what's actually open. |
close_tab.py | INDEX | Close one tab by 0-indexed position. Refuses to close index 0. |
cleanup.py | — | Close every tab except tabs[0]. The "reset stray tabs" button. |
Misc
| Script | Args | Purpose |
|---|
screenshot.py | [PATH] [--full] | PNG of viewport (or full scrollable page with --full). Default path /tmp/nodriver-skill/last.png. |
The snapshot/refs model
snapshot.py is the single most important script. It does three things:
- Walks the DOM for every interactive element (
a[href], button, input, select, textarea, [role=button], [contenteditable], [onclick], ...)
- Assigns each a stable ref id
r1, r2, ... and mutates the DOM by setting data-nd-ref="rN" on each. This gives a stable CSS selector ([data-nd-ref="r17"]) that survives subsequent queries.
- Writes the
{ref: selector} map to /tmp/nodriver-skill/refs.json so click.py / type.py / press.py can look refs up.
Example output:
{
"url": "https://example.com/login",
"title": "Sign in",
"text": "Sign in to your account...",
"refs": [
{ "ref": "r1", "tag": "input", "type": "email", "name": "Email address", "visible": true, "bbox": [120, 200, 400, 40] },
{ "ref": "r2", "tag": "input", "type": "password", "name": "Password", ... },
{ "ref": "r3", "tag": "button", "type": "submit", "name": "Sign in", ... }
],
"tabs_open": 1
}
To act on it:
scripts/type.py r1 "user@example.com"
scripts/type.py r2 "hunter2"
scripts/click.py r3
Refs go stale on navigation or significant SPA re-render. If click.py returns "ref no longer in DOM", just re-run snapshot.py and try again.
Daemon lifecycle
The daemon is singleton-enforced via fcntl.flock on /tmp/nodriver-skill/start.lock. Five concurrent script invocations from a cold start will only ever spawn one Chromium.
- Auto-start: First call to any interaction script (nav, state, snapshot, ...) auto-starts the daemon if it's not running. You don't need to call
start_daemon.py first unless you want to verify it manually or choose options like --headed --user-profile.
- Persists: The daemon runs with
start_new_session=True so it survives the parent script exit. It will stay alive across all your turn boundaries until explicit shutdown.
- Explicit stop:
stop_daemon.py resolves the daemon PID from the pid file or the CDP debug port, SIGTERMs it, then SIGKILLs after 2s if needed, cleans the PID file and stale singleton locks. Run this at the end of any session that started the daemon.
- Port: 9222 by default. Override with
NODRIVER_SKILL_PORT=9223 if something else holds 9222.
- Mode: headless by default. Use
--headed for a visible window. You cannot change a running daemon from headless to headed; stop it first.
- Profile: isolated skill profile by default:
~/.cache/nodriver-skill/profile/ (cookies, localStorage, IndexedDB, etc.). Use --user-profile for the user's Chrome profile.
- Chrome binary: search order is
CHROMIUM_PATH / CHROME_PATH env vars, then PATH binaries, then standard OS install paths, then the Playwright Chromium cache. Set CHROMIUM_PATH=/path/to/chrome to override.
- Sandbox: Chrome's sandbox is enabled by default. Only pass
--no-sandbox for constrained environments where Chrome cannot start with the OS sandbox, such as PRoot/container/root setups.
Tab hygiene (READ THIS)
In default headless mode there is no visible window. In headed mode you can see the browser, but the tab contract is still enforced by script output. Some sites open new tabs you didn't ask for: target="_blank" links, window.open() calls, popup ads, OAuth redirects.
The contract is one tab. If tabs_open > 1 in any script's output (and the warning field is set):
scripts/tabs.py
scripts/cleanup.py
scripts/close_tab.py 2
Don't ignore the warning. Tabs accumulate. 30 stale tabs = ~2 GB of RAM and a confused state machine.
Footguns
- Refs go stale on re-render. SPAs that re-mount components on route change will lose
data-nd-ref attributes. Re-run snapshot.py after every navigation or significant action.
- Concurrent navigations on the same tab race. Multiple processes can attach simultaneously, but two
nav.py calls to different URLs at the same time will fight. Serialize them.
- Daemon outlives the session. If you forget
stop_daemon.py, Chromium keeps running — silently eating ~190 MB of RAM until you explicitly stop it or reboot. Stop it when you're done.
- User Chrome profile can be locked.
--user-profile uses the real Chrome profile root, so close normal Chrome first if startup fails or if Chrome attaches to the existing app instead of opening the CDP daemon.
--no-sandbox is not normal. It weakens browser isolation and shows Chrome's unsupported-flag banner in headed mode. Use it only for PRoot/container/root environments where normal sandboxed Chrome cannot start.
wait.py polls every 250ms. Don't use it for sub-second timing-sensitive stuff.
type.py clears the field first. If you need to append, read the existing value with eval.py first.
eval.py takes ONE expression, not statements. Multi-statement code: eval.py '(() => { let x = 1; x++; return x; })()'.
- Do not delete the profile without explicit user approval. The isolated profile at
~/.cache/nodriver-skill/profile/ stores cookies, login sessions, and other persistent data. Never clear, reset, or remove it unless the user explicitly asks.
Troubleshooting
| Symptom | Likely cause | Fix |
|---|
daemon won't start | Chrome binary missing | Set CHROMIUM_PATH=/path/to/chrome or apt install chromium |
port 9222 in use by a non-CDP process | Another tool holds 9222 | NODRIVER_SKILL_PORT=9223 ./scripts/start_daemon.py |
PID is a chromium process but isn't responding | Crashed daemon left a zombie | ./scripts/stop_daemon.py to clean up, then retry |
daemon is already running in headless/headed mode | A running daemon cannot change visibility mode | ./scripts/stop_daemon.py, then restart with --headed or --headless |
daemon is already running with ... profile | A running daemon cannot change profile root | ./scripts/stop_daemon.py, then restart with --user-profile or --skill-profile |
daemon is already running with Chrome sandbox ... | A running daemon cannot change sandbox flags | ./scripts/stop_daemon.py, then restart with or without --no-sandbox |
Chrome says unsupported command-line flag: --no-sandbox | You started headed Chrome with sandbox disabled | Stop the daemon and restart without --no-sandbox unless you are in PRoot/container/root |
Chrome user data dir does not exist | The detected user profile root is missing | Use --user-data-dir PATH or fall back to --skill-profile |
no snapshot yet — run snapshot.py first | click.py called without prior snapshot | Run snapshot.py first |
ref no longer in DOM | Page navigated/re-rendered | Re-run snapshot.py, get the new ref |
tabs_open: 5, warning: ... | Site opened popups/new tabs | cleanup.py closes everything except tabs[0] |
Installed N packages log noise on first run | uv resolving deps for the inline script | Normal — only happens once per skill version |
| Hardlink errors during install | PRoot/container without hardlink support | Already mitigated: UV_LINK_MODE=copy is set automatically |
Chrome CDP daemon is alive ... no safe PID could be resolved | Stale/missing PID file and PID discovery failed | Use lsof -nP -iTCP:9222 -sTCP:LISTEN, inspect the process, then stop only that Chrome process |
Verifying it works
scripts/stop_daemon.py
scripts/nav.py https://example.com
scripts/state.py
scripts/snapshot.py | head -20
scripts/eval.py "document.title"
scripts/screenshot.py /tmp/test.png && ls -la /tmp/test.png
scripts/status.py
scripts/stop_daemon.py
If the second call (state.py) reports the same URL as nav.py set, the persistent-tab invariant is working — every other script can rely on it.