| name | gsd-browser |
| description | Native Rust browser automation CLI for AI agents. Use when the user needs to interact with websites — navigating pages, filling forms, clicking buttons, taking screenshots, sharing a live browser view, narrating browser actions, extracting structured data, running assertions, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button", "take a screenshot", "show me the browser", "share the screen", "pause the browser", "step through this", "scrape data from a page", "test this web app", "login to a site", "automate browser actions", "visual regression test", "check for prompt injection", or any task requiring programmatic web interaction.
|
| allowed-tools | Bash(gsd-browser:*), Bash(gsd-browser *) |
Browser Automation with gsd-browser
Critical Rules
- The daemon auto-starts on browser commands.
daemon health only reports state; it does not start a session. Use daemon start only when you want to pre-warm or verify daemon lifecycle explicitly.
- Always re-snapshot after page changes. Refs are versioned (
@v1:e1). After navigation, form submission, or dynamic content loading, old refs are stale. Run gsd-browser snapshot to get fresh refs.
- Use
--json when parsing output. Use text mode when reading output yourself. Use --json when you need to extract values programmatically (e.g., checking assertion results, parsing snapshot refs).
- Positional args have no flag prefix. Commands like
click, type, hover take positional args — do NOT add --selector. See exact syntax in command reference below.
- Use
batch for atomic multi-step flows. Batch reduces round trips and keeps pass/fail checks in one call. Use separate commands when you need intermediate output (e.g., snapshot to discover refs).
- Use
view when the user wants to watch or direct the browser. The live viewer is an authenticated local workbench with Control, Annotate, Record, and Sensitive modes. Keep CLI commands on the same named session.
Core Workflow
Every browser automation follows this pattern:
- Navigate:
gsd-browser navigate <url>
- Snapshot:
gsd-browser snapshot (get versioned refs like @v1:e1, @v1:e2)
- Interact: Use refs to click, fill, hover
- Re-snapshot: After navigation or DOM changes, get fresh refs
gsd-browser navigate https://example.com/form
gsd-browser snapshot
gsd-browser fill-ref @v1:e1 "user@example.com"
gsd-browser fill-ref @v1:e2 "password123"
gsd-browser click-ref @v1:e3
gsd-browser wait-for --condition network_idle
gsd-browser snapshot
Command Chaining
Commands can be chained with && in a single shell invocation. Browser state also persists across separate invocations through the background daemon when you stay on the same session.
gsd-browser navigate https://example.com && gsd-browser wait-for --condition network_idle && gsd-browser snapshot
gsd-browser fill-ref @v1:e1 "user@example.com" && gsd-browser fill-ref @v1:e2 "password123" && gsd-browser click-ref @v1:e3
When to chain: Use && when you don't need intermediate output. Run commands separately when you need to parse output first (e.g., snapshot to discover refs, then interact).
Command Reference
Argument syntax: <arg> = required positional, [arg] = optional positional, --flag = named option. Do NOT add -- prefix to positional args.
Navigation
gsd-browser navigate <url>
gsd-browser back
gsd-browser forward
gsd-browser reload
Interaction
All selectors are positional — do NOT use --selector.
gsd-browser click <selector>
gsd-browser click --x 100 --y 200
gsd-browser type <selector> <text>
gsd-browser type <selector> <text> --slowly
gsd-browser type <selector> <text> --clear-first
gsd-browser type <selector> <text> --submit
gsd-browser press <key>
gsd-browser hover <selector>
gsd-browser scroll --direction down
gsd-browser scroll --direction up --amount 500
gsd-browser select-option <selector> <option>
gsd-browser set-checked <selector> --checked
gsd-browser drag <source-selector> <target-selector>
gsd-browser upload-file <selector> <file>...
gsd-browser set-viewport --preset mobile
gsd-browser set-viewport --width 1920 --height 1080
Snapshot & Refs
Refs are versioned (@v1:e1, @v2:e3). The version increments each snapshot. Old refs become stale after page changes — always re-snapshot.
gsd-browser snapshot
gsd-browser snapshot --selector "form"
gsd-browser snapshot --mode <mode>
gsd-browser snapshot --limit 80
gsd-browser get-ref <ref>
gsd-browser click-ref <ref>
gsd-browser hover-ref <ref>
gsd-browser fill-ref <ref> <text>
Snapshot modes (--mode):
| Mode | What it captures |
|---|
interactive | Buttons, inputs, links, selects (default) |
form | Form fields with labels and current values |
dialog | Elements inside open dialogs/modals |
navigation | Links and nav elements |
errors | Error messages, validation warnings |
headings | Heading elements (h1-h6) for page structure |
visible_only | All visible elements regardless of interactivity |
Inspection
gsd-browser accessibility-tree
gsd-browser find --text "Sign In"
gsd-browser find --role button
gsd-browser find --selector ".my-class"
gsd-browser find --role link --limit 50
gsd-browser page-source
gsd-browser page-source --selector "main"
gsd-browser eval '<js-expression>'
Assertions
Run explicit pass/fail checks against the current page state. Prefer this over inferring success from output.
gsd-browser assert --checks '[
{"kind": "url_contains", "text": "/dashboard"},
{"kind": "text_visible", "text": "Welcome"},
{"kind": "selector_visible", "selector": "#user-menu"},
{"kind": "value_equals", "selector": "input[name=email]", "value": "user@test.com"},
{"kind": "no_console_errors"},
{"kind": "no_failed_requests"}
]'
Assertion kinds (17): url_contains, text_visible, text_hidden, selector_visible, selector_hidden, value_equals, checked, no_console_errors, no_failed_requests, request_url_seen, response_status, console_message_matches, network_count, console_count, element_count, no_console_errors_since, no_failed_requests_since.
Batch Execution
Execute multiple steps in one call to reduce round trips. Stops on first failure by default.
gsd-browser batch --steps '[
{"action": "navigate", "url": "https://example.com"},
{"action": "wait_for", "condition": "network_idle"},
{"action": "click", "selector": "#login-btn"},
{"action": "type", "selector": "input[name=email]", "text": "user@test.com"},
{"action": "type", "selector": "input[name=password]", "text": "secret", "submit": true},
{"action": "assert", "checks": [{"kind": "url_contains", "text": "/dashboard"}]}
]'
gsd-browser batch --steps '[...]' --summary-only
Batch actions: navigate, click, type, key_press, wait_for, assert, click_ref, fill_ref.
Wait Conditions
gsd-browser wait-for --condition selector_visible --value "#content"
gsd-browser wait-for --condition selector_hidden --value ".spinner"
gsd-browser wait-for --condition url_contains --value "/dashboard"
gsd-browser wait-for --condition network_idle
gsd-browser wait-for --condition delay --value 2000
gsd-browser wait-for --condition text_visible --value "Success"
gsd-browser wait-for --condition text_hidden --value "Loading"
gsd-browser wait-for --condition request_completed --value "/api/data"
gsd-browser wait-for --condition console_message --value "ready"
gsd-browser wait-for --condition element_count --value ".item" --threshold ">=5"
gsd-browser wait-for --condition region_stable --value "#content"
gsd-browser wait-for --condition selector_visible --value "#slow" --timeout 30000
Forms (Smart Fill)
Analyze forms and fill them by field label, name, placeholder, or aria-label — no selectors needed.
gsd-browser analyze-form
gsd-browser analyze-form --selector "#signup-form"
gsd-browser fill-form --values '{"Email": "a@b.com", "Password": "secret", "Country": "US"}'
gsd-browser fill-form --values '{"Email": "a@b.com"}' --submit
gsd-browser fill-form --values '{"Email": "a@b.com"}' --selector "#login-form"
Intent-Based Interaction
Find and act on elements by semantic intent — no selectors or refs needed. Intents are predefined categories, not free-form text.
gsd-browser find-best --intent submit_form
gsd-browser find-best --intent accept_cookies
gsd-browser find-best --intent primary_cta --scope "#modal"
gsd-browser act --intent submit_form
gsd-browser act --intent accept_cookies
gsd-browser act --intent fill_email
Intents (15):
| Intent | Action | Description |
|---|
submit_form | click | Submit buttons, form actions |
close_dialog | click | Modal/dialog close buttons |
primary_cta | click | Primary call-to-action elements |
search_field | focus | Search inputs and searchboxes |
next_step | click | Next/continue/proceed buttons |
dismiss | click | Dismiss overlays, banners, toasts |
auth_action | click | Login/signup/register buttons |
back_navigation | click | Back/previous navigation links |
fill_email | focus | Email input fields |
fill_password | focus | Password input fields |
fill_username | focus | Username/login input fields |
accept_cookies | click | Cookie consent accept buttons |
main_content | click | Main content area (<main>, <article>, semantic markup required) |
pagination_next | click | Next page in pagination |
pagination_prev | click | Previous page in pagination |
Pages & Frames
Page and frame IDs are positional — do NOT use --id.
gsd-browser list-pages
gsd-browser switch-page <id>
gsd-browser close-page <id>
gsd-browser list-frames
gsd-browser select-frame --name "iframe-name"
gsd-browser select-frame --url-pattern "embed"
gsd-browser select-frame --index 0
gsd-browser select-frame --name main
Diagnostics
gsd-browser console
gsd-browser console --no-clear
gsd-browser network
gsd-browser dialog
gsd-browser timeline
gsd-browser session-summary
gsd-browser debug-bundle
Live Viewer & Workbench
The live viewer is a localhost screen-sharing and control surface for the active browser session. It prints or opens a tokenized URL bound to the session, viewer id, local origin, expiry, and capabilities. The viewer displays live frames, narrated action history, ref overlays, target rings, click ripples, failure markers, and page-following across navigation or tab changes.
gsd-browser view
gsd-browser view --print-only
gsd-browser view --interactive
gsd-browser view --history
gsd-browser view --history --print-only
gsd-browser goal "Find the checkout button"
gsd-browser goal --clear
gsd-browser control-state
gsd-browser takeover
gsd-browser release-control
gsd-browser pause
gsd-browser resume
gsd-browser step
gsd-browser abort
gsd-browser sensitive-on
gsd-browser sensitive-off
Use one named session for the whole shared-screen flow:
gsd-browser --session demo navigate https://example.com
gsd-browser --session demo view --print-only
gsd-browser --session demo click "h1"
Viewer controls:
| Control | Effect |
|---|
| Control | Forwards pointer, wheel, keyboard, text, and paste input to Chrome |
| Annotate | Creates point annotations without forwarding page input |
| Record | Starts or stops a local recording bundle |
| Sensitive | Keeps local viewer control active and applies redaction policy |
| Pause | Blocks agent page input |
| Resume | Allows actions to continue |
| Step | Allows one action, then returns to paused mode |
| Abort | Aborts the next gated action |
| Refs overlay | Shows or hides target boxes/labels |
Keyboard shortcuts: Space pauses/resumes, Right Arrow steps, Escape aborts, R toggles refs.
Risky visible targets such as destructive labels, payment actions, OAuth grants, credential entry, file transfer, production/admin surfaces, and cross-origin navigation produce an approval banner. Approval dispatches the exact pending command stored by the daemon.
Annotations:
gsd-browser annotations
gsd-browser annotation-get <id>
gsd-browser annotation-clear <id>
gsd-browser annotation-clear --all
gsd-browser annotation-resolve <id>
gsd-browser annotation-export --output annotations.json
gsd-browser annotation-request "Select the button to restyle"
Recording bundles:
gsd-browser record-start --name checkout-bug
gsd-browser record-stop
gsd-browser record-pause
gsd-browser record-resume
gsd-browser recordings
gsd-browser recording-get <id>
gsd-browser recording-export <id> --output <path>
gsd-browser recording-discard <id>
gsd-browser recording-validate <id-or-path> --json
Viewer annotations and recording bundles are local daemon artifacts. Recording manifests use BrowserArtifactBundleV1, ordered JSONL events, redaction metadata, and local artifact directories under the browser state path.
Use --no-narration-delay for fast agent-only runs that keep narration events/history without lead-time sleeps:
gsd-browser --session demo --no-narration-delay click "h1"
Visual
gsd-browser screenshot
gsd-browser screenshot --output page.png
gsd-browser screenshot --format png
gsd-browser screenshot --full-page
gsd-browser screenshot --selector "#hero"
gsd-browser screenshot --quality 50
gsd-browser zoom-region --x 100 --y 200 --width 400 --height 300
gsd-browser zoom-region --x 0 --y 0 --width 200 --height 200 --scale 3
gsd-browser save-pdf
gsd-browser save-pdf --output report.pdf
gsd-browser save-pdf --format Letter
Visual Regression
gsd-browser visual-diff --name "homepage"
gsd-browser visual-diff --name "homepage"
gsd-browser visual-diff --name "homepage" --threshold 0.05
gsd-browser visual-diff --selector "#hero" --name "hero"
gsd-browser visual-diff --name "homepage" --update-baseline
Structured Data Extraction
gsd-browser extract --schema '{
"type": "object",
"properties": {
"title": {"_selector": "h1", "_attribute": "textContent"},
"price": {"_selector": ".price", "_attribute": "textContent"},
"image": {"_selector": "img.product", "_attribute": "src"}
}
}'
gsd-browser extract --selector ".product-card" --multiple --schema '{
"type": "object",
"properties": {
"name": {"_selector": "h3", "_attribute": "textContent"},
"price": {"_selector": ".price", "_attribute": "textContent"}
}
}'
Network Mocking
gsd-browser mock-route --url "**/api/users*" --body '[{"name":"Alice"}]' --status 200
gsd-browser mock-route --url "**/api/data" --body '{"ok":true}' --delay 3000
gsd-browser block-urls "**/analytics*" "**/ads*"
gsd-browser clear-routes
Device Emulation
gsd-browser emulate-device <device-name>
gsd-browser emulate-device "iPhone 15"
gsd-browser emulate-device "Pixel 7"
gsd-browser emulate-device "iPad Pro 11"
gsd-browser emulate-device list
Warning: Device emulation recreates the browser context — current page state and cookies are lost.
State & Auth
gsd-browser save-state --name "logged-in"
gsd-browser restore-state --name "logged-in"
gsd-browser vault-save --profile github --url https://github.com/login \
--username user --password "secret"
gsd-browser vault-login --profile github
gsd-browser vault-list
Vault encryption requires GSD_BROWSER_VAULT_KEY env var set before the daemon starts. If the daemon is already running, stop it first, set the var, then run your vault command.
Tracing & Recording
gsd-browser trace-start
gsd-browser trace-start --name "checkout-flow"
gsd-browser trace-stop
gsd-browser trace-stop --name "checkout.json"
gsd-browser har-export
gsd-browser har-export --filename "session.har"
gsd-browser generate-test
gsd-browser generate-test --name "login-flow" --output tests/login.spec.ts
Security
gsd-browser check-injection
gsd-browser check-injection --include-hidden
Action Cache
Reduce repeated element lookups by caching intent-to-selector mappings.
gsd-browser action-cache --action stats
gsd-browser action-cache --action get --intent submit_form
gsd-browser action-cache --action put --intent submit_form --selector "#submit-btn" --score 0.95
gsd-browser action-cache --action clear
Daemon Management
The daemon auto-starts on browser commands. These are for explicit lifecycle control.
gsd-browser daemon stop
gsd-browser daemon health
gsd-browser daemon start
gsd-browser update
Global Options
Available on all commands:
| Flag | Description |
|---|
--json | Output as JSON (use when parsing output programmatically) |
--browser-path <path> | Path to Chrome/Chromium binary |
--cdp-url <url> | Attach to an already-running Chrome (e.g. http://localhost:9222) |
--session <name> | Named session for parallel browser instances |
--no-narration-delay | Skip narration lead-time sleeps while keeping history/events |
Error Recovery
Stale refs
Error: resolve_ref: JS evaluation failed: ref @v1:e3 not found
Refs become stale after page changes. Fix: re-snapshot and use the new version.
gsd-browser snapshot
gsd-browser click-ref @v2:e1
Click/type timeouts
Error: click timed out after 10s for: #submit-btn
The element may not be visible, may be behind an overlay, or may not exist. Try:
gsd-browser find --selector "#submit-btn"
gsd-browser scroll --direction down
gsd-browser wait-for --condition selector_visible --value "#submit-btn"
gsd-browser click "#submit-btn"
Empty console/network logs
Console and network buffers start fresh each navigation. If you need logs from a specific action, check them before navigating away:
gsd-browser navigate https://example.com
gsd-browser eval "fetch('/api/data')"
gsd-browser network
Cookie banners / overlays blocking interaction
Many sites show consent banners that block clicks. Dismiss them first:
gsd-browser act --intent accept_cookies
gsd-browser act --intent dismiss
Session is stopped, unhealthy, or opens a fresh blank page
If daemon health reports stopped or unhealthy, or a named session no longer has the page you expected, that session does not currently map to a live daemon/browser pair.
gsd-browser --session site1 daemon health
gsd-browser --session site1 daemon stop
gsd-browser --session site1 navigate https://example.com
Use the same --session value on every follow-up command. batch is still useful for atomic flows, but separate invocations are supported when the session is healthy.
Daemon won't start
Error: daemon did not start within 10s
Usually the session is unhealthy, startup exited early, or browser launch state is stale. Fix:
gsd-browser daemon health
gsd-browser daemon stop
gsd-browser daemon start
Common Patterns
Form Submission
gsd-browser navigate https://example.com/signup
gsd-browser analyze-form
gsd-browser fill-form --values '{"Full Name": "Jane Doe", "Email": "jane@example.com", "State": "California"}' --submit
gsd-browser wait-for --condition network_idle
gsd-browser assert --checks '[{"kind": "text_visible", "text": "Welcome"}]'
Login Flow (Refs)
gsd-browser navigate https://app.example.com/login
gsd-browser act --intent accept_cookies
gsd-browser snapshot
gsd-browser fill-ref @v1:e1 "$USERNAME"
gsd-browser fill-ref @v1:e2 "$PASSWORD"
gsd-browser click-ref @v1:e3
gsd-browser wait-for --condition url_contains --value "/dashboard"
gsd-browser save-state --name "myapp-auth"
Login Flow (Vault)
gsd-browser vault-save --profile myapp \
--url https://app.example.com/login \
--username user@example.com \
--password "$PASSWORD"
gsd-browser vault-login --profile myapp
gsd-browser wait-for --condition url_contains --value "/dashboard"
Reuse Saved Auth
gsd-browser restore-state --name "myapp-auth"
gsd-browser navigate https://app.example.com/dashboard
Data Scraping
gsd-browser navigate https://example.com/products
gsd-browser extract --selector ".product" --multiple --schema '{
"type": "object",
"properties": {
"name": {"_selector": ".title", "_attribute": "textContent"},
"price": {"_selector": ".price", "_attribute": "textContent"},
"link": {"_selector": "a", "_attribute": "href"}
}
}'
Visual Regression Testing
gsd-browser navigate https://example.com
gsd-browser visual-diff --name "home-page"
gsd-browser navigate https://example.com
gsd-browser visual-diff --name "home-page"
Network Mocking for Testing
gsd-browser mock-route --url "**/api/users" --body '{"error":"server error"}' --status 500
gsd-browser navigate https://app.example.com
gsd-browser assert --checks '[{"kind": "text_visible", "text": "Something went wrong"}]'
gsd-browser clear-routes
Parallel Sessions
gsd-browser --session site1 navigate https://site-a.com
gsd-browser --session site2 navigate https://site-b.com
gsd-browser --session site1 snapshot
gsd-browser --session site2 snapshot
gsd-browser --session site1 daemon stop
gsd-browser --session site2 daemon stop
Performance Audit
gsd-browser navigate https://example.com
gsd-browser trace-start --name "perf-audit"
gsd-browser trace-stop --name "perf-audit.json"
gsd-browser har-export --filename "network.har"
Prompt Injection Scanning
gsd-browser navigate https://untrusted-page.com
gsd-browser check-injection
Configuration
Config Files (TOML)
gsd-browser loads config with 5-layer merge precedence:
- Compiled defaults
- User config:
~/.gsd-browser/config.toml
- Project config:
./gsd-browser.toml
- Environment variables:
GSD_BROWSER_*
- CLI flags (highest priority)
Example gsd-browser.toml:
[browser]
path = "/usr/bin/chromium"
[daemon]
port = 9222
host = "127.0.0.1"
[screenshot]
quality = 90
format = "png"
full_page = false
[settle]
timeout_ms = 500
poll_ms = 40
quiet_window_ms = 100
[logs]
max_buffer_size = 1000
[artifacts]
dir = "./browser-artifacts"
[timeline]
max_entries = 500
Environment Variables
Supported config overrides use GSD_BROWSER_<SECTION>_<FIELD> naming:
GSD_BROWSER_BROWSER_PATH=/usr/bin/chromium
GSD_BROWSER_DAEMON_PORT=9223
GSD_BROWSER_SCREENSHOT_QUALITY=90
GSD_BROWSER_SETTLE_TIMEOUT_MS=1000
GSD_BROWSER_VAULT_KEY=your-encryption-key
Session Cleanup
Always stop the daemon when done to avoid leaked Chrome processes:
gsd-browser daemon stop
For parallel sessions:
gsd-browser --session agent1 daemon stop
gsd-browser --session agent2 daemon stop