| name | agent-browser |
| description | Headless browser automation CLI optimized for AI agents. Use when automating multi-step browser workflows, web scraping, form filling, or testing SPAs. Uses accessibility tree snapshots with ref-based element selection for deterministic actions. Requires `agent-browser` CLI (npm install -g agent-browser). |
Agent Browser Skill
Fast browser automation using accessibility tree snapshots with refs for deterministic element selection.
Core Workflow
agent-browser open https://example.com
agent-browser snapshot -i --json
agent-browser click @e2
agent-browser fill @e3 "text"
agent-browser snapshot -i --json
Key Commands
Navigation
agent-browser open <url>
agent-browser back | forward | reload | close
Snapshot (Always use -i --json)
agent-browser snapshot -i --json
agent-browser snapshot -i -c -d 5 --json
agent-browser snapshot -s "#main" -i
Interactions (Ref-based)
agent-browser click @e2
agent-browser fill @e3 "text"
agent-browser type @e3 "text"
agent-browser hover @e4
agent-browser check @e5 | uncheck @e5
agent-browser select @e6 "value"
agent-browser press "Enter"
agent-browser scroll down 500
Get Information
agent-browser get text @e1 --json
agent-browser get html @e2 --json
agent-browser get value @e3 --json
agent-browser get attr @e4 "href" --json
agent-browser get title --json
agent-browser get url --json
Check State
agent-browser is visible @e2 --json
agent-browser is enabled @e3 --json
agent-browser is checked @e4 --json
Wait
agent-browser wait @e2
agent-browser wait 1000
agent-browser wait --text "Welcome"
agent-browser wait --load networkidle
Sessions (Isolated Browsers)
agent-browser --session admin open site.com
agent-browser --session user open site.com
agent-browser session list
State Persistence
agent-browser state save auth.json
agent-browser state load auth.json
Screenshots
agent-browser screenshot page.png
agent-browser screenshot --full page.png
Snapshot Output Format
{
"success": true,
"data": {
"snapshot": "...",
"refs": {
"e1": {"role": "heading", "name": "Example Domain"},
"e2": {"role": "button", "name": "Submit"},
"e3": {"role": "textbox", "name": "Email"}
}
}
}
Best Practices
- Always use
-i flag - Focus on interactive elements
- Always use
--json - Easier to parse
- Wait for stability -
agent-browser wait --load networkidle
- Save auth state - Skip login flows with
state save/load
- Use sessions - Isolate different browser contexts
Example: Form Automation
agent-browser open https://example.com/form
agent-browser snapshot -i --json
agent-browser fill @e2 "John Doe"
agent-browser fill @e3 "john@example.com"
agent-browser click @e4
agent-browser wait --text "Success"
Installation
npm install -g agent-browser
agent-browser install
agent-browser install --with-deps
Credits
Skill created by Yossi Elkrief (@MaTriXy)
agent-browser CLI by Vercel Labs