| name | agent-browser |
| description | Browser automation using Vercel's agent-browser CLI. Use when you need to interact with web pages, fill forms, take screenshots, or scrape data. Alternative to Playwright MCP - uses Bash commands with ref-based element selection. Triggers on "browse website", "fill form", "click button", "take screenshot", "scrape page", "web automation". |
| model | sonnet |
agent-browser: CLI Browser Automation
Vercel's headless browser automation CLI designed for AI agents. Uses ref-based selection (@e1, @e2) from accessibility snapshots.
Setup Check
command -v agent-browser >/dev/null 2>&1 && echo "Installed" || echo "NOT INSTALLED - run: npm install -g agent-browser && agent-browser install"
Install if needed
npm install -g agent-browser
agent-browser install
Core Workflow
The snapshot + ref pattern is optimal for LLMs:
- Navigate to URL
- Snapshot to get interactive elements with refs
- Interact using refs (@e1, @e2, etc.)
- Re-snapshot after navigation or DOM changes
agent-browser open https://example.com
agent-browser snapshot -i --json
agent-browser click @e1
agent-browser fill @e2 "search query"
agent-browser snapshot -i
Key Commands
Navigation
agent-browser open <url>
agent-browser back
agent-browser forward
agent-browser reload
agent-browser close
Snapshots (Essential for AI)
agent-browser snapshot
agent-browser snapshot -i
agent-browser snapshot -i --json
agent-browser snapshot -c
agent-browser snapshot -d 3
Interactions
agent-browser click @e1
agent-browser dblclick @e1
agent-browser fill @e1 "text"
agent-browser type @e1 "text"
agent-browser press Enter
agent-browser hover @e1
agent-browser check @e1
agent-browser uncheck @e1
agent-browser select @e1 "option"
agent-browser scroll down 500
agent-browser scrollintoview @e1
Get Information
agent-browser get text @e1
agent-browser get html @e1
agent-browser get value @e1
agent-browser get attr href @e1
agent-browser get title
agent-browser get url
agent-browser get count "button"
Screenshots & PDFs
agent-browser screenshot
agent-browser screenshot --full
agent-browser screenshot output.png
agent-browser screenshot --full output.png
agent-browser pdf output.pdf
Wait
agent-browser wait @e1
agent-browser wait 2000
agent-browser wait "text"
Semantic Locators (Alternative to Refs)
agent-browser find role button click --name "Submit"
agent-browser find text "Sign up" click
agent-browser find label "Email" fill "user@example.com"
agent-browser find placeholder "Search..." fill "query"
Sessions (Parallel Browsers)
agent-browser --session browser1 open https://site1.com
agent-browser --session browser2 open https://site2.com
agent-browser session list
Examples
Login Flow
agent-browser open https://app.example.com/login
agent-browser snapshot -i
agent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait 2000
agent-browser snapshot -i
Search and Extract
agent-browser open https://news.ycombinator.com
agent-browser snapshot -i --json
agent-browser get text @e12
agent-browser click @e12
Form Filling
agent-browser open https://forms.example.com
agent-browser snapshot -i
agent-browser fill @e1 "John Doe"
agent-browser fill @e2 "john@example.com"
agent-browser select @e3 "United States"
agent-browser check @e4
agent-browser click @e5
agent-browser screenshot confirmation.png
Debug Mode
agent-browser --headed open https://example.com
agent-browser --headed snapshot -i
agent-browser --headed click @e1
JSON Output
Add --json for structured output:
agent-browser snapshot -i --json
Returns:
{
"success": true,
"data": {
"refs": {
"e1": {"name": "Submit", "role": "button"},
"e2": {"name": "Email", "role": "textbox"}
},
"snapshot": "- button \"Submit\" [ref=e1]\n- textbox \"Email\" [ref=e2]"
}
}
vs Playwright MCP
| Feature | agent-browser (CLI) | Playwright MCP |
|---|
| Interface | Bash commands | MCP tools |
| Selection | Refs (@e1) | Refs (e1) |
| Output | Text/JSON | Tool responses |
| Parallel | Sessions | Tabs |
| Best for | Quick automation | Tool integration |
Use agent-browser when:
- You prefer Bash-based workflows
- You want simpler CLI commands
- You need quick one-off automation
Use Playwright MCP when:
- You need deep MCP tool integration
- You want tool-based responses
- You're building complex automation