| name | agent-browser |
| description | Automates browser interactions for web testing, form filling, screenshots, and data extraction. Use when the user needs to navigate websites, interact with web pages, fill forms, take screenshots, test web applications, or extract information from web pages. |
Browser Automation with agent-browser
Comprehensive browser automation skill using the agent-browser CLI for web testing, form filling, screenshots, data extraction, and interactive web automation.
Quick Start Workflow
agent-browser open <url>
agent-browser snapshot -i
agent-browser click @e1
agent-browser fill @e2 "text"
agent-browser close
Core Commands
Navigation
agent-browser open <url>
agent-browser back
agent-browser forward
agent-browser reload
agent-browser close
Page Analysis
agent-browser snapshot
agent-browser snapshot -i
agent-browser snapshot -c
agent-browser snapshot -d 3
agent-browser snapshot -s "#main"
The snapshot command returns elements with reference IDs like @e1, @e2 that you use for interactions.
Element Interactions
agent-browser click @e1
agent-browser dblclick @e1
agent-browser fill @e2 "text"
agent-browser type @e2 "text"
agent-browser hover @e1
agent-browser focus @e1
agent-browser check @e1
agent-browser uncheck @e1
agent-browser select @e1 "value"
agent-browser upload @e1 file.pdf
agent-browser scroll down 500
agent-browser scrollintoview @e1
Keyboard Input
agent-browser press Enter
agent-browser press Control+a
agent-browser keydown Shift
agent-browser keyup Shift
Data Extraction
agent-browser get text @e1
agent-browser get html @e1
agent-browser get value @e1
agent-browser get attr @e1 href
agent-browser get title
agent-browser get url
agent-browser get count ".item"
State Checking
agent-browser is visible @e1
agent-browser is enabled @e1
agent-browser is checked @e1
Screenshots & Documentation
agent-browser screenshot
agent-browser screenshot path.png
agent-browser screenshot --full
agent-browser pdf output.pdf
Waiting & Timing
agent-browser wait @e1
agent-browser wait 2000
agent-browser wait --text "Success"
agent-browser wait --url "**/dashboard"
agent-browser wait --load networkidle
agent-browser wait --fn "window.ready"
Semantic Finding (Alternative to refs)
agent-browser find role button click --name "Submit"
agent-browser find text "Sign In" click --exact
agent-browser find label "Email" fill "user@example.com"
agent-browser find placeholder "Search" type "query"
agent-browser find testid "submit-btn" click
Common Patterns
Form Automation
agent-browser open https://example.com/form
agent-browser snapshot -i
agent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser snapshot -i
Authentication & State Management
agent-browser open https://app.example.com/login
agent-browser snapshot -i
agent-browser fill @e1 "username"
agent-browser fill @e2 "password"
agent-browser click @e3
agent-browser wait --url "**/dashboard"
agent-browser state save auth.json
agent-browser state load auth.json
agent-browser open https://app.example.com/dashboard
Data Extraction Workflow
agent-browser open https://example.com/data
agent-browser snapshot -i
agent-browser get text @e1 > title.txt
agent-browser get attr @e2 href > link.txt
agent-browser screenshot --full > page.png
agent-browser find text "Next" click
agent-browser wait --load networkidle
Testing & Validation
agent-browser open http://localhost:3000
agent-browser wait --load networkidle
agent-browser find role button click --name "Get Started"
agent-browser wait --text "Welcome"
agent-browser screenshot test-step-1.png
agent-browser is visible @e1 && echo "✓ Element visible"
agent-browser get text @e2 | grep -q "Expected" && echo "✓ Text correct"
Advanced Features
Sessions (Parallel Browsers)
agent-browser --session test1 open site-a.com
agent-browser --session test2 open site-b.com
agent-browser session list
Video Recording
agent-browser record start ./demo.webm
agent-browser click @e1
agent-browser record stop
Network Control
agent-browser network route <url> --abort
agent-browser network route <url> --body '{}'
agent-browser network requests
Browser Configuration
agent-browser set viewport 1920 1080
agent-browser set device "iPhone 14"
agent-browser set offline on
agent-browser set credentials user pass
Debugging Options
agent-browser --headed open example.com
agent-browser --json snapshot -i
agent-browser highlight @e1
agent-browser console
agent-browser errors
Error Handling
agent-browser open https://localhost:8443 --ignore-https-errors
if agent-browser is visible @e1; then
agent-browser click @e1
else
echo "Element not found, taking screenshot for debugging"
agent-browser screenshot error-state.png
fi
Environment Configuration
Set these environment variables for persistent configuration:
export AGENT_BROWSER_SESSION="mysession"
export AGENT_BROWSER_EXECUTABLE_PATH="/path/chrome"
export AGENT_BROWSER_PROVIDER="browserbase"
Integration with Pi Workflow
With Testing
agent-browser open http://localhost:3000
agent-browser wait --load networkidle
agent-browser find testid "login-form" fill "user@test.com"
agent-browser screenshot test-login-form.png
With Documentation
agent-browser open https://myapp.com/dashboard
agent-browser screenshot --full docs/images/dashboard.png
agent-browser click @e1
agent-browser screenshot docs/images/feature-view.png
With Data Collection
agent-browser open https://competitor.com
agent-browser get text ".pricing" > competitor-pricing.txt
agent-browser get count ".feature-list li" > feature-count.txt
Troubleshooting
Common Issues
- Elements not found: Re-run
snapshot -i after page changes
- Timeouts: Use
wait commands before interactions
- Authentication: Save and reuse session state
- Dynamic content: Wait for specific conditions before interacting
Debug Commands
agent-browser snapshot -i --json | jq '.elements[] | select(.text | contains("Submit"))'
agent-browser highlight @e1
agent-browser console
Related Skills
- Use with
/skill:feature for testing new features
- Use with
/skill:plan for documenting user flows and validating requirements
References
For detailed documentation on specific topics:
Templates
Ready-to-use scripts for common workflows:
Usage:
.pi/skills/agent-browser/templates/form-automation.sh https://example.com/form
.pi/skills/agent-browser/templates/authenticated-session.sh https://app.com/login
.pi/skills/agent-browser/templates/capture-workflow.sh https://site.com ./output