Internal support skill for agent-browser CLI workflows used by rust-learner, docs-researcher, and crate-researcher. Use only when browser automation is explicitly required.
Installation
Install with Codex or Claude Copy this prompt, paste it into Codex, Claude, or another assistant, and let it review the skill page and install it for you.
Internal support skill for agent-browser CLI workflows used by rust-learner, docs-researcher, and crate-researcher. Use only when browser automation is explicitly required.
user-invocable
false
disable-model-invocation
true
Browser Automation with agent-browser
Priority Note
For fetching Rust/crate information, use this priority order:
actionbook MCP - Pre-computed selectors for known sites
agent-browser CLI - Direct browser automation (last resort)
Use agent-browser directly only when:
actionbook has no pre-computed selectors for the target site
You need interactive browser testing/automation
You need screenshots or form filling
Quick start
agent-browser open <url> # Navigate to page
agent-browser snapshot -i # Get interactive elements with refs
agent-browser click @e1 # Click element by ref
agent-browser fill @e2 "text"# Fill input by ref
agent-browser close # Close browser
Core workflow
Navigate: agent-browser open <url>
Snapshot: agent-browser snapshot -i (returns elements with refs like @e1, @e2)
Interact using refs from the snapshot
Re-snapshot after navigation or significant DOM changes
Commands
Navigation
agent-browser open <url> # Navigate to URL
agent-browser back # Go back
agent-browser forward # Go forward
agent-browser reload # Reload page
agent-browser close # Close browser
Snapshot (page analysis)
agent-browser snapshot # Full accessibility tree
agent-browser snapshot -i # Interactive elements only (recommended)
agent-browser snapshot -c # Compact output
agent-browser snapshot -d 3 # Limit depth to 3
Interactions (use @refs from snapshot)
agent-browser click @e1 # Click
agent-browser dblclick @e1 # Double-click
agent-browser fill @e2 "text"# Clear and type
agent-browser type @e2 "text"# Type without clearing
agent-browser press Enter # Press key
agent-browser press Control+a # Key combination
agent-browser hover @e1 # Hover
agent-browser check @e1 # Check checkbox
agent-browser uncheck @e1 # Uncheck checkbox
agent-browser select @e1 "value"# Select dropdown
agent-browser scroll down 500 # Scroll page
agent-browser scrollintoview @e1 # Scroll element into view
Get information
agent-browser get text @e1 # Get element text
agent-browser get value @e1 # Get input value
agent-browser get title # Get page title
agent-browser get url # Get current URL
Screenshots
agent-browser screenshot # Screenshot to stdout
agent-browser screenshot path.png # Save to file
agent-browser screenshot --full # Full page
Wait
agent-browser wait @e1 # Wait for element
agent-browser wait 2000 # Wait milliseconds
agent-browser wait --text "Success"# Wait for text
agent-browser wait --load networkidle # Wait for network idle
Semantic locators (alternative to refs)
agent-browser find role button click --name "Submit"
agent-browser find text "Sign In" click
agent-browser find label "Email" fill "user@test.com"
Example: Form submission
agent-browser open https://example.com/form
agent-browser snapshot -i
# Output shows: textbox "Email" [ref=e1], textbox "Password" [ref=e2], button "Submit" [ref=e3]
agent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser snapshot -i # Check result