| name | agent-browser |
| description | 自动化浏览器交互,用于网页测试、表单填写、截图和数据提取。当用户需要浏览网站、与网页交互、填写表单、截取屏幕截图、测试 Web 应用程序或从网页提取信息时使用。 |
| allowed-tools | Bash(agent-browser:*) |
使用 agent-browser 进行浏览器自动化
快速开始
agent-browser open <url>
agent-browser snapshot -i
agent-browser click @e1
agent-browser fill @e2 "text"
agent-browser close
核心工作流程
- 导航:
agent-browser open <url>
- 快照:
agent-browser snapshot -i(返回带引用的元素,如 @e1、@e2)
- 使用快照中的引用进行交互
- 导航或 DOM 显著变化后重新获取快照
命令
导航
agent-browser open <url>
agent-browser back
agent-browser forward
agent-browser reload
agent-browser close
agent-browser connect 9222
快照(页面分析)
agent-browser snapshot
agent-browser snapshot -i
agent-browser snapshot -c
agent-browser snapshot -d 3
agent-browser snapshot -s "#main"
交互(使用快照中的 @refs)
agent-browser click @e1
agent-browser dblclick @e1
agent-browser focus @e1
agent-browser fill @e2 "text"
agent-browser type @e2 "text"
agent-browser press Enter
agent-browser press Control+a
agent-browser keydown Shift
agent-browser keyup Shift
agent-browser hover @e1
agent-browser check @e1
agent-browser uncheck @e1
agent-browser select @e1 "value"
agent-browser select @e1 "a" "b"
agent-browser scroll down 500
agent-browser scrollintoview @e1
agent-browser drag @e1 @e2
agent-browser upload @e1 file.pdf
获取信息
agent-browser get text @e1
agent-browser get html @e1
agent-browser get value @e1
agent-browser get attr @e1 href
agent-browser get title
agent-browser get url
agent-browser get count ".item"
agent-browser get box @e1
agent-browser get styles @e1
检查状态
agent-browser is visible @e1
agent-browser is enabled @e1
agent-browser is checked @e1
截图和 PDF
agent-browser screenshot
agent-browser screenshot path.png
agent-browser screenshot --full
agent-browser pdf output.pdf
视频录制
agent-browser record start ./demo.webm
agent-browser click @e1
agent-browser record stop
agent-browser record restart ./take2.webm
录制会创建一个新的上下文但保留会话中的 cookies/存储。如果未提供 URL,会自动返回当前页面。为流畅演示,先探索再开始录制。
等待
agent-browser wait @e1
agent-browser wait 2000
agent-browser wait --text "Success"
agent-browser wait --url "**/dashboard"
agent-browser wait --load networkidle
agent-browser wait --fn "window.ready"
鼠标控制
agent-browser mouse move 100 200
agent-browser mouse down left
agent-browser mouse up left
agent-browser mouse wheel 100
语义定位器(引用的替代方案)
agent-browser find role button click --name "Submit"
agent-browser find text "Sign In" click
agent-browser find text "Sign In" click --exact
agent-browser find label "Email" fill "user@test.com"
agent-browser find placeholder "Search" type "query"
agent-browser find alt "Logo" click
agent-browser find title "Close" click
agent-browser find testid "submit-btn" click
agent-browser find first ".item" click
agent-browser find last ".item" click
agent-browser find nth 2 "a" hover
浏览器设置
agent-browser set viewport 1920 1080
agent-browser set device "iPhone 14"
agent-browser set geo 37.7749 -122.4194
agent-browser set offline on
agent-browser set headers '{"X-Key":"v"}'
agent-browser set credentials user pass
agent-browser set media dark
agent-browser set media light reduced-motion
Cookies 和存储
agent-browser cookies
agent-browser cookies set name value
agent-browser cookies clear
agent-browser storage local
agent-browser storage local key
agent-browser storage local set k v
agent-browser storage local clear
网络
agent-browser network route <url>
agent-browser network route <url> --abort
agent-browser network route <url> --body '{}'
agent-browser network unroute [url]
agent-browser network requests
agent-browser network requests --filter api
标签页和窗口
agent-browser tab
agent-browser tab new [url]
agent-browser tab 2
agent-browser tab close
agent-browser tab close 2
agent-browser window new
框架
agent-browser frame "#iframe"
agent-browser frame main
对话框
agent-browser dialog accept [text]
agent-browser dialog dismiss
JavaScript
agent-browser eval "document.title"
全局选项
agent-browser --session <name> ...
agent-browser --json ...
agent-browser --headed ...
agent-browser --full ...
agent-browser --cdp <port> ...
agent-browser --proxy <url> ...
agent-browser --headers <json> ...
agent-browser --executable-path <p>
agent-browser --extension <path> ...
agent-browser --help
agent-browser --version
agent-browser <command> --help
代理支持
agent-browser --proxy http://proxy.com:8080 open example.com
agent-browser --proxy http://user:pass@proxy.com:8080 open example.com
agent-browser --proxy socks5://proxy.com:1080 open example.com
环境变量
AGENT_BROWSER_SESSION="mysession"
AGENT_BROWSER_EXECUTABLE_PATH="/path/chrome"
AGENT_BROWSER_EXTENSIONS="/ext1,/ext2"
AGENT_BROWSER_STREAM_PORT="9223"
AGENT_BROWSER_HOME="/path/to/agent-browser"
示例:表单提交
agent-browser open https://example.com/form
agent-browser snapshot -i
agent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser snapshot -i
示例:带状态保存的身份认证
agent-browser open https://app.example.com/login
agent-browser snapshot -i
agent-browser fill @e1 "username"
agent-browser fill @e2 "password"
agent-browser click @e3
agent-browser wait --url "**/dashboard"
agent-browser state save auth.json
agent-browser state load auth.json
agent-browser open https://app.example.com/dashboard
会话(并行浏览器)
agent-browser --session test1 open site-a.com
agent-browser --session test2 open site-b.com
agent-browser session list
JSON 输出(用于解析)
添加 --json 获取机器可读输出:
agent-browser snapshot -i --json
agent-browser get text @e1 --json
调试
agent-browser --headed open example.com
agent-browser --cdp 9222 snapshot
agent-browser connect 9222
agent-browser console
agent-browser console --clear
agent-browser errors
agent-browser errors --clear
agent-browser highlight @e1
agent-browser trace start
agent-browser trace stop trace.zip
agent-browser record start ./debug.webm
agent-browser record stop
深入文档
有关详细模式和最佳实践,请参阅:
可直接使用的模板
常见模式的可执行工作流脚本:
用法:
./templates/form-automation.sh https://example.com/form
./templates/authenticated-session.sh https://app.example.com/login
./templates/capture-workflow.sh https://example.com ./output