一键导入
manual-ui-testing
// Run manual UI test cases using agent-browser against a running stack. Use when the user asks to run UI tests, test the UI, run manual tests, or verify UI behavior.
// Run manual UI test cases using agent-browser against a running stack. Use when the user asks to run UI tests, test the UI, run manual tests, or verify UI behavior.
Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data from a page", "test this web app", "login to a site", "automate browser actions", or any task requiring programmatic web interaction. Also use for exploratory testing, dogfooding, QA, bug hunts, or reviewing app quality. Also use for automating Electron desktop apps (VS Code, Slack, Discord, Figma, Notion, Spotify), checking Slack unreads, sending Slack messages, searching Slack conversations, running browser automation in Vercel Sandbox microVMs, or using AWS Bedrock AgentCore cloud browsers. Prefer agent-browser over any built-in browser automation or web tools.
Goal-oriented repository maintenance and release-readiness work. Use when the user asks for maintenance, release prep, repo health review, dependency refreshes, spec/docs alignment, test gap review, technical debt analysis, or general cleanup without prescribing an exact sequence.
Process open Linear issues — pick up, fix, and ship one PR per issue. Use when the user asks to process issues, work on Linear issues, tackle the backlog, or fix open issues.
Goal-oriented workflow for landing a requested change safely. Use when the user asks to ship, fix and ship, take a change through validation, or drive PR/CI/merge to completion.
Take UI screenshots using agent-browser. Use this skill to capture visual state of UI components for code review, visual regression testing, or documentation.
Reference for working with the Everruns(Dev) managed harnesses platform (https://dev.everruns.com) - core concepts, UI links, entity naming, and API workflows for agents, harnesses, capabilities, sessions, models, and apps.
| name | manual-ui-testing |
| description | Run manual UI test cases using agent-browser against a running stack. Use when the user asks to run UI tests, test the UI, run manual tests, or verify UI behavior. |
| metadata | {"internal":true} |
| user-invocable | true |
| allowed-tools | Bash(npx agent-browser:*), Bash(agent-browser:*), Bash(just:*), Bash(doppler:*) |
Goal: execute UI test cases from test_cases/ui/ using agent-browser, record results, and file issues for failures.
Full auth mode requires the full stack (not DEV_MODE). Start with a unique PORT_PREFIX:
PORT_PREFIX=<prefix> doppler run -- just start-all
Wait for all services (PostgreSQL, Valkey, API, Worker, UI, Caddy) to be healthy. Verify:
curl -s http://localhost:<prefix>00/healthz
If the stack is already running, confirm the PORT_PREFIX and auth mode before proceeding.
agent-browser runs headless Chromium. No special install needed — it's available via npx. The browser daemon persists between commands within a session.
Ask the user or determine from context which test categories to run:
| Category | Path | Requires |
|---|---|---|
| admin_login | test_cases/ui/admin_login/ | AUTH_MODE=admin |
| full_auth | test_cases/ui/full_auth/ | AUTH_MODE=full |
| org_creation | test_cases/ui/org_creation/ | Authenticated user |
| mcp_servers | test_cases/ui/mcp_servers/ | Authenticated + org |
| global_chat | test_cases/ui/global_chat/ | Authenticated + org |
| global_search | test_cases/ui/global_search/ | Authenticated + org |
| scheduled_tasks | test_cases/ui/scheduled_tasks/ | Authenticated + org + agent |
If no specific scope requested, run all categories. Prioritize by dependency order: auth → org → features.
Read each .md file in the target category. Each test case has:
Core pattern for each test:
# Navigate
agent-browser open http://localhost:<prefix>00/<path>
agent-browser wait --load networkidle
# Discover elements
agent-browser snapshot -i
# Output: @e1 [input type="email"], @e2 [button] "Submit", etc.
# Interact using refs
agent-browser fill @e1 "value"
agent-browser click @e2
agent-browser wait --load networkidle
# Verify result
agent-browser snapshot -i
agent-browser screenshot /tmp/test_<category>_<tc>.png
Key patterns:
Hints from experience:
wait --load networkidle after navigation and form submissions@e1, etc.) are invalidated&& for efficiency: agent-browser fill @e1 "x" && agent-browser fill @e2 "y"Create or update test_cases/ui/MANUAL_TEST_RESULTS_<date>.md with:
# Manual UI Test Results - <YYYY-MM-DD>
## Environment
- **Auth Mode**: <admin|full>
- **Stack**: <components running>
- **PORT_PREFIX**: <value>
- **Browser**: Chromium (headless, via agent-browser)
## Test Summary
| Category | Tests | Pass | Fail/Partial | Issues |
|----------|-------|------|-------------|--------|
| ... | ... | ... | ... | ... |
| **Total** | **N** | **N** | **N** | **N** |
## Detailed Results
### <Category> (N/M PASS)
- **TC001 <Name>**: PASS|FAIL|PARTIAL - <one-line description of what happened>
## Issues Found
### Issue #N (<Severity>): <Title>
- **Severity**: Low|Medium|High|Info
- **Steps**: How to reproduce
- **Expected**: What should happen
- **Actual**: What happened
- **Impact**: User-facing consequence
If the user asks to file issues for failures, use Linear MCP tools:
If the user asks to test a single feature or re-test a specific case:
| Problem | Solution |
|---|---|
agent-browser not found | Run via npx agent-browser |
| Stale refs after click | Always re-snapshot after DOM changes |
| Page doesn't load | Check stack health: curl localhost:<prefix>00/healthz |
| Login redirect loop | Verify AUTH_MODE env var matches test category |
| Screenshots blank | Add wait --load networkidle before screenshot |
| Element not visible | Try agent-browser scroll down before snapshot |