com um clique
browser-testing-with-screenshots
// Use when testing web applications with visual verification - automates Chrome browser interactions, element selection, and screenshot capture for confirming UI functionality
// Use when testing web applications with visual verification - automates Chrome browser interactions, element selection, and screenshot capture for confirming UI functionality
Use when you need Codex to coordinate multiple agents through Relaycast for peer-to-peer messaging, lead/worker handoffs, or shared status tracking across sub-agents and terminals.
Use when creating Agent Skills packages (SKILL.md format) for Codex CLI, GitHub Copilot, or Amp - provides the agentskills.io specification with frontmatter constraints, directory structure, and validation rules
Use when creating or improving Claude Code agents. Expert guidance on agent file structure, frontmatter, persona definition, tool access, model selection, and validation against schema.
Use when creating or publishing Claude Code hooks - covers executable format, event types, JSON I/O, exit codes, security requirements, and PRPM package structure
Use when creating or fixing .claude/rules/ files - provides correct paths frontmatter (not globs), glob patterns, and avoids Cursor-specific fields like alwaysApply
Use when creating new Claude Code skills or improving existing ones - ensures skills are discoverable, scannable, and effective through proper structure, CSO optimization, and real examples
| name | browser-testing-with-screenshots |
| description | Use when testing web applications with visual verification - automates Chrome browser interactions, element selection, and screenshot capture for confirming UI functionality |
Automate Chrome browser testing with visual verification using browser-tools. Connect to Chrome DevTools Protocol for navigation, interaction, and screenshot capture to confirm application functionality.
REQUIRED: Install agent-tools from https://github.com/badlogic/agent-tools
# Clone and install agent-tools
git clone https://github.com/badlogic/agent-tools.git
cd agent-tools
# Follow installation instructions in the repository
# Ensure all executables (browser-start.js, browser-nav.js, etc.) are in your PATH
Verify installation:
# Check that browser tools are available
which browser-start.js
which browser-nav.js
which browser-screenshot.js
All browser-* commands referenced in this skill come from the agent-tools repository and must be properly installed and accessible in your system PATH.
Use this skill when:
Don't use for:
| Task | Command | Purpose |
|---|---|---|
| Start browser | browser-start.js | Launch Chrome with debugging |
| Navigate | browser-nav.js http://localhost:5172/dashboard | Go to specific URL |
| Take screenshot | browser-screenshot.js | Capture current viewport |
| Pick elements | browser-pick.js "Select the login button" | Interactive element selection |
| Run JavaScript | browser-eval.js 'document.title' | Execute code in page context |
| Extract content | browser-content.js | Get readable page content |
| View cookies | browser-cookies.js | List session cookies |
# Launch Chrome with debugging enabled (preserves user profile)
browser-start.js
# Or start fresh (no cookies, clean state)
browser-start.js --fresh
Expected Result: Chrome opens on port 9222 with DevTools Protocol enabled
# Go to your application starting point
browser-nav.js http://localhost:5172/dashboard
Verify: Browser navigates to dashboard page
# Take initial screenshot to confirm page loaded
browser-screenshot.js
Output: Returns path to screenshot file (e.g., screenshot_20231203_141532.png)
#!/bin/bash
# Test login and dashboard functionality
echo "🚀 Starting browser test..."
# 1. Launch browser
browser-start.js --fresh
# 2. Navigate to login page
browser-nav.js http://localhost:5172/login
sleep 2
# 3. Take screenshot of login page
LOGIN_SHOT=$(browser-screenshot.js)
echo "📸 Login page: $LOGIN_SHOT"
# 4. Fill login form (interactive element picking)
browser-pick.js "Click the username field"
browser-eval.js 'document.activeElement.value = "testuser"'
browser-pick.js "Click the password field"
browser-eval.js 'document.activeElement.value = "password123"'
# 5. Screenshot filled form
FORM_SHOT=$(browser-screenshot.js)
echo "📸 Filled form: $FORM_SHOT"
# 6. Submit form
browser-pick.js "Click the login button"
sleep 3
# 7. Verify dashboard loaded
browser-nav.js http://localhost:5172/dashboard
DASHBOARD_SHOT=$(browser-screenshot.js)
echo "📸 Dashboard: $DASHBOARD_SHOT"
# 8. Verify specific dashboard elements
browser-pick.js "Select the navigation menu"
browser-eval.js 'console.log("Navigation found:", !!document.querySelector(".nav"))'
echo "✅ Test complete. Screenshots saved."
# Interactive element selection (best for dynamic content)
browser-pick.js "Select the submit button"
# User clicks element in browser → returns CSS selector
# Use returned selector for automation
SELECTOR=$(browser-pick.js "Select the submit button" | grep "selector:")
browser-eval.js "document.querySelector('$SELECTOR').click()"
# Take screenshot to verify action
browser-screenshot.js
# Check if element exists before interaction
browser-eval.js 'document.querySelector("#login-form") !== null'
# Wait for dynamic content
browser-eval.js '
new Promise(resolve => {
const check = () => {
if (document.querySelector(".loaded")) resolve(true);
else setTimeout(check, 100);
};
check();
})
'
# Extract form data
browser-eval.js 'JSON.stringify(Object.fromEntries(new FormData(document.querySelector("form"))))'
# Navigate and wait before screenshot
browser-nav.js http://localhost:5172/slow-page
sleep 5 # Wait for animations/loading
browser-screenshot.js
# Get page title
PAGE_TITLE=$(browser-eval.js 'document.title')
echo "Current page: $PAGE_TITLE"
# Extract readable content
browser-content.js > page_content.md
# Check for specific text
browser-eval.js 'document.body.textContent.includes("Welcome to Dashboard")'
| Mistake | Problem | Solution |
|---|---|---|
| No sleep after navigation | Screenshots of loading page | Add sleep 2-5 after nav |
| Hardcoded selectors | Breaks when UI changes | Use browser-pick.js for selection |
| Missing Chrome setup | "Connection refused" errors | Run browser-start.js first |
| Wrong localhost port | Navigation fails | Verify application is running on correct port |
| Screenshot timing | Captures before content loads | Wait for page load or specific elements |
| Not preserving state | Login lost between commands | Use default profile, not --fresh |
# Check if Chrome is running
if ! browser-eval.js 'true' 2>/dev/null; then
echo "❌ Chrome not connected. Running browser-start.js..."
browser-start.js
fi
# Verify navigation succeeded
if browser-eval.js 'location.href.includes("dashboard")'; then
echo "✅ Navigation successful"
else
echo "❌ Navigation failed"
exit 1
fi
screenshot_YYYYMMDD_HHMMSS.png in current directorybrowser-content.jsbrowser-pick.js interactionbrowser-eval.js# Create test evidence directory
mkdir -p test-results/$(date +%Y%m%d_%H%M%S)
cd test-results/$(date +%Y%m%d_%H%M%S)
# Run tests with organized screenshots
browser-start.js
for page in login dashboard profile; do
browser-nav.js "http://localhost:5172/$page"
sleep 2
screenshot=$(browser-screenshot.js)
mv "$screenshot" "${page}_page.png"
echo "✅ $page page tested"
done
Benefits:
Results:
Key principle: Combine automated navigation with human element selection for robust, maintainable browser testing.