com um clique
browser-testing
// Full browser automation via Agent Browser Protocol (ABP). Navigate, click, type, scroll, drag, screenshot, extract text, handle dialogs/downloads/file pickers, manage tabs, control JS execution. Single CLI tool.
// Full browser automation via Agent Browser Protocol (ABP). Navigate, click, type, scroll, drag, screenshot, extract text, handle dialogs/downloads/file pickers, manage tabs, control JS execution. Single CLI tool.
Start a local JS/TS dev server and expose it through an ngrok tunnel so the user can test it from a phone, another machine, or while away from their laptop. Trigger when the user says things like "expose my dev server", "ngrok this", "tunnel localhost", "share my local app", "test on my phone", "make my app reachable from outside", "I want to test this remotely", "give me a public URL for my dev server", or any variant where they want a public URL pointing at their local JS app. Also trigger when they reference remote-connect / SSH-from-phone scenarios and need a way to hit their dev server. Don't wait for the exact word "ngrok" — any "expose / share / tunnel / public link / mobile preview" intent in a Node/JS project should trigger this.
Fetch and address GitHub PR review comments for the current branch. Trigger when: user asks to "address PR comments", "fix review feedback", "handle PR reviews", "review-pr", "address review", "fix PR comments", or any request to act on pull request review feedback. Also trigger when the user pastes a PR URL and asks to address its comments.
Use tree-sitter to build a structural code graph of a project — extract definitions, references, imports, and module dependencies to quickly understand unfamiliar codebases. Trigger when: user asks to "understand this codebase", "map the architecture", "show me the structure", "what does this project do", "how is this organized", "code graph", "dependency graph", "call graph", or any request to get a high-level overview of a project's code. Also trigger when exploring a new/unfamiliar repo for the first time, or when the user asks about relationships between modules, files, or components. Also trigger when the user asks to refactor a module, file, or component — understanding the module's public surface, internal structure, and all dependents is a prerequisite to safe refactoring.
Performance engineering guidance for CPU and memory optimization across languages (Rust, C/C++, TypeScript/JavaScript, Go, Python). Use this skill whenever the user asks about optimizing code performance, reducing memory allocations, improving cache locality, SIMD/vectorization, profiling, benchmarking, or any question about making code faster or more memory-efficient. Also trigger when the user mentions: hot loops, allocation pressure, cache misses, false sharing, memory pools, arena allocation, string interning, branch prediction, auto-vectorization, zero-copy, AoS vs SoA, data-oriented design, or profiling tools (perf, flamegraph, Instruments, VTune, cachegrind). Trigger even for indirect performance questions like "why is this slow", "this function is a bottleneck", "how to reduce memory usage", or "should I optimize this".
| name | browser-testing |
| description | Full browser automation via Agent Browser Protocol (ABP). Navigate, click, type, scroll, drag, screenshot, extract text, handle dialogs/downloads/file pickers, manage tabs, control JS execution. Single CLI tool. |
Single tool: {baseDir}/browser.js <command> [args] [--flags]
ABP is a Chromium fork with a REST API baked into the engine. Every action is deterministic — JS freezes between steps, no race conditions, no manual waits.
{baseDir}/browser.js start # Launch ABP on :8222
B={baseDir}/browser.js
# Navigate
$B nav https://example.com # Navigate active tab
$B nav https://other.com --new # New tab
$B back # History back
$B forward # History forward
$B reload # Reload
# Mouse
$B click 450 320 # Left click
$B click 450 320 --right # Right click
$B click 450 320 --double # Double click
$B click 450 320 --mod CTRL # Ctrl+click
$B hover 300 200 # Mouse move (trigger tooltips/menus)
$B scroll 640 400 --dy 500 # Scroll down 500px
$B scroll 640 400 --dy -300 # Scroll up
$B scroll 640 400 --dx 200 # Scroll right
$B drag 100 200 500 200 # Drag from→to
$B drag 100 200 500 200 --steps 20 # Smooth drag
# Keyboard
$B type hello world # Type text
$B key ENTER # Press key
$B key TAB # Tab
$B key ESCAPE # Escape
$B key a --mod CTRL # Ctrl+A (select all)
$B key c --mod CTRL # Ctrl+C (copy)
$B key ARROWDOWN # Arrow keys
$B key BACKSPACE
$B key a --mod CTRL --action down # Key down only (hold)
$B key a --action up # Key up (release)
# Input helpers
$B slider 400 300 75 # Set range input to 75
$B clear 400 300 # Clear text field (click + select all + delete)
$B pick "Select the login button" # Interactive: user clicks element in browser
# Screenshot
$B screenshot # Viewport with interactive markup
$B screenshot --markup clickable # Only clickable elements
$B screenshot --markup typeable # Only input fields
$B screenshot --markup clickable,typeable,scrollable,grid
$B screenshot --markup none # Clean, no overlays
$B screenshot --format png # PNG instead of WebP
# Extract content
$B text # All visible text (fast, API-native)
$B text "h1.title" # Text within CSS selector
$B eval 'document.title' # Execute JavaScript
$B eval '({links: document.querySelectorAll("a").length})'
$B content # Current page as Markdown (Readability)
$B content https://example.com # Navigate + extract as Markdown
$B cookies # Non-HttpOnly cookies
$B tabs # List all tabs
$B tabs new https://google.com # New tab with URL
$B tabs activate <id> # Switch to tab
$B tabs close <id> # Close tab
$B tabs info <id> # Tab details
$B tabs stop <id> # Stop loading
ABP surfaces events that normally require polling — dialogs, file pickers, downloads, select dropdowns, permission prompts. They appear in the output of any action.
# Dialogs (alert, confirm, prompt)
$B dialog # Check for pending dialog
$B dialog accept # Accept
$B dialog accept "response text" # Accept prompt with text
$B dialog dismiss # Dismiss/cancel
# Downloads
$B download # List all
$B download status <id> # Check progress
$B download cancel <id> # Cancel
$B download get <id> # Get content (base64)
# File chooser (triggered by file input click)
$B file <chooser_id> /path/to/file.pdf # Upload file
$B file <chooser_id> file1.jpg file2.jpg # Multiple files
$B file <chooser_id> --cancel # Cancel picker
$B file <chooser_id> --save /path/out.pdf # Save dialog
# Native <select> dropdown
$B select <select_id> 2 # Choose option at index
# Permissions (geolocation, camera, etc.)
$B permission # List pending
$B permission grant <id> # Grant
$B permission grant <id> --lat 42.36 --lng -71.06 # Grant geo with coords
$B permission deny <id> # Deny
ABP freezes JS between actions by default. You can control this:
$B execution # Current state
$B execution pause # Freeze JS & virtual time
$B execution resume # Unfreeze
# Batch: multiple actions, one screenshot
$B batch '[{"type":"mouse_click","x":350,"y":200},{"type":"keyboard_type","text":"hello"},{"type":"keyboard_press","key":"ENTER"}]'
# Session history (SQLite-backed, for training data)
$B history # List sessions
$B history current # Current session
$B history actions # Action log
$B history clear # Delete all
# Lifecycle
$B status # Browser readiness
$B shutdown # Graceful shutdown
| Flag | Description |
|---|---|
--tab <id> | Target specific tab (default: active) |
--shot | Save screenshot after action (prints path) |
--markup <types> | Screenshot markup: interactive, clickable,typeable,scrollable,grid,selected, or none |
--format <fmt> | Screenshot format: webp (default), png, jpeg |
--json | Output raw API response as JSON |
When events occur during any action, they're printed automatically:
→ https://new-page.com # Navigation happened
⚠ dialog (confirm): Delete item? # Dialog appeared
📁 file chooser id=fc_1 # File picker opened
⬇ download: report.pdf # Download started
▾ select id=s_1 (5 options) # Native select opened
🔐 permission id=p_1 geolocation # Permission requested
↗ popup: https://popup.com # Popup window
The fast pattern: navigate → eval to extract. Skip screenshots unless you're lost.
browser.js start--shot during form-filling. Only screenshot when you need to see layout.nav directly next time — skip the form entirely.eval, not vision: One JS query extracts 10 results faster than scrolling + screenshotting.batch call instead of three.text for simple data: text is faster than eval for plain text extraction.Anti-pattern: click → screenshot → read image → decide → click → screenshot → ... (each step: ~3s for screenshot + LLM vision round-trip)
Fast pattern: nav → click click click (no shots) → eval to extract all data → screenshot once to verify