원클릭으로
browser-edge-cases
// SOP for debugging browser automation failures on complex websites. Use when browser tools fail on specific sites like LinkedIn, Twitter/X, SPAs, or sites with Shadow DOM.
// SOP for debugging browser automation failures on complex websites. Use when browser tools fail on specific sites like LinkedIn, Twitter/X, SPAs, or sites with Shadow DOM.
Claim tasks, record step progress, and verify SOP gates in the colony SQLite queue. Applies when your spawn message includes a db_path field.
Required reading whenever any chart_* tool is available. Teaches the one-tool embedding contract (call chart_render → live chart appears in chat AND a downloadable PNG lands in the queen session dir), the ECharts (data viz) vs Mermaid (structural diagrams) decision, the BI/financial-grade aesthetic baseline (no chartjunk, restrained palette, proper typography, single message per chart), and the canonical spec patterns for the 12 most-common chart types. Skipping this leads to 1990s-Excel charts, missing downloads, and the agent writing markdown image links by hand instead of letting chart_render drive the UI.
Required before any browser_* tool call. Teaches the screenshot + browser_click_coordinate workflow that reaches shadow-DOM inputs selectors can't see, the CSS-pixel coordinate rule (not physical px), rich-text editor quirks ("send button stays disabled" failures), and CSP gotchas. Covers Chrome via CDP through the GCU Beeline extension. Skipping this causes repeated failures on LinkedIn / Reddit / X. Verified against real production sites 2026-04-11.
Required reading whenever any shell_* tool is available. Teaches the foreground/background dichotomy (terminal_exec auto-promotes past 30s, returns a job_id you poll with terminal_job_logs), the standard envelope shape (exit_code, stdout, stdout_truncated_bytes, output_handle, semantic_status, warning, auto_backgrounded, job_id), output handle pagination via terminal_output_get, when to read semantic_status instead of raw exit_code (grep/rg/find/diff/test exit 1 is NOT an error), the destructive-warning surface (rm -rf, git push --force, DROP TABLE), tool preference (use files-tools / gcu-tools / hive_tools before raw shell), and the bash-only-on-macOS policy. Skipping this leads to "tool returned no output" surprises, orphaned jobs, and panic over benign grep exit codes.
Use terminal_rg / terminal_find when you need raw filesystem search outside the project tree — system configs, /var/log, /etc, archive contents — or when files-tools.search_files is too project-scoped. Teaches the rg vs find vs terminal_exec("ls/du/tree") split, common rg flag combos for code/logs/configs, find predicates for mtime/size/type queries, and the rule that for tree views or single-file stat info you should just use terminal_exec instead of inventing a tool. Read before reaching for raw shell to grep or find anything.
Use when launching anything that runs longer than a minute, anything that streams logs, anything you want to keep running while doing other work — or when terminal_exec auto-backgrounded on you and returned a job_id. Teaches the start→poll→wait pattern with terminal_job_logs offset bookkeeping, the `wait_until_exit=True` blocking-poll idiom, the truncated_bytes_dropped resumption signal, the merge_stderr decision, the SIGINT→SIGTERM→SIGKILL escalation ladder via terminal_job_manage, and the hard rule that jobs die when the terminal-tools server restarts. Read before calling terminal_job_start, or right after terminal_exec auto-backgrounded.
| name | browser-edge-cases |
| description | SOP for debugging browser automation failures on complex websites. Use when browser tools fail on specific sites like LinkedIn, Twitter/X, SPAs, or sites with Shadow DOM. |
| license | MIT |
Standard Operating Procedure for debugging and fixing browser automation failures on complex websites.
browser_scroll succeeds but page doesn't movebrowser_click succeeds but no action triggeredbrowser_type text disappears or doesn't workbrowser_snapshot hangs or returns stale contentbrowser_navigate loads wrong content1. Create minimal test case demonstrating failure
2. Test against simple site (example.com) to verify tool works
3. Test against problematic site to confirm issue
Quick isolation test:
# Test 1: Does the tool work at all?
await browser_navigate(tab_id, "https://example.com")
result = await browser_scroll(tab_id, "down", 100)
# Should work on simple sites
# Test 2: Does it fail on the problematic site?
await browser_navigate(tab_id, "https://linkedin.com/feed")
result = await browser_scroll(tab_id, "down", 100)
# If this fails but example.com works → site-specific edge case
Step 2a: Check console for errors
console = await browser_console(tab_id)
# Look for: CSP violations, React errors, JavaScript exceptions
Step 2b: Inspect DOM structure
html = await browser_html(tab_id)
snapshot = await browser_snapshot(tab_id)
# Look for:
# - Nested scrollable divs (overflow: scroll/auto)
# - Shadow DOM roots
# - iframes
# - Custom widgets
Step 2c: Identify the pattern
| Symptom | Likely Cause | Check |
|---|---|---|
| Scroll doesn't move | Nested scroll container | Look for overflow: scroll divs |
| Click no effect | Element covered | Check getBoundingClientRect vs viewport |
| Type clears | Autocomplete/React | Check for event listeners on input; try browser_type_focused |
| Snapshot hangs | Huge DOM | Check node count in snapshot |
| Snapshot stale | SPA hydration | Wait after navigation |
Pattern: Always have fallbacks
async def robust_operation(tab_id):
# Method 1: Primary approach
try:
result = await primary_method(tab_id)
if verify_success(result):
return result
except Exception:
pass
# Method 2: CDP fallback
try:
result = await cdp_fallback(tab_id)
if verify_success(result):
return result
except Exception:
pass
# Method 3: JavaScript fallback
return await javascript_fallback(tab_id)
Pattern: Always add timeouts
# Bad - can hang forever
result = await browser_snapshot(tab_id)
# Good - fails fast with useful error
try:
result = await browser_snapshot(tab_id, timeout_s=10.0)
except asyncio.TimeoutError:
# Handle timeout gracefully
result = await fallback_snapshot(tab_id)
1. Run against problematic site → should work
2. Run against simple site → should still work (regression check)
3. Document in registry.md
Sites: LinkedIn, Twitter/X, any SPA with scrollable feeds
Detection:
// Find largest scrollable container
const candidates = [];
document.querySelectorAll('*').forEach(el => {
const style = getComputedStyle(el);
if (style.overflow.includes('scroll') || style.overflow.includes('auto')) {
const rect = el.getBoundingClientRect();
if (rect.width > 100 && rect.height > 100) {
candidates.push({el, area: rect.width * rect.height});
}
}
});
candidates.sort((a, b) => b.area - a.area);
return candidates[0]?.el;
Fix: Dispatch scroll events at container's center, not viewport center.
Sites: Modals, tooltips, SPAs with loading overlays
Detection:
const rect = element.getBoundingClientRect();
const centerX = rect.left + rect.width / 2;
const centerY = rect.top + rect.height / 2;
const topElement = document.elementFromPoint(centerX, centerY);
return topElement === element || element.contains(topElement);
Fix: Wait for overlay to disappear, or use JavaScript click.
Sites: React SPAs, modern web apps
Detection: If CDP click doesn't trigger handler but manual click works.
Fix: Use JavaScript click as primary:
element.click();
Sites: LinkedIn, Facebook, Twitter (feeds with 1000s of nodes)
Detection:
document.querySelectorAll('*').length > 5000
Fix:
Sites: React, Vue, Angular SPAs after navigation
Detection:
// Check if React app has hydrated
document.querySelector('[data-reactroot]') ||
document.querySelector('[data-reactid]')
Fix: Wait for specific selector after navigation:
await browser_navigate(tab_id, url, wait_until="load")
await browser_wait(tab_id, selector='[data-testid="content"]', timeout_ms=5000)
Sites: Components using Shadow DOM, Lit elements
Detection:
document.querySelectorAll('*').some(el => el.shadowRoot)
Fix: Pierce shadow root:
function queryShadow(selector) {
const parts = selector.split('>>>');
let node = document;
for (const part of parts) {
if (node.shadowRoot) {
node = node.shadowRoot.querySelector(part.trim());
} else {
node = node.querySelector(part.trim());
}
}
return node;
}
| Issue | Primary Fix | Fallback |
|---|---|---|
| Scroll not working | Find scrollable container | Mouse wheel at container center |
| Click no effect | JavaScript click() | CDP mouse events |
| Type clears | Add delay_ms | Use browser_type_focused (Input.insertText) |
| Snapshot hangs | Add timeout_s | DOM snapshot fallback |
| Stale content | Wait for selector | Increase wait_until timeout |
| Shadow DOM | Pierce selector | JavaScript traversal |