| name | chrome |
| description | This skill should be used when the user wants to inspect or control their own Chrome browser session over CDP. Triggers include: "grab my chrome tab", "show me my tabs", "screenshot my [site] tab", "what's open in my chrome", "eval this in my browser", "cookies from my chrome", "navigate my chrome to", "what's the page console showing", "check my chrome network requests". Does not apply for generic browser automation (use webwright) or the agent-os Chrome endpoint (use agent-os). |
chrome
Talk to a real, running Chrome instance on a remote machine via CDP (Chrome DevTools Protocol). The remote Chrome must be launched with --remote-debugging-port=<PORT>. Everything in this skill is one SSH-hop away and stays on the user's machine — no data leaves their box except the response payload you fetch back.
Defaults (override via env vars)
SSH_TARGET="${CHROME_HOST:-steamy-wsl}"
CHROME_PORT="${CHROME_PORT:-9222}"
POWERSHELL="${CHROME_POWERSHELL:-/mnt/c/Windows/System32/WindowsPowerShell/v1.0/powershell.exe}"
REMOTE_DIR="${CHROME_REMOTE_DIR:-/mnt/c/screens}"
NATIVE_DIR="${CHROME_NATIVE_DIR:-C:\\screens}"
CHROME_SKILL_DIR="${CHROME_SKILL_DIR:?set to this installed skill directory}"
The host needs a Chrome started like:
chrome.exe --remote-debugging-port=9222 --user-data-dir=C:\chrome-debug
On the default host there's a "Chrome (debug)" desktop shortcut wired to this. If curl -s http://127.0.0.1:9222/json from the remote returns nothing, ask the user to launch the debug Chrome and open the target page in that window — a normal Chrome session won't expose CDP.
Sanity check (do this first)
ssh "$SSH_TARGET" "$POWERSHELL -NoProfile -Command \"try { Invoke-RestMethod -Uri http://127.0.0.1:$CHROME_PORT/json/version -TimeoutSec 3 | Select-Object Browser,'User-Agent' } catch { 'CDP_DOWN' }\""
If you see CDP_DOWN, keep the scope narrow:
- For explicit "my Chrome", "my tabs", cookies, or steamy personal-session tasks, ask the user to start the debug Chrome and open the target page in that window.
- For generic web automation or sandbox browser work, switch to the appropriate skill instead of stretching this one.
Everything below assumes the selected CDP endpoint is live.
List tabs
ssh "$SSH_TARGET" "$POWERSHELL -NoProfile -Command \"(Invoke-RestMethod -Uri http://127.0.0.1:$CHROME_PORT/json -TimeoutSec 3) | Where-Object { \\\$_.type -eq 'page' } | ForEach-Object { '{0} :: {1}' -f \\\$_.title, \\\$_.url }\""
Pick a tab by title or URL substring — every helper below takes a -Pattern that does a case-insensitive substring match against both fields.
Workhorse — generic CDP call
scripts/cdp-call.ps1 opens a WebSocket to a tab (or to the browser endpoint), sends one JSON-RPC call, and prints the raw response. Stage it once per session:
scp -q "$CHROME_SKILL_DIR/scripts/cdp-call.ps1" "$SSH_TARGET:$REMOTE_DIR/cdp-call.ps1"
scp -q "$CHROME_SKILL_DIR/scripts/cdp-shot.ps1" "$SSH_TARGET:$REMOTE_DIR/cdp-shot.ps1"
cdp() {
local pat="$1" method="$2" params="${3:-{\}}"
ssh "$SSH_TARGET" "$POWERSHELL -NoProfile -ExecutionPolicy Bypass -File '$NATIVE_DIR\\cdp-call.ps1' -Pattern '$pat' -Port $CHROME_PORT -Method '$method' -Params '$params'" 2>/dev/null
}
Now any CDP method works:
cdp github 'Page.navigate' '{"url":"https://example.com"}' | jq
cdp 'github.com' 'Runtime.evaluate' '{"expression":"document.title","returnByValue":true}' | jq .result.result.value
cdp '' 'Network.getCookies' '{}' | jq '.result.cookies | length'
-Browser switches to the browser-wide endpoint (for things like Target.getTargets, Browser.getVersion).
Tab screenshot — works even if minimized
CDP renders off-screen; window state doesn't matter.
chrome_shot() {
local pat="$1"
local name=$(ssh "$SSH_TARGET" "$POWERSHELL -NoProfile -ExecutionPolicy Bypass -File '$NATIVE_DIR\\cdp-shot.ps1' -Pattern '$pat' -Port $CHROME_PORT -OutDir '$NATIVE_DIR'" 2>/dev/null | tr -d '\r\n')
[ -z "$name" ] && { echo "no tab matched '$pat'"; return 1; }
local dest="${CLAUDE_JOB_DIR:-/tmp}/$name"
ssh "$SSH_TARGET" "cat \"$REMOTE_DIR/$name\"" > "$dest"
echo "$dest"
}
chrome_shot 'github.com'
For full-page (beyond-viewport) screenshots, call cdp with Page.captureScreenshot and {"captureBeyondViewport":true}.
Evaluate JS in a tab
chrome_eval() {
local pat="$1" expr="$2"
local params=$(printf '%s' "$expr" | python3 -c 'import sys,json;print(json.dumps({"expression":sys.stdin.read(),"returnByValue":True,"awaitPromise":True}))')
printf '%s' "$params" | ssh "$SSH_TARGET" "$POWERSHELL -NoProfile -ExecutionPolicy Bypass -File '$NATIVE_DIR\\cdp-call.ps1' -Pattern '$pat' -Port $CHROME_PORT -Method Runtime.evaluate -ParamsStdin" \
| jq '.result.result.value // .result.exceptionDetails'
}
chrome_eval github 'document.querySelectorAll("a[href*=\'foo\']").length' # apostrophes safe
chrome_eval github 'fetch("/api/foo").then(r=>r.json())' # awaitPromise unwraps it
returnByValue:true is important — without it CDP returns an objectId reference, not the actual value. The -ParamsStdin switch on cdp-call.ps1 keeps the JS expression out of every shell quoting hazard between bash → ssh → PowerShell.
Console messages
CDP streams console events; to collect them, enable Runtime, then keep the socket open. For a one-shot "show me what's already in the console", scrape via JS instead (Chrome doesn't replay past console events to a new attached client):
chrome_eval github 'console.history?.slice(-50)'
For live capture, use the more complex cdp-listen.ps1 pattern (not bundled — add it if a session keeps wanting it). Or open DevTools manually and ask the user to copy what's there.
Network requests
Same caveat: CDP only sees requests after Network.enable is sent. To inspect prior traffic, ask the user to open DevTools → Network → export HAR, then pull the HAR file via ssh.
For ongoing traffic capture (e.g. "show me what this page is fetching when I click X"), enable Network and stream events:
cdp github 'Network.enable' '{}'
cdp github 'Network.getResponseBody' '{"requestId":"..."}'
Cookies
cdp '' 'Network.getCookies' '{}' | jq '.result.cookies[] | {name, domain, path, expires, size, httpOnly, secure, sameSite}'
cdp '' 'Storage.getCookies' '{}' | jq '.result.cookies | length'
Do not print cookie values by default. Fetch cookie values only when the user explicitly asks and the security risk is understood.
Navigate / reload / close
cdp github 'Page.navigate' '{"url":"https://example.com"}'
cdp github 'Page.reload' '{}'
cdp '' 'Target.closeTarget' '{"targetId":"<id from getTargets>"}'
DOM snapshot
For "what's on this page" without screenshotting:
chrome_eval github 'document.body.innerText.slice(0,2000)'
cdp github 'DOMSnapshot.captureSnapshot' '{"computedStyles":[]}' | jq '.result | keys'
Adapting to another machine
Persist via ~/.claude/settings.json's env block (reloaded per session):
{
"env": {
"CHROME_HOST": "workbox",
"CHROME_PORT": "9223",
"CHROME_REMOTE_DIR": "~/Downloads",
"CHROME_NATIVE_DIR": "",
"CHROME_POWERSHELL": ""
}
}
One-shot override inline:
CHROME_HOST=workbox CHROME_PORT=9223 <paste any snippet>
If the target is non-Windows (macOS/Linux), drop $POWERSHELL and use curl/websocat directly:
ssh "$SSH_TARGET" "curl -s http://127.0.0.1:$CHROME_PORT/json"
How cdp-call.ps1 works
The script opens one WebSocket, sends {"id":1, method, params}, then loops past any unsolicited events (frames without an id field) until it receives the response with id:1. That means it's safe to call against tabs where another tool has already done Runtime.enable / Page.enable — the events get discarded silently rather than confusing the reply.
Notes
- CDP target scopes: tab (page) vs browser.
cdp-call.ps1 -Browser switches. Some methods (Browser., Target.) only work on the browser endpoint.
- One call per ws connection in
cdp-call.ps1. That's fine for ad-hoc work; for streams (Network/Page events), you need a long-lived connection — extend the script if needed.
- Headless Chrome on the same port: doesn't conflict, but you'll get both sets of tabs in
/json. Filter by -Pattern.
- agent-browser fallback: agent-browser spawns its own Chromium locally for automation. Prefer it when the task does not need an existing real Chrome profile/session, especially after CDP is unavailable.
- Sister skill
screenshots handles full-desktop captures (which CDP can't see). Use chrome_shot when you want a specific tab; use screenshots Mode 2 when you want the whole monitor.