원클릭으로 Manus에서 모든 스킬 실행

$pwd:

agent-browser

Name: Agent Browser
Author: alchemiststudiosDOTai

// Browser automation for AI agents (Linux/macOS/Windows). Use when the user needs to interact with websites, navigate pages, fill forms, click buttons, take screenshots, extract data, test web apps, or automate browser tasks. Triggers include "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data", "test this web app", "login to a site", or any task requiring programmatic web interaction.

Manus에서 실행

$ git log --oneline --stat

stars:1

forks:0

updated:2026년 3월 25일 00:31

파일 탐색기

5 개 파일

SKILL.md

readonly

related-skills.json

같은 저장소

harness-map.md

from "alchemiststudiosDOTai/pi-harness-skills"

Map a repository's mechanical harness layers: canonical check command, local and CI gates, architecture boundaries, structural rules, behavioral verification, docs ratchets, evidence workflows, and operator-facing surfaces. Use when you need to understand how a repo keeps change safe.

2026-03-251

plan-phase.md

from "alchemiststudiosDOTai/pi-harness-skills"

Generate execution-ready implementation plans (PLUS per-task ticket files) from research docs — planning ONLY, no fixing or verifying. North Star is whether a JR developer can execute the work with zero additional context.

2026-03-251

docs-frontmatter-ontology.md

from "alchemiststudiosDOTai/pi-harness-skills"

Enforce repository Markdown docs with required YAML frontmatter and ontology_relations, and set up a pre-push hook that blocks pushes when any .md file except AGENTS.md is missing required frontmatter keys.

2026-03-181

smart-git.md

from "alchemiststudiosDOTai/pi-harness-skills"

Stage all local changes, generate a commit message with branch context, diff stats, and a truncated inline diff versus origin/<branch>, then commit and push with automatic rebase-and-retry when remote is ahead.

2026-03-181

package.json

"author": "alchemiststudiosDOTai"

"repository": "alchemiststudiosDOTai/pi-harness-skills"

GitHub 저장소 열기 Creator 저장소 보기

$ install --global

$ download --local

Manus에서 실행

$ useful --forSOC

소프트웨어 개발자컴퓨터 및 수학직15-1252L4

name	agent-browser
description	Browser automation for AI agents (Linux/macOS/Windows). Use when the user needs to interact with websites, navigate pages, fill forms, click buttons, take screenshots, extract data, test web apps, or automate browser tasks. Triggers include "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data", "test this web app", "login to a site", or any task requiring programmatic web interaction.
allowed-tools	["Bash"]
writes-to	.artifacts/browser/
hard-guards	["Always re-snapshot after navigation or DOM changes","Close browser session when done","Use content boundaries for untrusted pages"]

Browser Automation with agent-browser

The agent-browser CLI automates Chrome/Chromium via CDP (Chrome DevTools Protocol). Install via npm i -g agent-browser, brew install agent-browser, or cargo install agent-browser.

Installation

Linux (most common)

# Option 1: npm (recommended)
npm install -g agent-browser

# Option 2: cargo (if you have Rust)
cargo install agent-browser

# Then install Chromium
agent-browser install

macOS

# npm
npm install -g agent-browser

# or Homebrew
brew install agent-browser

# Install Chromium
agent-browser install

Windows (WSL2 or native)

# npm (WSL2 recommended for best experience)
npm install -g agent-browser
agent-browser install

Verify Installation

agent-browser --version
agent-browser install  # Downloads Chromium if not present

Dependencies (Linux)

If Chromium fails to launch, you may need system libraries:

# Debian/Ubuntu
sudo apt install libnss3 libnspr4 libatk1.0-0 libatk-bridge2.0-0 libcups2 libdrm2 libxkbcommon0 libxcomposite1 libxdamage1 libxfixes3 libxrandr2 libgbm1 libasound2

# Fedora/RHEL
sudo dnf install nss nspr cups-libs libXcomposite libXdamage libXrandr libXScrnSaver alsa-lib

Quick Start

# Navigate to a URL
agent-browser open https://example.com

# Get interactive elements
agent-browser snapshot -i

# Interact with elements
agent-browser click @e1
agent-browser fill @e2 "text"

# Take screenshot
agent-browser screenshot output.png

# Close when done
agent-browser close

Core Workflow

Every browser automation follows this pattern:

Navigate: agent-browser open <url>
Snapshot: agent-browser snapshot -i (get element refs like @e1, @e2)
Interact: Use refs to click, fill, select
Re-snapshot: After navigation or DOM changes, get fresh refs

agent-browser open https://example.com/form
agent-browser snapshot -i
# Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit"

agent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser snapshot -i  # Check result

Command Chaining

Chain commands with && for efficiency when you don't need intermediate output:

# Chain navigation + wait + screenshot in one call
agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser screenshot page.png

# Chain multiple interactions
agent-browser fill @e1 "text" && agent-browser click @e2

Essential Commands

Navigation

agent-browser open <url>              # Navigate (aliases: goto, navigate)
agent-browser close                   # Close browser
agent-browser back                    # Go back
agent-browser forward                 # Go forward

Snapshot

agent-browser snapshot -i             # Interactive elements with refs (recommended)
agent-browser snapshot -s "#selector"  # Scope to CSS selector
agent-browser snapshot --json          # JSON output for parsing
agent-browser snapshot --json > out.json

Interaction

agent-browser click @e1               # Click element
agent-browser click @e1 --new-tab     # Click and open in new tab
agent-browser fill @e2 "text"         # Clear and type text
agent-browser type @e2 "text"         # Type without clearing
agent-browser select @e3 "option"     # Select dropdown option
agent-browser check @e4               # Check checkbox
agent-browser press Enter             # Press key
agent-browser keyboard type "text"    # Type at current focus
agent-browser scroll down 500         # Scroll page

Get Information

agent-browser get text @e1           # Get element text
agent-browser get url                 # Get current URL
agent-browser get title               # Get page title
agent-browser get text body > page.txt  # Get all page text

Wait

agent-browser wait @e1                # Wait for element
agent-browser wait --load networkidle # Wait for network idle
agent-browser wait --url "**/page"    # Wait for URL pattern
agent-browser wait 2000               # Wait milliseconds
agent-browser wait --text "Welcome"   # Wait for text to appear

Capture

agent-browser screenshot              # Screenshot to temp dir
agent-browser screenshot page.png     # Screenshot to file
agent-browser screenshot --full       # Full page screenshot
agent-browser screenshot --annotate   # Annotated with element labels
agent-browser pdf output.pdf          # Save as PDF

Network

agent-browser network requests                 # Inspect tracked requests
agent-browser network requests --type xhr,fetch  # Filter by type
agent-browser network route "**/api/*" --abort  # Block matching requests
agent-browser network har start                # Start HAR recording
agent-browser network har stop ./capture.har   # Stop and save

Device & Viewport

agent-browser set viewport 1920 1080    # Set viewport size
agent-browser set viewport 1920 1080 2 # With 2x retina scale
agent-browser set device "iPhone 14"    # Emulate device
agent-browser set media dark            # Dark mode

State Persistence

agent-browser state save ./auth.json    # Save session state
agent-browser state load ./auth.json    # Load session state
agent-browser --session myapp open ...  # Named session (auto-save)

Authentication Patterns

Option 1: Import from Running Browser

agent-browser --auto-connect state save ./auth.json
agent-browser --state ./auth.json open https://app.example.com

Option 2: Auth Vault (Recommended)

# Save credentials (encrypted)
echo "$PASSWORD" | agent-browser auth save myapp --url https://app.example.com --username user --password-stdin

# Login later
agent-browser auth login myapp
agent-browser auth list

Option 3: Session Persistence

agent-browser --session-name myapp open https://app.example.com/login
# ... login flow ...
agent-browser close  # State auto-saved

# Next time: auto-restored
agent-browser --session-name myapp open https://app.example.com/dashboard

Common Patterns

Form Submission

agent-browser open https://example.com/signup
agent-browser snapshot -i
agent-browser fill @e1 "Jane Doe"
agent-browser fill @e2 "jane@example.com"
agent-browser select @e3 "California"
agent-browser check @e4
agent-browser click @e5
agent-browser wait --load networkidle

Data Extraction

agent-browser open https://example.com/products
agent-browser snapshot -i
agent-browser get text @e5           # Get specific element
agent-browser get text body > page.txt  # Get all text
agent-browser snapshot --json > data.json  # Structured data

Visual Verification (Diff)

agent-browser snapshot -i          # Take baseline
agent-browser click @e2             # Perform action
agent-browser diff snapshot         # Compare current vs last

# Compare two pages
agent-browser diff url https://staging.example.com https://prod.example.com --screenshot

Parallel Sessions

agent-browser --session site1 open https://site-a.com
agent-browser --session site2 open https://site-b.com
agent-browser --session site1 snapshot -i
agent-browser --session site2 snapshot -i

Connect to Existing Chrome

agent-browser --auto-connect open https://example.com
agent-browser --cdp 9222 snapshot

Security

Content Boundaries (Recommended)

Wrap page output in markers to distinguish from tool output:

export AGENT_BROWSER_CONTENT_BOUNDARIES=1
agent-browser snapshot

Domain Allowlist

export AGENT_BROWSER_ALLOWED_DOMAINS="example.com,*.example.com"

Output Limits

export AGENT_BROWSER_MAX_OUTPUT=50000

Batch Execution

Run multiple commands efficiently:

echo '[
  ["open", "https://example.com"],
  ["snapshot", "-i"],
  ["click", "@e1"],
  ["screenshot", "result.png"]
]' | agent-browser batch --json

Advanced: JavaScript Evaluation

# Simple expressions
agent-browser eval 'document.title'
agent-browser eval 'document.querySelectorAll("img").length'

# Complex JS: use --stdin
agent-browser eval --stdin <<'EVALEOF'
JSON.stringify(Array.from(document.querySelectorAll("a")).map(a => a.href))
EVALEOF

iOS Simulator (macOS)

agent-browser device list
agent-browser -p ios --device "iPhone 16 Pro" open https://example.com
agent-browser -p ios snapshot -i
agent-browser -p ios tap @e1
agent-browser -p ios screenshot mobile.png

Requirements: macOS with Xcode and Appium (npm install -g appium && appium driver install xcuitest)

Important: Ref Lifecycle

Refs (@e1, @e2) are invalidated when the page changes. Always re-snapshot after:

Clicking links or buttons that navigate
Form submissions
Dynamic content loading

agent-browser click @e5              # Navigates to new page
agent-browser snapshot -i            # MUST re-snapshot
agent-browser click @e1              # Use new refs

Session Cleanup

Always close when done:

agent-browser close

For ephemeral environments, auto-shutdown after inactivity:

AGENT_BROWSER_IDLE_TIMEOUT_MS=60000 agent-browser open example.com

Reference Commands

# Check console errors
agent-browser console
agent-browser errors

# Open DevTools
agent-browser inspect

# Record session
agent-browser record start demo.webm
agent-browser record stop

Output Artifacts

Save screenshots and snapshots to .artifacts/browser/ for organized output:

mkdir -p .artifacts/browser/screenshots
agent-browser screenshot .artifacts/browser/screenshots/page.png
agent-browser snapshot --json > .artifacts/browser/snapshot.json

agent-browser

이 저장소의 다른 Skills

이 저장소의 다른 Skills

Browser Automation with agent-browser

Installation

Linux (most common)

macOS

Windows (WSL2 or native)

Verify Installation

Dependencies (Linux)

Quick Start

Core Workflow

Command Chaining

Essential Commands

Navigation

Snapshot

Interaction

Get Information

Wait

Capture

Network

Device & Viewport

State Persistence

Authentication Patterns

Option 1: Import from Running Browser

Option 2: Auth Vault (Recommended)

Option 3: Session Persistence

Common Patterns

Form Submission

Data Extraction

Visual Verification (Diff)

Parallel Sessions

Connect to Existing Chrome

Security

Content Boundaries (Recommended)

Domain Allowlist

Output Limits

Batch Execution

Advanced: JavaScript Evaluation

iOS Simulator (macOS)

Important: Ref Lifecycle

Session Cleanup

Reference Commands

Output Artifacts

Browser Automation with agent-browser

Installation

Linux (most common)

macOS

Windows (WSL2 or native)

Verify Installation

Dependencies (Linux)

Quick Start

Core Workflow

Command Chaining

Essential Commands

Navigation

Snapshot

Interaction

Get Information

Wait

Capture

Network

Device & Viewport

State Persistence

Authentication Patterns

Option 1: Import from Running Browser

Option 2: Auth Vault (Recommended)

Option 3: Session Persistence

Common Patterns

Form Submission

Data Extraction

Visual Verification (Diff)

Parallel Sessions

Connect to Existing Chrome

Security

Content Boundaries (Recommended)

Domain Allowlist

Output Limits

Batch Execution

Advanced: JavaScript Evaluation