Run any Skill in Manus with one click

$pwd:

open-browser

Name: Open Browser
Author: softpudding

// Drive a real Chrome browser through the local OpenBrowser service for interactive website tasks that require rendered-page inspection, clicking, typing, scrolling, dialog handling, or multi-step navigation. Use when Codex needs to open websites, fill forms, scrape JS-rendered content, reproduce browser-only issues, or complete end-to-end UI workflows. Prefer direct HTTP/API tools for simple fetches, downloads, or non-visual integrations.

Run Skill in Manus

$ git log --oneline --stat

stars:182

forks:18

updated:March 31, 2026 at 11:36

File Explorer

7 files

SKILL.md

readonly

name

open-browser

description

Drive a real Chrome browser through the local OpenBrowser service for interactive website tasks that require rendered-page inspection, clicking, typing, scrolling, dialog handling, or multi-step navigation. Use when Codex needs to open websites, fill forms, scrape JS-rendered content, reproduce browser-only issues, or complete end-to-end UI workflows. Prefer direct HTTP/API tools for simple fetches, downloads, or non-visual integrations.

OpenBrowser

Use OpenBrowser as a dedicated browser-operation agent when the task depends on a live Chrome session and visual page state.

Decide Fast

Use this skill for:

Multi-step UI navigation
Form filling and browser interactions
JS-rendered pages that need a real browser
Browser bug reproduction or manual flow verification

Do not use this skill for:

Simple API calls
Static downloads
Local file transformations

Preconditions

Before sending a browser task, confirm all of the following:

The OpenBrowser server is reachable at http://127.0.0.1:8765
The Chrome extension is connected
The OpenBrowser UI already has a valid LLM configuration
A browser UUID is available through OPENBROWSER_CHROME_UUID or --chrome-uuid

Run this first:

python3 skill/codex/open-browser/scripts/check_status.py --chrome-uuid "$OPENBROWSER_CHROME_UUID"

If readiness fails, read references/setup.md or references/troubleshooting.md.

Standard Workflow

Run check_status.py
If the server is down, start it from the repo root with uv run local-chrome-server serve
If the extension, UUID, or API key is missing, pause and ask the user to complete the manual Chrome or UI steps from references/setup.md
Submit the task with send_task.py in foreground mode by default
Read the live SSE output in the terminal and use it to monitor progress in real time
Use background mode only when the task is expected to run long enough that a detached log is safer than an attached stream
Summarize the final browser outcome, any failures, and the conversation ID when useful

Run Tasks

Foreground mode is the default and preferred mode for Codex usage because it exposes live SSE events in the current terminal:

python3 skill/codex/open-browser/scripts/send_task.py \
  "Open https://example.com and report the page title" \
  --chrome-uuid "$OPENBROWSER_CHROME_UUID"

Prefer foreground mode for most tasks, including normal multi-step browser work, so you can inspect actions, observations, and usage metrics as they happen.

Use background mode only as a fallback for long-running tasks or when you explicitly need detached execution:

python3 skill/codex/open-browser/scripts/send_task.py \
  "Open https://example.com and click the sign in button" \
  --chrome-uuid "$OPENBROWSER_CHROME_UUID" \
  --background \
  --output /tmp/openbrowser.log
sleep 120
tail -n 80 /tmp/openbrowser.log

If you choose background mode, use larger waits for longer flows:

60-90 seconds for a single navigation
120-180 seconds for a short multi-step task
300+ seconds for long workflows or debugging sessions

Working Directory

Run commands from the OpenBrowser repo root so the relative script paths resolve cleanly.

Use --cwd when the browser task should operate with context from another workspace:

python3 skill/codex/open-browser/scripts/send_task.py \
  "Open the local app and verify the login flow" \
  --cwd /absolute/path/to/project \
  --chrome-uuid "$OPENBROWSER_CHROME_UUID"

Failure Handling

If the task does not start or fails immediately:

Re-run check_status.py
Verify that the browser UUID is still valid
Inspect the live foreground stream first; if you used background mode, inspect /tmp/openbrowser.log or your chosen log file
Read references/troubleshooting.md

If you need lower-level control or want to inspect conversations directly, read references/api_reference.md.

References

references/setup.md: Read when OpenBrowser is not ready yet
references/troubleshooting.md: Read when connectivity, UUID, or task execution fails
references/api_reference.md: Read when scripting against the HTTP API or inspecting conversation state

related-skills.json

same repository

ob-routines.md

from "softpudding/OpenBrowser"

Record, compile, and replay Browser Routines — saved, named browser workflows. (Alias for openbrowser-routines.) Supports subcommands: "list [query]" to list/search routines, "new" to record a new routine, "execute <name>" to replay a saved routine. Use when the user says "list routines", "record a routine", "replay X", "execute X", or "/ob-routines <subcommand>".

2026-04-24182

open-browser.md

from "softpudding/OpenBrowser"

Drive a real Chrome browser through the local OpenBrowser service for interactive website tasks that require rendered-page inspection, clicking, typing, scrolling, dialog handling, or multi-step navigation. Use when Claude Code needs to open websites, fill forms, scrape JS-rendered content, reproduce browser-only issues, verify a frontend change end-to-end, or complete UI workflows. Prefer direct HTTP/API tools for simple fetches, downloads, or non-visual integrations.

2026-04-18182

open-browser.md

from "softpudding/OpenBrowser"

Visual AI browser automation via OpenBrowser Agent. Use when the user asks to "automate browser", "control Chrome", "browse website with AI", "use OpenBrowser", "run browser automation", or mentions web scraping, form filling, UI testing. Advantages over Browser Relay based on evaluation with human-like interactive web tasks (multi-step workflows, form interactions, agent dialogs): (1) 100% pass rate vs 85.7%, (2) Isolated context prevents overflow, (3) Handles complex tasks that Browser Relay fails. Prefer for complex multi-step workflows; simple page visits can use Browser Relay.

2026-03-20182

package.json

"author": "softpudding"

"repository": "softpudding/OpenBrowser"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

$ useful --forSOC

Computer Occupations, All OtherComputer and Mathematical Occupations15-1299L4

name

open-browser

description

OpenBrowser

Use OpenBrowser as a dedicated browser-operation agent when the task depends on a live Chrome session and visual page state.

Decide Fast

Use this skill for:

Multi-step UI navigation
Form filling and browser interactions
JS-rendered pages that need a real browser
Browser bug reproduction or manual flow verification

Do not use this skill for:

Simple API calls
Static downloads
Local file transformations

Preconditions

Before sending a browser task, confirm all of the following:

The OpenBrowser server is reachable at http://127.0.0.1:8765
The Chrome extension is connected
The OpenBrowser UI already has a valid LLM configuration
A browser UUID is available through OPENBROWSER_CHROME_UUID or --chrome-uuid

Run this first:

python3 skill/codex/open-browser/scripts/check_status.py --chrome-uuid "$OPENBROWSER_CHROME_UUID"

If readiness fails, read references/setup.md or references/troubleshooting.md.

Standard Workflow

Run check_status.py
If the server is down, start it from the repo root with uv run local-chrome-server serve
If the extension, UUID, or API key is missing, pause and ask the user to complete the manual Chrome or UI steps from references/setup.md
Submit the task with send_task.py in foreground mode by default
Read the live SSE output in the terminal and use it to monitor progress in real time
Use background mode only when the task is expected to run long enough that a detached log is safer than an attached stream
Summarize the final browser outcome, any failures, and the conversation ID when useful

Run Tasks

Foreground mode is the default and preferred mode for Codex usage because it exposes live SSE events in the current terminal:

python3 skill/codex/open-browser/scripts/send_task.py \
  "Open https://example.com and report the page title" \
  --chrome-uuid "$OPENBROWSER_CHROME_UUID"

Prefer foreground mode for most tasks, including normal multi-step browser work, so you can inspect actions, observations, and usage metrics as they happen.

Use background mode only as a fallback for long-running tasks or when you explicitly need detached execution:

python3 skill/codex/open-browser/scripts/send_task.py \
  "Open https://example.com and click the sign in button" \
  --chrome-uuid "$OPENBROWSER_CHROME_UUID" \
  --background \
  --output /tmp/openbrowser.log
sleep 120
tail -n 80 /tmp/openbrowser.log

If you choose background mode, use larger waits for longer flows:

60-90 seconds for a single navigation
120-180 seconds for a short multi-step task
300+ seconds for long workflows or debugging sessions

Working Directory

Run commands from the OpenBrowser repo root so the relative script paths resolve cleanly.

Use --cwd when the browser task should operate with context from another workspace:

python3 skill/codex/open-browser/scripts/send_task.py \
  "Open the local app and verify the login flow" \
  --cwd /absolute/path/to/project \
  --chrome-uuid "$OPENBROWSER_CHROME_UUID"

Failure Handling

If the task does not start or fails immediately:

Re-run check_status.py
Verify that the browser UUID is still valid
Inspect the live foreground stream first; if you used background mode, inspect /tmp/openbrowser.log or your chosen log file
Read references/troubleshooting.md

If you need lower-level control or want to inspect conversations directly, read references/api_reference.md.

References

references/setup.md: Read when OpenBrowser is not ready yet
references/troubleshooting.md: Read when connectivity, UUID, or task execution fails
references/api_reference.md: Read when scripting against the HTTP API or inspecting conversation state

open-browser

OpenBrowser

Decide Fast

Preconditions

Standard Workflow

Run Tasks

Working Directory

Failure Handling

References

More from this repository

More from this repository

OpenBrowser

Decide Fast

Preconditions

Standard Workflow

Run Tasks

Working Directory

Failure Handling

References