Exécutez n'importe quel Skill dans Manus
en un clic

Exécutez n'importe quel Skill dans Manus en un clic

$pwd:

gsd-browser

Name: Gsd Browser
Author: gsd-build

// Native Rust browser automation CLI for AI agents. Use when the user needs to interact with websites, navigate pages, fill forms, click buttons, take screenshots, share a live browser view, narrate browser actions, extract structured data, run assertions, test web apps, mock network requests, or automate any browser task. Triggers include "open a website", "fill out a form", "take a screenshot", "show me the browser", "share the screen", "pause the browser", "step through this", "scrape data", "test this web app", "login to a site", "visual regression test", or any task requiring programmatic web interaction.

Exécuter dans Manus

$ git log --oneline --stat

stars:234

forks:14

updated:1 mai 2026 à 19:26

Explorateur de fichiers

12 fichiers

SKILL.md

readonly

name	gsd-browser
description	Native Rust browser automation CLI for AI agents. Use when the user needs to interact with websites, navigate pages, fill forms, click buttons, take screenshots, share a live browser view, narrate browser actions, extract structured data, run assertions, test web apps, mock network requests, or automate any browser task. Triggers include "open a website", "fill out a form", "take a screenshot", "show me the browser", "share the screen", "pause the browser", "step through this", "scrape data", "test this web app", "login to a site", "visual regression test", or any task requiring programmatic web interaction.
allowed-tools	Bash(gsd-browser:), Bash(gsd-browser )

<essential_principles>

The daemon auto-starts on browser commands. gsd-browser daemon health reports state and does not start a session. Use gsd-browser daemon start only when you want to pre-warm or verify daemon lifecycle explicitly.

Always re-snapshot after page changes. Refs are versioned (@v1:e1, @v2:e3). After navigation, form submission, or dynamic content loading, old refs are stale. Run gsd-browser snapshot to get fresh refs before interacting.

Use --json when parsing output. Use text mode when reading output yourself. Use --json when you need to extract values programmatically.

Positional args have no flag prefix. Commands like click, type, hover take positional args. Do NOT add --selector:

gsd-browser click "button.submit" (correct)
gsd-browser click --selector "button.submit" (WRONG)

Core workflow pattern: Every browser automation follows: navigate -> snapshot -> interact -> re-snapshot (after DOM changes).

gsd-browser navigate https://example.com
gsd-browser snapshot
# Read snapshot output: @v1:e1 [input type="email"], @v1:e2 [button] "Submit"
gsd-browser fill-ref @v1:e1 "user@example.com"
gsd-browser click-ref @v1:e2
gsd-browser wait-for --condition network_idle
gsd-browser snapshot  # REQUIRED - old refs are now stale

Command chaining: Use && when you don't need intermediate output. Run separately when you need to parse output first (e.g., snapshot to discover refs, then interact).

Use the live viewer when the user wants to watch or direct the browser. gsd-browser view opens a localhost viewer with live frames, narrated action history, ref overlays, and pause/step/resume/abort controls. Keep using CLI commands for actions; the viewer is the shared screen and control surface.

Global options available on all commands:

Flag	Purpose
`--json`	Structured JSON output
`--browser-path <path>`	Path to Chrome/Chromium
`--cdp-url <url>`	Attach to an already-running Chrome instance
`--session <name>`	Named session for parallel instances
`--no-narration-delay`	Skip narration lead-time sleeps while keeping history/events

</essential_principles>

Based on what the user needs, read the appropriate workflow:

User intent	Workflow
Navigate, click, type, fill forms, interact with pages	`workflows/navigate-and-interact.md`
Share the browser screen, narrate actions, pause/step/resume/abort	`workflows/live-viewer-and-narration.md`
Scrape data, extract content, read page structure	`workflows/scrape-and-extract.md`
Test pages, run assertions, visual regression, mock network	`workflows/test-and-assert.md`
Debug issues, check logs, diagnose problems	`workflows/debug-and-diagnose.md`
Install, configure, set up sessions	`workflows/setup-and-configure.md`

After reading the workflow, follow it. Load references only when the workflow directs you to.

<reference_index>

All domain knowledge in references/:

Commands: command-reference.md (complete command syntax) Snapshots: snapshot-and-refs.md (versioned refs, snapshot modes) Intents: semantic-intents.md (15 predefined intents for find-best/act) Errors: error-recovery.md (common errors and fixes) Config: configuration.md (TOML config, env vars, 5-layer merge)

</reference_index>

<workflows_index>

Workflow	Purpose
navigate-and-interact.md	Page navigation, clicking, typing, forms, intents
live-viewer-and-narration.md	Live shared viewer, narrated history, refs overlay, controls
scrape-and-extract.md	Data extraction, accessibility tree, page source
test-and-assert.md	Assertions, visual regression, network mocking, test generation
debug-and-diagnose.md	Console/network logs, timeline, debug bundles
setup-and-configure.md	Installation, configuration, sessions, daemon management

</workflows_index>

<success_criteria>

Browser automation task is complete when:

Target page state is achieved and verified (via assertions or visual confirmation)
Daemon is stopped if no further browser work is needed (gsd-browser daemon stop)
Extracted data is returned in the expected format
Any saved state (auth, cookies) is persisted for reuse if appropriate

</success_criteria>

related-skills.json

même dépôt

gsd-browser.md

from "gsd-build/gsd-browser"

Native Rust browser automation CLI for AI agents. Use when the user needs to interact with websites — navigating pages, filling forms, clicking buttons, taking screenshots, sharing a live browser view, narrating browser actions, extracting structured data, running assertions, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button", "take a screenshot", "show me the browser", "share the screen", "pause the browser", "step through this", "scrape data from a page", "test this web app", "login to a site", "automate browser actions", "visual regression test", "check for prompt injection", or any task requiring programmatic web interaction.

2026-05-03234

package.json

"author": "gsd-build"

"repository": "gsd-build/gsd-browser"

Ouvrir le dépôt GitHub Voir les dépôts du créateur

$ install --global

$ download --local