ワンクリックでManusで任意のスキルを実行

$pwd:

browser-automation

Name: Browser Automation
Author: agent0ai

// Use for complex Agent Zero browser automation, including multi-tab browsing, screenshots, forms, uploads, raw pointer/keyboard actions, host-vs-container browser mode, and visual verification workflows.

Manusで実行

$ git log --oneline --stat

stars:17,775

forks:3,628

updated:2026年5月22日 07:50

SKILL.md

readonly

name	browser-automation
description	Use for complex Agent Zero browser automation, including multi-tab browsing, screenshots, forms, uploads, raw pointer/keyboard actions, host-vs-container browser mode, and visual verification workflows.

Browser Tool

Use the browser tool for rendered pages, forms, logins, downloads, JavaScript-heavy sites, screenshots, and visual inspection. Prefer search_engine or document_query for plain text research.

Core Workflow

open creates a browser tab and returns a browser_id.
content returns readable markdown plus typed refs like [link 3], [button 6], [input text 8].
Interact with refs using click, type, submit, scroll, etc.
Use navigate on an existing browser_id for serial browsing.
Keep only a small working tab set; close pages when finished.

Modes

The same tool may run in Docker container mode or A0 CLI host-browser mode, depending on project/plugin settings.

Container mode: browser and upload paths resolve inside the Agent Zero container.
Host mode: browser and upload paths resolve on the connected A0 CLI host machine.

In host mode, page content and screenshots may be blocked by host-content policy when remote models are active.

Screenshots And Vision

Screenshots are explicit only; the browser does not automatically load images into model context.

Call browser with action: "screenshot".
Call vision_load with the returned vision_load.tool_args.paths value.
Reason from the latest loaded screenshot.

Screenshot args include quality, full_page, and optional path. Without path, the screenshot is an ephemeral ref consumed by vision_load; with path, PNG is used when path ends with .png, otherwise JPEG is used.

Forms And Files

select_option works for native selects and detectable ARIA listbox/combobox controls.
set_checked works for checkbox, radio, switch, and toggle-like refs.
upload_file works for file input refs or associated labels; verify the file exists in the active browser environment.
For fragile forms, load skill browser-form-workflows.

Pointer And Keyboard

hover, double_click, right_click, and drag accept refs or viewport coordinates.
Coordinates are Chromium viewport CSS pixels and match screenshots.
key_chord presses keys in order and releases in reverse.
clipboard actions are copy, cut, or paste.
set_viewport resizes the page viewport.

Tabs And Popups

Popups and target-blank tabs are auto-registered.
list shows open tabs; pass include_content: true sparingly.
set_active deliberately changes focus.
Operations on a non-active tab do not steal focus unless browser rules require it.

Browser Action Multi

multi is only a browser action, never a top-level tool. Use:

{
  "tool_name": "browser",
  "tool_args": {
    "action": "multi",
    "calls": [
      {"action": "content", "browser_id": 1},
      {"action": "screenshot", "browser_id": 2}
    ]
  }
}

Use browser action multi for parallel reads across tabs. Avoid mutating the same tab twice in one batch unless serial order is intended.

related-skills.json

同じリポジトリ

host-computer-use.md

from "agent0ai/agent-zero"

Beta desktop control through the connected A0 CLI host. Use for screenshots, screen inspection, menus, native app UI, OS-level clicking, scrolling, typing, or checking computer_use_remote status. Do not use for ordinary browser navigation; host browser requests should use the browser tool.

2026-05-2217.8k

linux-desktop.md

from "agent0ai/agent-zero"

Use when the user asks Agent Zero to operate the built-in Linux Desktop, XFCE apps, LibreOffice GUI apps, file manager, terminal, or visual desktop workflows.

2026-05-2217.8k

calc-spreadsheets.md

from "agent0ai/agent-zero"

Use when creating, opening, or editing LibreOffice Calc ODS spreadsheets, or XLSX workbooks only when Excel compatibility is explicitly required.

2026-05-2217.8k

impress-presentations.md

from "agent0ai/agent-zero"

Use when creating, opening, or editing LibreOffice Impress ODP presentations, or PPTX decks only when PowerPoint compatibility is explicitly required.

2026-05-2217.8k

markdown-documents.md

from "agent0ai/agent-zero"

Use when creating or editing Markdown documents, notes, reports, briefs, drafts, or other editable writing where Markdown should be the primary artifact format.

2026-05-2217.8k

office-artifacts.md

from "agent0ai/agent-zero"

Use when creating, opening, reading, or editing Office artifacts such as LibreOffice-native ODT/ODS/ODP files and compatibility DOCX/XLSX/PPTX files with the office_artifact tool.

2026-05-2217.8k

package.json

"author": "agent0ai"

"repository": "agent0ai/agent-zero"

GitHub リポジトリを開く Creator のリポジトリを見る

$ install --global

$ download --local

Manusで実行

$ useful --forSOC

ソフトウェア開発者コンピュータ・数学職15-1252L4

name	browser-automation
description	Use for complex Agent Zero browser automation, including multi-tab browsing, screenshots, forms, uploads, raw pointer/keyboard actions, host-vs-container browser mode, and visual verification workflows.

Browser Tool

Use the browser tool for rendered pages, forms, logins, downloads, JavaScript-heavy sites, screenshots, and visual inspection. Prefer search_engine or document_query for plain text research.

Core Workflow

open creates a browser tab and returns a browser_id.
content returns readable markdown plus typed refs like [link 3], [button 6], [input text 8].
Interact with refs using click, type, submit, scroll, etc.
Use navigate on an existing browser_id for serial browsing.
Keep only a small working tab set; close pages when finished.

Modes

The same tool may run in Docker container mode or A0 CLI host-browser mode, depending on project/plugin settings.

Container mode: browser and upload paths resolve inside the Agent Zero container.
Host mode: browser and upload paths resolve on the connected A0 CLI host machine.

In host mode, page content and screenshots may be blocked by host-content policy when remote models are active.

Screenshots And Vision

Screenshots are explicit only; the browser does not automatically load images into model context.

Call browser with action: "screenshot".
Call vision_load with the returned vision_load.tool_args.paths value.
Reason from the latest loaded screenshot.

Forms And Files

select_option works for native selects and detectable ARIA listbox/combobox controls.
set_checked works for checkbox, radio, switch, and toggle-like refs.
upload_file works for file input refs or associated labels; verify the file exists in the active browser environment.
For fragile forms, load skill browser-form-workflows.

Pointer And Keyboard

hover, double_click, right_click, and drag accept refs or viewport coordinates.
Coordinates are Chromium viewport CSS pixels and match screenshots.
key_chord presses keys in order and releases in reverse.
clipboard actions are copy, cut, or paste.
set_viewport resizes the page viewport.

Tabs And Popups

Popups and target-blank tabs are auto-registered.
list shows open tabs; pass include_content: true sparingly.
set_active deliberately changes focus.
Operations on a non-active tab do not steal focus unless browser rules require it.

Browser Action Multi

multi is only a browser action, never a top-level tool. Use:

{
  "tool_name": "browser",
  "tool_args": {
    "action": "multi",
    "calls": [
      {"action": "content", "browser_id": 1},
      {"action": "screenshot", "browser_id": 2}
    ]
  }
}

Use browser action multi for parallel reads across tabs. Avoid mutating the same tab twice in one batch unless serial order is intended.

browser-automation

Browser Tool

Core Workflow

Modes

Screenshots And Vision

Forms And Files

Pointer And Keyboard

Tabs And Popups

Browser Action Multi

このリポジトリの他の Skills

このリポジトリの他の Skills

Browser Tool

Core Workflow

Modes

Screenshots And Vision

Forms And Files

Pointer And Keyboard

Tabs And Popups

Browser Action Multi