Execute qualquer Skill no Manus
com um clique

Execute qualquer Skill no Manus com um clique

$pwd:

kimi-webbridge

Name: Kimi Webbridge
Author: mxyhi

// Kimi WebBridge lets AI control the user's real browser — navigate, click, type, read, screenshot, and interact with any website using the user's actual login sessions. Use this skill whenever the user wants to interact with websites, automate browser tasks, scrape web content, or perform any action requiring a real browser. Also use when the user mentions "browser", "webpage", "open URL", "screenshot", or asks to read/interact with any website. Use even for simple-sounding browser requests — the daemon handles all complexity.

Executar no Manus

$ git log --oneline --stat

stars:389

forks:36

updated:17 de maio de 2026 às 08:37

Explorador de arquivos

3 arquivos

SKILL.md

readonly

related-skills.json

mesmo repositório

grill-with-docs.md

from "mxyhi/ok-skills"

Grilling session that challenges your plan against the existing domain model, sharpens terminology, and updates documentation (CONTEXT.md, ADRs) inline as decisions crystallise. Use when user wants to stress-test a plan against their project's language and documented decisions.

2026-05-31389

planning-with-files.md

from "mxyhi/ok-skills"

Implements Manus-style file-based planning to organize and track progress on complex tasks. Creates task_plan.md, findings.md, and progress.md. Use when asked to plan out, break down, or organize a multi-step project, research task, or any work requiring 5+ tool calls. Supports automatic session recovery after /clear.

2026-05-28389

autoresearch.md

from "mxyhi/ok-skills"

Autonomous iteration loop: modify, verify, keep/discard against any metric

2026-05-24389

improve-codebase-architecture.md

from "mxyhi/ok-skills"

Find deepening opportunities in a codebase, informed by the domain language in CONTEXT.md and the decisions in docs/adr/. Use when the user wants to improve architecture, find refactoring opportunities, consolidate tightly-coupled modules, or make a codebase more testable and AI-navigable.

2026-05-21389

browser-trace.md

from "mxyhi/ok-skills"

Capture a full DevTools-protocol trace of any browser automation — CDP firehose, screenshots, and DOM dumps — then bisect the stream into per-page searchable buckets. Use when the user wants to debug a failed run, audit network/console/DOM activity, attach a trace to an in-progress session, or feed structured per-page summaries back into an agent loop so its next iteration learns from the last one.

2026-05-19389

caveman.md

from "mxyhi/ok-skills"

Ultra-compressed communication mode. Cuts token usage ~75% by speaking like caveman while keeping full technical accuracy. Supports intensity levels: lite, full (default), ultra, wenyan-lite, wenyan-full, wenyan-ultra. Use when user says "caveman mode", "talk like caveman", "use caveman", "less tokens", "be brief", or invokes /caveman. Also auto-triggers when token efficiency is requested.

2026-05-13389

package.json

"author": "mxyhi"

"repository": "mxyhi/ok-skills"

Abrir repositório GitHub Ver repositórios do creator

$ install --global

$ download --local

Executar no Manus

$ useful --forSOC

Desenvolvedores de softwareInformática e Matemática15-1252L4

name

kimi-webbridge

description

Kimi WebBridge lets AI control the user's real browser — navigate, click, type, read, screenshot, and interact with any website using the user's actual login sessions. Use this skill whenever the user wants to interact with websites, automate browser tasks, scrape web content, or perform any action requiring a real browser. Also use when the user mentions "browser", "webpage", "open URL", "screenshot", or asks to read/interact with any website. Use even for simple-sounding browser requests — the daemon handles all complexity.

Kimi WebBridge

Control the user's real browser (with their login sessions) via a local daemon at http://127.0.0.1:10086.

Health check (always do this first)

~/.kimi-webbridge/bin/kimi-webbridge status

Then act on the result:

running: true and extension_connected: true — healthy. Proceed with the tool calls below.
Anything else (command not found, running: false, extension_connected: false, errors) — Read references/operations.md in this skill directory. It has the install / start / diagnose routing table.

Don't guess fixes here — every non-healthy state is handled in references/operations.md.

Tools

Tool	Args	Returns	Note
`navigate`	`url`, `newTab`(bool), `group_title`	`{success, url, tabId}`	Always use `newTab:true` on first call. `group_title` sets the tab group's visible label
`find_tab`	`url`, `active`(bool)	`{success, url, tabId}`	Reuse an already-open tab — pass any URL or domain; `active:true` picks the tab the user is currently viewing, default picks the leftmost match
`snapshot`	—	`{url, title, tree}` with `@e` refs	Accessibility tree (text) — use this to read page content and locate elements
`click`	`selector` (@e ref or CSS)	`{success, tag, text}`	Synthetic `el.click()`
`fill`	`selector`, `value`	`{success, tag, mode}`	Works on `<input>`/`<textarea>` AND `[contenteditable]` (ProseMirror/Lexical/Slate). `mode` is `"value"` or `"contenteditable"`
`evaluate`	`code` (supports async/await)	`{type, value}`
`screenshot`	`format`(png\|jpeg), `quality`(0-100), optional `selector` (@e/CSS)	`{format, dataLength, data}` (base64)	Full page floods context — use helper script (saves to disk, returns path). With `selector` only that element is captured (small, OK to call API directly)
`network`	`cmd`(start\|stop\|list\|detail), `filter`, `requestId`	request/response data
`upload`	`selector`, `files`(string[])	`{success, fileCount}`
`save_as_pdf`	`paper_format`, `landscape`, `scale`, `print_background`, `file_name`	`{path, sizeBytes, mimeType, pageTitle}`	Render current page → PDF, saved under `/tmp/kimi-webbridge-pdfs/`
`list_tabs`	—	`{success, tabs:[{tabId, url, title, active, groupTitle}]}`	Inspect tabs in the current session
`close_tab`	—	`{success, closed: bool}`	Close the current tab in the session
`close_session`	—	`{success, closed: int}`	Close all tabs — `closed` is the count. Always call at task end

Using find_tab

Use find_tab when the user explicitly asks to operate on an already-open tab. It matches by domain, so any URL on the same site works. Without active:true, returns the leftmost matching tab; with active:true, returns the tab the user is currently viewing — pass it when the user says "用我打开的 X" / "在我当前的 X 页面上".

curl -s -X POST http://127.0.0.1:10086/command \
  -d '{"action":"find_tab","args":{"url":"https://www.kimi.com","active":true},"session":"kimi"}'

If find_tab returns "no open tab found", the page is not open — fall back to navigate with newTab:true.

Call Format

curl -s -X POST http://127.0.0.1:10086/command \
  -H 'Content-Type: application/json' \
  -d '{"action":"navigate","args":{"url":"https://example.com","newTab":true}}'

Sessions

Each session maps to a separate browser tab group. Use different session names for different sites to keep operations isolated.

Add "session":"name" to the request body:

curl -s -X POST http://127.0.0.1:10086/command \
  -d '{"action":"navigate","args":{"url":"https://example.com","newTab":true},"session":"my-task"}'

Always assign distinct session names when working with multiple sites in parallel.

Screenshots: Use the Helper Script

Never call the screenshot API directly — it returns base64-encoded image data (hundreds of KB of text) that will flood the context window.

Use scripts/screenshot.sh instead. It decodes and saves the image to disk, returning only the file path:

# Default — saves to /tmp/kimi-webbridge-screenshots/{timestamp}.png
bash "$(dirname "$SKILL_PATH")/scripts/screenshot.sh"

# With a session
bash "$(dirname "$SKILL_PATH")/scripts/screenshot.sh" -s my-task

# Custom output path
bash "$(dirname "$SKILL_PATH")/scripts/screenshot.sh" -o /tmp/page.png

# JPEG format, quality 60
bash "$(dirname "$SKILL_PATH")/scripts/screenshot.sh" -f jpeg -q 60

After getting the file path, use the Read tool to view the image.

If $SKILL_PATH is unavailable, call the script by its absolute path.

Prefer snapshot over CSS/JS selectors

snapshot returns interactive elements with @e refs based on semantic role/name. Use them directly with click/fill — they survive CSS class hash changes that break manually-written selectors.

Fall back to evaluate (JS) only when:

The target has no @e ref in the snapshot
You need attributes not in the snapshot (e.g., href)
You need to dispatch complex event sequences, or scroll

Evaluate Tips

Always use compact JSON.stringify(data) — never add null, 2 formatting. Indentation and newlines can inflate the response several times over, causing truncation during transmission.
evaluate calls share the page's JS realm — re-declaring the same const/let across two calls throws SyntaxError. Wrap in an IIFE for a fresh scope: (() => { const x = ...; return x; })().

Text input — use `fill`

fill handles all three text input shapes. Pass selector (CSS or @e ref) + value:

Target	What `fill` does	Returned `mode`
`<input>` / `<textarea>`	Sets `.value` via native setter, fires `input`/`change`.	`"value"`
`[contenteditable]` (ProseMirror / TipTap / Lexical / Slate / Quill etc.)	Focuses, selects all existing content, calls `document.execCommand('insertText', ...)` which fires `beforeinput`/`input` with `inputType:'insertText'` and `data:value`.	`"contenteditable"`
Other element	Best-effort `.value` + events.	`"value"`

fill is clear-and-insert: existing content is replaced. For "append to existing text", read the current value via evaluate, concatenate, then fill with the result.

Form submit / special keys

There's no separate "press Enter" tool. To submit a form, click the submit button directly (click on the @e ref or selector). To dispatch a key event programmatically (e.g. Escape to close a modal):

{"action":"evaluate","args":{"code":"document.activeElement.dispatchEvent(new KeyboardEvent('keydown',{key:'Escape',bubbles:true}))"}}

Save the current page as PDF

save_as_pdf renders the current page to PDF, writes it to /tmp/kimi-webbridge-pdfs/, and returns the file path (the daemon strips the base64 — agent never sees raw PDF bytes).

All args optional:

paper_format: letter (default) | a4 | legal | a3 | tabloid
landscape: false (default)
scale: 1.0 (default), range [0.1, 2.0]
print_background: true (default) — keep background colors
file_name: caller-supplied name; if absent, derived from page title

Decoded PDF cap is 100 MB. Above that the daemon refuses; reduce scale or split the page.

Known limitations

Sites that strictly check event.isTrusted (some banking portals, captcha challenges) reject fill and click because both go through DOM-level synthetic events (isTrusted=false). This is a product boundary, not a bug — no automation primitive that runs on the user's machine without stealing OS focus can produce trusted events on these sites.
Cross-origin iframes: fill, click, evaluate, and snapshot operate on the top frame. If a target element lives in a same-page iframe from a different origin (e.g. embedded sandbox demos), navigate to the iframe's URL directly instead.

Versions

Daemon, extension, and this skill share a 1:1 version string. Read both via:

~/.kimi-webbridge/bin/kimi-webbridge status
# {"version":"<daemon>", "extension_version":"<extension>"}

If a tool returns an error containing "Please update the Kimi WebBridge extension", the user's extension is older than this skill. Tell the user:

请更新 Kimi WebBridge 浏览器扩展后重试：https://kimi.com/features/webbridge

Don't retry the failed tool. Don't auto-switch skill versions based on extension_version — the pairing protocol isn't finalized.

kimi-webbridge

Mais deste repositório

Mais deste repositório

Kimi WebBridge

Health check (always do this first)

Tools

Using find_tab

Call Format

Sessions

Screenshots: Use the Helper Script

Prefer snapshot over CSS/JS selectors

Evaluate Tips

Text input — use fill

Form submit / special keys

Save the current page as PDF

Known limitations

Versions

Kimi WebBridge

Health check (always do this first)

Tools

Using find_tab

Call Format

Sessions

Screenshots: Use the Helper Script

Prefer snapshot over CSS/JS selectors

Evaluate Tips

Text input — use fill

Form submit / special keys

Save the current page as PDF

Known limitations

Versions

Text input — use `fill`

Text input — use `fill`