Jeden Skill in Manus ausführen
mit einem Klick

Jeden Skill in Manus mit einem Klick ausführen

computer-use

Operate the user's connected desktop apps through Zero Computer Use. Use when a task explicitly asks for zero computer-use, desktop computer use, opening or inspecting a local app, reading a GUI, clicking UI elements, using Slack/Safari/Chrome/Arc/Notion/Finder or another desktop app, taking an app screenshot, or performing a UI workflow that cannot be completed through normal APIs alone.

In Manus ausführen

Sterne66

Forks16

Aktualisiert11. Juni 2026 um 06:29

Quelle

vm0-ai

vm0-ai/vm0-skills

GitHub-Repository öffnen Creator-Repositorys ansehen

Installationsbefehl

Download

In Manus ausführen

Verwandte BerufeSOC

Basierend auf der SOC-Berufsklassifikation

SoftwareentwicklerInformatik- und Mathematikberufe·SOC 15-1252

SKILL.md

readonly

Mehr aus diesem Repository

gleiches Repository

google-maps

vm0-ai/vm0-skills

Google Maps Platform API for geocoding, places, routes, and distance matrices. Use when user mentions "Google Maps", "geocode", "directions", "places API", route matrix, or asks to look up an address, route, or place metadata.

2026-06-1766

fashion-editorial

vm0-ai/vm0-skills

A high-fashion editorial video style - cold desaturated grade, dramatic high-contrast light, monumental clean architecture or backdrop, deliberate model pose, strong silhouette, and luxury material texture. Applies to fashion, beauty, luxury, and personal-brand subjects. Trigger on /fashion-editorial, "Fashion Editorial", "luxury fashion campaign", "high fashion video", or "editorial model film".

2026-06-1666

hand-drawn-fantasy-anime

vm0-ai/vm0-skills

A hand-drawn fantasy animation video style - painterly 2D backgrounds, expressive simple characters, lush nature, soft diffused light, warm storybook color, and gentle magical wonder. Applies to any fantasy, character, nature, or adventure subject. Trigger on /hand-drawn-fantasy-anime, "hand-drawn fantasy animation", "painterly 2D fantasy", "storybook anime", or "warm forest animation".

2026-06-1666

luxury-product

vm0-ai/vm0-skills

A dark luxury product macro video style - premium materials in extreme close-up, black studio, pinpoint specular highlights, ultra-shallow focus, engraved or mechanical detail, and refined reveal pacing. Applies to watches, jewelry, pens, cameras, fragrance caps, and other high-end objects. Trigger on /luxury-product, "Luxury Product", "premium product macro", "dark luxury product video", or "metal detail reveal".

2026-06-1666

sports-performance-ad

vm0-ai/vm0-skills

A sports performance advertising video style - athlete effort, gear and body close-ups, impact rhythm, motion blur, Dutch or low angles, high-contrast desaturated grade, and dramatic rim light. Applies to any sport, athlete, training action, or performance product. Trigger on /sports-performance-ad, "sports performance ad", "athletic commercial", "training commercial", or "sports brand film".

2026-06-1666

playful-editorial

vm0-ai/vm0-skills

Playful-editorial presentation look — saturated colour fields, scalloped sunburst badges and pill chips, oversized stacked headlines. Carnival palette · Archivo + Manrope · pill radius. The design system (visual language).

2026-06-1666

name	computer-use
description	Operate the user's connected desktop apps through Zero Computer Use. Use when a task explicitly asks for zero computer-use, desktop computer use, opening or inspecting a local app, reading a GUI, clicking UI elements, using Slack/Safari/Chrome/Arc/Notion/Finder or another desktop app, taking an app screenshot, or performing a UI workflow that cannot be completed through normal APIs alone.

Computer Use

Overview

Use npx -p @vm0/cli zero computer-use <command> to inspect and operate apps on the connected Zero Desktop host. Treat it as a GUI control surface: read the app state, inspect screenshots, then act through accessibility elements whenever possible.

--app accepts an app bundle id only, such as com.apple.Safari or com.google.Chrome. App names like Safari, Google Chrome, Slack, or WeChat are display labels only and must not be passed to --app.

Core Workflow

List apps and choose the target app's bundleId:

npx -p @vm0/cli zero computer-use list-apps

Use the name field only to identify the app for yourself, then copy the bundleId field. Apps listed without a bundleId cannot be targeted.

Set the bundle id you found:

app_bundle_id="com.apple.Safari"

If the app is not running, open it by bundle id:

npx -p @vm0/cli zero computer-use open-app --app "$app_bundle_id" --timeout 10

Inspect the target app by bundle id:

npx -p @vm0/cli zero computer-use get-app-state --app "$app_bundle_id" --timeout 10

Read the JSON result:

snapshotId is required for follow-up actions against the same state.
appState is expected to be a local file path containing the full accessibility tree. Read or filter that file to find element indexes, roles, labels, URLs, and the focused element.
screenshot is a local image path under /tmp/vm0/computer-use/...; inspect it with the image viewer when visual judgment matters.
During rollout, older CLI versions may still return appState as an inline string. If it is not a path, save the JSON output to a temp file and extract/filter the inline string.

Act on elements first, using the same bundle id and snapshot:

npx -p @vm0/cli zero computer-use click --app "$app_bundle_id" --snapshot-id <snapshotId> --element-index <index>
npx -p @vm0/cli zero computer-use set-value --app "$app_bundle_id" --snapshot-id <snapshotId> --element-index <index> --value "text"
npx -p @vm0/cli zero computer-use type-text --app "$app_bundle_id" --snapshot-id <snapshotId> --text "literal text"
npx -p @vm0/cli zero computer-use press-key --app "$app_bundle_id" --snapshot-id <snapshotId> --key Command+L
npx -p @vm0/cli zero computer-use scroll --app "$app_bundle_id" --snapshot-id <snapshotId> --element-index <index> --direction down --pages 1

Re-read state after every meaningful UI change. Element indexes and coordinates can become stale after navigation, scrolling, or opening a new window.

If a post-action command returns a new snapshotId and appState, use that new snapshot. Otherwise, explicitly run get-app-state again before choosing the next element index or coordinate.

App Identity Rules

Always run list-apps before targeting an app unless you already know the exact bundle id from the same host.
Always pass the bundleId value to --app.
Never pass display names such as Slack, Safari, Google Chrome, or WeChat to --app.
If a command fails with app_not_found and the --app value looks like a name, re-run list-apps, find the app record, and retry with its bundleId.
If the target app has no bundleId in list-apps, report that it is not targetable through Computer Use.

To search the list-apps output for a likely target:

apps_json=/tmp/computer-use-apps.json
npx -p @vm0/cli zero computer-use list-apps > "$apps_json"
node -e "const fs=require('fs'); const q=(process.argv[2]||'').toLowerCase(); const apps=JSON.parse(fs.readFileSync(process.argv[1],'utf8')).apps || []; for (const a of apps) { if (!q || String(a.name || '').toLowerCase().includes(q)) console.log([a.name || '', a.bundleId || 'NO_BUNDLE_ID', a.running ? 'running' : 'not running'].join('\t')); }" "$apps_json" "Slack"

Copy the exact bundleId from the matching row into app_bundle_id.

Reading App State

Accessibility trees can be very large. Prefer the appState file returned by the command:

app_bundle_id="com.apple.Safari"
state_json=/tmp/computer-use-state.json
npx -p @vm0/cli zero computer-use get-app-state --app "$app_bundle_id" --timeout 10 > "$state_json"
app_state_path=$(node -e "const fs=require('fs'); const r=JSON.parse(fs.readFileSync(process.argv[1],'utf8')); console.log(r.appState)" "$state_json")
rg -n "Ethan|showcase|sites\\.vm0|Send now" "$app_state_path"

If the CLI still returns inline app state instead of a path, use the transition workaround:

state_json=/tmp/computer-use-state.json
npx -p @vm0/cli zero computer-use get-app-state --app "$app_bundle_id" --timeout 10 > "$state_json"
node -e "const fs=require('fs'); const s=JSON.parse(fs.readFileSync(process.argv[1],'utf8')).appState; s.split('\\n').forEach((l,i)=>{ if(/Ethan|showcase|sites\\.vm0|Send now/.test(l)) console.log((i+1)+': '+l) })" "$state_json"

Use filtered output to locate clickable link, button, text field, checkbox, or outline row entries and their element indexes.

Action Rules

Use the same bundle id for get-app-state and follow-up actions against that snapshot.
Prefer --element-index or --element over coordinate clicks.
Use --x/--y only when the screenshot clearly shows the target but the accessibility tree has no useful element.
Always pass the matching --snapshot-id from the state used to pick the element or coordinate.
Use open-app --app <bundleId> when the app is not visible or not running.
Use press-key for reliable shortcuts such as Command+L, Command+T, Command+W, Enter, and arrow keys.
Use type-text only after confirming focus is in the right field.
If type-text fails with element_not_editable, click into a text field first or use set-value on a settable text field.
Avoid destructive UI actions unless the user explicitly requested them and the current state confirms the target.

Slack Pattern

For Slack reading tasks:

Run list-apps and find Slack's bundleId from the app record. Slack is often com.tinyspeck.slackmacgap, but use the value returned by the host.
Run get-app-state --app <slackBundleId>.
Inspect the screenshot to understand channel/thread layout.
Filter the appState file for sender names, link domains, message text, composer labels, and Send now.
Click Slack links by element index when the user specifically asked to use the GUI.
If the task is simply to send a message and no GUI interaction is required, prefer zero slack message send for reliability; note that this is a Slack API shortcut, not desktop computer use.

For Slack thread messages, watch for composer options like Also send to #channel. If the user asked to post in the channel rather than only reply in a thread, either select that option in Slack or use zero slack message send -c <channel-id>.

Browser Pattern

After clicking a link from another app:

Use list-apps to find the browser bundle id, such as com.apple.Safari, com.google.Chrome, or the Arc bundle id returned by the host.
Inspect the address field, page title, and screenshot.
Use press-key --app <browserBundleId> --snapshot-id <snapshotId> --key Command+L plus type-text and Enter when direct navigation is faster than finding a link.
For visual review, always inspect the screenshot path. Browser accessibility text alone is not enough for layout, design, or image-heavy pages.

Failure Handling

app_not_found: check whether --app was a display name. If so, retry with the app's bundleId from list-apps. If it was already a bundle id, the app may not be installed, not running, or not targetable on the host.
window_unavailable: try open-app --app <bundleId>, then run get-app-state again. If it still fails, report that the app has no controllable window.
element_not_editable: click into the target text field, re-read state, then use type-text; or use set-value when a settable text field is available.
permission_denied, accessibility_unavailable, or screen_recording_unavailable: report the missing Desktop permission instead of retrying the same command.

Reporting Back

State what was actually done through Computer Use: apps inspected by bundle id, key screenshots reviewed, and whether actions succeeded. If a desktop action times out or the host is unavailable, say that directly and use a stable fallback only when it still satisfies the user request.