com um clique
cua-control
// Mouse, keyboard, and scroll primitives for cua_click / cua_type / cua_key / cua_scroll. Reference for action shapes, modifier keys, key chord syntax, and screenshot-driven targeting workflows.
// Mouse, keyboard, and scroll primitives for cua_click / cua_type / cua_key / cua_scroll. Reference for action shapes, modifier keys, key chord syntax, and screenshot-driven targeting workflows.
Drive the user's host machine directly through Cua, with no sandbox. Maps to Cua's Localhost API. Higher risk - the agent can click, type, and run shell commands on the real desktop. Use only when the user explicitly opts in and read the safety section first. Triggers - "control my computer", "click here", "use my real Chrome", "no sandbox", "내 컴퓨터 자동화".
pi-cua-integration is a Pi/Senpi extension that bridges Cua (trycua/cua) for computer-use sandboxes and host control. MUST USE when the user asks to drive a desktop UI, take screenshots, run a "computer use" task, control a sandbox, or invoke any cua_* tool. Triggers - "cua", "computer use", "sandbox", "screenshot the desktop", "click the screen", "take over the browser", "샌드박스 켜줘", "스크린샷 찍어", "내 컴퓨터에서 자동화".
Use Cua's cloud sandboxes via cua.ai. Requires CUA_API_KEY. Cloud sandboxes are isolated VMs run by trycua with billing per usage. Use when the user wants no local Docker/QEMU, wants Windows/macOS guests without Apple Silicon, or explicitly asks for cloud.
Run Cua sandboxes locally - no API key required. Covers Docker (XFCE/Kasm) for Linux containers, QEMU for full VMs, Lume for macOS guests on Apple Silicon, and Tart for Apple VZ VMs. Use when the user wants a local sandbox, wants to keep things off the cloud, or asks "spin up an Ubuntu desktop", "open a sandbox VM", "let me try this safely first".
| name | cua-control |
| description | Mouse, keyboard, and scroll primitives for cua_click / cua_type / cua_key / cua_scroll. Reference for action shapes, modifier keys, key chord syntax, and screenshot-driven targeting workflows. |
These tools target the active Cua sandbox (or localhost when in that mode).
Always take a cua_screenshot first when you need coordinates; never
guess pixel positions blindly.
Coordinates are in pixels from the top-left of the target display. The default sandbox display is 1024x768 (XGA) which matches Anthropic's recommended computer-use resolution. Cua handles Retina/HiDPI scaling internally on macOS.
cua_clickcua_click({ x: 320, y: 180 }) // left click
cua_click({ x: 320, y: 180, button: "right" }) // right click
cua_click({ x: 320, y: 180, clicks: 2 }) // double click
Supported buttons: left, right, middle.
cua_typecua_type({ text: "hello world" })
Types one character at a time on the underlying surface. Newlines in the text are typed as Return.
cua_keyPress a single chord or a sequence of chords:
cua_key({ keys: "Return" })
cua_key({ keys: "ctrl+s" })
cua_key({ keys: ["cmd+space", "Return"] }) // Spotlight + Enter
Common chord names: Return, Escape, Tab, BackSpace, Delete,
Up, Down, Left, Right, Page_Up, Page_Down, Home, End,
F1 .. F12, ctrl+a, ctrl+c, ctrl+v, cmd+s (macOS), super+l
(Linux).
cua_scrollcua_scroll({ x: 400, y: 300, dy: -5 }) // scroll down
cua_scroll({ x: 400, y: 300, dy: 5 }) // scroll up
cua_scroll({ x: 400, y: 300, dx: 3 }) // scroll right
Use dx / dy wheel deltas. Positive dx means right; negative dy
means down. scrollX / scrollY are still accepted aliases.
The canonical pattern:
1. cua_screenshot() // see the current state
2. <find target visually> // identify x, y of the element
3. cua_click({ x, y }) / cua_type
4. cua_screenshot() // verify the change
5. repeat
Take a fresh screenshot after every state-changing action. The model's internal memory of the screen is unreliable after typing or clicking.