一键导入
serve-sim
// Control and stream a running iOS, iPad, or Apple Watch Simulator with npx serve-sim. Use for simulator preview, taps, gestures, hardware buttons, rotation, camera injection, permissions, accessibility, and CoreAnimation debug.
// Control and stream a running iOS, iPad, or Apple Watch Simulator with npx serve-sim. Use for simulator preview, taps, gestures, hardware buttons, rotation, camera injection, permissions, accessibility, and CoreAnimation debug.
| name | serve-sim |
| description | Control and stream a running iOS, iPad, or Apple Watch Simulator with npx serve-sim. Use for simulator preview, taps, gestures, hardware buttons, rotation, camera injection, permissions, accessibility, and CoreAnimation debug. |
| license | Apache-2.0 |
Drive an Apple Simulator (iOS, iPad, Apple Watch) from an agent using the serve-sim CLI. serve-sim spawns a Swift helper that captures the simulator framebuffer via simctl io, exposes it as an MJPEG stream plus a binary WebSocket input channel, and serves a React preview UI on top. This skill teaches an agent the exact CLI surface, the gesture JSON shape, the gotchas, and the recommended workflows.
adb shell tooling.xcodebuild or xcrun simctl install.xcrun devicectl or Xcode.Before any other action, verify the host satisfies these. If something is missing, tell the user exactly what to install — do not proceed.
| Requirement | Check command | Why |
|---|---|---|
| macOS host | uname -s returns Darwin | serve-sim only runs on macOS |
| Xcode CLI tools | xcrun --version exits 0 | simctl is the underlying simulator driver |
| Node.js ≥18 | node --version ≥18 | serve-sim is an npm package run via npx |
| macOS 14+ (optional) | sw_vers -productVersion ≥14 | Required ONLY for camera subcommand |
A bundled helper script is available: scripts/check-prereqs.sh. Run it; if it exits non-zero, surface the message to the user.
A booted simulator is required for most subcommands. Check with xcrun simctl list devices booted. If none are booted, tell the user to open Xcode → Simulator or to run xcrun simctl boot <UDID>.
┌──────────────┐ simctl io ┌─────────────────┐ MJPEG / WS ┌─────────┐
│ iOS Simulator│ ──────────► │ serve-sim-bin │ ───────────► │ Browser │
└──────────────┘ (Swift) │ (per-device) │ └─────────┘
└─────────────────┘
▲
state file in
$TMPDIR/serve-sim/
▲
┌──────────────────┐
│ serve-sim CLI │
└──────────────────┘
Key invariants the agent must respect:
(0, 0) at top-left and (1, 1) at bottom-right of the display. Never pass pixel coordinates.$TMPDIR/serve-sim/server-{udid}.json. Use serve-sim --list to query it; do not read the JSON directly unless you know what you are doing.rotate is remembered by the helper, and subsequent gestures are rotated client-side. An agent that sends raw coords after a rotation does not need to compensate manually.| Goal | Command | Notes |
|---|---|---|
| Start preview server | npx serve-sim [device] | Default preview at http://localhost:3200, stream at :3100. Foreground process. |
| Start headless / daemon | npx serve-sim --detach [device] | Returns JSON with pid, port, url. Use for agent loops. |
| Show stream in host's preview | npx serve-sim --detach -q → hand off url to host preview tool | See "Showing the stream in your agent's preview" section. |
| List running streams | npx serve-sim --list | Add -q for JSON-only output. |
| Stop all helpers | npx serve-sim --kill | Pass [device] to stop a specific one. |
| Single tap | npx serve-sim tap <x> <y> | <x> <y> in 0..1. Use this, not gesture, for plain taps. See "Critical gotcha" below. |
| Multi-step gesture | npx serve-sim gesture '<json>' | See references/gestures.md. |
| Hardware button | npx serve-sim button <name> | Names: home, swipe_home, app_switcher, lock, siri, side_button. See references/buttons-rotation.md. |
| Rotate device | npx serve-sim rotate <orientation> | portrait, portrait_upside_down, landscape_left, landscape_right. |
| Simulate memory warning | npx serve-sim memory-warning | Equivalent to Debug → Simulate Memory Warning. |
| CoreAnimation debug | npx serve-sim ca-debug <option> <on|off> | Options: blended, copies, misaligned, offscreen, slow-animations. See references/ca-debug.md. |
| Inject camera feed | npx serve-sim camera <bundle-id> [--file <path>|--webcam [name]] | (Re)launches the app with the camera dylib attached. macOS 14+ only. See references/camera.md. |
| Hot-swap camera source | npx serve-sim camera switch <placeholder|webcam|file> [arg] | No app relaunch. |
| Manage app permissions | npx serve-sim permissions <grant|revoke|reset|list> <permission> <bundle-id> | Camera, photos, location, push notifications, contacts, etc. See references/permissions.md. |
| Read accessibility tree | curl http://localhost:3100/ax | Returns axe-style JSON. See references/endpoints.md for all endpoints. |
Most subcommands accept -d <udid|name> to target a specific device when several are booted.
tap over gesture for tapsEach serve-sim gesture call opens its own WebSocket. If you issue two back-to-back gesture calls — one with {"type":"begin",...} and one with {"type":"end",...} — the simulator receives them with enough latency between them that the touch is interpreted as a long-press, not a tap. This is a deliberate constraint of the protocol, not a bug to work around.
Rule: for any single-shot tap, use serve-sim tap <x> <y>. Only use gesture for drags, swipes, or multi-step interactions where you must thread the same socket across begin → move × N → end.
When multiple simulators are booted, every subcommand accepts -d <udid|name>. The name match is case-insensitive against the device name returned by xcrun simctl list devices booted. Examples:
npx serve-sim tap 0.5 0.5 -d "iPhone 16 Pro"
npx serve-sim button home -d ABC12345-...
npx serve-sim --list # show all running streams
If the user has only one booted simulator, omit -d entirely. The skill should prefer auto-detection over hard-coding device names.
By default, serve-sim prints human-readable status to stdout. For agent loops, prefer JSON output:
npx serve-sim --list -q # JSON array of running streams
npx serve-sim --detach -q # JSON with pid/port/url after spawn
npx serve-sim camera status -q # JSON with {alive, source, mirror, ...}
Parse -q output programmatically. Never parse the non--q human output — it can change between versions.
When the user asks to "see the simulator here", "view it in preview", "open it in this tool", or similar, the goal is to stream the simulator into the same surface the user is chatting with. serve-sim returns a regular HTTP URL — the agent's job is to surface that URL and, if the host exposes a preview tool, hand it off.
Steps:
Start serve-sim and capture the URL:
npx serve-sim --detach -q
This returns JSON like {"pid":..., "port":3200, "url":"http://localhost:3200", "streamUrl":"http://localhost:3100", ...}. The url field is the human-facing preview UI; streamUrl is the raw MJPEG endpoint.
Always surface the URL plainly in your response so the user can fallback to opening it manually in any browser.
Probe your host's preview tool and hand off the URL if one exists. Examples of tool names you may see in your toolset:
preview_start (Claude Code) — call it with { url: "<url>" }.mcp__Claude_Preview__preview_start (some MCP setups).browser_open, open_url, or similar URL-opening tool — pass the URL.Do not assume any specific preview tool exists. Inspect the tools available to you in the current session. If one matches the description above, use it. If none does, fall back to step 2 (print the URL prominently).
The stream stays alive until npx serve-sim --kill. Multiple clients (the host's preview + the user's browser + a tunnel) can read the same URL simultaneously.
See references/workflows.md workflow "Show the simulator stream in the host's preview" for the full recipe.
For complete end-to-end recipes (UI automation, camera testing, accessibility-driven taps, deep-link flows, preview handoff), see references/workflows.md. The reference covers the patterns documented in serve-sim's own AGENTS.md.
Always stop helpers when finished, unless the user explicitly wants them to keep running:
npx serve-sim --kill # stop all
npx serve-sim --kill "iPhone 16 Pro" # stop one
Orphan helpers occupy ports 3200/3100 and prevent fresh starts.
0..1. If the user gives pixel values, divide by the screen dimensions reported by GET /config.gesture for plain taps. Use tap. See "Critical gotcha" above.npx serve-sim is already running. Verify with --list or by checking $TMPDIR/serve-sim/server-{udid}.json. If absent, start it explicitly.home, swipe_home, app_switcher, lock, siri, side_button. See references/buttons-rotation.md for the source-of-truth list.-q for JSON.npx serve-sim camera --stop-webcam when done.GET /ax) to find a target element and the query returned no result, fail loudly — tapping a guessed spot is almost always worse than reporting "target not found" back to the user. See references/workflows.md workflow 1 for the guard pattern.