Run any Skill in Manus with one click

$pwd:

ios-sim

Name: Ios Sim
Author: ryanthedev

// Control the iOS Simulator — tap, type, swipe, screenshot, record video, install and launch apps. Use when interacting with the iOS simulator during React Native development. Triggers on "simulator", "tap", "swipe", "screenshot of simulator", "install app", "launch app", "what's on screen", "record video", "accessibility tree", "tap by label", "tap by id", "list elements", "go back", "scroll to top", "scroll to bottom".

Run Skill in Manus

$ git log --oneline --stat

stars:7

forks:0

updated:March 11, 2026 at 01:49

File Explorer

6 files

SKILL.md

readonly

name	ios-sim
description	Control the iOS Simulator — tap, type, swipe, screenshot, record video, install and launch apps. Use when interacting with the iOS simulator during React Native development. Triggers on "simulator", "tap", "swipe", "screenshot of simulator", "install app", "launch app", "what's on screen", "record video", "accessibility tree", "tap by label", "tap by id", "list elements", "go back", "scroll to top", "scroll to bottom".
allowed-tools	Bash, Read, Agent

Skill: ios-sim

On load: Read ../../.claude-plugin/plugin.json from this skill's base directory. Display ios-sim v{version} before proceeding.

Control the iOS Simulator through shell scripts wrapping xcrun simctl and AXe.

IMPORTANT: Never load screenshots or accessibility trees in the main context.
Always dispatch a subagent for visual/inspection tasks.

Used By

Skill	What it uses
`a11y-audit`	`ui.sh describe-all` for accessibility tree capture
`layout-check`	`capture.sh view` + `ui.sh describe-all` for screenshot and element positions
`diagnose`	`capture.sh view` to read error text from the simulator screen
`deeplink-test`	Screenshot capture to verify the screen after firing a deep link

Prerequisites

macOS with Xcode installed
A booted iOS simulator (xcrun simctl list devices to check)
AXe (brew install cameroncooke/axe/axe) — required for all ui.sh commands (tap, tap-label, tap-id, type, swipe, describe-all, describe-point, list, back, scroll). capture.sh view works without AXe.
See ${CLAUDE_SKILL_DIR}/references/troubleshooting.md if anything is missing

Scripts

All scripts live at ${CLAUDE_SKILL_DIR}/scripts/. Run them with Bash.

Routing Table

Intent	Workflow	Why
See what's on screen	view	Image stays in subagent
Find UI elements/coordinates	inspect	JSON tree stays in subagent
Multi-step UI interaction	interact	Entire loop stays in subagent
Simple one-shot command	direct	No image/tree involved

Direct Commands (safe for main context)

Intent	Script	Example
Get booted simulator ID	`device.sh booted`	`device.sh booted`
Open Simulator app	`device.sh open`	`device.sh open`
Save screenshot to file	`capture.sh screenshot <path>`	`capture.sh screenshot /tmp/shot.png`
Start video recording	`capture.sh record`	`capture.sh record`
Stop video recording	`capture.sh stop`	`capture.sh stop`
Install app bundle	`app.sh install <path>`	`app.sh install /path/to/App.app`
Launch app by bundle ID	`app.sh launch <id>`	`app.sh launch com.example.app`
Tap element by accessibility label	`ui.sh tap-label <label>`	`ui.sh tap-label "Login"`
Tap element by accessibility ID	`ui.sh tap-id <id>`	`ui.sh tap-id "submit-button"`
List on-screen elements (Controls/Content)	`ui.sh list`	`ui.sh list`
Tap the back/navigation button	`ui.sh back`	`ui.sh back`
Scroll to top or bottom of list	`ui.sh scroll top\|bottom`	`ui.sh scroll top`

Workflows

1. View (screenshot analysis)

When: "What's on the simulator screen?", "How does it look?", "Is there an error?"

Main agent never loads the image. Haiku does the analysis.

Dispatch Agent:
  subagent_type: general-purpose
  model: haiku
  description: "ios-sim: analyze screenshot"
  prompt: |
    1. Run: ${CLAUDE_SKILL_DIR}/scripts/capture.sh view
       This outputs a file path to a compressed JPEG.
    2. Read that file path with the Read tool to see the image.
    3. Analyze and return:
       - Overview: What app/screen is visible (1-2 sentences)
       - Key elements: Buttons, text, inputs, navigation items
       - State: Errors, loading, forms filled, current tab
       - Coordinates: Notable interactive elements with approximate point positions
    4. If the user asked something specific, answer that directly.
    Return text only. Be concise.

    USER QUESTION: [insert user's question here]

2. Inspect (accessibility tree)

When: "What elements are on screen?", "Find the login button", "Where should I tap?"

The accessibility tree JSON can be massive. Parse it in a subagent.

Dispatch Agent:
  subagent_type: general-purpose
  model: haiku
  description: "ios-sim: inspect UI elements"
  prompt: |
    1. Run: ${CLAUDE_SKILL_DIR}/scripts/ui.sh describe-all
       This outputs the full accessibility tree as JSON.
    2. Parse the JSON and return a structured summary:
       - Screen dimensions (from root frame)
       - Interactive elements: buttons, text fields, switches, links
         Format each as: "Label" [type] at (x, y) — size WxH
       - Current focus/selection state
       - Navigation structure (tabs, headers, back buttons)
    3. If looking for a specific element, report its exact coordinates.
    Return text only. Be concise.

    LOOKING FOR: [insert what the user needs to find]

3. Interact (multi-step UI automation)

When: "Tap the login button", "Fill in the form", "Navigate to settings"

Combines view + inspect + actions in a subagent loop. The entire interaction stays isolated — main context only gets the final result.

Dispatch Agent:
  subagent_type: general-purpose
  description: "ios-sim: UI interaction"
  prompt: |
    You are automating the iOS Simulator. Scripts are at:
    ${CLAUDE_SKILL_DIR}/scripts/

    Available commands:
    - capture.sh view               → compressed screenshot (read the output path to see it)
    - ui.sh describe-all            → full accessibility tree JSON
    - ui.sh describe-point X Y      → element at coordinates
    - ui.sh tap X Y                 → tap at point coordinates
    - ui.sh tap X Y --duration S    → long press
    - ui.sh type "text"             → type ASCII text (max 500 chars)
    - ui.sh swipe X1 Y1 X2 Y2      → swipe gesture
    - ui.sh tap-label "label"       → tap element by accessibility label (no coordinate lookup needed)
    - ui.sh tap-id "id"             → tap element by accessibility ID (no coordinate lookup needed)
    - ui.sh list                    → compact table of on-screen elements grouped by Controls/Content
    - ui.sh back                    → heuristic back-button finder and tap (scores by label/position)
    - ui.sh scroll top|bottom       → repeated swipes with stabilization detection (max 10 swipes)

    TASK: [insert what the user wants to do]

    WORKFLOW:
    1. First capture.sh view to see current state
    2. Use ui.sh describe-all if you need exact coordinates
    3. Perform the requested actions
    4. capture.sh view again to verify the result
    5. Return a text summary of what you did and the final state

    RULES:
    - Use POINT coordinates from the accessibility tree, not pixel coordinates
    - After each action, verify the result before proceeding
    - If something fails, try describe-all to re-orient
    - Return text summary only — do not include base64 image data

Tips

All scripts auto-detect the booted simulator. Pass --udid <UUID> to target a specific device.
Screenshots are 3x pixel resolution. The accessibility tree reports point coordinates. Always use point coordinates for tap/swipe.
ui.sh type only accepts ASCII printable characters (max 500 chars).
Video recording runs in the background. Use capture.sh stop to finish.
For the interact workflow, omit model to use the user's current model (better reasoning for complex multi-step tasks).

Context Efficiency

Item	Size	In Main Context?
Screenshot JPEG	~100-300 KB	NEVER — haiku subagent only
Accessibility tree JSON	~10-100 KB	NEVER — subagent only
Subagent text summary	~200-800 chars	YES
Direct commands (device, app)	~50-200 chars	YES

related-skills.json

same repository

coding.md

from "ryanthedev/react-native-foundations.skill"

React Native coding guidance — consults official docs before writing code and suggests verification after. Use when writing components, implementing features, building screens, fixing queries, or adding views. Triggers on "write a component", "implement this feature", "build this screen", "add a view", "React Native code", "fix this query", "use best practices".

2026-03-117

a11y-audit.md

from "ryanthedev/react-native-foundations.skill"

Audit accessibility of the running React Native app. Captures the accessibility tree from the iOS Simulator and checks it against React Native best practices. Triggers on "audit accessibility", "check a11y", "VoiceOver check", "accessibility issues", "screen reader test", "are labels correct".

2026-03-117

clean.md

from "ryanthedev/react-native-foundations.skill"

Intelligent environment reset for React Native projects — diagnoses stale caches before cleaning instead of blindly nuking everything. Triggers on "clean build", "reset caches", "fresh start", "metro won't start", "build is broken", "nuclear clean", "clear derived data".

2026-03-117

debug.md

from "ryanthedev/react-native-foundations.skill"

Debug React Native apps — view console logs, JS errors, Metro status, evaluate expressions, inspect React component tree, monitor network requests. Use for runtime debugging of JavaScript and React layers. Triggers on "console logs", "JS errors", "Metro status", "evaluate expression", "React component tree", "network requests", "debug".

2026-03-117

deeplink-test.md

from "ryanthedev/react-native-foundations.skill"

Test deep links and URL schemes in the iOS Simulator. Reads navigation/linking config, constructs test URLs, fires them via xcrun simctl, and verifies the resulting screen. Triggers on "test deep link", "test this URL scheme", "verify navigation to", "does deeplink work", "open URL in simulator", "test universal link".

2026-03-117

diagnose.md

from "ryanthedev/react-native-foundations.skill"

Diagnose React Native errors by cross-referencing error text against docs, error pattern database, and project config. Use for build failures, red screens, runtime crashes, and dependency issues. Triggers on "diagnose this error", "why is my build failing", "what does this red screen mean", "Unable to resolve module", "null is not an object", "Invariant Violation", "pod install failed", "gradlew failed", "EACCES", "metro error".

2026-03-117

package.json

"author": "ryanthedev"

"repository": "ryanthedev/react-native-foundations.skill"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

$ useful --forSOC

Software DevelopersComputer and Mathematical Occupations15-1252L4

name	ios-sim
description	Control the iOS Simulator — tap, type, swipe, screenshot, record video, install and launch apps. Use when interacting with the iOS simulator during React Native development. Triggers on "simulator", "tap", "swipe", "screenshot of simulator", "install app", "launch app", "what's on screen", "record video", "accessibility tree", "tap by label", "tap by id", "list elements", "go back", "scroll to top", "scroll to bottom".
allowed-tools	Bash, Read, Agent

Skill: ios-sim

On load: Read ../../.claude-plugin/plugin.json from this skill's base directory. Display ios-sim v{version} before proceeding.

Control the iOS Simulator through shell scripts wrapping xcrun simctl and AXe.

IMPORTANT: Never load screenshots or accessibility trees in the main context.
Always dispatch a subagent for visual/inspection tasks.

Used By

Skill	What it uses
`a11y-audit`	`ui.sh describe-all` for accessibility tree capture
`layout-check`	`capture.sh view` + `ui.sh describe-all` for screenshot and element positions
`diagnose`	`capture.sh view` to read error text from the simulator screen
`deeplink-test`	Screenshot capture to verify the screen after firing a deep link

Prerequisites

macOS with Xcode installed
A booted iOS simulator (xcrun simctl list devices to check)
AXe (brew install cameroncooke/axe/axe) — required for all ui.sh commands (tap, tap-label, tap-id, type, swipe, describe-all, describe-point, list, back, scroll). capture.sh view works without AXe.
See ${CLAUDE_SKILL_DIR}/references/troubleshooting.md if anything is missing

Scripts

All scripts live at ${CLAUDE_SKILL_DIR}/scripts/. Run them with Bash.

Routing Table

Intent	Workflow	Why
See what's on screen	view	Image stays in subagent
Find UI elements/coordinates	inspect	JSON tree stays in subagent
Multi-step UI interaction	interact	Entire loop stays in subagent
Simple one-shot command	direct	No image/tree involved

Direct Commands (safe for main context)

Intent	Script	Example
Get booted simulator ID	`device.sh booted`	`device.sh booted`
Open Simulator app	`device.sh open`	`device.sh open`
Save screenshot to file	`capture.sh screenshot <path>`	`capture.sh screenshot /tmp/shot.png`
Start video recording	`capture.sh record`	`capture.sh record`
Stop video recording	`capture.sh stop`	`capture.sh stop`
Install app bundle	`app.sh install <path>`	`app.sh install /path/to/App.app`
Launch app by bundle ID	`app.sh launch <id>`	`app.sh launch com.example.app`
Tap element by accessibility label	`ui.sh tap-label <label>`	`ui.sh tap-label "Login"`
Tap element by accessibility ID	`ui.sh tap-id <id>`	`ui.sh tap-id "submit-button"`
List on-screen elements (Controls/Content)	`ui.sh list`	`ui.sh list`
Tap the back/navigation button	`ui.sh back`	`ui.sh back`
Scroll to top or bottom of list	`ui.sh scroll top\|bottom`	`ui.sh scroll top`

Workflows

1. View (screenshot analysis)

When: "What's on the simulator screen?", "How does it look?", "Is there an error?"

Main agent never loads the image. Haiku does the analysis.

Dispatch Agent:
  subagent_type: general-purpose
  model: haiku
  description: "ios-sim: analyze screenshot"
  prompt: |
    1. Run: ${CLAUDE_SKILL_DIR}/scripts/capture.sh view
       This outputs a file path to a compressed JPEG.
    2. Read that file path with the Read tool to see the image.
    3. Analyze and return:
       - Overview: What app/screen is visible (1-2 sentences)
       - Key elements: Buttons, text, inputs, navigation items
       - State: Errors, loading, forms filled, current tab
       - Coordinates: Notable interactive elements with approximate point positions
    4. If the user asked something specific, answer that directly.
    Return text only. Be concise.

    USER QUESTION: [insert user's question here]

2. Inspect (accessibility tree)

When: "What elements are on screen?", "Find the login button", "Where should I tap?"

The accessibility tree JSON can be massive. Parse it in a subagent.

Dispatch Agent:
  subagent_type: general-purpose
  model: haiku
  description: "ios-sim: inspect UI elements"
  prompt: |
    1. Run: ${CLAUDE_SKILL_DIR}/scripts/ui.sh describe-all
       This outputs the full accessibility tree as JSON.
    2. Parse the JSON and return a structured summary:
       - Screen dimensions (from root frame)
       - Interactive elements: buttons, text fields, switches, links
         Format each as: "Label" [type] at (x, y) — size WxH
       - Current focus/selection state
       - Navigation structure (tabs, headers, back buttons)
    3. If looking for a specific element, report its exact coordinates.
    Return text only. Be concise.

    LOOKING FOR: [insert what the user needs to find]

3. Interact (multi-step UI automation)

When: "Tap the login button", "Fill in the form", "Navigate to settings"

Combines view + inspect + actions in a subagent loop. The entire interaction stays isolated — main context only gets the final result.

Dispatch Agent:
  subagent_type: general-purpose
  description: "ios-sim: UI interaction"
  prompt: |
    You are automating the iOS Simulator. Scripts are at:
    ${CLAUDE_SKILL_DIR}/scripts/

    Available commands:
    - capture.sh view               → compressed screenshot (read the output path to see it)
    - ui.sh describe-all            → full accessibility tree JSON
    - ui.sh describe-point X Y      → element at coordinates
    - ui.sh tap X Y                 → tap at point coordinates
    - ui.sh tap X Y --duration S    → long press
    - ui.sh type "text"             → type ASCII text (max 500 chars)
    - ui.sh swipe X1 Y1 X2 Y2      → swipe gesture
    - ui.sh tap-label "label"       → tap element by accessibility label (no coordinate lookup needed)
    - ui.sh tap-id "id"             → tap element by accessibility ID (no coordinate lookup needed)
    - ui.sh list                    → compact table of on-screen elements grouped by Controls/Content
    - ui.sh back                    → heuristic back-button finder and tap (scores by label/position)
    - ui.sh scroll top|bottom       → repeated swipes with stabilization detection (max 10 swipes)

    TASK: [insert what the user wants to do]

    WORKFLOW:
    1. First capture.sh view to see current state
    2. Use ui.sh describe-all if you need exact coordinates
    3. Perform the requested actions
    4. capture.sh view again to verify the result
    5. Return a text summary of what you did and the final state

    RULES:
    - Use POINT coordinates from the accessibility tree, not pixel coordinates
    - After each action, verify the result before proceeding
    - If something fails, try describe-all to re-orient
    - Return text summary only — do not include base64 image data

Tips

All scripts auto-detect the booted simulator. Pass --udid <UUID> to target a specific device.
Screenshots are 3x pixel resolution. The accessibility tree reports point coordinates. Always use point coordinates for tap/swipe.
ui.sh type only accepts ASCII printable characters (max 500 chars).
Video recording runs in the background. Use capture.sh stop to finish.
For the interact workflow, omit model to use the user's current model (better reasoning for complex multi-step tasks).

Context Efficiency

Item	Size	In Main Context?
Screenshot JPEG	~100-300 KB	NEVER — haiku subagent only
Accessibility tree JSON	~10-100 KB	NEVER — subagent only
Subagent text summary	~200-800 chars	YES
Direct commands (device, app)	~50-200 chars	YES

ios-sim

Skill: ios-sim

Used By

Prerequisites

Scripts

Routing Table

Direct Commands (safe for main context)

Workflows

1. View (screenshot analysis)

2. Inspect (accessibility tree)

3. Interact (multi-step UI automation)

Tips

Context Efficiency

More from this repository

More from this repository

Skill: ios-sim

Used By

Prerequisites

Scripts

Routing Table

Direct Commands (safe for main context)

Workflows

1. View (screenshot analysis)

2. Inspect (accessibility tree)

3. Interact (multi-step UI automation)

Tips

Context Efficiency