| name | desktop-app-flows |
| description | Understand and explore the Omi desktop macOS app's UI flows, navigation patterns, and SwiftUI architecture. Use when developing features, fixing bugs, or verifying changes in desktop/ Swift files. Provides agent-swift commands to explore the live app, understand how screens connect, and verify your work. |
| allowed-tools | Bash, Read, Glob, Grep |
Omi Desktop App ā Flows & Exploration
This skill teaches you the Omi desktop macOS app's navigation structure, screen architecture, and SwiftUI patterns. Use it when developing features (to understand how the app works), fixing bugs (to navigate to the affected screen), or verifying changes (to confirm your code works in the live app).
Fast-Path for Local Iteration (start here)
Two things make iterating on the desktop app slow: signing in (web OAuth) and clicking through the UI to reach a screen. Both are solved ā use these before reaching for agent-swift.
1. Skip the web login (seed auth once, reuse forever)
Dev/named bundles store auth in UserDefaults (not Keychain), so a signed-in session can be cloned between bundles. Sign in once in "Omi Dev", then replay it into any test bundle:
cd desktop
./scripts/omi-auth-dump.sh
./scripts/omi-auth-seed.sh com.omi.omi-myfeature
The seeded bundle boots already signed-in and past onboarding ā no browser. The captured Firebase idToken expires (~1h); re-run omi-auth-dump.sh after signing in again if backend calls start 401ing. Scope: this is for dev iteration only ā when validating the onboarding or auth flows themselves (or running flow-walker E2E), use the real flow per Guard Conditions below.
2. Jump straight to any screen (automation bridge)
The app runs a local HTTP control bridge (DesktopAutomationBridge.swift) that auto-enables on every non-production bundle (off on prod). scripts/omi-ctl drives it ā jump to a screen in ~150ms instead of clicking through the sidebar:
./scripts/omi-ctl wait-ready
./scripts/omi-ctl navigate rewind
./scripts/omi-ctl navigate settings rewind
./scripts/omi-ctl state
./scripts/omi-ctl screens
Disable with OMI_DISABLE_LOCAL_AUTOMATION=1 to run a dev build "clean". Running several named bundles at once? Give each its own OMI_AUTOMATION_PORT (default 47777).
2b. Run semantic actions (cursor-free, in-process)
Beyond navigation, the bridge exposes named actions that invoke the app's real
code paths directly ā no synthetic mouse events, so they never grab the cursor (the
deterministic equivalent of the Flutter app's Marionette driver). Prefer these over
agent-swift click/coordinate clicking for anything they cover.
./scripts/omi-ctl actions
./scripts/omi-ctl action refresh_all_data
./scripts/omi-ctl action toggle_transcription enabled=false
Add new actions in DesktopAutomationActionRegistry (registerBuiltins() for global
ones, or register(name:summary:params:handler:) from a view model for screen-scoped
ones). GET /actions lists them; POST /action {name, params} runs one and returns
the resulting state snapshot.
The full loop
cd desktop
OMI_APP_NAME="omi-myfeature" ./run.sh &
./scripts/omi-auth-seed.sh com.omi.omi-myfeature
./scripts/omi-ctl wait-ready
./scripts/omi-ctl navigate memories
agent-swift connect --bundle-id com.omi.omi-myfeature
agent-swift snapshot -i --json
After a code change, an incremental xcrun swift build + relaunch is fast ā the slow parts (login, navigation) are gone. For pure visual checks without launching at all, SwiftUI snapshot tests are an option, but most pages are entangled with AppState.shared/Firebase singletons, so the live-app bridge loop above is usually the better path.
How to Explore the App
You can interact with the running app via agent-swift ā a CLI that clicks elements, reads the accessibility tree, and captures screenshots through the macOS Accessibility API. Works with any macOS app, no app-side instrumentation needed.
Setup
agent-swift doctor
agent-swift connect --bundle-id com.omi.desktop-dev
agent-swift snapshot -i --json
Commands
| Command | Purpose | Example |
|---|
snapshot -i --json | See all interactive elements with refs, types, labels | agent-swift snapshot -i --json |
click @ref | CGEvent click ā SwiftUI elements (NavigationLink, gestures) | agent-swift click @e3 |
press @ref | AXPress ā AppKit buttons, Settings sidebar items | agent-swift press @e5 |
find role/text/key VALUE | Find element and chain action | agent-swift find text "Settings" click |
fill @ref "text" | Type into text field | agent-swift fill @e7 "search" |
scroll down/up | Scroll current view | agent-swift scroll down |
wait text "X" | Wait for element to appear | agent-swift wait text "Loading" --timeout 5000 |
is exists @ref | Assert element exists (exit 0/1) | agent-swift is exists @e3 |
get PROP @ref | Read property value | agent-swift get value @e5 --json |
screenshot PATH | Capture app window | agent-swift screenshot /tmp/screen.png |
Key rules:
click = CGEvent mouse click (SwiftUI). Use for main sidebar icons, NavigationLink.
press = AXPress action (AppKit). Use for Settings sidebar sections.
- Refs go stale after any mutation ā always re-snapshot before the next interaction.
find with chained action is more stable than hardcoded @ref numbers.
--json flag on any command gives structured output for parsing.
App Navigation Architecture
Screen Map
Main Window
āāā Sidebar (SidebarView.swift) ā use `click`
ā āāā Home (DesktopHomeView.swift)
ā āāā Conversation (ChatSessionsSidebar.swift)
ā āāā brain ā Memories
ā āāā checklist ā Action Items
ā āāā puzzlepiece.fill ā Integrations
ā āāā gearshape.fill ā Settings
ā
āāā Settings (SettingsPage.swift) ā use `press` for sidebar sections
āāā General ā app preferences
āāā Rewind ā screenshot/timeline settings
āāā Transcription ā Language Mode (Auto-Detect / Single Language)
ā āāā Language picker (popupbutton or button)
āāā Notifications ā alert preferences
āāā Privacy ā data settings
āāā Account ā user info
āāā AI Chat ā chat model settings
āāā Advanced ā developer options
āāā About ā version info
System Tray Menu
āāā openOmi ā Open Omi
āāā checkFor ā Check for Updates
āāā resetOnb ā Reset Onboarding
āāā reportIs ā Report Issue
āāā signOut ā Sign Out
āāā quitApp ā Quit
Interaction Patterns
Main sidebar navigation:
- Icons are
image type elements with accessibility identifiers: sidebar_dashboard, sidebar_chat, sidebar_memories, sidebar_tasks, sidebar_rewind, sidebar_apps, sidebar_settings
- Use
find key sidebar_dashboard click for reliable navigation (survives UI changes)
- Keyboard shortcuts: Cmd+1 (Dashboard), Cmd+2 (Chat), Cmd+3 (Memories), Cmd+4 (Tasks), Cmd+5 (Rewind), Cmd+6 (Apps), Cmd+, (Settings)
- Use
click ā these are SwiftUI views with onTapGesture
Settings sidebar navigation:
- Sections are
button type elements with section name labels
- Use
press ā these are SwiftUI Button views that respond to AXPress
Transcription language mode:
- Two radio-button-style options: "Auto-Detect Multi-Language" and "Single Language Better Accuracy"
click on the text to switch modes
- Single Language mode shows a language picker (
popupbutton)
- Click popupbutton ā menu items appear as
menuitem elements
System tray menu:
- Menu items have
identifier prefixes for detection
- Access via
snapshot --json (includes menu bar items)
Known Flows
Reference flows in desktop/e2e/flows/*.yaml describe the app's key user journeys. Read these to understand navigation paths, expected elements, and UI state at each step.
| Flow | Covers | Steps | Report |
|---|
flows/navigation.yaml | SidebarView, DesktopHomeView | 6/6 PASS | report |
flows/dashboard.yaml | DashboardPage, GoalsWidget, TasksWidget | 3/6 (3 skipped) | report |
flows/chat.yaml | ChatPage, ChatProvider | 5/5 PASS | report |
flows/memories.yaml | MemoriesPage, MemoryGraphPage | 5/6 (1 skipped) | report |
flows/tasks.yaml | TasksPage, TasksStore | 4/5 (1 skipped) | report |
flows/settings.yaml | SettingsPage, SettingsSidebar | 9/9 PASS | report |
flows/language.yaml | SettingsPage, SettingsSidebar | 5 steps | ā |
flows/rewind.yaml | RewindPage | 4/4 PASS | report |
flows/apps.yaml | IntegrationsPage | 6/6 PASS | report |
flows/refer.yaml | ReferPage | 3/3 PASS | report |
flows/screen-recording-permission.yaml | RewindPage, ScreenCaptureService, PermissionsPage | 7/7 PASS | report |
flows/audio-recording.yaml | ConversationsPage, AudioCaptureService, AppState | 7/7 PASS | report |
When you modify a Swift file, check if any flow's covers: includes it. That flow describes the user journey your change affects.
Adding a New Flow
Create desktop/e2e/flows/<name>.yaml in v2 format:
version: 2
name: my-flow
description: What this flow covers
app: com.omi.computer-macos
covers:
- desktop/Desktop/Sources/path/to/YourView.swift
preconditions:
- auth_ready
steps:
- id: S1
name: Step description
do: "Click the element (identifier: my_element). Verify the page loads."
expect:
interactive_count: { min: 5 }
text_visible:
- Expected Text
Important: Always use quoted strings for do: fields (not YAML > or |).
Verification & Evidence
After making changes, verify them in the live app:
- Navigate to the affected screen using the commands above
- Check that your changes appear (snapshot, screenshot)
- Test interactions (click buttons, fill fields, scroll)
- Capture evidence:
agent-swift screenshot /tmp/evidence.png
- Generate video:
ffmpeg -framerate 1 -pattern_type glob -i '/tmp/e2e-*.png' -vf "scale=1280:720:force_original_aspect_ratio=decrease,pad=1280:720:-1:-1" -c:v libx264 -pix_fmt yuv420p /tmp/report.mp4
Decision Tree
| Problem | Solution |
|---|
| Element not found | Re-snapshot, try scrolling, check if on wrong screen |
| Click doesn't navigate | Try press instead (Settings sidebar = press, main sidebar = click) |
| Picker not responding | SwiftUI Picker .menu style may not expose as popupbutton ā look for button with value label |
| App seems frozen | Check agent-swift status --json, re-connect, check /private/tmp/omi-dev.log |
Guard Conditions
NEVER:
- Kill or restart the production Omi app
- Enable the automation bridge or seed auth on the production bundle (
com.omi.computer-macos) ā both are gated to non-production builds; keep it that way
- Modify source code to make tests pass ā report the failure instead
When validating auth or onboarding themselves, or running flow-walker E2E: drive the real flows ā do NOT use the seeded-auth / hasCompletedOnboarding fast-path, which exists only for iterating on other screens. The beta app (com.omi.computer-macos) is the standard target for flow-walker E2E testing; the dev app (com.omi.desktop-dev) and named omi-* bundles are for local development only.