| name | tauri-agent-dev |
| description | Spawn, probe, and stop Mini Diarium's live Windows Tauri dev app with WebView2 CDP enabled, then hand control to agent-browser for real UI inspection. Use this whenever the user wants to manually test the real desktop UI, drive the dev app, verify a bug or preference in the actual window, inspect localStorage, take a real screenshot, or "actually try it in the app" instead of relying only on unit tests or WDIO. Triggers: manually test the UI, drive the dev app, verify in the real UI, agent dev mode, spawn the dev app, open the running app and check, inspect the live Tauri window.
|
Tauri Agent Dev
Platform Support
- Use this skill on Windows only.
- Do not use it on macOS or Linux. WebView2 CDP is the mechanism here; the other Tauri webviews do not match this flow.
Start A Session
Run everything from the repo root with the Windows toolchain:
cmd.exe /c bun run agent:dev:start
cmd.exe /c bun run agent:dev:probe
Useful flags:
cmd.exe /c bun run agent:dev:start -- --port 9223
cmd.exe /c bun run agent:dev:start -- --timeout 180
cmd.exe /c bun run agent:dev:start -- --use-real-config
What start does:
- launches
tauri dev
- enables WebView2 remote debugging
- defaults to a sandbox under
.agent-dev/sandbox/
- seeds
.agent-dev/sandbox/app/config.json on first run so the frontend auto-selects the sandbox journal
- isolates WebView storage under
.agent-dev/sandbox/webview/ so localStorage does not leak across runs
- writes runtime state to
.agent-dev/state.json
- writes logs to
.agent-dev/dev.log
After start succeeds, connect the separate browser-driving layer:
agent-browser connect 9222
If you changed the port, connect to that port instead.
Drive The UI
After agent-browser connect, use the normal browser-driving loop:
agent-browser snapshot
- click or fill controls
- re-snapshot after meaningful UI transitions
- use eval for DOM or
localStorage reads
- take screenshots when the user needs proof
PowerShell note:
- Quote
@eNNN refs. Use agent-browser click '@e5', not agent-browser click @e5.
Stale ref warning: @eNNN refs are assigned at snapshot time. Any DOM mutation (tab switch,
scroll, dialog open/close) can reassign refs so an old ref silently targets a different element.
For controls inside scrollable panels or dialogs, prefer CSS selectors or JS eval with label-text
matching over bare @eNNN refs. Always re-snapshot after a meaningful transition before clicking
a ref from a previous snapshot.
Stable Selectors
Prefer the app's documented data-testid hooks where they exist. Do not invent new ones.
password-create-input
password-repeat-input
create-journal-button
password-unlock-input
unlock-journal-button
toggle-sidebar-button
lock-journal-button
title-input
calendar-day-YYYY-MM-DD
entry-nav-bar
entry-prev-button
entry-number-button-{N}
entry-next-button
entry-add-button
entry-delete-button
For Preferences and tab navigation, use visible text and current DOM state. There is no documented data-testid contract for those controls in src/CLAUDE.md.
Common Recipes
Create Or Unlock A Journal
For a fresh sandbox:
- fill
password-create-input
- fill
password-repeat-input
- click
create-journal-button
For an existing sandbox journal:
- fill
password-unlock-input
- click
unlock-journal-button
Read Or Verify Preferences
Open the Preferences UI through the visible app controls, then navigate to the needed tab by text.
To inspect saved preferences directly:
JSON.parse(localStorage.getItem('preferences') ?? '{}');
Typical checks:
autoLockEnabled
autoLockTimeout
language
editorFontFamily
Auto-lock timer interference: If autoLockTimeout is short (< 30 s), the journal will lock
between CDP roundtrips — eval calls do not dispatch DOM activity events and therefore do not
reset the idle timer. Patch the timeout in localStorage before starting a multi-step test:
(function() {
const p = JSON.parse(localStorage.getItem('preferences') ?? '{}');
p.autoLockTimeout = 600;
localStorage.setItem('preferences', JSON.stringify(p));
})();
Run this eval immediately after unlocking, before taking the first snapshot. Restore the original
value when done if needed.
Navigate A Scrollable Preferences Dialog
The Preferences dialog clips its content with a scrollable container — not the [role="tabpanel"]
element itself (which has overflow: visible). The actual scroller has class .flex-1.overflow-y-auto.
Correct approach — scroll a specific element into view:
document.querySelector('#some-element-id').scrollIntoView({behavior: 'instant', block: 'center'});
Array.from(document.querySelectorAll('input[type="checkbox"]'))
.find(el => el.labels?.[0]?.textContent?.trim() === 'Lock after inactivity')
?.scrollIntoView({behavior: 'instant', block: 'center'});
Wrong approach — setting scrollTop on the tabpanel:
document.querySelector('[role="tabpanel"]').scrollTop = 400;
After scrollIntoView, take a screenshot to confirm the element is visible before clicking.
Compound Eval for Atomic UI Chains
When multiple UI interactions must happen without a roundtrip gap (e.g., to avoid the auto-lock
timer or avoid stale refs), chain them in a single eval call:
(function() {
const openBtn = Array.from(document.querySelectorAll('button'))
.find(b => b.textContent.includes('Open Preferences'));
if (!openBtn) return {error: 'button not found'};
openBtn.click();
const tab = Array.from(document.querySelectorAll('[role="tab"]'))
.find(t => t.textContent.trim() === 'Security');
if (!tab) return {error: 'tab not found'};
tab.click();
const cb = Array.from(document.querySelectorAll('input[type="checkbox"]'))
.find(el => el.labels?.[0]?.textContent?.trim() === 'Lock after inactivity');
if (!cb) return {error: 'checkbox not found'};
cb.scrollIntoView({block: 'center'});
return {checkboxChecked: cb.checked};
})()
Use this pattern when:
- The idle timer is short and would fire between steps
- You need to read state immediately after opening a dialog (before refs can go stale)
- You're verifying that a setting persisted after Save + re-open in one shot
Capture A Screenshot
Use agent-browser's screenshot flow after the app is in the exact state the user cares about. Prefer this over describing the UI from memory.
End The Session
Always stop the dev session before finishing the task:
cmd.exe /c bun run agent:dev:stop
Optional:
cmd.exe /c bun run agent:dev:stop -- --keep-sandbox
Stopping is not optional cleanup. It kills both long-lived Windows process roots and removes sandbox state unless told otherwise.
Sandbox Semantics
- Default mode is sandboxed.
- Sandbox paths live under
.agent-dev/sandbox/.
- Start seeds
config.json with a single sandbox journal on first run, so the app does not fall back to the journal picker.
MINI_DIARIUM_DATA_DIR points directly at the sandbox diary dir, while the seeded app config gives the frontend an active journal selection.
- WebView storage is isolated under the sandbox too, so preferences and other
localStorage state start clean on a fresh sandbox.
- First launch against an empty sandbox lands on journal creation.
- Reusing the same sandbox lands on password unlock.
- Use
--use-real-config only when the bug depends on the user's actual app state.
Troubleshooting
- Start can take 30-90 seconds on a cold build. Use
--timeout 180 if Rust rebuilds are expected.
- If port
9222 is already taken, restart with --port 9223 and connect agent-browser to that port.
- If start succeeds but the page target is not immediately recorded, run
cmd.exe /c bun run agent:dev:probe. Probe resolves the current page target from the live /json list.
- If probe says
cdp unreachable, inspect .agent-dev/dev.log.
- If probe says the managed PIDs are not alive, the dev session is gone. Start a new one.
- If a stop attempt fails, do not delete
.agent-dev/state.json by hand until you understand which root is still alive.
- Journal locks repeatedly during session: If the app is configured with a short
autoLockTimeout (< 30 s), the idle timer fires between CDP roundtrips. eval calls do not count as user activity. Patch autoLockTimeout to 600 in localStorage immediately after the first unlock (see "Read Or Verify Preferences" recipe above).
What This Skill Does Not Do
- It does not replace WDIO or CI E2E coverage.
- It does not support production builds.
- It does not support macOS or Linux.
- It does not bundle browser automation itself; it relies on the separate
agent-browser capability after startup.