with one click
daytona-flow-validator
Daytona UI flow validation loop. Use when validating real app behavior, checking a Daytona flow, proving a bug is fixed, or deciding pass/fail from CDP snapshots, screenshots, and assertions.
Daytona UI flow validation loop. Use when validating real app behavior, checking a Daytona flow, proving a bug is fixed, or deciding pass/fail from CDP snapshots, screenshots, and assertions.
Daytona Electron sandbox testing with CDP/noVNC. Use when the user says test on Daytona, run Electron on Daytona, Daytona dry run, test Electron remotely, reproduce on Daytona, or validate a real desktop flow.
Daytona recording volume, screenshots, artifacts, and validation evidence. Use when the user says record Daytona, recording volume, artifacts volume, screenshots, proof, PR evidence, before/after video, or validate behavior visually.
Local OpenWork Electron browser automation with CDP. Use when driving a local Electron dev app, browser_list, browser_snapshot, browser_eval, composer automation, or local UI smoke tests.
Launch and control standalone Chrome in a Daytona sandbox via CDP. Use for web sign-in, OAuth, Den Web setup, browser-only flows, or when the app should not be driven through Electron CDP.
Daytona cloud server and Den sandbox setup. Use when the user says Daytona server, cloud server, Den server, marketplace server, worker proxy, cloud auth, org policies, or connect Electron to a Daytona server.
Daytona development environment overview. Use when the user asks about Daytona setup, Daytona toolbox, dev environment, noVNC, CDP, server sandbox, secrets volume, Electron sandbox, standalone Chrome, validation, or artifacts volume.
| name | daytona-flow-validator |
| description | Daytona UI flow validation loop. Use when validating real app behavior, checking a Daytona flow, proving a bug is fixed, or deciding pass/fail from CDP snapshots, screenshots, and assertions. |
Use this skill to decide whether a Daytona Electron or browser flow actually works. Launching the sandbox is separate. This skill owns the feedback loop.
Never report success from a click, script return value, or recording alone. Every meaningful action must follow this loop:
browser_snapshot or browser_eval.browser_click, browser_fill, or browser_eval.browser_snapshot or browser_eval.If any assertion is missing, the flow is not validated yet.
For a UI flow, collect all of these when feasible:
browser_list shows the intended target.navigator.userAgent contains Electron/ for desktop flows, or does not for standalone Chrome flows.daytona exec process/log/health check for sidecars, Den, worker proxy, or mock servers.browser_screenshot or .devcontainer/capture-daytona-screenshot.sh at important checkpoints.Use this structure for each step in the final report:
Step: <what was attempted>
Before: <snapshot/eval showed X>
Action: <tool/selector used>
After: <snapshot/eval showed Y>
Assertion: pass/fail because <observable signal>
Evidence: <screenshot path or artifact URL if captured>
Start with browser_snapshot for normal UI controls because it gives stable
UIDs for browser_click and browser_fill. Use browser_eval when:
Prefer synthetic paste for the OpenWork composer:
(function pasteComposer(text) {
var editor = document.querySelector('[contenteditable="true"][data-lexical-editor="true"]');
if (!editor) return { ok: false, reason: 'composer not found' };
editor.focus();
var data = new DataTransfer();
data.setData('text/plain', text);
editor.dispatchEvent(new ClipboardEvent('paste', { bubbles: true, cancelable: true, clipboardData: data }));
return { ok: true, text: editor.innerText };
})('Reply with exactly: Daytona validation OK')
Then assert the Run button is enabled before clicking it.
Use CDP for renderer UI first. When the flow opens native Linux UI that CDP cannot control, such as GTK file pickers, OS permission dialogs, XFCE windows, or Electron-native dialogs, switch to desktop automation inside the sandbox.
Check/install the tools:
daytona exec "$SANDBOX" -- "bash -lc 'if ! command -v xdotool >/dev/null 2>&1; then sudo apt-get update && sudo apt-get install -y xdotool wmctrl; fi'"
Inspect the real desktop window state before acting:
daytona exec "$SANDBOX" -- "bash -lc 'DISPLAY=:99 wmctrl -l; DISPLAY=:99 xdotool getactivewindow getwindowname 2>/dev/null || true'"
Native file picker pattern:
daytona exec "$SANDBOX" -- "bash -lc 'DISPLAY=:99 xdotool search --name \"Authorize folder\" windowactivate; DISPLAY=:99 xdotool mousemove 760 151 click 1 key ctrl+a type --delay 1 -- \"/workspace/hello\" key Return; sleep 1; DISPLAY=:99 xdotool mousemove 1465 927 click 1'"
Rules for native desktop automation:
wmctrl -l before and after to prove the expected native window opened or
closed.Authorize folder over coordinates when possible.Escape before capturing evidence.Use browser screenshots for renderer state:
browser_screenshot({ browser_url: CDP_URL, target_id: TARGET_ID })
Use Daytona display screenshots for noVNC/window state:
daytona exec "$SANDBOX" -- 'bash .devcontainer/capture-daytona-screenshot.sh'
Do not treat screenshots as the only assertion. Inspect text/state with CDP too.
Before publishing, commenting, or reporting a screenshot URL, open the saved
image and visually verify it matches the claim. Use webfetch on the artifact
URL, Read on the local PNG path, or another image-capable viewer. A screenshot
is not valid evidence until the image itself has been inspected.
For every screenshot, assert these visual checks:
If any check fails, mark the evidence as failed, fix the visible state, capture a new screenshot, inspect the new image, and only then share it. If bad evidence was already posted, post a superseding correction that clearly says the earlier screenshot was invalid.
When a step fails:
browser_snapshot or document.body.innerText.Common logs:
daytona exec "$SANDBOX" -- 'tail -120 /tmp/electron.log'
daytona exec "$SANDBOX" -- 'tail -120 /tmp/vite.log'
daytona exec "$SERVER_SANDBOX" -- 'tail -120 /tmp/den-api.log'
Use one of these verdicts:
Passed: every expected outcome has an observable assertion.Failed: at least one assertion disproves the expected outcome.Incomplete: the sandbox/tooling failed, evidence is missing, or only a recording/screenshot was collected.