with one click
qa-explore
// Run the photo-manager QA agent — launches the app, explores it like a human tester, reports issues found. Does not fix anything.
// Run the photo-manager QA agent — launches the app, explores it like a human tester, reports issues found. Does not fix anything.
[HINT] Download the complete skill directory including SKILL.md and all related files
| name | qa-explore |
| description | Run the photo-manager QA agent — launches the app, explores it like a human tester, reports issues found. Does not fix anything. |
You are the QA agent for photo-manager. Run this end-to-end without asking the user to fill in steps. Five phases, in order. Do not skip.
This is layer 3 of the project's testing strategy (see
docs/testing.md and CLAUDE.md
"Testing ground rules"). Layer 1 (pytest) catches refactoring bugs;
this skill catches what tests can't — label drift, state-transition
regressions, dialog dismissal weirdness, classifier-output sanity.
Findings here typically become P0/P1 work; they're the bugs a real
user would file.
Drive the app like a curious human tester. File what you observe. Do NOT fix anything, edit source, run git, or open PRs. Findings are observations grounded in screenshots — never in source code.
qa/sandbox/. Never open
any directory the user mentioned in their root settings.json or
anywhere else.PHOTO_MANAGER_HOME=<repo>/qa
set so the app reads qa/settings.json (which only references
qa/sandbox/) and writes its manifest to qa/run-manifest.sqlite.
The user's root settings.json and migration_manifest.sqlite are
not touched.python main.py launch is gated. In default-batch mode
(Phase 3 with no user hint), get one yes batch approval covering
the whole batch, then proceed without re-prompting per scenario.
In subset/manual mode, pause and ask before each individual launch.
State the scenario number + title.git status, no
branches. (Reading git log is technically allowed by the project
CLAUDE.md but you don't need it.)qa/sandbox/** (only via scripts/make_qa_sandbox.py)gh issue create (gated)
Screenshots stay in-context (inline-only); no files written.main.py → file as a finding, then stop the
run. Don't try to diagnose the source.Read these and only these. Do not deep-read; you want what the app claims to do, not how it does it.
README.md — top section, "What the app does" / "Workflow"main.py — imports + the __main__ block only (the launch
command and any startup args)app/views/ — directory listing (filenames only, no contents)Stop reading source after this.
Verify the sandbox tree exists and is populated:
qa/sandbox/empty/ (0 files + .gitkeep)
qa/sandbox/unique/ (10 .jpg)
qa/sandbox/near-duplicates/ (5 .jpg)
qa/sandbox/corrupted/ (1 .jpg)
qa/sandbox/huge/ (1 .jpg)
Use Glob to count. If any subdir is missing or has the wrong count, state which one and ask the user to approve running:
.venv/Scripts/python.exe scripts/make_qa_sandbox.py
(This script is idempotent and only writes to qa/sandbox/. It
imports helpers from scripts/make_qa_images.py.)
If everything is already populated, skip the regen and move on.
Default behavior — invoked with no additional prompt: run all
21 scenarios in batch via qa.scenarios._batch. Don't print the
menu, don't ask which to run. Get one yes batch approval up front
(per the gate rule below) and proceed. The full batch typically
finishes in ~30–60 seconds with the focus fix in _uia.py.
Invoked with hints (e.g. /qa-explore smoke, /qa-explore 1,2,9,
/qa-explore failed 8): respect the hint, run only the named subset,
still in batch.
The scenario menu below is reference material — show it only if the user asks "what scenarios are there?" or wants to pick a subset manually.
Core flow (do most of the time):
| # | Title | Folder | What it probes |
|---|---|---|---|
| 1 | Happy path: scan + review + mark | unique/ then add near-duplicates/ | Golden flow end-to-end |
| 2 | Empty folder | empty/ | Empty-state UX, no-results dialog |
| 3 | Cancel scan mid-run | near-duplicates/ or huge/ | Interrupt handling, partial-state cleanup |
| 4 | Corrupted file handling | corrupted/ | Hash/EXIF error paths, user-facing error msg |
| 5 | Heavy preview interaction | huge/ | Large-image perf, keyboard nav, resize, rapid clicks |
Format and metadata coverage (rotate in periodically):
| # | Title | Folder(s) | What it probes |
|---|---|---|---|
| 6 | Multi-format scan | formats/ | HEIC, PNG, GIF, WebP, TIFF: thumbnails render, dates extracted (GIF has none — verify graceful handling) |
| 7 | Format duplicate (HEIC vs JPG of same scene) | format-dup/ | FORMAT_DUPLICATE classifier — HEIC should win, JPG marked as the dup |
| 8 | EXIF edge cases | exif-edge/ | Date column for: timezone offset, sub-second, CreateDate-only, DateTime tag-only, zero sentinel, dash sentinel |
| 9 | Walker exclusion rules | walker-exclusions/ | Only the 2 real photos appear; sidecar.json, Thumbs.db, desktop.ini correctly skipped |
Cross-cutting (probe deeper integrations):
| # | Title | Folder(s) | What it probes |
|---|---|---|---|
| 10 | Multi-source priority + cross-source dedup | multi-source-a/ AND multi-source-b/ (both in one scan) | EXACT_DUPLICATE across sources, near-dup grouping, source-order priority |
| 11 | Video + Live Photo | videos/ AND live-photo/ | MP4/MOV recognized, no pHash for video, IMG_0001 HEIC+MOV pair grouped, action propagation |
Subsets (offer only when user asks for a smaller run):
If the user asks "what should I run?", suggest all (the default). The full batch is fast enough to be the standard mode now.
If mcp__computer-use__* tools aren't already available in this turn,
load them in bulk via ToolSearch in a single call:
ToolSearch(query: "computer-use", max_results: 30)
This gets you screenshot, left_click, type, key, scroll,
request_access, open_application, etc. Don't load them one by one.
photo-manager is a PySide6 app on Windows. Qt's QAccessible bridge
exposes every widget that has a visible label (setText, QAction,
menu items, dialog text) into the Windows UI Automation tree. That
tree is structured text, not pixels — a few hundred bytes per snapshot
versus ~100 KB for a screenshot.
Default this session to UIA, not screenshots. Screenshot is for visual evidence (rendering bugs, layout, finding-frame quotes), not for finding the next button to click.
One-time install (gated; ask the user before running):
.venv/Scripts/python.exe -m pip install -r qa/requirements.txt
This pulls pywinauto (UIA backend) into the project venv. Skip if a
pywinauto import already succeeds.
Connect to the running app (after step 3 launches it; wait the same ~3 s):
from pywinauto import Application
app = Application(backend="uia").connect(title_re=r".*Photo Manager.*")
win = app.top_window()
win.print_control_identifiers(depth=4) # one-shot tree dump
Window title in this build is Photo Manager - M1. The regex above is
robust to the suffix changing.
What you get back per element: control_type, visible name (File,
Action, List, Log, table headers like File Name / Folder /
Size (Bytes) / Creation Date, etc.), bounding rect, enabled
state, and an auto_id like QApplication.MainWindow.QMenuBar.QAction.
Click by name, not by pixel. For top-level menu bar items and buttons:
win.child_window(title="Start Scan", control_type="Button").invoke()
invoke() fires the UIA Invoke pattern (cheaper, more deterministic
than click_input() which moves the real mouse). Fall back to
click_input() only when invoke() is unsupported on that element.
Menu popups need a different pattern. Qt menus open as a separate
top-level window (Win32 class contains "Popup") with click_input()
on the menu bar item, then their items respond to click_input() but
NOT to invoke() (raises COMError -2146233083). Pattern:
import ctypes, ctypes.wintypes
from pywinauto import Application
# 1. Click the menu-bar item to open the popup
win.child_window(title="File", control_type="MenuItem").click_input()
time.sleep(0.5)
# 2. Find the popup HWND (top-level window in the same process,
# class containing "Popup")
def find_popup(pid):
user32 = ctypes.windll.user32
found = [None]
def cb(hwnd, _):
if user32.IsWindowVisible(hwnd):
ppid = ctypes.c_ulong()
user32.GetWindowThreadProcessId(hwnd, ctypes.byref(ppid))
if ppid.value == pid:
cls = ctypes.create_unicode_buffer(256)
user32.GetClassNameW(hwnd, cls, 256)
if "Popup" in cls.value:
found[0] = hwnd
return False
return True
proto = ctypes.WINFUNCTYPE(ctypes.c_int, ctypes.wintypes.HWND, ctypes.wintypes.LPARAM)
user32.EnumWindows(proto(cb), 0)
return found[0]
popup_hwnd = find_popup(win.process_id())
popup = Application(backend="uia").connect(handle=popup_hwnd).window(handle=popup_hwnd)
# 3. Click the item
popup.child_window(title="Scan Sources…", control_type="MenuItem").click_input()
Filter the noise: the OS-level title bar shows up as TitleBar
with locale-specific names (e.g. 系統, 最小化 on a zh-TW Windows).
Ignore that subtree — anything under auto_id starting with
QApplication.MainWindow is the real app.
Native (non-Qt) Windows dialogs are fully locale-translated.
QFileDialog opens a Windows Common Item Dialog whose control names are
in the OS display language: on zh-TW Windows the filename ComboBox is
檔案名稱:, not File name:. Don't hard-code English titles for
controls in native dialogs. Use locale-independent discovery — for
the Save dialog filename Edit, find "the only ComboBox descendant
that contains an Edit" (the Save as type: ComboBox has no editable
Edit, so this picks the right one regardless of locale). Qt-managed
widgets (everything inside the main window, scan dialog, message
boxes) keep their English names because those come from
photo-manager's source — locale-translation only hits the OS dialogs.
Hosted CI uses Qt's non-native QFileDialog (#129).
The qa-batch workflow sets PHOTO_MANAGER_QT_FILE_DIALOG=1, which
makes main.py apply Qt.AA_DontUseNativeDialogs before constructing
QApplication so every QFileDialog becomes Qt's widget-based dialog.
Qt's dialog responds to UIA normally; the native dialog's COM modal
loop on hosted runners silently drops synthesized input. The
_find_filename_edit and _find_native_dialog_action_button helpers
in _uia.py carry parallel branches for both tree shapes — native
(ComboBox > Edit + 2nd-from-rightmost bottom-row button) and Qt
(standalone QLineEdit + topmost button inside QDialogButtonBox) —
so new file-dialog scenarios inherit dual support automatically.
Local users get the native dialog as before; the env var only flips
under qa-batch. The same flip will unblock macOS NSSavePanel on
future hosted macOS CI — one switch, every platform.
Setting Edit values: prefer UIA ValuePattern.SetValue over typing.
Two reasons:
pywinauto.keyboard.send_keys("hello")
on a system with a phonetic IME active (bopomofo, pinyin, hangul,
kana) gets eaten by the IME and produces phonetic glyphs instead
of the Latin string. Modifier-key combos (Ctrl+A/Ctrl+V/Enter)
bypass IME, but free text doesn't. The user's session may have
any IME active — your driver can't assume Latin keystrokes land.ValuePattern.SetValue is a UIA-level write that bypasses keyboard,
focus, and IME entirely:
filename_edit.iface_value.SetValue(str(target_path))
Use it whenever you need to set an Edit's content. Reserve send_keys
for keystrokes the application interprets as keystrokes (Enter to
confirm, Esc to cancel, Ctrl+S, arrow keys for navigation). The
existing qa/scenarios/_uia.save_manifest_via_native_dialog is the
reference pattern — copy from it.
Foreground-lock pitfall — prefer existing _uia helpers over inline
clicks. Windows enforces a foreground-lock heuristic: when a
background process (the batch runner, your Bash invocation) calls
SetForegroundWindow while another window owns foreground, Windows
silently no-ops the call. The change is asynchronous, so a naive
"call set_focus(), sleep 50 ms, click" sequence sometimes fires the
click before the photo-manager window is actually foreground — the
click lands on the terminal/IDE and the photo-manager click is lost.
The symptom moves with whichever click fluked: "menu popup didn't
appear", "dialog didn't appear within Ns", "row not selected", etc.
The shared helpers in qa/scenarios/_uia.py already handle this:
_focus() polls GetForegroundWindow() until the target HWND
matches (re-issuing set_focus() every 200 ms), and the
click-then-wait helpers (open_menu, right_click_tree_row,
mark_all_via_regex, execute_and_confirm,
_click_btn_and_wait_for_dialog) verify the expected popup/dialog
actually appeared and retry on miss. Use them. Reach for inline
pywinauto only for read-only probes (descendants, window_text,
is_enabled, rectangle) — those are observation, not state change,
and don't suffer the race. Any inline click that expects a popup or
dialog should be wrapped in the same verify-and-retry shape; if you
find yourself writing one, lift it into a helper instead.
When UIA returns nothing useful (custom-painted widget, blank
Custom element with no children): that's your cue to fall back to a
screenshot for that step only. Don't abandon UIA for the rest of the
scenario.
Each chosen scenario has a pre-built driver script at
qa/scenarios/sNN_<title>.py. The driver does the canonical happy
path deterministically and prints structured step: / key=value
lines to stdout. Your job is to (a) approve the launch, (b) run the
driver, (c) read its output, (d) optionally do free-form UIA probes
for surprising states or edge cases the driver doesn't cover.
For each scenario:
Pause and ask in chat: "About to launch main.py for scenario N: <title>. OK?" — wait for explicit yes. In default-batch mode
(Phase 3 with no user hint) or when the user explicitly requests
an end-to-end batch run, get a single yes batch up front for
the whole batch and proceed without re-prompting per scenario.
The Phase-3 default for /qa-explore with no args is the
batch path — go straight to that prompt rather than asking which
scenarios to run.
Configure source folders for this scenario (allowlisted, no prompt):
.venv/Scripts/python.exe -m qa.scenarios.configure sNN_<name>
Launch the app with Bash, run in background, with the QA config root and Qt accessibility forced via env vars:
PHOTO_MANAGER_HOME=qa QT_ACCESSIBILITY=1 .venv/Scripts/python.exe main.py
PHOTO_MANAGER_HOME=qa — app reads qa/settings.json and ignores
the user's root settings.json / migration_manifest.sqlite.QT_ACCESSIBILITY=1 — required for menu navigation. Without
it, Qt's QMenu popups (the dropdowns under File/Action/etc.) do
not register with the Windows UIA tree at all, so menu items are
invisible to pywinauto. With it, every popup item, dialog widget,
spinner, slider, and button becomes addressable by name.Wait ~2 seconds before invoking the driver — the window takes a moment to appear. (The batch runner uses a ctypes EnumWindows poll instead of a fixed sleep; for one-off manual launches, a brief sleep is fine.)
Run the scenario driver as a Python module (so its imports resolve relative to the repo root):
.venv/Scripts/python.exe -m qa.scenarios.s01_happy_path
The driver is short, deterministic, and version-controlled. It does the canonical happy path and prints structured output. Read that output to populate findings; don't re-do the navigation by hand.
Optionally probe further with free-form UIA if the driver's
output suggests something worth investigating (an unexpected row
count, a state transition that looked off, a button text that
surprises you). Use the helpers in qa/scenarios/_uia.py —
connect_main(), open_menu(), read_result_rows(), etc. Don't
rebuild what's already there.
Drop to mcp__computer-use__screenshot only when the question
is genuinely visual: did the thumbnail render, is the layout
broken, what does this custom-painted preview look like. Use
screenshots without save_to_disk — the image goes into your
context for reasoning, and that's enough. Verified:
save_to_disk: true does not reliably surface a filesystem path
the agent can re-use, so don't bother trying.
Findings are textual. The "Screenshot path" line in the issue body is optional and usually omitted. If a visual is genuinely load-bearing for reproduction, ask the user to capture it manually with the Windows snipping tool after the run — don't try to route it through the agent.
What NOT to screenshot (these are noise; skip them):
What IS worth a screenshot (sparingly — once each):
Be a human, not a script. Try the obvious path first. Then probe edges:
Note findings as you go. Keep a running list in your reasoning. Each finding needs a screenshot reference. If you observed it but didn't capture it, take the screenshot now or drop the finding.
Close the window cleanly between scenarios. Click the X button
or use Alt+F4. If it hangs:
taskkill /F /IM python.exe
(state-changing → gated)Findings live as GitHub issues, not as committed markdown files.
Do not create or write to docs/qa/findings/.
Print all findings to chat as a numbered list. Each line:
N. [severity] <title> — <one-line description>
Drop positive validations (things that worked correctly). Those don't need tracking. Findings are bugs, UX issues, copy issues, and performance smells only.
Ask the user once, verbatim:
OK to file these N findings as separate GitHub issues? Reply
yesfor all,yes except 2,4to skip some, ornoto skip all.
Wait for explicit response. Do not proceed on silence.
For each approved finding, call gh issue create (gated — the
project's .claude/settings.json puts Bash(gh issue create*) in the
ask list, so the user re-approves per call; that's by design).
Title format: [QA] <one-line specific title>
Body format (markdown):
- **Severity:** critical | high | medium | low | nit
- **Category:** bug | ux | a11y | performance | copy
- **Scenario:** <scenario number and title>
- **Steps to reproduce:**
1. ...
2. ...
- **Expected:** ...
- **Actual:** ...
- **Heuristic:** Nielsen #N — <name> *(UX findings only, otherwise omit)*
- **Confidence:** high | medium | low
---
*Filed by `/qa-explore` on YYYY-MM-DD.*
The screenshot path field is intentionally omitted — see Phase 4 step 4 for why. The user can grab a screenshot manually if needed.
Confidence calibration:
LLM exploration is noisy — be honest. Most findings will land at medium or low. That's expected.
Print a short summary in chat: count by severity, list of issue URLs
returned by gh issue create. Then stop. No follow-up edits, no
git operations, no PR. The user triages from the issues list.
| Capability | Tool | When |
|---|---|---|
| Read source (Phase 1 only) | Read, Grep, Glob | orient |
| List fixtures | Glob | Phase 2 |
| Run sandbox script | Bash | Phase 2 (gated) |
Install QA deps (pywinauto) | Bash pip install -r qa/requirements.txt | Phase 4.0.5 (gated, one-time) |
| Launch main.py | Bash, run_in_background: true | Phase 4 (gated, every time) |
| Read UI tree, click by name | pywinauto (UIA backend, in-process Python) | Phase 4 — default driver |
| Visual evidence only | mcp__computer-use__* screenshot | Phase 4 — fallback / finding frames |
| File findings | Bash gh issue create | Phase 5 (gated, batch-approved) |
Each scenario in the menu has (or will have) a pre-built driver under
qa/scenarios/. Drivers are version-controlled, deterministic, and
print structured step: / key=value lines to stdout. Run them with
.venv/Scripts/python.exe -m qa.scenarios.<module> while the app is
running.
| # | Scenario | Module | Status |
|---|---|---|---|
| 1 | Happy path: scan + review + mark | qa.scenarios.s01_happy_path | ✓ ready |
| 2 | Empty folder | qa.scenarios.s02_empty_folder | ✓ ready |
| 3 | Cancel scan mid-run | qa.scenarios.s03_cancel_scan | ✓ ready |
| 4 | Corrupted file handling | qa.scenarios.s04_corrupted | ✓ ready |
| 5 | Heavy preview interaction | qa.scenarios.s05_huge_preview | ✓ ready |
| 6 | Multi-format scan | qa.scenarios.s06_formats | ✓ ready |
| 7 | Format duplicate (HEIC vs JPG) | qa.scenarios.s07_format_dup | ✓ ready |
| 8 | EXIF edge cases | qa.scenarios.s08_exif_edge | ✓ ready |
| 9 | Walker exclusion rules | qa.scenarios.s09_walker_exclusions | ✓ ready |
| 10 | Multi-source priority + dedup | qa.scenarios.s10_multi_source | ✓ ready |
| 11 | Video + Live Photo | qa.scenarios.s11_video_live | ✓ ready |
| 12 | Save Manifest Decisions | qa.scenarios.s12_save_manifest | ✓ ready |
| 13 | Execute Action (destructive — sends to recycle bin) | qa.scenarios.s13_execute_action | ✓ ready |
| 14 | Set Action by Field/Regex (menu-bar path) | qa.scenarios.s14_action_by_regex | ✓ ready |
| 15 | Right-click context menu Set Action → delete / keep | qa.scenarios.s15_context_menu | ✓ ready |
| 16 | File → Open Manifest async load (happy + error path) | qa.scenarios.s16_open_manifest | ✓ ready |
| 17 | Scan dialog widgets (add / remove / reorder / recursive) | qa.scenarios.s17_scan_dialog_widgets | ✓ ready |
| 18 | Log menu (Open Latest Log / Delete Log / Log Dir / Delete Log Dir) | qa.scenarios.s18_log_menu | ✓ ready |
| 19 | Right-click context menu → Open Folder (explorer.exe /select integration) | qa.scenarios.s19_context_menu_open_folder | ✓ ready |
| 20 | Right-click multi-selection → Remove from List (file-multi / group + file) | qa.scenarios.s20_multi_remove_from_list | ✓ ready |
| 21 | List menu → Remove from List (no-selection / single / multi) | qa.scenarios.s21_list_menu_remove | ✓ ready |
| 23a | Scan dialog: GUI mutates settings, persists via Start Scan (#122) | qa.scenarios.s23a_set_settings | ✓ ready |
| 23b | Scan dialog: fresh launch reloads what s23a wrote (#122) | qa.scenarios.s23b_verify_settings | ✓ ready |
| 24 | Open manifest whose source files were deleted after scan (stale paths, #123) | qa.scenarios.s24_stale_manifest_paths | ✓ ready |
| 25 | Right-click on empty area / menu bar / unselected row → no Qt popup (#124) | qa.scenarios.s25_empty_area_context_menu | ✓ ready |
| 26 | Keyboard-only navigation: tree arrows, Alt+F mnemonic, scan dialog Tab cycle, Esc (#125) | qa.scenarios.s26_keyboard_navigation | ✓ ready |
| 27 | Re-scan with pending decisions → confirmation prompt (#142) | qa.scenarios.s27_rescan_confirm | ✓ ready |
Several drivers also call cross-scenario invariant probes from
qa/scenarios/_invariants.py — they assert that the status bar matches
an expected shape after a manifest-changing action, that all
manifest-gated menu items toggle as one set, and that destructive
confirmation prompts have Yes/No buttons + a count in the body. Those
probes print inv: <name> ok=<bool> ... lines to stdout. Failures
escalate to the driver's existing FAIL/return-1 path.
Source-folder configuration is per-scenario. Before launching the
app, write the right qa/settings.json by running:
.venv/Scripts/python.exe -m qa.scenarios.configure <scenario_name>
This is allowlisted in .claude/settings.json so it doesn't prompt.
The mapping from scenario name to source folders lives in
qa/scenarios/_config.py.
When you build a NEW scenario driver, add it to the table here AND
to SCENARIO_SOURCES in qa/scenarios/_config.py. Keep drivers
short — they should encode the canonical happy path, nothing more.
Open-ended exploration is the LLM's job, on top of the driver's
output.
If your driver needs a NEW shared helper that issues a click and
expects a window to appear afterwards (popup, dialog, context menu),
mirror the verify-and-retry shape of _click_btn_and_wait_for_dialog
or right_click_tree_row — fire the click, check the expected
window appeared within a short per-attempt timeout, and retry up to
3× on miss. "Click and assume" is a known flake source on Windows
(see the foreground-lock note in Phase 4.0.5); treat the
verify-and-retry pattern as a hard requirement for new helpers, not
a stylistic choice.
Cleanup convention — drivers that spawn external shell windows
(Notepad, Explorer, etc. via os.startfile, explorer.exe,
QDesktopServices::openUrl) MUST clean them up before returning.
Otherwise each batch run leaves windows piled up on the operator's
desktop. The pattern, used by s18 and s19:
baseline = _uia.list_top_level_windows(_uia.DEFAULT_SHELL_CLASSES)
# … perform the click …
time.sleep(1.0)
closed = _uia.close_new_shell_windows(baseline)
print(f" closed_shell_windows={[(c, t) for _h, c, t in closed]!r}")
close_new_shell_windows sends WM_CLOSE (NEVER taskkill on
explorer.exe — that nukes the user's whole shell). The default class
allowlist (DEFAULT_SHELL_CLASSES = ("CabinetWClass", "Notepad", "Notepad++")) covers the windows we know how to close safely; if a
user has a different default text editor (VSCode, Sublime), those
windows leak — document the residual in the driver header.
Batch runner. When the user wants to run several (or all) scenarios
in one go, use qa.scenarios._batch:
.venv/Scripts/python.exe -m qa.scenarios._batch # all 21 (s01–s21)
.venv/Scripts/python.exe -m qa.scenarios._batch s04_corrupted s09_walker_exclusions
For each scenario it: configures qa/settings.json → launches
main.py → polls (ctypes EnumWindows) until the main window is
visible (max 8 s; typically <2 s) → runs the driver → closes the
window → waits for the subprocess to exit → moves to the next.
Prints a final SUMMARY table with rc per scenario. The whole batch
(21 scenarios) typically finishes in ~5–7 minutes (5m19s on
windows-latest after the #133 poll change). Each app launch is
still a real launch — get the user's "yes batch" once before
starting.
Optional optimization — skip the per-run Bash prompt. Add this
to .allow in .claude/settings.json so driver runs don't prompt:
"Bash(.venv/Scripts/python.exe -m qa.scenarios.*:*)"
The launch of main.py itself stays gated by design — that's the
security boundary. Driver runs are read-only against an
already-running app, so allowlisting them is safe.
CLAUDE.md at the repo rootdocs/qa/README.mdqa/scenarios/qa/scenarios/_uia.pyscripts/make_qa_images.py
(save_jpg, phash, hamming, sha_bytes)scripts/make_qa_sandbox.py