Jeden Skill in Manus ausführen
mit einem Klick

Jeden Skill in Manus mit einem Klick ausführen

orch-qa

Sterne1

Forks0

Aktualisiert23. Mai 2026 um 18:39

Senior QA/QC engineer that evaluates existing codebases for test quality. Runs existing tests, diagnoses failures, identifies missing test coverage, and writes missing tests. Framework-agnostic — works with any language and test runner. Use when the user asks to run tests, diagnose failures, do test gap analysis, QA, quality assurance, test coverage analysis, find missing tests, test audit, quality check, quality report, fix failing tests, evaluate test quality, or assess test health of a codebase. Trigger phrases include "run tests", "QA", "quality assurance", "test audit", "test coverage", "missing tests", "gap analysis", "quality report", "fix failing tests", "diagnose test failures", "quality check", "evaluate test quality", "test health".

Installation

Mit Codex oder Claude installieren Kopieren Sie diesen Prompt, fügen Sie ihn in Codex, Claude oder einen anderen Assistant ein und lassen Sie die Skill-Seite prüfen und installieren.

In Manus ausführen

Quelle

hideki5123

hideki5123/agent-skill-set

GitHub-Repository öffnen Creator-Repositorys ansehen

Download

In Manus ausführen

Verwandte BerufeSOC

Basierend auf der SOC-Berufsklassifikation

Softwarequalitätssicherungsanalysten und -testerInformatik- und Mathematikberufe·SOC 15-1253

Datei-Explorer

7 Dateien

SKILL.md

readonly

Mehr aus diesem Repository

gleiches Repository

self-pr-review

hideki5123/agent-skill-set

Self-review loop for YOUR OWN PR — request AI reviews (Copilot + Gemini), apply their fixes, push, re-request, and repeat until clean. NOT for reviewing someone else's PR. Use when the user asks to self-review their PR, run the AI review loop, or wants Copilot + Gemini to review their own code. Trigger phrases include "self-review", "self-pr-review", "review my PR", "AI review my PR", "review loop", "copilot + gemini review", "run self-review on my PR".

2026-06-241

chrome-use

hideki5123/agent-skill-set

Drive an already-running, LOGGED-IN Chromium browser on macOS via AppleScript — navigate, click, fill forms, scroll, and extract page content from the user's REAL session, with the browser left open and NO profile copy and NO restart. Use when the task needs the existing logged-in session (cookies, auth, current tabs) reused as-is. Trigger phrases: "chrome-use", "ブラウザから取得", "ログイン済みの Chromeで", "ブラウザを操作して", "このページ読んで", "Xのスレッド読んで", "ログインしたまま取得", "自分のセッションでスクレイプ", "read this page in my browser", "scrape with my session", "drive my chrome", "execute JS in my (logged-in) browser", "automate my logged-in browser". Chromium family (Chrome default; Brave/Edge/Arc/Vivaldi via --app). NOT for isolated/throwaway automation where login reuse is unneeded — use playwright-cli for that. macOS only.

2026-06-151

my-skill-factory

hideki5123/agent-skill-set

Create, build, and install custom Claude Code skills into Hideki's local marketplace. End-to-end workflow from requirements gathering to a fully installed and usable skill. Use when the user asks to create a new skill, build a skill, make a plugin, add a new capability, or says "make me a skill for X". Also use when updating or reinstalling an existing custom skill. Trigger phrases include "create skill", "make skill", "new skill", "build plugin", "skill for X", "update skill".

2026-06-091

grill-to-impl

hideki5123/agent-skill-set

Turn a finished grill-me (or any design-finalizing) session into a running implementation kickoff. Synthesizes the session's shared understanding into a self-contained implementation brief, hardens it with a small agent team and an adversarial codex-server review loop (until Codex approves), then launches a backgrounded, remote-controllable, autonomous Claude Code session (claude-bg / ccb-style: --permission-mode auto --allow-dangerously-skip-permissions --bg --remote-control) that runs /prd-council seeded with the brief to produce the execution-ready plan and begin implementation. Confirm-before-launch by default. Document/prompt generation + session spawning; it does not itself write feature code. Use AFTER a grill session when the user wants to hand the agreed design off to a fresh autonomous session. Trigger patterns (match any variation): grill-to-impl / grill to impl / grill→impl / grillの成果を実装へ / grillしたセッションから実装 / grillした内容でbriefを作って / briefを作ってCodexにレビュー / Codexにレビューさせてからprdーcouncil / 別セッションでprd-counc

2026-06-091

codex-server

hideki5123/agent-skill-set

Run OpenAI Codex via the Codex App Server (JSON-RPC-over-stdio) for streaming multi-turn chat sessions with persistent threads, structured output, image input, and rich event streams. Default entry point for ChatGPT/GPT/Codex-like conversations from the terminal. Uses the user's ChatGPT subscription (Plus / Pro / Team) exclusively via `codex login` — never consumes OPENAI_API_KEY billing, by design. Implementation: TypeScript on deno with npm:@openai/codex-sdk; spawns the existing system codex binary. Decoupled async invocation: every turn returns turn-id in <1 s; the actual SDK work runs in a detached worker that streams to per-turn files under ~/.codex-server/turns/<turn-id>/ — Bash's 2-min timeout never applies. Prefer this over codex-cli for streaming UX, multi-turn dialogues that need live state, structured (JSON schema) output, attaching images, or programmatic event handling. Use codex-cli only for one-shot batch invocations that pre-capture to an -o file. Trigger patterns (match any variation): codex

2026-06-071

prd-council

hideki5123/agent-skill-set

Turn a feature idea into an execution-ready document set through an adversarial PRD "council": grill the user for requirements, draft a PRD, then debate it with OpenAI Codex round-by-round until BOTH sides approve, and finally emit a Technical PRD (one overall summary + one per UseCase) plus a task list with agent assignments, dependencies, and acceptance criteria — designed so a PdM-role agent could later distribute the work to specialist agents. Document-generation only (no agent execution). Writes local files under docs/prd/<slug>/; never publishes to external trackers. Uses codex-server (the user's ChatGPT subscription) for the debate loop. Use when the user wants a PRD reviewed/approved by Codex, a "PRD council", a Technical PRD for task distribution, or a task breakdown from a PRD. Trigger patterns (match any variation): prd-council / prd council / PRD council / PRD + {Codex, GPT, ChatGPT} + {議論, 相互承認, レビュー, 承認, debate, review, approve} / CodexとPRDを議論 / CodexとPRDを相互承認 / PRDをCodexと詰める / technical prd / t

2026-06-071

name	orch-qa
description	Senior QA/QC engineer that evaluates existing codebases for test quality. Runs existing tests, diagnoses failures, identifies missing test coverage, and writes missing tests. Framework-agnostic — works with any language and test runner. Use when the user asks to run tests, diagnose failures, do test gap analysis, QA, quality assurance, test coverage analysis, find missing tests, test audit, quality check, quality report, fix failing tests, evaluate test quality, or assess test health of a codebase. Trigger phrases include "run tests", "QA", "quality assurance", "test audit", "test coverage", "missing tests", "gap analysis", "quality report", "fix failing tests", "diagnose test failures", "quality check", "evaluate test quality", "test health".
version	1.1.0

QA Engineer (orchestrator)

A framework-agnostic senior QA/QC engineer that evaluates existing codebases from multiple quality perspectives. Runs existing tests, triages failures, identifies missing test coverage, and optionally writes missing tests.

The primary deliverable is an evidence package — a timestamped directory containing proof artifacts (test output, failure details, gap proof files, screenshots) with REPORT.md as the index/navigator.

Role definition lives in the subagent

The QA engineer judgment — lens definitions, failure triage rubric, severity calibration, test-writing rules, evidence shape — lives in the @agent-orch-qa:orch-qa subagent. This skill is the orchestrator: it runs the long-lived shell-heavy phases (recon, monorepo selection, stack confirmation, preflight safety, test execution with recording, parallel team spawn for gap analysis) and delegates judgment calls to the subagent.

Use this skill when the workflow needs all of: project recon + execution + gap analysis + report. Use @agent-orch-qa:orch-qa directly when another agent has the project context and only needs the judgment call (failure triage on a specific failure, gap classification under a single lens, test-writing for a specific gap).

Arguments

Parse these from the user's invocation. All are optional with defaults:

Argument	Default	Description
`--scope`	`all`	`all`, `changed` (git diff vs base branch), or a path/glob
`--lens`	`all`	Comma-separated subset: functional, security, infra, network, frontend, journey, resilience, idempotence, performance, observability
`--fix`	`false`	Write missing test cases (test-only file edits, deterministic assertions, rerun to verify)
`--run`	`true`	Run existing tests before analysis
`--severity`	`all`	Minimum severity to report: `critical`, `high`, `medium`, `low`, or `all`
`--test-cmd`	(auto)	Override auto-detected test runner command
`--exclude`	(none)	Glob pattern to exclude from analysis
`--timeout`	`300`	Max seconds for the entire test suite execution
`--dry-run`	`false`	Show what would be done without running tests or writing files
`--max-findings`	`50`	Cap the number of findings in the report
`--evidence`	`on-failure`	Recording evidence capture mode: `off`, `on-failure`, `all`. Basic file evidence (stdout, code snippets, gap proofs) is always captured regardless of this setting.
`--evidence-dir`	`./qa-evidence`	Base directory for evidence packages
`--base-url`	(auto)	Dev server URL for UI screenshot capture. Auto-detected from Playwright/Cypress config.
`--app-type`	(auto)	Override auto-detected app type: `terminal`, `browser`, `native`. Affects screenshot capture strategy.

Workflow Phases

Feedback Check

If feedback/log.md exists and has 5 or more entries, read the last 10 entries. If a pattern is apparent (same issue in 3+ entries, or average rating below 3):

Tell the user: "Recurring feedback detected: [brief pattern]. Consider running /skill-improve --skill orch-qa."
Continue with normal execution.

Phase 1: Reconnaissance

Understand the project before touching anything.

Read project root: README.md, package.json, pyproject.toml, go.mod, Cargo.toml, *.sln, etc.
Identify source directories vs test directories
Count source files vs test files — compute a rough test-to-source ratio
Map the directory tree (top 3 levels)
If --scope=changed, run git diff --name-only against the base branch to scope files
If --scope is a path/glob, restrict all subsequent phases to matching files

Monorepo handling:

Detect workspace definitions (workspaces in package.json, pnpm-workspace.yaml, Cargo workspace, Go workspace)
If monorepo detected, ask user which package(s) to target
Treat each targeted package as an independent analysis unit

Output: Project summary table — language, framework, source dirs, test dirs, file counts, monorepo status.

Phase 2: Stack Detection

Auto-detect the test stack using signals from references/framework-detection.md.

Scan for config files and dependency declarations
Determine: language, test framework, runner command
If multiple test frameworks detected (e.g., Jest for unit + Playwright for E2E), identify each separately
If --test-cmd is provided, use it as override
Present detection results to user and ask for confirmation before proceeding
Detect optional recording tools (when --evidence != off):
- Check vhs --version — terminal test recording. If missing: warn + show install command per OS
- Check ffmpeg -version — native screen recording + GIF conversion. If missing: warn
- For Playwright projects: note that video can be enabled via config (video: 'retain-on-failure')
- These are optional enhancements — baseline file evidence is always captured without tools
Determine app type (or use --app-type override):
- terminal — Jest, Vitest, Mocha, Pytest, Go test, Cargo test, etc. (default for most)
- browser — Playwright or Cypress detected as test framework
- native — WinAppDriver, XCUITest, Appium, Robot Framework detected, or --app-type=native
- Present tool availability and app type in the detection summary
Detect coverage tooling — check for existing coverage infrastructure:
- Pre-commit hooks: .husky/pre-commit containing coverage commands, lint-staged with coverage
- CI pipelines: .github/workflows/*.yml, .gitlab-ci.yml, Jenkinsfile — look for coverage gates/thresholds
- Runner config: Jest coverageThreshold, pytest --cov-fail-under, Go -coverprofile in scripts
- Third-party: Codecov config (codecov.yml), Coveralls, SonarQube
- If none detected, flag for recommendation in the report's "Next Steps"

Output: Detection summary — runner command, detected frameworks, app type, recording tools, coverage tooling status.

Phase 3: Preflight Safety

Before running any tests, check for potential side effects.

Scan test configuration files for:
- Database connection strings (check for prod-like URLs)
- Environment variable requirements (process.env, os.environ, etc.)
- API keys / secrets references
- Network calls to external services
- File system write operations outside temp dirs
Scan test files in scope for:
- Direct HTTP calls to non-localhost URLs
- Database mutation operations (INSERT, UPDATE, DELETE, DROP)
- Payment/billing API references
- Email/SMS sending functions
Classify risks:
- Safe: Mocked/stubbed, localhost only, in-memory DB
- Warn: External API calls, real DB mutations, file writes
- Block: Production URLs, payment APIs, destructive operations

If any Warn/Block risks found:

List each risk with file:line evidence
Ask user for explicit confirmation to proceed
Suggest mitigations (e.g., "set DATABASE_URL to test DB", "mock the payment client")

If --run=false: Skip this phase entirely. If --dry-run: Show the safety report but do not prompt for confirmation.

Phase 4: Test Execution

Run the existing test suite and capture results into the evidence package.

If --run=false, skip to Phase 6. If --dry-run, show the commands that would run and skip to Phase 6.

Create evidence directory structure:

<evidence-dir>/<YYYY-MM-DD-HHmm>/
  execution/
  failures/
  gaps/
  remediation/
  recordings/

Run the test command detected in Phase 2 (or --test-cmd)
Always capture stdout+stderr to execution/test-output.txt (use 2>&1 | tee)
Always generate execution/test-summary.json by parsing the test output — see references/evidence-tools.md for the format spec
Apply --timeout — if execution exceeds the limit, kill the process tree and report partial results
Parse from output:
- Total tests, passed, failed, skipped, duration
- Full failure output for each failing test
If tests fail to start (missing deps, config error), diagnose the setup issue and report it. Do not proceed to Phase 5 failure triage — instead report the setup issue and ask user how to proceed.
Recording capture (when --evidence != off). Strategy depends on app type detected in Phase 2. See references/evidence-tools.md Tier 2 section for tool details.

Terminal tests:
- Generate a VHS tape file wrapping the test command
- Run vhs <tape-file> instead of the raw command — outputs .gif to recordings/
- If VHS unavailable: fall back to raw command (stdout/stderr still captured in execution/)
Browser UI tests:
- Playwright: temporarily set video: 'retain-on-failure' (or video: 'on' if --evidence=all)
- Cypress: enable video: true in config
- After test run: collect .webm files into recordings/
- For failed tests with a known URL: npx playwright screenshot <base-url><route> <output>.png --full-page
Native desktop app tests:
- Before test: start ffmpeg screen recording as background process (platform-specific: gdigrab/avfoundation/x11grab)
- Run the test command normally
- After test: stop ffmpeg (SIGINT / taskkill on Windows), save to recordings/
- If ffmpeg unavailable: fall back to platform-native screenshot

Output: Test execution summary table. Evidence artifacts listed if captured.

Phase 5: Failure Triage

Classify every failure from Phase 4. For each failing test:

Create evidence folder failures/NNN-<test-name>/ containing:
- error-output.txt — full error message + stack trace extracted from test output
- source-context.txt — source code around the failing line (+/- 10 lines)
- test-code.txt — the complete failing test function/block
- rerun-output.txt — output from rerunning the single test (flaky check)
- screenshot.png — for browser tests, capture the page under test
Classify the failure:

Category	Signal	Action
env/infra	Missing deps, DB connection refused, file not found, permission denied, port in use	Report the infra issue. Do not rerun.
flaky	Passes on 2nd or 3rd rerun of the same test (use `--retry` flag if runner supports it, else rerun manually up to 2 times)	Mark as flaky. Report but deprioritize.
real defect	Fails consistently across reruns. Assertion mismatch indicates the code under test is wrong.	Report as product bug. Include expected vs actual.
test bug	Assertion logic is wrong, mock is stale/incorrect, test uses removed API, test has race condition	Report as test bug. If `--fix` enabled, queue for remediation.

For each failure, produce:
- Category label
- Confidence (high/medium/low)
- Evidence link: [evidence](failures/NNN-<test-name>/)
- Suggested fix (one sentence)

Output: Failure triage table with evidence links.

Phase 6: Gap Analysis

The core value of this skill. Identify untested or under-tested code. Uses an agent team for parallel analysis when multiple lens groups are active.

Read references/qa-team-roles.md for teammate definitions, output formats, and the synthesis protocol.

Phase 6a: Prepare (Lead)

Build the source-to-test mapping that all teammates will use:

Build a source-file-to-test-file mapping:
- Convention-based: src/foo.ts -> test/foo.test.ts, src/foo.py -> tests/test_foo.py
- Config-based: check testMatch, testPathPattern, pytest testpaths
Find source files with NO corresponding test file
Determine which teammate groups are needed based on --lens:
- Code Quality group: functional, resilience, idempotence
- Security & Infra group: security, infra, network
- User & System group: frontend, journey, performance, observability
Map each active --lens value to its group. Only groups with at least one active lens will be spawned.
Assign non-overlapping gap NNN ranges per group (001-199, 200-399, 400-599)

Phase 6b: Parallel Lens Analysis (Team)

If only 1 group has active lenses (e.g., --lens=security), skip team creation and run the analysis inline — apply the QA lenses per references/qa-perspectives.md sequentially to avoid orchestration overhead.

If 2 or more groups are active, create an agent team to analyze test gaps from different quality perspectives:

Create the team and spawn each needed teammate with their context package (see references/qa-team-roles.md for the full spawn prompt spec):
- Source-to-test mapping from Phase 6a
- Project metadata (language, framework, test runner, source/test dirs)
- Their assigned lenses and lens definitions from references/qa-perspectives.md
- Gap NNN range, evidence directory path, filter settings (--severity, --exclude, --app-type, --base-url)
Each teammate independently:
- Greps source files for their lens-specific patterns
- Checks if matching code has corresponding test coverage
- Generates proof files gaps/gap-NNN-<lens>-<severity>.md containing: full source snippet, explanation, pattern matched, suggested test description (see references/evidence-tools.md Tier 1 for the proof template)
- For browser apps (--app-type=browser): the User & System teammate captures screenshots as gaps/gap-NNN-screenshot.png
- Returns a structured findings table to the lead
The lead waits for all teammates to complete and report back.

Phase 6c: Synthesis (Lead)

Aggregate all teammate findings into a unified gap analysis:

Collect all findings tables from each teammate
Deduplicate — if two lenses found the same file:line, keep both findings but note the overlap
Apply --severity filter — drop findings below the minimum severity
Apply --max-findings cap — keep highest severity findings first
Apply --exclude pattern — skip matching files
Renumber gap files to a clean unified sequence (001, 002, ...) by renaming files from teammate ranges
Clean up the team
For each finding, produce a summary row:
- Lens category
- Severity: critical / high / medium / low
- Confidence: high / medium / low
- Source location: file:line
- Pattern matched (what was detected)
- Proof link: [proof](gaps/gap-NNN-<lens>-<severity>.md)
- Suggested test description (what a test should verify)

Output: Gap analysis findings list, grouped by lens, with proof links.

Phase 7: Remediation & Report

7a: Write Missing Tests (only if `--fix` is enabled)

For each gap finding queued for remediation:

Determine the target test file:
- Use the source-to-test mapping from Phase 6
- If no test file exists, create one following the project's test naming convention
Write the test:
- Follow existing test patterns in the project (imports, describe/it structure, fixtures)
- Use deterministic assertions only — no random data, no timing-dependent checks
- Mock external dependencies (network, DB, file system) following existing mock patterns
- Include a comment: // QA-ENGINEER: covers <lens> gap in <source-file>:<line>
Run the impacted test file to verify the new test passes
Capture evidence to remediation/NNN-<test-file>/:
- added-tests.diff — diff of the test file changes
- test-output.txt — output from running the new tests
If the test fails:
- If it's a test bug (assertion wrong), fix it and rerun once
- If it's a real defect (code is actually broken), keep the test but mark it with // TODO: real defect — <description>
- If it's flaky after 2 reruns, delete it and report the issue
Constraints:
- Only edit test files — NEVER edit source/production code
- Only add new test cases — never modify or delete existing tests
- Maximum 10 new test files per run (ask user to continue if more needed)

7b: Generate Quality Report

Generate a markdown report following references/report-template.md.

Save the report as REPORT.md inside the evidence package directory (<evidence-dir>/<timestamp>/REPORT.md)
The evidence directory was created in Phase 4 (or create it now if --run=false)
Add qa-evidence/ to .gitignore if not already present (ask user first)

Report contents:

Evidence Package manifest table (directory -> contents -> count)
Executive summary with health verdict
Test execution results from Phase 4 (linked to execution/ files)
Failure triage from Phase 5 (linked to failures/NNN/ folders)
Gap analysis findings from Phase 6, organized by lens and by severity (linked to gaps/ proof files)
Remediation summary from Phase 7a (linked to remediation/NNN/ folders, if --fix was used)
Coverage Tooling section — detected tools or recommendation to set up coverage gates
Recommended next steps (include coverage tooling setup if none detected)

Present the executive summary to the user in chat — include:
- Health verdict and key metrics
- Evidence package path and file counts: "Evidence package: ./qa-evidence/<timestamp>/ — N failure folders, N gap proofs, N recordings"
- Don't make them open the file for the headline result

7c: Recording Conversion (only when recording artifacts exist in `recordings/`)

Present recording summary: list all captured artifacts with type, size, duration
Ask user: "Which recording artifacts should be converted to GIF for PR?"
For user-approved items:
- .webm / .mp4 -> GIF via ffmpeg palette method (see references/evidence-tools.md)
- VHS .gif output — copy as-is (already GIF)
- .png screenshots — keep as-is
Save GIFs to recordings/gif/
Output: "Ready for PR: N GIFs saved to recordings/gif/"

Retrospective

After completing the workflow, reflect on the entire execution session:

Consider: Were there mid-session corrections? Rejected outputs? Plan changes? Errors?
Ask the user: "Quick feedback on this run? (1-5 rating, note any issues, or press enter to skip)"
If the user provides feedback OR if corrections/issues occurred during this session: a. Create feedback/ directory if it does not exist b. Read feedback/log.md (create with # Feedback Log header if it does not exist) c. Prepend a new entry after the header using the log format from my-skill-factory/references/skill-improvement-guide.md d. Fill in: current timestamp, skill version from frontmatter, task description, outcome assessment, corrections that occurred during the session, issues encountered, user's note
If the user skips AND no corrections or issues occurred, end without recording.

Monorepo Behavior

When a monorepo is detected:

Ask user which packages to target (or accept --scope=packages/foo)
Run Phases 2-7 independently per package
Generate one evidence package per package, plus a combined executive summary
Cross-package integration gaps are noted but not deeply analyzed (suggest using --lens=network for API boundaries)

Error Handling

Situation	Behavior
No test files found	Skip Phase 4-5. Run Phase 6 gap analysis. Report "no test suite detected".
Test runner not detected	Ask user for `--test-cmd`. If still unknown, skip Phase 4-5.
Tests time out	Kill after `--timeout` seconds. Report partial results. Suggest increasing timeout or scoping.
Permission denied on test dir	Report the issue. Ask user to fix permissions.
Git not available	`--scope=changed` falls back to `--scope=all` with a warning.
Evidence dir not writable	Fall back to report-only mode — generate REPORT.md in the current directory without evidence subdirectories. Warn user.

Integration with Other Skills

Skill	When to use instead / together
`tdd-team-workflow`	Use TDD to build new features with tests. Use `orch-qa` to audit existing code.
`e2e-test`	Use for interactive Playwright browser testing. `orch-qa` evaluates E2E test coverage but uses the project's own test runner.
`review-pr`	Use for line-by-line PR review. `orch-qa` focuses on test quality across the codebase, not PR diff review.
`playwright-cli` / `playwright-codegen`	Use to create/run Playwright tests. `orch-qa` may detect E2E gaps and recommend using these skills.

Example Invocations

Full QA audit

Run a full QA analysis on this project

Security-focused audit, no test execution

Analyze test gaps --run=false --lens=security

Fix failing tests and fill gaps

QA this codebase --fix

Audit only changed files

Run QA on changed files --scope=changed --lens=functional,security

Dry run to preview

QA audit --dry-run

Custom test command with timeout

Run tests and analyze gaps --test-cmd="npm run test:unit" --timeout=120

QA with recording evidence for all tests

QA this project --evidence=all

Native desktop app testing

Run QA --app-type=native --evidence=on-failure

Browse evidence after a run

# Evidence package structure:
./qa-evidence/2026-03-07-1430/
  REPORT.md                          # Start here — index into all evidence
  execution/test-output.txt          # Full test output
  execution/test-summary.json        # Parsed results
  failures/001-auth-login/           # Failure evidence folder
    error-output.txt
    source-context.txt
    test-code.txt
    rerun-output.txt
  gaps/gap-001-security-critical.md  # Gap proof with source snippet
  gaps/gap-001-screenshot.png        # UI screenshot (browser apps)
  remediation/001-auth-test/         # Added test diff + output
    added-tests.diff
    test-output.txt

orch-qa

Mehr aus diesem Repository

Mehr aus diesem Repository

QA Engineer (orchestrator)

Role definition lives in the subagent

Arguments

Workflow Phases

Feedback Check

Phase 1: Reconnaissance

Phase 2: Stack Detection

Phase 3: Preflight Safety

Phase 4: Test Execution

Phase 5: Failure Triage

Phase 6: Gap Analysis

Phase 6a: Prepare (Lead)

Phase 6b: Parallel Lens Analysis (Team)

Phase 6c: Synthesis (Lead)

Phase 7: Remediation & Report

7a: Write Missing Tests (only if --fix is enabled)

7b: Generate Quality Report

7c: Recording Conversion (only when recording artifacts exist in recordings/)

Retrospective

Monorepo Behavior

Error Handling

Integration with Other Skills

Example Invocations

Full QA audit

Security-focused audit, no test execution

Fix failing tests and fill gaps

Audit only changed files

Dry run to preview

Custom test command with timeout

QA with recording evidence for all tests

Native desktop app testing

Browse evidence after a run

QA Engineer (orchestrator)

Role definition lives in the subagent

Arguments

Workflow Phases

Feedback Check

Phase 1: Reconnaissance

Phase 2: Stack Detection

Phase 3: Preflight Safety

Phase 4: Test Execution

Phase 5: Failure Triage

Phase 6: Gap Analysis

Phase 6a: Prepare (Lead)

Phase 6b: Parallel Lens Analysis (Team)

Phase 6c: Synthesis (Lead)

Phase 7: Remediation & Report

7a: Write Missing Tests (only if --fix is enabled)

7b: Generate Quality Report

7c: Recording Conversion (only when recording artifacts exist in recordings/)

Retrospective

Monorepo Behavior

Error Handling

Integration with Other Skills

Example Invocations

Full QA audit

Security-focused audit, no test execution

Fix failing tests and fill gaps

Audit only changed files

Dry run to preview

Custom test command with timeout

QA with recording evidence for all tests

Native desktop app testing

Browse evidence after a run

7a: Write Missing Tests (only if `--fix` is enabled)

7c: Recording Conversion (only when recording artifacts exist in `recordings/`)

7a: Write Missing Tests (only if `--fix` is enabled)

7c: Recording Conversion (only when recording artifacts exist in `recordings/`)