원클릭으로
qa-review
// Use when reviewing or planning QA strategy for a feature, PR, or release so test coverage, test quality, reliability, and defect reporting are handled as a coherent engineering discipline instead of ad hoc checks.
// Use when reviewing or planning QA strategy for a feature, PR, or release so test coverage, test quality, reliability, and defect reporting are handled as a coherent engineering discipline instead of ad hoc checks.
| name | qa-review |
| description | Use when reviewing or planning QA strategy for a feature, PR, or release so test coverage, test quality, reliability, and defect reporting are handled as a coherent engineering discipline instead of ad hoc checks. |
| metadata | {"category":"testing","agent_type":"general-purpose","origin":"adapted from github/awesome-copilot qa-engineering-best-practices.instructions.md (MIT)"} |
Review quality strategy the way a strong QA engineer would: choose the right test layers, verify the tests are readable and deterministic, and make failures easy to act on.
| Instead of qa-review | Use |
|---|---|
| Writing one failing test before implementation | tdd-workflow |
| Building an LLM or agent evaluation suite | eval-harness |
| Writing or debugging browser automation for a specific flow | e2e-testing or browser-devtools |
| General UX or usability critique | ux-audit |
Balance the test pyramid first:
| Layer | Goal | Typical share |
|---|---|---|
| Unit | business logic, edge cases, fast feedback | 60-70% |
| Integration | module boundaries, DB, API, contracts | 20-30% |
| End-to-end | critical user journeys and smoke flows | 5-10% |
If a change leans too heavily on slow end-to-end tests or skips contract-level checks entirely, treat that as a QA design issue, not just a missing test.
Before reading tests, ask:
Use that to decide whether you primarily need unit, integration, E2E, or performance evidence.
rg -n "describe\\(|test\\(|it\\(" . -g "*.{test,spec}.{js,jsx,ts,tsx}"
rg -n "def test_|class Test" . -g "*_test.py" -g "test_*.py"
rg -n "@playwright/test|cypress|selenium" .
Check whether the tests already map to the right layer or whether important paths are covered only indirectly.
Prefer names that read as standalone behavior statements:
should return 404 when product id does not existgiven an expired token, when the user calls /me, then it returns 401Flag names like test1, works, or implementation-detail phrasing that makes
failures harder to interpret.
For assertions:
Look for:
Treat random sleeps, global state reuse, and order-dependent tests as reliability bugs.
For APIs, verify:
For UI or E2E coverage, verify:
role, label, then test-id)For performance-sensitive work, verify:
A strong QA pass should leave behind:
Bug reports should include:
## QA Review
### Coverage Shape
- ...
### Test Quality
- ...
### Reliability Risks
- ...
### Missing Cases
- ...
### CI / Reporting Gaps
- ...
### Recommended Next Step
1. ...
2. ...
3. ...
| Rationalization | Reality |
|---|---|
| "The E2E test covers everything already." | Slow end-to-end coverage does not replace unit or contract coverage. |
| "This assertion is good enough." | Vague assertions create vague failures. |
| "A little sleep makes the test stable." | Arbitrary waits hide race conditions instead of fixing them. |
| "Coverage is high, so QA is done." | Coverage percentages do not prove critical paths or edge cases were tested well. |
tdd-workflow - write the first failing tests before implementationeval-harness - evaluate LLM or agent workflows with tracked test casese2e-testing - build and run critical user-flow automationtest-coverage - measure and close structural coverage gapsUse when Copilot CLI's built-in tools do not cover a service you need — for example PostgreSQL, Redis, Jira, Slack, or an internal API — and you need to add an MCP server beyond the default GitHub MCP. NOT when the built-in tools already cover the task.
Use when designing or reviewing an AI agent system that needs policy-based access controls, intent classification, tool-level rate limiting, trust scoring for multi-agent workflows, or append-only audit trails.
Use when auditing an AI agent system against the OWASP Agentic Security Initiative Top 10 — checks tool access, prompt boundaries, memory handling, and operational safeguards across the agent pipeline.
Use when you need to evaluate an LLM pipeline or AI feature systematically — sets up an eval harness with test cases, scoring rubrics, and pass/fail tracking rather than one-off manual spot-checks
Use when creating or validating a Git branch name so the branch follows a conventional type/description format, matches the work being done, and starts from the right base branch.
Use when work is changing sessions, agents, or machines and the next pass needs a compact handoff document with current state, open questions, and next steps instead of raw chat history.