con un clic
qa-engineer
// [production-grade internal] Writes and runs tests when you want to verify code works — unit, integration, e2e, performance, contract testing. Routed via the production-grade orchestrator.
// [production-grade internal] Writes and runs tests when you want to verify code works — unit, integration, e2e, performance, contract testing. Routed via the production-grade orchestrator.
[production-grade internal] Reviews code for quality — architecture conformance, anti-patterns, performance issues, maintainability. Read-only analysis, never modifies code. Routed via the production-grade orchestrator.
[production-grade internal] Optimizes AI/ML/LLM usage when you need model selection, prompt engineering, cost reduction, or experiment design. Routed via the production-grade orchestrator.
[production-grade internal] Sets up deployment and infrastructure — Docker, CI/CD pipelines, cloud provisioning, environment configuration. Routed via the production-grade orchestrator.
[production-grade internal] Builds web frontends — React/Next.js components, pages, design systems, state management, typed API clients. Routed via the production-grade orchestrator.
[production-grade internal] Thinking partner when you're unsure what to build or how — explores ideas, researches options, helps decide before committing to code. Routed via the production-grade orchestrator.
[production-grade internal] Turns product ideas and business goals into formal requirements — BRD, user stories, acceptance criteria, prioritization. Routed via the production-grade orchestrator.
| name | qa-engineer |
| description | [production-grade internal] Writes and runs tests when you want to verify code works — unit, integration, e2e, performance, contract testing. Routed via the production-grade orchestrator. |
Read protocol: ${PG_PROTOCOLS}/ux-protocol.md (use the read_file tool before continuing).
Read protocol: ${PG_PROTOCOLS}/input-validation.md (use the read_file tool before continuing).
Read protocol: ${PG_PROTOCOLS}/tool-efficiency.md (use the read_file tool before continuing).
Read protocol: ${PG_PROTOCOLS}/visual-identity.md (use the read_file tool before continuing).
Read protocol: ${PG_PROTOCOLS}/freshness-protocol.md (use the read_file tool before continuing).
Read protocol: ${PG_PROTOCOLS}/receipt-protocol.md (use the read_file tool before continuing).
Read protocol: ${PG_PROTOCOLS}/boundary-safety.md (use the read_file tool before continuing).
Read protocol: ${PG_PROTOCOLS}/conflict-resolution.md (use the read_file tool before continuing).
Run shell command before continuing: cat .production-grade.yaml 2>/dev/null || echo "No config — using defaults"
(use the execute_shell_command tool).
Run shell command before continuing: cat Claude-Production-Grade-Suite/.orchestrator/codebase-context.md 2>/dev/null || true
(use the execute_shell_command tool).
Fallback (if protocols not loaded): Use AskUserQuestion with options (never open-ended), "Chat about this" last, recommended first. Work continuously. Print progress constantly. Validate inputs before starting — classify missing as Critical (stop), Degraded (warn, continue partial), or Optional (skip silently). Use parallel tool calls for independent reads. Use smart_outline before full Read.
Run shell command before continuing: cat Claude-Production-Grade-Suite/.orchestrator/settings.md 2>/dev/null || echo "No settings — using Standard"
(use the execute_shell_command tool).
| Mode | Behavior |
|---|---|
| Express | Fully autonomous. Generate all test suites with sensible coverage targets. Report test plan in output. |
| Standard | Surface 1-2 critical decisions — coverage targets, e2e scope (which flows to test), performance thresholds. |
| Thorough | Show full test plan before implementing. Ask about test data strategy, which edge cases matter most, performance SLAs to validate. Show test results summary per category. |
| Meticulous | Walk through test plan per service. User reviews test scenarios before implementation. Show each test category's results. Ask about flaky test tolerance and retry strategy. |
Follow Claude-Production-Grade-Suite/.protocols/visual-identity.md. Print structured progress throughout execution.
Skill header (print on start):
━━━ QA Engineer ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Phase progress (print during execution):
[1/2] Test Planning
✓ {N} test cases across {M} categories
⧖ building traceability matrix...
○ coverage targets
[2/2] Test Implementation
✓ unit: {N} tests
✓ integration: {N} tests
⧖ e2e: writing user flow specs...
○ performance: load tests
Completion summary (print on finish — MUST include concrete numbers):
✓ QA Engineer {N} tests written, {M} passing, {K} failing ⏱ Xm Ys
If Claude-Production-Grade-Suite/.orchestrator/codebase-context.md exists and mode is brownfield:
Read .production-grade.yaml at startup. Use these overrides if defined:
paths.services — default: services/paths.frontend — default: frontend/paths.tests — default: tests/This skill runs AFTER the Software Engineer and Frontend Engineer skills have completed. It expects:
services/ and libs/ — Backend services, handlers, repositories, domain models, API route definitionsfrontend/ — UI components, pages, hooks, state management, API client callsapi/, schemas/, docs/architecture/ — API contracts (OpenAPI/AsyncAPI specs), data models, sequence diagramsThe QA Engineer does NOT modify source code. It generates test files and test infrastructure to tests/ at the project root, and test documentation (test plan, reports) to Claude-Production-Grade-Suite/qa-engineer/.
At startup, check whether frontend/ (or paths.frontend from config) exists. If the frontend directory is not found:
[DEGRADED: frontend not found — skipping frontend tests]This skill produces output in two locations: test deliverables (code, configs, fixtures) at tests/ in the project root, and workspace artifacts (test plan, reports, findings) in Claude-Production-Grade-Suite/qa-engineer/. Never write test files into services/ or frontend/ directly.
tests/)tests/
├── unit/
│ └── <service>/ # One folder per backend service
│ ├── handlers/
│ │ └── <handler>.test.ts # HTTP handler / controller tests
│ ├── services/
│ │ └── <service>.test.ts # Business logic / domain service tests
│ ├── repositories/
│ │ └── <repo>.test.ts # Data access layer tests (mocked DB)
│ ├── validators/
│ │ └── <validator>.test.ts # Input validation tests
│ └── mappers/
│ └── <mapper>.test.ts # DTO / domain mapper tests
├── integration/
│ ├── docker-compose.test.yml # Test dependency containers (Postgres, Redis, Kafka, etc.)
│ ├── setup.ts # Global integration test setup / teardown
│ └── <service>/
│ ├── db/
│ │ └── <repo>.integration.ts # Real DB queries via testcontainers
│ ├── cache/
│ │ └── <cache>.integration.ts # Real Redis / cache operations
│ ├── messaging/
│ │ └── <queue>.integration.ts # Real message broker publish / consume
│ └── api/
│ └── <endpoint>.integration.ts # HTTP-level integration (supertest / httptest)
├── contract/
│ ├── pacts/
│ │ ├── consumer/
│ │ │ └── <consumer>-<provider>.pact.ts # Consumer-driven contract tests
│ │ └── provider/
│ │ └── <provider>.verify.ts # Provider verification tests
│ ├── schema/
│ │ └── <api>.schema.test.ts # OpenAPI schema validation tests
│ └── pact-broker.config.ts # Pact Broker connection config
├── e2e/
│ ├── api/
│ │ ├── flows/
│ │ │ └── <user-flow>.e2e.ts # Multi-step API workflow tests
│ │ ├── smoke.e2e.ts # Critical-path smoke tests
│ │ └── setup.ts # API E2E auth helpers, base URLs
│ └── ui/
│ ├── pages/ # Page Object Models
│ │ └── <page>.page.ts
│ ├── flows/
│ │ └── <user-flow>.spec.ts # Playwright / Cypress user flow specs
│ ├── visual/
│ │ └── <component>.visual.ts # Visual regression snapshot tests
│ └── playwright.config.ts # Or cypress.config.ts
├── performance/
│ ├── load-tests/
│ │ └── <scenario>.k6.js # k6 load test scripts (sustained load)
│ ├── stress-tests/
│ │ └── <scenario>.k6.js # k6 stress test scripts (breaking point)
│ ├── spike-tests/
│ │ └── <scenario>.k6.js # k6 spike test scripts (sudden burst)
│ ├── baselines/
│ │ └── <scenario>.baseline.json # Expected p50/p95/p99 latency, throughput
│ └── thresholds.js # Shared k6 threshold definitions
├── fixtures/
│ ├── factories/
│ │ └── <entity>.factory.ts # Test data factories (fishery / factory-girl pattern)
│ ├── seed-data/
│ │ ├── <entity>.seed.json # Static seed data for integration / E2E
│ │ └── seed-runner.ts # Script to load seed data into test DBs
│ └── mocks/
│ ├── <external-api>.mock.ts # External API mock servers (MSW / nock)
│ └── <service>.stub.ts # Internal service stubs
└── coverage/
└── thresholds.json # Per-service and global coverage gates
Claude-Production-Grade-Suite/qa-engineer/)Claude-Production-Grade-Suite/qa-engineer/
├── test-plan.md # Master test plan with traceability matrix
├── coverage-report.md # Coverage analysis and findings
└── findings.md # QA findings and recommendations
Execute each phase sequentially. Do NOT skip phases. Each phase builds on the outputs of the previous one.
After Phase 1 (Test Planning), Phases 2-6 run in parallel — each test type is independent:
# After test plan is written, spawn all test types simultaneously:
<!-- v0.1: do this work yourself; no subagent spawn --> Agent(prompt="Write unit tests following Phase 2 rules. Read test-plan.md for traceability. Write to tests/unit/.", ...)
<!-- v0.1: do this work yourself; no subagent spawn --> Agent(prompt="Write integration tests following Phase 3 rules. Read test-plan.md. Write to tests/integration/.", ...)
<!-- v0.1: do this work yourself; no subagent spawn --> Agent(prompt="Write contract tests following Phase 4 rules. Read test-plan.md. Write to tests/contract/.", ...)
<!-- v0.1: do this work yourself; no subagent spawn --> Agent(prompt="Write E2E tests following Phase 5 rules. Read test-plan.md. Write to tests/e2e/.", ...)
<!-- v0.1: do this work yourself; no subagent spawn --> Agent(prompt="Write performance tests following Phase 6 rules. Read test-plan.md. Write to tests/performance/.", ...)
Wait for all 5 agents to complete, then run Phase 7 (Test Infrastructure) sequentially — it needs all test files to configure CI.
Why this works: Each test type reads source code independently and writes to its own directory. No conflicts. The test plan from Phase 1 provides shared context.
Execution order:
Goal: Produce a traceability matrix linking every BRD acceptance criterion to concrete test cases, categorized by test type.
Inputs to read:
api/ API contracts (OpenAPI specs, AsyncAPI specs)schemas/ data models and docs/architecture/ sequence diagramsservices/ service structure (list all services, handlers, repos)frontend/ component and page structure (if frontend exists; otherwise skip frontend inputs)Actions:
Output: Write Claude-Production-Grade-Suite/qa-engineer/test-plan.md with the following sections:
Goal: Test each service's business logic, handlers, and repositories in isolation with full mocking of external dependencies.
Inputs to read:
services/ source code for each serviceRules:
tests/unit/<service>/.it("should return 404 when order does not exist for the given user").tests/fixtures/factories/ for test data — never inline large object literals.toEqual over toBeTruthy.Output: Write test files to tests/unit/<service>/.
Also write factories to tests/fixtures/factories/ as you discover entity shapes.
Goal: Test service interactions with real dependencies using testcontainers or docker-compose.
Inputs to read:
services/ database migrations, schemas, connection configsdocs/architecture/ infrastructure requirements (which DBs, caches, brokers)Rules:
tests/integration/docker-compose.test.yml with containers for every real dependency (PostgreSQL, Redis, Kafka, Elasticsearch, etc.). Pin exact image versions.tests/integration/setup.ts with global before/after hooks: start containers, run migrations, seed base data, tear down after suite.Output: Write test files to tests/integration/<service>/.
Write docker-compose.test.yml and setup.ts to tests/integration/.
Goal: Verify API consumers and providers agree on request/response schemas and that implementations conform to OpenAPI specifications.
Inputs to read:
api/ OpenAPI specs and AsyncAPI specsservices/ API route definitions, request/response DTOsfrontend/ API client calls and expected response shapes (if frontend exists; otherwise skip consumer-side frontend contracts)Rules:
pact-broker.config.ts (even if the broker URL is a placeholder).Output: Write contract tests to tests/contract/.
Goal: Test critical user flows end-to-end through the full stack.
Inputs to read:
frontend/ pages and navigation flow (if frontend exists; otherwise API-only E2E)services/ API endpointsRules:
tests/e2e/ui/pages/.data-testid attributes, ARIA roles — never CSS classes or DOM structure.smoke.e2e.ts) that covers the absolute minimum "is the app alive" checks. This runs on every deploy.sleep() calls.<Link> or client-side navigate() targets API routes, external URLs, or auth endpoints. These must use raw <a href> or window.location for full HTTP requests.Output: Write E2E tests and page objects to tests/e2e/. Write Playwright or Cypress config.
Goal: Establish performance baselines and create load/stress test scripts for performance-sensitive endpoints.
Inputs to read:
docs/architecture/ NFRs (latency targets, throughput requirements, SLOs)services/ API endpoints (especially high-traffic ones)Rules:
http_req_duration['p(95)'] < 500, http_req_failed < 0.01.order_processing_time).Output: Write k6 scripts to tests/performance/. Write baseline files to tests/performance/baselines/.
Goal: Configure CI test execution, coverage enforcement, and test reliability tooling.
Inputs to read:
Actions:
tests/coverage/thresholds.json with per-service and global coverage gates:
{
"global": { "lines": 80, "branches": 75, "functions": 80, "statements": 80 },
"services": {
"<service-name>": { "lines": 85, "branches": 80, "functions": 85, "statements": 85 }
}
}
.github/workflows/test.yml (or ci/test-config.yml) with:
tests/fixtures/seed-data/seed-runner.ts.tests/fixtures/mocks/.Output: Write CI config to .github/workflows/test.yml, coverage thresholds and test infrastructure to tests/.
| # | Mistake | Why It Fails | What to Do Instead |
|---|---|---|---|
| 1 | Writing tests inside services/ or frontend/ source directories | Pollutes source directories; violates pipeline separation | Always write tests to tests/ at project root exclusively |
| 2 | Testing implementation details instead of behavior | Tests break on every refactor, providing no safety net | Test public interfaces, inputs, and outputs — not private methods or internal state |
| 3 | Using any type or skipping type assertions in test mocks | Mocks drift from real interfaces silently; tests pass but code is broken | Type mocks against the real interface; use jest.Mocked<typeof RealService> or equivalent |
| 4 | Sharing mutable state between tests | Tests pass in isolation but fail when run together; order-dependent results | Reset state in beforeEach; use factory functions that return fresh instances |
| 5 | Hardcoding connection strings, ports, or URLs in test files | Tests break in CI, on other machines, or when container ports change | Use environment variables with sensible defaults; read from docker-compose labels |
| 6 | Writing integration tests that mock the dependency under test | You are just writing unit tests with extra steps; real bugs slip through | If testing DB queries, use a real database. If testing cache, use real Redis. Mock only the things NOT under test |
| 7 | E2E tests that depend on specific database IDs or auto-increment values | Tests break when seed data changes or when run against a non-empty database | Create test data as part of test setup; reference by unique business identifiers, not DB IDs |
| 8 | Performance test scripts with a single hardcoded request | Does not simulate real traffic patterns; results are misleading | Parameterize requests with varied data; simulate realistic user think-time with sleep(Math.random() * 3) |
| 9 | Coverage thresholds set to 100% | Encourages meaningless tests written just to hit the number; blocks legitimate PRs | Set realistic thresholds (80-85% lines, 75-80% branches); focus on critical path coverage |
| 10 | Ignoring test execution time | Slow test suites get skipped by developers; CI feedback loops become painful | Parallelize tests by service; keep unit suite under 60 seconds; keep integration suite under 5 minutes |
| 11 | Not testing error paths and failure modes | Happy-path-only tests miss the bugs that actually cause production incidents | For every success test, write at least one failure test: invalid input, timeout, auth failure, conflict |
| 12 | Writing E2E tests with sleep() for async waits | Flaky on slow CI runners; wastes time on fast ones | Use explicit wait-for conditions: poll for element visibility, API response, or DB state change |
| 13 | Contract tests that only check status codes | Schema changes, missing fields, and type mismatches go undetected | Validate full response body shape, field types, required fields, and enum values against the contract |
| 14 | No seed data strategy — each test creates its own world from scratch | Integration and E2E suites become extremely slow; redundant setup logic everywhere | Build a shared seed-data layer with factories and a seed runner; tests add only their unique data on top |
| 15 | Generating test files without reading the actual implementation first | Tests reference nonexistent functions, wrong parameter names, or incorrect module paths | Always read the source file before writing its test file; match imports, function signatures, and error types exactly |
| 16 | Auth E2E tests that only check "token returned" | Misses redirect bugs, callback misconfig, and infinite loops that only appear in the full browser flow | Test the complete journey: visit protected page → redirect to login → authenticate → land on original page with authenticated state |
| 17 | Not testing cross-system flows end-to-end | Payment tests that check "Stripe returns success" but never check "order status is updated and user sees confirmation" miss the integration point bugs | For every multi-system flow (auth, payment, webhook), trace from user action to final visible state |
Before marking the skill as complete, verify:
Claude-Production-Grade-Suite/qa-engineer/test-plan.md has a traceability matrix covering every BRD acceptance criterionservices/ has corresponding unit tests in tests/unit/tests/integration/docker-compose.test.yml defines all required test containers with pinned versionstests/coverage/thresholds.json defines realistic per-service coverage gates.github/workflows/test.yml orchestrates all test stages with parallelization and artifact collectiontests/fixtures/factories/ and reused across test typesThis skill body has been adapted for QwenPaw. Differences vs the upstream Claude Code plugin to be aware of:
- No
AskUserQuestiontool. When this skill says to surface a decision, render numbered options as plain Markdown and ask the user to type the option name. Parse free-text replies leniently.- No
Skilltool. Phase transitions happen in-line: read the next sub-skill body viaread_filefrom the workspaceskills/dir.- No subagent spawn. v0.1 is a single-agent flow. If the methodology says "delegate to specialist X", invoke X by reading its
SKILL.mdfromskills/<name>/SKILL.mdand following its instructions yourself.- No
TaskCreate/TaskList. Track progress by writing receipts toClaude-Production-Grade-Suite/.orchestrator/receipts/<task>-<role>.jsonand emitting a one-line status update in chat after each phase.WebSearchistavily_search. RequiresTAVILY_API_KEY. If unset, skip the Freshness Protocol and note it.