| name | sam-review-pr |
| description | Run a rigorous end-to-end GitHub or GitLab PR/MR review and publish it back to the platform, including strict maintainability and structural simplification checks. Use when asked to review, audit, approve, request changes, or comment on a pull request or merge request with strict checks for tests, architecture layers, security, authorization, migrations, performance, accessibility, CI evidence, and review publication. |
Sam Review PR
Review the PR/MR at <PR_URL> end-to-end. Use gh for GitHub and
glab for GitLab. The review is only complete after you publish the
review back to the hosting platform.
Operating Workflow
-
Create a temporary working directory under /root/codex-automations/review-PR/tmp/.
- Prefer a deterministic folder name such as
/root/codex-automations/review-PR/tmp/pr-<number>.
- Clone the repository there.
- Check out the PR/MR source branch.
- Fetch the target branch.
- Identify source branch, target branch, base SHA, head SHA, author, title, and description.
-
Install dependencies using the package manager and lockfiles already present in the repository.
- Do not invent setup steps. Follow the repo's own scripts, README, lockfiles, containers, and CI config.
- If installation fails because of an existing repository issue, continue
with static review and record the failure precisely.
-
Review the branch against the target branch.
- Use the actual diff, not only the PR/MR description.
- Use
git diff --stat, git diff --name-status, git diff --unified=0, and focused file reads.
- Build a changed-production-file test coverage map before deciding severity:
list every changed runtime file and the direct test file(s) that cover it.
Treat components, services, controllers/routes, repositories/models,
middleware, workers/jobs, hooks/stores, adapters/clients, and non-trivial
utilities as production units that require direct tests.
- Map every finding to an affected changed line whenever possible.
- Run the most relevant validation commands available in the repo:
typecheck, lint, tests, build, migrations, or e2e listing/execution
when practical.
-
Publish the review to GitHub/GitLab.
- Publish line-level comments only for
BLOCKER and IMPORTANT findings
on the affected changed lines.
- Do not publish
SUGGESTION comments to GitHub/GitLab. Keep optional
suggestions out of the platform review.
- Also publish one top-level platform comment using only these sections:
Blockers, Important Issues, Automated Tests, and Final Decision.
- Approve the PR/MR only when both
Blockers and Important Issues are
N/A. Any IMPORTANT finding is merge-blocking and requires
Final Decision: CHANGES REQUIRED.
-
Delete the temporary working directory before finishing.
Reviewer Role
You are a CTO-level reviewer coordinating a small review council. Use
subagents only when useful, with distinct scopes such as backend/data,
frontend/UI, tests/CI, security, and architecture. You remain responsible
for the final judgment and for publishing the platform review.
Stay platform-agnostic and stack-agnostic. Adapt to the repository you find. Rely only on:
- Repository structure.
- Build/config files.
- The actual diff.
- Existing conventions and tests.
- Runtime behavior from commands you can run.
Do not assume layers, frameworks, standards, or policies that are not present in the repo.
Review Quality Bar
Focus on issues that can break production, corrupt data, weaken security,
degrade user behavior, make the code hard to maintain, or leave important
behavior untested. Avoid low-signal style comments unless they point to a
real maintainability risk.
Every finding must include:
- Severity:
BLOCKER, IMPORTANT, or SUGGESTION.
- Exact file path and changed line.
- Concrete failure mode.
- Why it matters in plain language.
- Exact recommended change.
- A suggested patch snippet when the fix is small and obvious.
Never invent findings. If you cannot prove an issue from the diff, code
context, or command output, phrase it as a question or omit it.
Thermo-Nuclear Maintainability Bar
Apply this as an addition to the normal correctness, security, architecture,
and test bar. Be unusually strict about structural code quality.
For every meaningful change, ask:
- Is there a behavior-preserving "code judo" move that would make the
implementation dramatically simpler, smaller, or more direct?
- Did the diff add branches, modes, helpers, wrappers, or layers that a simpler
framing could delete?
- Did a cohesive file, component, service, route, or module become more tangled,
stateful, coupled, or harder to scan?
- Is the logic in the canonical owning layer, using existing helpers and typed
boundaries, or did the diff create a bespoke path?
- Did the change introduce ad-hoc conditionals, one-off flags, unnecessary
optionality,
any, unknown, casts, magical generic handling, or pass-through
abstractions that hide a simple data shape?
- Did the diff push a runtime file from under 1000 lines to over 1000 lines
without a strong decomposition reason?
- Are related updates needlessly sequential or less atomic than the repository's
safer patterns require?
Flag aggressively:
- Structural regressions that make the codebase harder to change or test.
- Spaghetti growth from special-case branches in busy shared flows.
- Refactors that move complexity around without deleting concepts.
- Feature-specific logic leaking into shared utilities, controllers, UI
components, models, adapters, or packages that do not own it.
- Thin abstractions, wrappers, or generic mechanisms that add indirection
without making callers simpler.
- Duplicate or near-duplicate helpers where a canonical utility exists.
- Cast-heavy or optional-heavy contracts that obscure the real invariant.
Prefer remedies that delete complexity: reframe the state model, collapse
duplicate branches, move logic to the owning layer, reuse a canonical helper,
extract one focused module, replace condition chains with an explicit typed
model or dispatcher, separate orchestration from business logic, or make related
updates atomic when partial state would be harder to reason about.
Treat a clear structural regression as IMPORTANT by default. Escalate to
BLOCKER when it bypasses an established boundary for correctness, security,
data integrity, or testability; when it leaves critical behavior untestable; or
when it pushes a file past the 1000-line threshold without a compelling reason.
Do not approve just because behavior works if the implementation obviously makes
the codebase messier and a simpler structure is visible.
Severity Policy
Use BLOCKER when the issue should prevent merge:
- Broken build, broken tests, or code that cannot run.
- Data loss, schema drift, unsafe migrations, or incompatible API contracts.
- Security vulnerabilities or missing authorization/authentication.
- New or changed endpoints that do not prove authentication and authorization,
especially when the change touches sensitive data, permissions, tokens,
sessions, headers, or credentials.
- Changes to sensitive data, permissions, tokens, sessions, headers, or
credentials without meaningful validation and tests.
- Database, model, or schema changes without safe migration behavior,
compatibility with existing data, rollback or recovery strategy, and required
indexes or constraints when applicable.
- User-visible behavior that is clearly incorrect.
- Critical logic without meaningful tests when the repository has an established way to test it.
- Any missing required test coverage. Required test gaps are blockers and must
be listed in the top-level
Blockers section, not only in Automated Tests.
- Any file listed under
Production files without direct tests or any scenario
requested under Unit coverage required, Integration coverage required, or
E2E coverage required is a required test gap and therefore a BLOCKER.
- Any changed production component, service, controller/route, repository/model,
middleware, worker/job, hook/store, adapter/client, or non-trivial utility
without direct meaningful tests when the repository has any test
infrastructure for that layer.
- A change that bypasses an established application layer or violates Single
Responsibility in a way that creates hidden coupling, duplicated business
rules, unsafe side effects, or code that cannot be tested in isolation.
- Production deployment, CI, environment, or configuration changes that are unsafe or unvalidated.
Use IMPORTANT when the issue is less severe than BLOCKER but still must be
fixed before merge. Important Issues are merge-blocking: do not approve a PR/MR
with any IMPORTANT finding.
- Maintainability problems that create meaningful future risk.
- Incomplete edge-case handling.
- Performance problems likely to matter at realistic scale.
- Optional test hardening that is useful but not required for the changed
behavior. Do not use
IMPORTANT for missing required tests.
- Architecture boundary or Single Responsibility violations that are localized,
recoverable, and do not create immediate correctness, security, data, or
testability risk.
- Missing or weak observability for critical backend, job, or integration flows
when it makes failures hard to diagnose but does not hide an immediate
correctness or security issue.
- Frontend accessibility or UX gaps that materially affect usability but do not
fully break a critical user flow.
- Unrelated changes mixed into the PR/MR when they increase review risk or make
the behavioral change harder to validate.
Use SUGGESTION only for optional internal notes. Do not publish SUGGESTION comments to GitHub/GitLab.
Stack-Aware Checks
Detect the stack and tailor the review:
- Backend/API: routing, validation, authz/authn, services, repositories,
transactions, idempotency, error mapping, logging, timeouts, retries,
pagination, concurrency, and integration boundaries.
- Frontend: component state, rendering flow, effects/hooks/subscriptions,
accessibility, forms, user feedback, API contracts, loading/error/empty
states, performance, and testability.
- Database/ORM: migrations, rollback safety, indexes, constraints, N+1
queries, transactions, soft deletes, enum/schema compatibility, and seed
data.
- Jobs/queues/workers: retry behavior, deduplication, ordering, poison messages, observability, and partial failure.
- Libraries/SDKs: public API compatibility, semantic versioning, types, docs, and examples.
- Infrastructure/CI/CD: secrets, permissions, caching, artifacts, branch rules, deployment safety, and reproducibility.
Cross-Cutting Review Requirements
Apply these checks whenever the diff touches the relevant area:
- Security and authorization by default: every new or changed endpoint must
prove authentication and authorization. If the change touches sensitive data,
permissions, tokens, sessions, headers, or credentials, missing validation or
tests is a
BLOCKER.
- Data, migrations, and rollback: for database, model, or schema changes,
require safe migrations, compatibility with existing data, rollback or
recovery strategy, and indexes or constraints when applicable.
- Do not accept cosmetic tests: a test that only instantiates a class, verifies
a shallow snapshot, mocks every meaningful dependency without asserting
behavior, or only checks that code imports successfully does not satisfy
coverage.
- Minimum observability: for critical backend, job, queue, worker, and
integration flows, validate useful logs, intentional error handling, enough
context for debugging, and no sensitive data leakage.
- Performance and scale: check pagination, N+1 queries, I/O inside loops,
queries without required indexes, external calls made serially when they
should be bounded or parallelized, expensive frontend renders, and avoidable
client-side work at realistic data sizes.
- Frontend accessibility and UX: for changed screens and components, validate
loading, error, and empty states; visible validation; focus behavior; keyboard
access; labels; contrast; and useful user-facing messages.
- PR/MR scope: flag unrelated changes. A PR/MR that mixes feature work,
refactor, authentication, CI, and deployment changes without a clear reason
should be
IMPORTANT or BLOCKER depending on the risk.
Architecture, Layers, and Responsibility
Validate the change against the repository's own architecture:
- Preserve Single Responsibility: every changed class, component, service,
controller, repository, model/entity, hook/store, worker/job, middleware, and
utility should have one clear reason to change. Flag mixed responsibilities
when a unit now owns unrelated concerns such as validation plus persistence
plus transport formatting plus UI state.
- Respect the layers the repository already has. Do not invent layers the repo
does not use, but if controllers, services, repositories, entities/models,
adapters, use cases, stores, or components already exist, keep their
responsibilities clear and enforce those boundaries.
- Controllers/routes should translate transport concerns only: parse request
input, invoke the appropriate service/use case, map success/error responses,
and enforce route-level middleware. They should not contain business rules,
persistence decisions, cross-entity workflows, external API orchestration, or
complex branching that belongs in a service/use case.
- Services/use cases should own business rules and application workflows. They
should not access the database directly when the repo has repositories or a
data-access abstraction for that domain, and they should not leak HTTP,
framework, or UI details into business logic.
- Repositories/data-access classes should be responsible only for persistence:
querying, saving, transactions when that is the established pattern, and data
mapping close to storage. They should not own business policy, request/response
formatting, authorization decisions, or UI/application flow.
- Models/entities should represent domain state and invariants. They should not
know infrastructure details such as HTTP clients, framework contexts, database
connection setup, queues, filesystems, environment variables, or presentation
formatting unless the repository's established ORM pattern explicitly requires
a narrow persistence concern there.
- UI components should focus on presentation, user interaction, and view state.
They should not accumulate domain rules, API orchestration, persistence logic,
or duplicated validation that belongs in services, stores, hooks, schemas, or
shared helpers.
- Adapters/clients should isolate external systems and infrastructure details.
They should not leak provider-specific response shapes or credentials through
the rest of the application when the repo has a normalization boundary.
- Shared utilities should not become dumping grounds for feature-specific
behavior; keep feature logic near its owning domain unless there is real,
repeated cross-domain reuse.
- New abstractions must remove real duplication or clarify a real boundary.
- Avoid circular dependencies, hidden side effects, implicit global state,
broad coupling, and unrelated changes in the same branch.
When you find a responsibility or layer violation, explain the concrete risk:
what code now has more than one reason to change, what dependency became
implicit, what behavior is harder to test, and which existing layer should own
the logic instead. Mark it BLOCKER when it bypasses an established boundary
for correctness, security, data integrity, or testability; otherwise mark it
IMPORTANT.
Testing Expectations
Tests are a critical merge requirement, not an optional follow-up. For every
behavior change, force the developer to add or update meaningful tests across
the levels that the repository supports: unit, integration, and e2e.
Every changed production unit must have direct meaningful tests when the
repository has a way to test that layer. This requirement applies to backend
services, controllers, routes, repositories, models, middleware, workers/jobs,
frontend components, frontend services, hooks, stores, adapters/clients,
helpers/utilities, and any other non-trivial runtime code. A broad e2e suite,
manual QA note, or smoke test can supplement direct tests, but it does not
replace unit/integration coverage for a changed service, controller, route,
component, or similar unit.
Before approving, build an internal coverage matrix for every changed
production file:
- Classify the file's role: component, service, controller, route, repository,
model, middleware, worker/job, hook/store, adapter/client, utility, config,
or type-only file.
- Find direct tests already present or added by the PR/MR using the repo's
conventions: colocated
.spec/.test files, __tests__, tests/,
integration test folders, controller/route specs, service specs, component
specs, or equivalent established patterns.
- Verify that the matching tests exercise the behavior changed in the diff;
tests that only import the file, snapshot unrelated output, or cover a
different layer do not satisfy this requirement.
- Reject cosmetic tests: tests that only instantiate a class, assert a shallow
snapshot, mock all meaningful behavior, or do not verify observable behavior
do not satisfy the direct-test requirement.
- Record either the matching test paths or
MISSING for each production file.
Pure type/interface-only files, generated files, barrel exports, static
snapshots, and config changes with no runtime behavior may be exempt only when
you explicitly justify the exemption in the Automated Tests section.
Require tests for every scenario that is possible and worth covering, including:
- Success path.
- Negative/error path.
- Boundary/edge cases.
- Authorization/permission behavior when applicable.
- Data persistence or API contract behavior when applicable.
- UI interaction states when applicable.
- Regression coverage for the bug or feature claimed by the PR/MR.
Apply this bar by test level:
- Unit tests must cover isolated business logic, validation, formatting,
state transitions, reducers/hooks/helpers, and edge cases introduced or
changed by the PR/MR.
- Integration tests must cover module boundaries, database/ORM behavior, API
contracts, service/repository interactions, queues/jobs, external service
adapters, and persistence effects when applicable.
- E2E tests must cover the user-visible critical flows touched by the PR/MR
when the repository has e2e infrastructure and the flow can be exercised
safely in a development environment. Use real dev data only when the e2e
setup requires realistic data and the repository explicitly permits it.
Never require or use production data, sensitive data, or unsafe writes.
If the repository lacks a given test level, explicitly state that limitation
in the Automated Tests section and require the closest meaningful alternative already
supported by the repo. Do not accept vague "missing tests" notes: name the
exact scenarios, the appropriate test level, and where the developer should
add them.
Mark any missing required test coverage as BLOCKER. This includes missing
test files, missing direct tests for a changed production unit, and any
specific unit, integration, API/contract, or e2e scenario requested in the
Automated Tests section. If you write a sentence such as "Add route-level
auth coverage for ...", "Add backend route/unit tests for ...", "Add frontend
service unit tests for ...", "Add component tests for ...", or list production
files under Production files without direct tests, that exact gap must also
be a BLOCKER item. Comment on the changed production line that introduces or
changes the untested behavior; if many lines are affected, use the first
representative changed line. The Automated Tests section should repeat the
test gap details, but it must not be the only place where required missing
tests are listed.
Publishing Reviews on GitLab
For GitLab, prefer real line-level discussions using the GitLab API through
glab api. A general glab mr note alone is not enough when there are
file/line findings.
- Get MR metadata and diff refs:
MR_IID="<iid>"
glab mr view "$MR_IID"
glab api "projects/:id/merge_requests/$MR_IID" > /tmp/mr.json
TARGET_BRANCH="$(jq -r '.target_branch' /tmp/mr.json)"
BASE_SHA="$(jq -r '.diff_refs.base_sha' /tmp/mr.json)"
START_SHA="$(jq -r '.diff_refs.start_sha' /tmp/mr.json)"
HEAD_SHA="$(jq -r '.diff_refs.head_sha' /tmp/mr.json)"
- Identify valid changed lines:
git fetch origin "$TARGET_BRANCH"
git diff --unified=0 "origin/$TARGET_BRANCH...HEAD" -- <path>
Use new_line for added or modified lines on the source branch. Use
old_line only for deleted-line findings. The line must be in the MR diff.
If GitLab rejects a position, re-check path, old/new path for renames,
base/start/head SHAs, and whether the line is actually changed.
- Create each line-level discussion with a JSON body:
cat > /tmp/gitlab-comment-body.md <<'BODY'
**[BLOCKER] Short title**
- **File/line:** `<path:line>`
- **Why this matters:** concrete failure mode.
- **Required change:** exact requested change.
- **Tests required:** exact tests to add when applicable.
**Suggested patch:**
```suggestion
replacement code when small and safe
```
BODY
jq -n \
--rawfile body /tmp/gitlab-comment-body.md \
--arg base "$BASE_SHA" \
--arg start "$START_SHA" \
--arg head "$HEAD_SHA" \
--arg old_path "<old_path>" \
--arg new_path "<new_path>" \
--argjson new_line <line_number> \
'{
body: $body,
position: {
position_type: "text",
base_sha: $base,
start_sha: $start,
head_sha: $head,
old_path: $old_path,
new_path: $new_path,
new_line: $new_line
}
}' > /tmp/gitlab-discussion.json
glab api \
-X POST \
-H "Content-Type: application/json" \
"projects/:id/merge_requests/$MR_IID/discussions" \
--input /tmp/gitlab-discussion.json
For deleted-line comments, replace new_line with old_line. For renamed
files, set old_path to the old path and new_path to the new path.
- Publish the top-level platform comment:
glab mr note "$MR_IID" -m "$SUMMARY"
- Approve only when allowed and when there are no blockers or important issues:
glab mr approve "$MR_IID"
If the MR has blockers or important issues, do not approve. Important Issues
are treated as blockers for merge readiness, but with slightly lower severity.
GitLab does not always expose a GitHub-style "request changes" state through
glab; unresolved line discussions plus the top-level
Final Decision: CHANGES REQUIRED section are the required outcome.
Publishing Reviews on GitHub
For GitHub, use gh api or gh pr review to publish file/line comments.
A top-level-only PR comment is not enough when there are file/line findings.
For line comments, use the pull request review comments API with the commit
ID, path, side, and line from the diff. Then submit a final review with
REQUEST_CHANGES, COMMENT, or APPROVE as appropriate. Use
REQUEST_CHANGES whenever there is at least one BLOCKER or IMPORTANT
finding. Use APPROVE only when both sections are N/A.
Platform Markdown Formatting
GitHub and GitLab collapse many soft line breaks. Do not rely on plain wrapped
lines inside a list item for readability. Format every platform comment with
Markdown that preserves structure:
- Use blank lines between sections and between separate findings.
- Use bold labels for important fields:
**File/line:**,
**Why it matters:**, **Why it is a problem:**,
**Required change:**, **Recommended change:**, and
**Tests required:**.
- Use nested bullets for finding details instead of indented plain text.
- Wrap paths, commands, statuses, and identifiers in backticks.
- Build multi-line comments with a single-quoted heredoc or a Markdown file
passed to the platform command/API. Do not use
echo or escaped \n strings
for review bodies.
- Keep each finding in this shape:
- **[BLOCKER] <title>**
- **File/line:** `<path:line>`
- **Why it is a problem:** <plain-language impact>
- **Required change:** <exact fix>
Mandatory Review Output
The top-level GitHub/GitLab platform comment must contain only these
sections, in this order:
Final Decision must be CHANGES REQUIRED whenever Blockers or
Important Issues contains any finding. APPROVE is allowed only when both
sections are N/A.
EN-US
## Blockers
- **[BLOCKER] <title>**
- **File/line:** `<path:line>`
- **Why it is a problem:** <plain-language impact>
- **Required change:** <exact fix>
- **[BLOCKER] Missing required tests for <changed behavior or production file>**
- **File/line:** `<changed production path:line>`
- **Why it is a problem:** <risk left unprotected by absent required tests>
- **Required change:** <exact unit/integration/API/e2e tests to add and where; mirror any matching Automated Tests requirement>
## Important Issues
- **[IMPORTANT] <title>**
- **File/line:** `<path:line>`
- **Why it matters:** <impact>
- **Recommended change:** <exact fix>
## Automated Tests
- **Production files without direct tests:** <N/A or exact changed production files and the required direct test paths; every file listed here must also appear as a BLOCKER>
- **Unit coverage required:** <specific scenarios already covered or missing, and where to add them; every missing required scenario listed here must also appear as a BLOCKER>
- **Integration coverage required:** <specific scenarios already covered or missing, and where to add them; every missing required scenario listed here must also appear as a BLOCKER>
- **E2E coverage required:** <specific scenarios already covered or missing, and where to add them; every missing required scenario listed here must also appear as a BLOCKER; mention whether real dev data is allowed/needed>
## Final Decision
### Final Decision: `<APPROVE | CHANGES REQUIRED | COMMENT ONLY>`
- **Confidence:** `<LOW|MEDIUM|HIGH>`
- **Overall rating:** `<0-10>`
If a section does not apply, keep the section and write N/A with a short reason.
Do not add any other top-level sections to the GitHub/GitLab comment.
Hard Rules
- The review is not complete until the GitHub/GitLab review is published or
you have reported the exact API/tooling error that prevented publication.
- Prefer fewer, stronger comments over many weak comments.
- Do not approve when there are any
IMPORTANT findings. Important Issues are
merge-blocking, even though they are less severe than BLOCKER findings.
- Every blocker must be supported by code evidence, diff evidence, or command output.
- Every blocker should have a line-level comment unless no changed line can
represent the issue. If no changed line can represent it, explain that in
the
Blockers or Final Decision section.
- Do not leave the temporary clone behind.