| name | review |
| description | Structured PR review for pygraphistry. Input: PR number/branch (default current branch PR).
Output: findings and convergence artifacts under plans/<task>/.
Method: multi-wave, evidence-first review across spec, correctness, tests, security,
code quality, DRY, concurrency, performance, architecture, operability, and conventions.
|
PR Review (pygraphistry)
Invocation
/review [<PR-number-or-branch>] [mode=findings|pr-comments|both] [fixes=deferred|inline]
Defaults:
- Target: current branch PR (
gh pr view --json number,headRefName,baseRefName,url)
mode=findings
fixes=deferred
mode:
findings: local artifacts only under plans/<task>/
pr-comments: draft locally, post only after explicit user confirmation in the same session
both: run findings then comment flow
fixes:
deferred: read-only review
inline: after each converged wave, apply confirmed BLOCKER/IMPORTANT fixes in separate commits
Runtime Assumptions
- Run from repo root.
gh is authenticated (gh auth status).
- Local branch reflects PR head (
origin/<base>...HEAD matches intended review scope).
plans/ is local working memory and normally gitignored.
Plan-First Requirement
Always use the plan skill flow:
- If
plans/<task>/plan.md exists, reuse it and append a review section.
- Else reload
.agents/skills/plan/SKILL.md and create plans/<task>/plan.md.
- Record PR metadata,
mode, fixes, and timestamp.
- Reload plan before every step; update plan immediately after every step.
<task>: prefer review-pr-<N> or <branch>-review.
Phase 0: Resolve Scope + Stack Context
- Resolve PR context (number/title/url/head/base).
- Record stack context:
gh pr view <PR> --json baseRefName,headRefName,title,body
gh pr list --base <headRefName>
- If stacked, explicitly mark out-of-scope upstream/downstream work in
plan.md.
- Set diff range reference:
origin/<base>...HEAD.
Phase 1: Research Criteria Before Findings
Create plans/<task>/research/ with:
context.md
policies.md
credentials.md
canvas-<dimension>.md (only for dimensions that apply)
1a) Collect context + changed files
gh pr view <PR> --json number,title,headRefName,baseRefName,url,body
git fetch origin <baseRefName> <headRefName>
git diff --name-only origin/<base>...HEAD
git log --oneline origin/<base>..HEAD
Record linked issue/spec refs from PR body and summarize PR intent.
1b) Discover applicable repo rules
Always inspect:
AGENTS.md, DEVELOP.md, ARCHITECTURE.md, CONTRIBUTING.md, README.md, CHANGELOG.md
docs/source/** relevant to changed areas
- CI/workflow context under
.github/workflows/ when checks/publish behavior are touched
- Tooling configs (
pyproject.toml, mypy.ini, pytest.ini) and helper scripts in bin/
Walk up from each changed file to repo root and include nearby .md guidance.
1c) Credentials gate (always, early)
Run before wave analysis:
git diff origin/<base>...HEAD -- '*.env*' 'custom.env*' 'docker-compose*.y*ml' '*.conf' '*.config.*' | \
grep -iE '(api_key|secret|token|password|bearer|authorization|aws_|azure_|openai_|anthropic_).{0,4}=' || \
echo "[clean] no obvious credential strings"
git diff origin/<base>...HEAD | \
grep -oE '(sk-[A-Za-z0-9]{20,}|AKIA[0-9A-Z]{16}|Bearer\s+[A-Za-z0-9._-]{20,})' | head
If any likely secret is found, raise BLOCKER and stop until surfaced.
Phase 2: Multi-Wave Review Loop
Run waves until 2 consecutive waves show no significant advance.
Severity:
BLOCKER: merge must not proceed
IMPORTANT: should fix before merge
SUGGESTION: non-blocking improvement
Suggested folder layout:
plans/<task>/waves/wave-<N>/
<dimension>/findings-<file-slug>.md
<dimension>/report.md
adversarial/<finding-id>.md
adversarial/report.md
wave-report.md
Wave gate (start of each wave)
- Reload
.agents/skills/plan/SKILL.md
- Reload
plans/<task>/plan.md
- Confirm current diff SHA and wave number
- If prior wave had substantive findings, perform both:
- targeted amplification pass on prior findings/fixes
- clean-slate full pass on current diff
Dimensions
Apply only relevant dimensions per PR:
- Spec conformance
- Correctness
- Testing
- Security
- Code quality
- DRY / reuse
- Concurrency / parallelism
- Performance
- Architecture
- Operability
- Repo conventions
Guidance:
- Keep dimensions independent (avoid blended "general review" prompts).
- For file-heavy diffs, parallelize by
(dimension, file) and aggregate.
- Verify pre-existing patterns are not misreported as regressions:
git show origin/<base>:<file>
Pygraphistry-specific review checks
- Test coverage should mirror changed areas in
graphistry/tests/**.
- Prefer behavioral tests over implementation-detail assertions.
- Treat broad exception catches (
except Exception, bare except) as a bad dynamic/over-defensive typing pattern by default. Require narrowed, documented exception types at local helpers; allow broad catches only at explicit isolation boundaries where the PR explains why programmer errors must be contained.
- Control-flow checks (runtime code, tests, and prompt-routing logic) must use structured signals (for example
code, context keys like field/value, AST/symbol kinds, and stable symbol-binding metadata/files) and never hardcoded message-substring matching.
- Run focused validation before escalating severity when feasible:
python -m pytest -q [targeted_test]
./bin/ruff.sh [changed_path_or_pkg]
./bin/typecheck.sh [changed_path_or_pkg]
- For GPU-affecting PRs, require GPU-path validation evidence:
- Local GPU path:
cd docker && ./test-gpu-local.sh [targeted_test_or_path]
- No local GPU available: run equivalent GPU validation on
dgx-spark and record exact command + output artifact path in wave evidence.
- For RAPIDS/cuDF changes, prefer dual-version validation (
RAPIDS_VERSION=25.02 and 26.02) and include at least one amplified pass beyond early-stop defaults (for example, avoid relying only on --maxfail=1 harness behavior when triaging regression surface).
- When shared GPU pressure blocks full-matrix execution, require explicit evidence of the constrained condition (for example
nvidia-smi + failing stack site), then run targeted amplified subsets and document exactly which tests were excluded and why.
- If startup/runtime claims are made, verify entrypoints/scripts in
bin/ and workflow behavior.
- For docs-only PRs, prioritize spec/documentation accuracy and navigability (toctree links, anchors, cross-refs).
Vectorization & engine compatibility (GFQL / row pipeline / compute)
GFQL is vectorization-first and pure-functional. The row pipeline runs on pandas + cuDF; per-row Python loops are a pandas perf cliff and a cuDF break, and in-place mutation creates aliasing surprises. Audit edits in graphistry/compute/** (esp. gfql/**, plotter/**) for the patterns below.
Engine-polymorphic helpers (use these instead of pandas-only APIs in compute paths):
graphistry.Engine.df_cons(engine), df_concat(engine), df_to_engine(df, engine), safe_merge, resolve_engine(arg, df)
graphistry.compute.dataframe_utils.template_df_cons(template_df, data)
- Type DataFrame params as
DataFrameT (graphistry.compute.typing), not pd.DataFrame.
Vectorization — flag (BLOCKER on hot rows / IMPORTANT elsewhere unless noted):
| Pattern | Fix |
|---|
df.apply(fn, axis=1), iterrows(), itertuples() | Vectorized ops on Series |
for x in df[col] building a Series | Series.map / numpy / mask |
sum(s) / max(s) / etc. on a Series (SUGGESTION) | s.sum() / s.max() |
df[col][i] scalar access in a loop | Same vectorization fix; cuDF rejects this |
Mutation — pygraphistry is pure-functional; flag IMPORTANT (BLOCKER if cell-wise in a loop):
| Pattern | Pure alternative |
|---|
df[col] = v / df.col = v | df.assign(col=v) |
df.loc[mask, col] = v | df.assign(col=df[col].where(~mask, v)) |
df.loc[i,c]= / df.iloc[]= / df.at[]= in a loop | Build column vectorized + df.assign(...) |
inplace=True (drop/rename/sort/fillna/...) | Drop kwarg; assign return |
del df[col] | df.drop(columns=[col]) |
df.append(...) (deprecated; not in cuDF) | df_concat(engine)([df, other]) |
| Mutating then returning the input df | Return a new df; no observable side effects |
Vectorized mutation (df.loc[mask, col] = v) is still mutation — prefer df.assign(...).
cuDF compatibility — flag IMPORTANT unless noted:
| Pattern | Fix |
|---|
df.apply(fn, axis=1) (BLOCKER on cuDF lanes) | Vectorize |
.to_pandas() → pandas op → back | Round-trip = missing vectorization |
is pd.NA / == pd.NA | s.isna() |
dtype == "<pandas-name>" (SUGGESTION) | pd.api.types.is_*_dtype or engine-aware helper |
Param typed pd.DataFrame instead of DataFrameT | Use DataFrameT |
Hot row path = row pipeline executor, edge/node materialization, anything called per-query in _execute_* / _compile_* / _lower_* / row-pipeline ops. Control plane = one-shot config builders, error formatters, parser glue (lower bar).
Paired cuDF coverage required for changes in compute/gfql/row/, compute/gfql/cypher/, compute/gfql_unified.py, compute/chain.py, compute/hop.py, compute/materialize_nodes.py. Sibling pattern: pytest.importorskip("cudf") + engine-parametrized fixture. New DataFrame-touching helpers also need cuDF smoke if on a hot path.
Flag wording:
[BLOCKER] <file>:<line> — <pattern> on hot row path. Use <engine-polymorphic alt>. See agents/skills/review/SKILL.md#vectorization--engine-compatibility-gfql--row-pipeline--compute.
Don't flag: apply(axis=0) (column-wise = vectorized); iterrows() in CLI/test fixtures; mutation of a df the function itself just constructed (no aliasing — SUGGESTION at most); inplace=True in throwaway setup code.
Pre-existing patterns: verify novelty via git show origin/<base>:<file> before raising. Don't re-flag repo debt.
Per-file findings format
## <file>
### Findings
- [BLOCKER] <description> — <file>:<line>
- [IMPORTANT] <description> — <file>:<line>
- [SUGGESTION] <description> — <file>:<line>
### Evidence
- <diff/code/test evidence with file:line references>
Adversarial pass (required)
For each finding, attempt disproof and write verdict:
CONFIRMED
DOWNGRADED
REJECTED
Only CONFIRMED and DOWNGRADED findings carry forward.
Each adversarial file should include:
- original claim
- disprove attempt
- verdict
- concrete proof (
file:line, diff excerpt, or test output)
Wave summary
Each wave writes wave-report.md with:
- inputs (diff SHA/range, dimensions, files)
- post-adversarial counts by severity
- fixes applied (if
fixes=inline)
- finding resolution ledger (
FIXED, UNFIXED, DEFERRED)
- convergence signal
Phase 3: Convergence Report
Write plans/<task>/final-report.md with:
- summary (waves run, convergence status)
- per-wave table (new findings + advance signal)
- resolution matrix (status of non-rejected findings)
- sections:
Blockers, Important, Suggestions, Rejected/False Positives, Methodology
Phase 4: Output Behavior
findings mode:
- Keep full artifacts in
plans/<task>/
- Provide concise terminal summary with:
- target + mode + fixes policy
- waves + convergence status
- per-wave counts
FIXED vs UNFIXED/DEFERRED
- path to
final-report.md
pr-comments mode:
- Draft local comments in
plans/<task>/comment-drafts/
- Ask for explicit confirmation
- Post inline + top-level comments with
gh
both mode:
- Complete findings artifacts first, then comment flow.
Guardrails
fixes=deferred: read-only; do not edit source files.
fixes=inline: edit only files in PR diff and only for confirmed BLOCKER/IMPORTANT findings.
- Never skip lint/tests when applying inline fixes.
- Never force-push from review flow.
- Never post PR comments without explicit user confirmation.
- Always run credentials scan in Phase 1 before waves.
- Prefer exact file/line evidence over broad statements.
- If uncertain whether issue is new, compare against
origin/<base> before filing.