| name | code-review |
| description | Guides AI-assisted code review of SCT pull requests. Use when reviewing a PR, checking a diff for correctness, evaluating method signature changes across class hierarchies, verifying override compatibility, checking import conventions, error handling patterns, backend impact, test coverage, or provision label requirements. Covers inheritance safety, polymorphic method audits, and SCT-specific review criteria. |
Code Review for SCT
Review SCT pull requests for correctness, safety, and convention compliance.
Essential Principles
Override Safety Is the Highest-Priority Check
When a method signature changes on a base class, every subclass override MUST be updated.
SCT uses deep class hierarchies (e.g., BaseCluster -> AWSCluster -> MonitorSetEKS). A signature change in a parent class that isn't propagated to all overrides causes a runtime TypeError — invisible to linting, invisible in the diff, caught only in production. This is the single most dangerous class of bug in SCT because it passes all static checks and all existing tests.
Real incident: PR #13445 added ami_id to AWSCluster._create_instances() but missed MonitorSetEKS._create_instances() in cluster_k8s/eks.py. The bug survived multiple review rounds and broke EKS monitor provisioning at runtime (PR #14113 fixed it).
Why this must be check #1: LLMs process diffs, not the full class hierarchy. Without explicit instruction to search for overrides outside the diff, no reviewer — human or AI — will find them.
Look Beyond the Diff
The diff shows what changed, not what broke.
Many SCT bugs come from code that was NOT modified but should have been. When reviewing:
- A method signature change -> search for ALL overrides
- A new parameter threaded through a call chain -> verify every link in the chain
- A configuration option added -> check
defaults/ for missing default values
- A backend-specific change -> check if other backends need the same change
Convention Violations Are Bugs
SCT enforces strict conventions because inconsistency causes real failures in a multi-backend test framework.
Import ordering, error handling patterns, test style — these aren't style preferences. Inline imports cause circular dependency failures in production. unittest.TestCase breaks pytest fixture injection. Wrong provision labels mean backend regressions go untested.
Review What's Missing, Not Just What's Present
The most valuable review comment is about code that should exist but doesn't.
Ask: Is there a test for this change? Is there a default value for this config option? Are there other backends that need the same fix? Is the docstring updated? Missing code is harder to spot than wrong code, and it's where AI reviewers add the most value.
When to Use
- Reviewing a PR diff for SCT repository
- Checking whether a method signature change breaks subclass overrides
- Verifying import conventions in new or modified Python files
- Assessing backend impact of infrastructure code changes
- Evaluating whether a PR needs provision test labels
- Checking for missing tests, missing defaults, or missing docstrings
- Reviewing nemesis operations or cluster configuration changes
When NOT to Use
- Writing new code from scratch (use relevant implementation skills)
- Writing unit tests (use
writing-unit-tests skill)
- Writing integration tests (use
writing-integration-tests skill)
- Creating implementation plans (use
writing-plans skill)
- Fixing backport conflicts (use
fix-backport-conflicts skill)
Review Checks (Quick Reference)
Detailed checklist with examples in review-checklist.md.
Real incident catalog in common-issues.md.
Check 1: Override & Inheritance Safety
Trigger: Any PR that adds, removes, or changes parameters on a method definition.
- Search for ALL overrides:
grep -rn "def <method_name>" sdcm/ unit_tests/
- For each override NOT in the diff -> FLAG as potential breakage
- Verify new params are accepted AND forwarded to
super()
- Check test stubs in
unit_tests/ that mock or override the same method
High-risk polymorphic methods (sorted by override count):
| Method | Overrides | Key files to check |
|---|
add_nodes | 18+ | All cluster_*.py, cluster_k8s/, unit_tests/ stubs |
_create_instances | 5+ | cluster_aws.py, cluster_gce.py, cluster_azure.py, cluster_oci.py, cluster_k8s/eks.py |
_create_on_demand_instances | 2+ | cluster_aws.py subclasses |
_create_spot_instances | 2+ | cluster_aws.py subclasses |
wait_for_init | 5+ | All backend cluster files |
destroy | 5+ | All backend cluster and node files |
Check 2: Import Conventions
Trigger: Any new or modified import statement.
- All imports at top of file, never inside functions
- Three groups separated by blank lines: stdlib, third-party, internal
- Alphabetically sorted within each group
- Only exception: cyclic dependency with explanatory comment
Check 3: Circular Import Prevention
Trigger: Any PR that adds new import statements between sdcm/ subpackages, or modifies __init__.py files.
Circular imports in SCT cause hard-to-diagnose ImportError or AttributeError at module load time. They are invisible to linting and only surface at runtime.
Detection patterns:
- Module A imports from module B's
__init__.py, and B (transitively) imports from A
__init__.py files that re-export symbols from submodules — importing the package triggers all submodule imports
- Inline/deferred imports with comments like
# avoid circular import — these are workarounds, not fixes
What to flag:
- New imports added to
__init__.py files (especially in packages like sdcm/monitorstack/, sdcm/sct_events/, sdcm/utils/)
- New cross-package imports between modules that are known to have tight coupling (e.g.,
logcollector ↔ monitorstack, cluster ↔ provision)
- Deferred imports introduced as a "fix" — the real fix is restructuring the dependency graph
Resolution approach:
- Extract shared constants/types into a leaf module (e.g.,
constants.py)
- Move functions that create the cycle into a separate submodule
- Keep
__init__.py empty — callers import directly from submodules
- Verify with:
python -c "from sdcm.<module> import <symbol>" for both sides of the cycle
Real incident: sdcm/monitorstack/__init__.py imported logcollector for verify functions, while logcollector imported sdcm.monitorstack.ui — triggering __init__.py and creating a cycle. Fixed by splitting into constants.py, verify.py, restore.py with empty __init__.py.
Check 4: Error Handling
Trigger: Any try/except, raise, or error handling code.
- No empty
catch(e) {} or bare except: pass
- Use
silence() context manager over bare try/except where appropriate
- Custom exceptions should subclass appropriate SCT base exceptions
- Error messages should include enough context for debugging
Check 5: Test Coverage
Trigger: Any non-trivial code change.
- New public functions/methods should have corresponding tests in
unit_tests/
- Tests must use pytest style, not
unittest.TestCase
- Tests must use fixtures, not
setUp/tearDown
- Parametrize when testing multiple input variations
Check 6: Configuration Changes
Trigger: Any change to sdcm/sct_config.py or config-related files.
- New config options MUST have defaults in
defaults/test_default.yaml or backend-specific files
- Config parameter must have proper type, description, and example
- Pre-commit hook updates
docs/configuration_options.md automatically
Check 7: Backend Impact
Trigger: Any change to files in sdcm/cluster_*.py, sdcm/provision/, or sdcm/utils/*_utils.py.
- Verify the correct provision test labels are requested
- Check if the same change is needed for other backends
- Backend file -> label mapping in review-checklist.md
Check 8: Commit Message Format
Trigger: Every PR.
- Conventional Commits format:
type(scope): subject
- Valid types:
ci, docs, feature, fix, improvement, perf, refactor, revert, style, test, unit-test, build, chore
- Scope minimum 3 chars, subject 10-120 chars, body minimum 30 chars
Check 9: HTTP Resilience & Retry Patterns
Trigger: PR touches files with curl, requests.get, requests.post, or remoter.run("curl.
- All
remoter.run("curl ...") calls use curl_with_retry() from sdcm/utils/curl.py (exception: document with # no-retry: <reason>)
- Inline bash scripts via
shell_script_cmd() must also use curl_with_retry() — interpolate the helper into the f-string (e.g. f"{curl_with_retry(url, output='file', follow_redirects=True)}")
- Watch for curl calls hidden inside multi-line
shell_script_cmd(f"""...""") blocks in sct_config.py, cluster.py, sct_runner.py, and similar files — these are easy to miss
- All
requests.get/post/put/delete calls go through a requests.Session with HTTPAdapter(max_retries=Retry(...)) — follow sdcm/rest/rest_client.py pattern
- No bare
requests.get() / requests.post() without session+retry
- Localhost/metadata calls may use
retry=0 but must still use the utility for consistent --connect-timeout
Reference Index
| Workflow | Purpose |
|---|
| review-a-pr.md | Step-by-step process for reviewing an SCT pull request |
Success Criteria
A complete code review: