一键导入
testing
Testing guide and pre-commit testing strategy for PTO Runtime. Use when running tests, adding tests, or deciding what to test before committing.
用 Codex 或 Claude 帮你安装 复制这段 Prompt,粘贴到 Codex、Claude 或其他助手里,让它检查 Skill 页面并帮你完成安装。
菜单
Testing guide and pre-commit testing strategy for PTO Runtime. Use when running tests, adding tests, or deciding what to test before committing.
用 Codex 或 Claude 帮你安装 复制这段 Prompt,粘贴到 Codex、Claude 或其他助手里,让它检查 Skill 页面并帮你完成安装。
基于 SOC 职业分类
Analyze an onboard run's performance/scheduling/dependency/dump data using simpler's BUILT-IN DFX tools (simpler_setup.tools.*) instead of hand-rolling instrumentation. Use AFTER an onboard run when you need per-run device timing (Total/Orch/Sched), AICPU scheduler-overhead / Tail-OH breakdown, the task dependency graph, scope ring-fill peaks, or to inspect args dumps. These are simpler's own tools (shipped in the wheel), distinct from any cross-repo workload. Reach for this before writing custom timing/logging into the runtime.
Set up a cross-repo investigation when a workload from another repo (pypto, pypto-lib, etc.) needs to be run, especially when you want to swap in simpler-main HEAD or the current worktree's simpler instead of the version that repo pins. Clones-or-updates each external repo every invocation so stale local clones don't lie about CI parity. MUST invoke before chasing "X doesn't work on simpler" reports where X lives outside this repo.
Summarize user-facing changes merged in the current Friday-anchored week (most recent Friday up through yesterday) in the simpler repo into a markdown changelog with before/after code examples. Also emits a full all-PR inventory (WEEKLY_ALL_PRS) and Chinese (_zh) translations of both docs. Use when the user asks for a weekly changelog, weekly summary, or weekly external changes report.
Detect the host's actual Ascend silicon and refuse mismatched `--platform` onboard hardware test invocations BEFORE any device is locked. MUST invoke this skill before running pytest or task-submit commands that use `--platform a2a3` or `--platform a5` (onboard only — sim variants pass through). Use when invoking onboard hardware tests, repro'ing flaky-test reports, or wrapping pytest in task-submit. Skip for `--platform a2a3sim` / `--platform a5sim` (silicon-agnostic).
Review a GitHub PR by analyzing the correct diff (merge-base to HEAD), reconciling stated vs. real goal, and applying type-specific scrutiny. Optionally folds in independent reviews from local `codex` / `gemini` CLIs when the invocation explicitly opts in (`codex`, `gemini`, or `all` in the arguments). Use when the user asks to review a PR, analyze PR changes, or give feedback on a pull request.
Generate a MindStudio Insight trace for any `kernel_entry(args)` style kernel in this repo — SPMD mix, AIC-only single-task (e.g. `aic_pv_matmul`), or AIV-only single-task (e.g. `aiv_softmax_prepare`). Use when the user asks to "produce/generate/run an Insight trace", "trace this kernel under msprof op simulator", or troubleshoot Insight trace collection. AICore-only replay path — bypasses AICPU orchestration. For PTOAS-style kernels, use [PTOAS msprof_op_simulator_usage_zh.md](https://github.com/hw-native-sys/PTOAS/blob/main/.claude/skill/msprof_op_sim_insight_skill.md) instead.
| name | testing |
| description | Testing guide and pre-commit testing strategy for PTO Runtime. Use when running tests, adding tests, or deciding what to test before committing. |
tests/ut/): Standard pytest tests for the Python compilation pipeline and nanobind bindings. Run with pytest tests/ut. Tests declaring @pytest.mark.requires_hardware[("<platform>")] auto-skip unless --platform points to a matching device.tests/ut/cpp/): GoogleTest-based tests for pure C++ modules. Run with cmake -B tests/ut/cpp/build -S tests/ut/cpp && cmake --build tests/ut/cpp/build && ctest --test-dir tests/ut/cpp/build -LE requires_hardware --output-on-failure. Hardware-required tests carry a requires_hardware or requires_hardware_<platform> ctest label and are filtered via -LE.examples/{arch}/*/, tests/st/{arch}/*/): End-to-end @scene_test classes declared inside test_*.py. Sim variants run cross-platform (Linux/macOS); hardware variants require the CANN toolkit and an Ascend device. Discovery is by pytest (batch) or python test_*.py (standalone); #591's parallel orchestrator handles device bin-packing and ChipWorker reuse automatically.Important: Always read .github/workflows/ci.yml first to extract the current --pto-isa-commit and --pto-session-timeout values. These ensure reproducible builds by pinning the PTO-ISA dependency to a known-good commit.
Before running tests, determine whether runtime binaries need recompilation:
| What changed | Rebuild needed? | How |
|---|---|---|
Runtime/platform C++ (src/{arch}/runtime/, src/{arch}/platform/) | Yes | Re-run pip install --no-build-isolation -e . (incremental via build/cache/) |
Nanobind bindings (python/bindings/) | Yes | Re-run pip install -e . |
| Python-only code, examples, kernels | No | Just re-run the test |
In CI, pip install . pre-builds all runtimes before tests run.
# Python unit tests (no hardware)
pytest tests/ut
# Python unit tests (a2a3 hardware)
pytest tests/ut --platform a2a3
# C++ unit tests (no hardware)
cmake -B tests/ut/cpp/build -S tests/ut/cpp && cmake --build tests/ut/cpp/build
ctest --test-dir tests/ut/cpp/build -LE requires_hardware --output-on-failure
# C++ unit tests (a2a3 hardware)
ctest --test-dir tests/ut/cpp/build -L "^requires_hardware(_a2a3)?$" --output-on-failure
# All simulation scene tests (extract --pto-isa-commit, --pto-session-timeout from ci.yml)
pytest examples tests/st --platform a2a3sim \
--clone-protocol https --pto-isa-commit <commit> --pto-session-timeout <timeout>
# All hardware scene tests (extract --pto-isa-commit, --pto-session-timeout from ci.yml, auto-detect idle devices)
pytest examples tests/st --platform a2a3 --device <range> \
--clone-protocol https --pto-isa-commit <commit> --pto-session-timeout <timeout>
# Single runtime
pytest examples tests/st --platform a2a3sim --runtime host_build_graph \
--clone-protocol https --pto-isa-commit <commit>
# Single example (pytest, uses pre-built binaries)
pytest examples/a2a3/host_build_graph/vector_example --platform a2a3sim \
--clone-protocol https --pto-isa-commit <commit>
# Single example (standalone; re-run `pip install --no-build-isolation -e .` first if runtime C++ changed)
python examples/a2a3/host_build_graph/vector_example/test_vector_example.py \
-p a2a3sim --clone-protocol https --pto-isa-commit <commit>
When changed files require testing (C++, Python, or CMake), follow these steps to decide what to test and how.
command -v npu-smi &>/dev/null
| Result | Platforms to test |
|---|---|
| Found | <arch>sim (simulation) and <arch> (hardware) |
| Not found | Simulation only (default a2a3sim) |
When npu-smi is found, detect the platform by parsing chip name from npu-smi info output:
| Chip name contains | Platform |
|---|---|
910B or 910C | a2a3 (sim: a2a3sim) |
950 | a5 (sim: a5sim) |
Use the detected platform for all subsequent --platform flags. If the chip name is unrecognized, warn and default to a2a3.
Run git diff --name-only (or git diff --cached --name-only for staged changes) and match the first applicable rule:
| Changed paths | Scope | Command pattern |
|---|---|---|
src/{arch}/platform/* | Full (all runtimes) | pytest examples tests/st --platform <platform> |
src/{arch}/runtime/<rt>/* | Single runtime | pytest examples tests/st --platform <platform> --runtime <rt> |
examples/{arch}/<rt>/<ex>/* | Single example | python <ex>/test_*.py -p <platform> (or pytest <ex> --platform <platform>) |
tests/ut/* (Python) | Python UT only | pytest tests/ut (add --platform <platform> on a device runner) |
tests/ut/cpp/* | C++ UT only | cmake -B tests/ut/cpp/build -S tests/ut/cpp && cmake --build tests/ut/cpp/build && ctest --test-dir tests/ut/cpp/build -LE requires_hardware |
| Mixed (spans multiple categories) | Escalate to the widest matching scope | — |
Note on runtime C++ changes: When changed paths include
src/{arch}/runtime/orsrc/{arch}/platform/, re-runpip install --no-build-isolation -e .before testing to rebuild the runtime binaries inbuild/lib/(incremental viabuild/cache/). There is no rebuild-on-import —editable.rebuild = false.
Parallelism is handled by the #591 scheduler (simpler_setup/parallel_scheduler.py) based on --device and --max-parallel:
Simulation (a2a3sim): --max-parallel auto = min(nproc, len(--device)). Pass --device 0-15 for a big virtual pool; auto caps in-flight at the CPU count. Override with --max-parallel N on CPU-constrained runners.
Hardware (a2a3): --max-parallel auto = len(--device). One in-flight subprocess per physical device — each device runs a dedicated ChipWorker (see docs/ci.md).
When testing on a2a3, detect idle devices:
npu-smi info
Pick devices whose HBM-Usage is 0 and find the longest consecutive sub-range (at most 4). Pass as --device <start>-<end> (or --device <id> if only one idle device). If no idle device is found, skip hardware testing and warn.
git diff --name-only
│
├─ Only docs/config? ──→ SKIP tests
│
└─ Code changed?
│
├─ Determine SCOPE (Step 2)
│ ├─ platform → full (pytest --platform ...)
│ ├─ runtime → single runtime (--runtime ...)
│ └─ example → single example (standalone test_*.py or pytest <ex>)
│
├─ Runtime C++ changed (src/{arch}/)? ──→ pip install --no-build-isolation -e . first
│
└─ npu-smi found?
├─ Yes → sim + hardware (idle devs, max 4)
└─ No → sim only
examples/{arch}/<runtime>/<name>/tests/st/{arch}/<runtime>/<name>/test_<name>.py with a @scene_test-decorated class (see docs/testing.md for the full template: CALLABLE, CASES, generate_args, compute_golden). End with if __name__ == "__main__": SceneTestCase.run_module(__name__) so the file runs standalone.kernels/aic/, kernels/aiv/, and/or kernels/orchestration/ — referenced by CALLABLE["orchestration"]["source"] / CALLABLE["incores"][*]["source"] as paths relative to the test file.test_*.py under examples/ and tests/st/; no registration needed.git-commit — Complete commit workflow (runs testing as a prerequisite)