Run any Skill in Manus with one click

dfx-analyze

Stars24

Forks62

UpdatedJune 26, 2026 at 02:11

Analyze an onboard run's performance/scheduling/dependency/dump data using simpler's BUILT-IN DFX tools (simpler_setup.tools.*) instead of hand-rolling instrumentation. Use AFTER an onboard run when you need per-run device timing (Total/Orch/Sched), AICPU scheduler-overhead / Tail-OH breakdown, the task dependency graph, scope ring-fill peaks, or to inspect args dumps. These are simpler's own tools (shipped in the wheel), distinct from any cross-repo workload. Reach for this before writing custom timing/logging into the runtime.

Installation

Install with Codex or Claude Copy this prompt, paste it into Codex, Claude, or another assistant, and let it review the skill page and install it for you.

Run Skill in Manus

Source

hw-native-sys

hw-native-sys/simpler

View GitHub Repository View Creator Repositories

Download

Run Skill in Manus

SKILL.md

readonly

name

dfx-analyze

description

Analyze DFX data (simpler's own tools)

simpler already ships end-user analysis CLIs under simpler_setup.tools — use them; do not re-invent timing/instrumentation in the runtime. Canonical reference (tool flags, examples, output paths): simpler_setup/tools/README.md. Per-DFX docs: docs/dfx/ (l2-timing.md, sched-overhead-model.md, l2-swimlane-profiling.md, scope-stats.md, dep_gen.md, args-dump.md).

Pick the tool by question

You want…	Tool	Needs
Per-run Total / Orch / Sched device timing	`device_log_timing`	nothing extra — `PTO2_PROFILING` markers are in every device log (compile-time default on, NOT gated by swimlane)
AICPU scheduler overhead / Tail-OH / critical-path breakdown	`sched_overhead_analysis`	a `--enable-l2-swimlane` (level≥3) run + `--enable-dep-gen` run
Swimlane → Perfetto Chrome trace	`swimlane_converter`	`--enable-l2-swimlane` run (`--overhead` track needs deps.json too)
Task dependency graph (text / HTML)	`deps_viewer`	`--enable-dep-gen` run → `deps.json`
Per-scope ring-fill peaks (task_window / heap / tensormap)	`scope_stats_plot`	`--enable-scope-stats` run → `scope_stats.jsonl`
Inspect / export args dumps	`dump_viewer`	`--enable-dump-tensor` run → `args_dump/`

First reflex: Total/Orch/Sched needs nothing extra

To answer "where did the time go / is this AICPU-orchestration bound", you do not need swimlane or custom logging — just run, then:

python -m simpler_setup.tools.device_log_timing -d <device_id>   # latest log for that die
# or: --device-log <path/to/device-*.log>
# prints per-round Total / Orch / Sched (us); Orch≈Sched≈Total ⇒ AICPU-bound.

Make the device log easy to find: redirect it under the run's output dir via ASCEND_PROCESS_LOG_PATH (see .claude/rules/running-onboard.md → "Device logs").

Where the inputs are written

DFX artifacts land in the run's output dir with fixed filenames:

simpler scene tests (tests/st): outputs/<case>_<ts>/ (the tools auto-pick the latest by mtime when run from the dir holding outputs/).
JIT examples / pypto-lib: build_output/_jit_*/dfx_outputs/.

Don't

❌ Hand-roll per-stage / submit-drain / per-scope timing in the runtime to get numbers these tools already produce. If a tool is missing a metric, extend the tool, not the hot path (and never log on AICPU hot paths — see codestyle.md rule 7).

More from this repository

same repository

multi-repo-setup

hw-native-sys/simpler

Set up a cross-repo investigation when a workload from another repo (pypto, pypto-lib, etc.) needs to be run, especially when you want to swap in simpler-main HEAD or the current worktree's simpler instead of the version that repo pins. Clones-or-updates each external repo every invocation so stale local clones don't lie about CI parity. MUST invoke before chasing "X doesn't work on simpler" reports where X lives outside this repo.

2026-06-2624

weekly-changelog

hw-native-sys/simpler

Summarize user-facing changes merged in the current Friday-anchored week (most recent Friday up through yesterday) in the simpler repo into a markdown changelog with before/after code examples. Also emits a full all-PR inventory (WEEKLY_ALL_PRS) and Chinese (_zh) translations of both docs. Use when the user asks for a weekly changelog, weekly summary, or weekly external changes report.

2026-06-1224

onboard-arch-precheck

hw-native-sys/simpler

Detect the host's actual Ascend silicon and refuse mismatched `--platform` onboard hardware test invocations BEFORE any device is locked. MUST invoke this skill before running pytest or task-submit commands that use `--platform a2a3` or `--platform a5` (onboard only — sim variants pass through). Use when invoking onboard hardware tests, repro'ing flaky-test reports, or wrapping pytest in task-submit. Skip for `--platform a2a3sim` / `--platform a5sim` (silicon-agnostic).

2026-06-0524

review-pr

hw-native-sys/simpler

Review a GitHub PR by analyzing the correct diff (merge-base to HEAD), reconciling stated vs. real goal, and applying type-specific scrutiny. Optionally folds in independent reviews from local `codex` / `gemini` CLIs when the invocation explicitly opts in (`codex`, `gemini`, or `all` in the arguments). Use when the user asks to review a PR, analyze PR changes, or give feedback on a pull request.

2026-06-0324

testing

hw-native-sys/simpler

Testing guide and pre-commit testing strategy for PTO Runtime. Use when running tests, adding tests, or deciding what to test before committing.

2026-05-3024

insight-trace

hw-native-sys/simpler

Generate a MindStudio Insight trace for any `kernel_entry(args)` style kernel in this repo — SPMD mix, AIC-only single-task (e.g. `aic_pv_matmul`), or AIV-only single-task (e.g. `aiv_softmax_prepare`). Use when the user asks to "produce/generate/run an Insight trace", "trace this kernel under msprof op simulator", or troubleshoot Insight trace collection. AICore-only replay path — bypasses AICPU orchestration. For PTOAS-style kernels, use [PTOAS msprof_op_simulator_usage_zh.md](https://github.com/hw-native-sys/PTOAS/blob/main/.claude/skill/msprof_op_sim_insight_skill.md) instead.

2026-05-2524

name

dfx-analyze

description

Analyze DFX data (simpler's own tools)

Pick the tool by question

You want…	Tool	Needs
Per-run Total / Orch / Sched device timing	`device_log_timing`	nothing extra — `PTO2_PROFILING` markers are in every device log (compile-time default on, NOT gated by swimlane)
AICPU scheduler overhead / Tail-OH / critical-path breakdown	`sched_overhead_analysis`	a `--enable-l2-swimlane` (level≥3) run + `--enable-dep-gen` run
Swimlane → Perfetto Chrome trace	`swimlane_converter`	`--enable-l2-swimlane` run (`--overhead` track needs deps.json too)
Task dependency graph (text / HTML)	`deps_viewer`	`--enable-dep-gen` run → `deps.json`
Per-scope ring-fill peaks (task_window / heap / tensormap)	`scope_stats_plot`	`--enable-scope-stats` run → `scope_stats.jsonl`
Inspect / export args dumps	`dump_viewer`	`--enable-dump-tensor` run → `args_dump/`

First reflex: Total/Orch/Sched needs nothing extra

To answer "where did the time go / is this AICPU-orchestration bound", you do not need swimlane or custom logging — just run, then:

python -m simpler_setup.tools.device_log_timing -d <device_id>   # latest log for that die
# or: --device-log <path/to/device-*.log>
# prints per-round Total / Orch / Sched (us); Orch≈Sched≈Total ⇒ AICPU-bound.

Make the device log easy to find: redirect it under the run's output dir via ASCEND_PROCESS_LOG_PATH (see .claude/rules/running-onboard.md → "Device logs").

Where the inputs are written

DFX artifacts land in the run's output dir with fixed filenames:

simpler scene tests (tests/st): outputs/<case>_<ts>/ (the tools auto-pick the latest by mtime when run from the dir holding outputs/).
JIT examples / pypto-lib: build_output/_jit_*/dfx_outputs/.

Don't

❌ Hand-roll per-stage / submit-drain / per-scope timing in the runtime to get numbers these tools already produce. If a tool is missing a metric, extend the tool, not the hot path (and never log on AICPU hot paths — see codestyle.md rule 7).