Skip to main content
Ejecuta cualquier Skill en Manus
con un clic
hw-native-sys
Perfil de creador de GitHub

hw-native-sys

Vista por repositorio de 58 skills recopiladas en 5 repositorios de GitHub.

skills recopiladas
58
repositorios
5
actualizado
2026-06-26
explorador de repositorios

Repositorios y skills representativas

incore-profiling
sin clasificar

Profile PyPTO kernels in-core with the Ascend msprof op-simulator — cycle-accurate per-kernel traces. Use when the user wants to profile a built case, inspect kernel timing or instruction streams, or generate MindStudio Insight traces.

2026-06-24
cube-tile-tuning
sin clasificar

Tune cube/matmul tile sizes (row tile, N fragment, K fragment) for a PyPTO kernel — analytic hints, an on-chip buffer constraint model, and an empirical device sweep. Use when optimizing a matmul/cube's throughput, sizing the row / N / K tiles, resolving Mat (L1) / L0C / UB buffer overflows, or trading one tile dim for another.

2026-06-23
bisect-precision
Analistas de garantía de calidad de software y probadores

Locate which pypto commit introduced a precision regression. Only pypto and its corresponding simpler (submodule) are tracked — ptoas and pto-isa versions are not part of the bisect. If the culprit is a simpler submodule bump, performs a second-level bisect within simpler.

2026-05-21
create-issue
Desarrolladores de software

Reproduce a reported problem, collect dependency versions, and create a GitHub issue. Use when the user wants to file a bug, request a feature, or create any GitHub issue.

2026-05-21
ascendc-docs-search
Desarrolladores de software

Ascend C 开发资源索引(本地+在线)。提供:(1) 本地 API 文档索引、示例代码映射,(2) 在线文档搜索功能,(3) 资源查找优先级,(4) Explore Agent 使用指南。优先使用本地资源,仅在本地检索不到时使用在线搜索。

2026-05-15
git-commit
Desarrolladores de software

Complete git commit workflow including pre-commit checks, staging, message generation, and verification. Use when creating commits or preparing changes for commit.

2026-05-06
github-pr
Desarrolladores de software

Create or update a GitHub pull request after committing and pushing changes. Use when the user asks to create a PR, submit changes for review, or open a pull request.

2026-04-02
ascendc-api-best-practices
Desarrolladores de software

Ascend C API 使用最佳实践。提供算术、归约、数据搬运、Buffer管理、精度转换等 API 的正确用法和限制说明。触发:用户询问具体 API 用法(如"DataCopy 怎么用")、遇到 API 参数错误或限制报错(如 repeatTimes、对齐问题)、需要查看 API 最佳实践或避坑指南时。

2026-03-31
Mostrando las 8 principales de 20 skills recopiladas en este repositorio.
dfx-analyze
sin clasificar

Analyze an onboard run's performance/scheduling/dependency/dump data using simpler's BUILT-IN DFX tools (simpler_setup.tools.*) instead of hand-rolling instrumentation. Use AFTER an onboard run when you need per-run device timing (Total/Orch/Sched), AICPU scheduler-overhead / Tail-OH breakdown, the task dependency graph, scope ring-fill peaks, or to inspect args dumps. These are simpler's own tools (shipped in the wheel), distinct from any cross-repo workload. Reach for this before writing custom timing/logging into the runtime.

2026-06-26
multi-repo-setup
sin clasificar

Set up a cross-repo investigation when a workload from another repo (pypto, pypto-lib, etc.) needs to be run, especially when you want to swap in simpler-main HEAD or the current worktree's simpler instead of the version that repo pins. Clones-or-updates each external repo every invocation so stale local clones don't lie about CI parity. MUST invoke before chasing "X doesn't work on simpler" reports where X lives outside this repo.

2026-06-26
weekly-changelog
Desarrolladores de software

Summarize user-facing changes merged in the current Friday-anchored week (most recent Friday up through yesterday) in the simpler repo into a markdown changelog with before/after code examples. Also emits a full all-PR inventory (WEEKLY_ALL_PRS) and Chinese (_zh) translations of both docs. Use when the user asks for a weekly changelog, weekly summary, or weekly external changes report.

2026-06-12
onboard-arch-precheck
Analistas de garantía de calidad de software y probadores

Detect the host's actual Ascend silicon and refuse mismatched `--platform` onboard hardware test invocations BEFORE any device is locked. MUST invoke this skill before running pytest or task-submit commands that use `--platform a2a3` or `--platform a5` (onboard only — sim variants pass through). Use when invoking onboard hardware tests, repro'ing flaky-test reports, or wrapping pytest in task-submit. Skip for `--platform a2a3sim` / `--platform a5sim` (silicon-agnostic).

2026-06-05
review-pr
Analistas de garantía de calidad de software y probadores

Review a GitHub PR by analyzing the correct diff (merge-base to HEAD), reconciling stated vs. real goal, and applying type-specific scrutiny. Optionally folds in independent reviews from local `codex` / `gemini` CLIs when the invocation explicitly opts in (`codex`, `gemini`, or `all` in the arguments). Use when the user asks to review a PR, analyze PR changes, or give feedback on a pull request.

2026-06-03
testing
Analistas de garantía de calidad de software y probadores

Testing guide and pre-commit testing strategy for PTO Runtime. Use when running tests, adding tests, or deciding what to test before committing.

2026-05-30
insight-trace
Desarrolladores de software

Generate a MindStudio Insight trace for any `kernel_entry(args)` style kernel in this repo — SPMD mix, AIC-only single-task (e.g. `aic_pv_matmul`), or AIV-only single-task (e.g. `aiv_softmax_prepare`). Use when the user asks to "produce/generate/run an Insight trace", "trace this kernel under msprof op simulator", or troubleshoot Insight trace collection. AICore-only replay path — bypasses AICPU orchestration. For PTOAS-style kernels, use [PTOAS msprof_op_simulator_usage_zh.md](https://github.com/hw-native-sys/PTOAS/blob/main/.claude/skill/msprof_op_sim_insight_skill.md) instead.

2026-05-25
benchmark
Analistas de garantía de calidad de software y probadores

Benchmark runtime performance on hardware. If the current branch has commits ahead of upstream/main or uncommitted changes, compares against the fork point (merge-base). Otherwise benchmarks current state only. Use when the user asks to benchmark, measure performance, or compare latency.

2026-05-21
Mostrando las 8 principales de 14 skills recopiladas en este repositorio.
incore-profiling
Desarrolladores de software

Profile PyPTO kernels in-core with the Ascend msprof op-simulator — cycle-accurate per-kernel traces. Use when the user wants to profile a built case, inspect kernel timing or instruction streams, or generate MindStudio Insight traces.

2026-06-15
weekly-changelog
Desarrolladores de software

Generate a weekly changelog markdown file summarizing external API and feature changes from git commits in a date range. Extracts before/after Python examples per commit, groups by theme (DSL / distributed / runtime / IR deprecations), and attributes each change to its author. Use when the user asks for a weekly report, changelog, commit summary, or interface-change digest.

2026-06-09
compare-codegen
Desarrolladores de software

Compare codegen output (.pto files and pass dumps) between origin/main and the current branch for a given test case. Runs the test with --save-kernels and --dump-passes on both branches via git worktree, then diffs the results. Use when the user asks to compare codegen output, diff .pto files between branches, or check what changed in generated code.

2026-05-30
add-op
Desarrolladores de software

Add new operator definitions to PyPTO across all layers (C++, Python IR, Python DSL, tests, codegen, docs). Covers tile ops, tensor ops, tensor-to-tile conversion, and codegen registration. Use when the user asks to add a new op, define a new operator, implement a new tile/tensor operation, or extend the operator system.

2026-05-25
auto-pr
Desarrolladores de software

Create a GitHub PR then autonomously loop on CI failures and review comments until the PR is fully green. Combines branch prep, PR creation, and a hands-off fix loop. Use when the user wants to ship a PR end-to-end, auto-fix a PR until green, or create-and-fix a PR in one go.

2026-05-23
fix-pr
Secretarios ejecutivos y asistentes administrativos de dirección

Fix GitHub PR issues — address review comments and resolve CI failures in a loop until the PR is fully clean. Fetches CI errors online and triages review feedback. Use when fixing PR problems, addressing review comments, or resolving CI failures.

2026-05-02
fix-issue
Desarrolladores de software

Fix a GitHub issue by fetching content, creating a branch, planning the fix, and implementing it. Use when the user asks to fix a specific issue number or work on a GitHub issue.

2026-04-10
create-issue
Desarrolladores de software

Create a GitHub issue following the project's issue templates. Classifies the issue type, fills required fields per template, creates it via gh CLI, and sets project board fields (Status, Priority, Effort, Sprint). Use when the user wants to file a bug, request a feature, report a pass bug, or create any GitHub issue.

2026-04-09
Mostrando las 8 principales de 13 skills recopiladas en este repositorio.
pto-isa-cpu-sim-kernel-test
Analistas de garantía de calidad de software y probadores

Use when Codex needs to validate a `.pto` program or `ptoas`-generated C++ kernel with the local `pto-isa` CPU simulator in this repository. Covers generating `.pto` files from PTOAS samples, running `ptoas` to emit C++, grafting the emitted kernel into a testcase under `.downloads/pto-isa/tests/cpu/st/testcase/`, updating `main.cpp`, `gen_data.py`, and CMake wiring as needed, then running `tests/run_cpu.py` for functional testing and debug on Windows, WSL, or Linux.

2026-04-25
ptoas-publish-pr
Desarrolladores de software

Publish PTOAS changes to GitHub as a pull request. Use when Codex needs to turn intended local PTOAS edits into a branch, commit, push, and PR, especially when the worktree contains unrelated files, the repo uses `origin` as a personal fork and `upstream` as the canonical repository, or GitHub authentication may need to be checked with `gh auth status` and `gh auth login`.

2026-04-25
camodel-isa-verification
Analistas de garantía de calidad de software y probadores

Create, run, and analyze PTO-ISA ST tests on the CANN CA model simulator. Use when Codex needs to verify A5/Ascend PTO-ISA instruction behavior, inspect simulator instruction logs, measure vector instruction latency, or compare UB dump hex output against expected values.

2026-04-25
msprof-op-simulator-insight
Analistas de garantía de calidad de software y probadores

Compile and profile PTOAS-generated kernel sources with `msprof op simulator`, then export MindStudio Insight files. Use when Codex needs to build a host runner, run A3 `dav_2201` op simulator collection, resolve mangled kernel symbols, export `trace.json` or `visualize_data.bin`, or troubleshoot simulator dump/export paths.

2026-04-25
ptoas-project-development
Desarrolladores de software

Project development guidance for PTOAS. Use when Codex modifies PTOAS source, MLIR ODS dialect definitions, C++ verifiers or transforms, CLI behavior, Python bindings, docs, tests, examples, or any user-visible PTOAS behavior; keeps cross-layer updates, license headers, regression tests, and examples synchronized.

2026-04-25
build-ptoas-wsl
Desarrolladores de software

Build PTOAS from source inside WSL using the repository README workflow. Use when Codex is asked to build, configure, install, test, or troubleshoot ptoas/PTOAS in WSL or Ubuntu, including LLVM/MLIR llvmorg-19.1.7 setup, CMake/Ninja out-of-tree builds, pybind11 Python bindings, runtime environment variables, CLI smoke tests, or Python dialect import validation.

2026-04-25
pto-isa
Desarrolladores de software

使用PTO-ISA实现指定算子功能的完整流程指南,涵盖ISA指令选择、数据流分析、指令功能解释和kernel代码生成

2026-06-04
pto-isa-flash-atten-a3-pipeline
Desarrolladores de software

PTO-DSL Flash Attention four-stage cross-core software pipeline for Ascend A3: compute_qk (Cube) -> compute_p (Vec) -> compute_pv (Cube) -> compute_gu (Vec), staged through a GM software FIFO. Captures the steady-state rhythm (cube-side per-tile emit_qk_pv interleaving, vec-side "drain GU then produce P"), the QK_PRELOAD / EXP_RING / S1_TILE knobs and their invariants, the UB 192 KiB budget with the row_slice working-tile shrink, the empirical S1 >= 16384 -> S1_TILE = 512 recommendation, and the op-pattern PIPE_V barrier removal recipe. Use when tuning the in-tree DSL Flash Attention, porting the four-stage pipeline to a new persistent-block kernel that mixes cube + vec stages through a GM FIFO, choosing QK_PRELOAD / S1_TILE for a new shape mix, or deciding when a PIPE_V barrier in generated C++ is safe to drop. Scoped to A3 non-causal prefill with HEAD=128, S0=128, CUBE_S1=128 -- other Flash Attention flavors (causal mask, GQA/MQA, KV-cache decode, A5 NZ/NZ+1 layout) belong in sibling skills.

2026-05-25
pto-isa-matmul-l2-schedule
Desarrolladores de software

PTO-DSL matmul L2-reuse scheduler for Ascend A2/A3: persistent-block GEMM with N-group swizzle along the inner M walk and M-direction zigzag at N-group boundaries. Captures the tile-id math, the CANN platform_config- driven swizzleCountN budget (with the 32 MiB safety-ratio cliff), the DN-B layout note, the runtime wiring, and the verification path against torch_npu. Use when tuning a matmul-shaped kernel that profiles as L2-bound, porting the swizzle/zigzag schedule to a new persistent-block kernel, choosing swizzleCountN for a new SoC, or deciding between the manual SPMD-static baseline and this persistent + swizzle schedule. Scoped to one schedule recipe — add a separate skill for other PTO-ISA performance patterns (vector reduce, flash-attention scheduling, etc.).

2026-05-21
pto-comm
Desarrolladores de software

基于 PTO-COMM ISA 开发通信算子的完整指南。涵盖 Host-Device 架构、文件结构、通信模式(P2P/集合通信/通算融合)、同步策略、信号矩阵设计、多 Block 调度、远端地址管理、构建系统配置等。触发:需要使用 PTO-COMM 开发通信算子、设计通信 kernel、编写 Host 侧代码、配置 CMakeLists 时。

2026-04-27
pto-isa-dev
Desarrolladores de software

Work effectively in PTO-ISA: choose the right backend, run CPU/SIM/NPU flows, trace instruction constraints, understand A2/A3 vs A5 differences, align with PTO-AS, debug failures, and apply review-derived guardrails from recent PRs.

2026-04-27
Mostrando 5 de 5 repositorios
Todos los repositorios cargados