一键导入
ascendc-npu-arch
Ascend NPU 架构知识库。包含架构代际划分(NpuArch/SocVersion)、芯片型号映射、arch35 特殊优化(Regbase/SIMT/FP8)。当需要查询 NPU 架构信息、芯片特性、架构条件编译时触发。
用 Codex 或 Claude 帮你安装 复制这段 Prompt,粘贴到 Codex、Claude 或其他助手里,让它检查 Skill 页面并帮你完成安装。
菜单
Ascend NPU 架构知识库。包含架构代际划分(NpuArch/SocVersion)、芯片型号映射、arch35 特殊优化(Regbase/SIMT/FP8)。当需要查询 NPU 架构信息、芯片特性、架构条件编译时触发。
用 Codex 或 Claude 帮你安装 复制这段 Prompt,粘贴到 Codex、Claude 或其他助手里,让它检查 Skill 页面并帮你完成安装。
基于 SOC 职业分类
Profile PyPTO kernels in-core with the Ascend msprof op-simulator — cycle-accurate per-kernel traces. Use when the user wants to profile a built case, inspect kernel timing or instruction streams, or generate MindStudio Insight traces.
Tune cube/matmul tile sizes (row tile, N fragment, K fragment) for a PyPTO kernel — analytic hints, an on-chip buffer constraint model, and an empirical device sweep. Use when optimizing a matmul/cube's throughput, sizing the row / N / K tiles, resolving Mat (L1) / L0C / UB buffer overflows, or trading one tile dim for another.
Locate which pypto commit introduced a precision regression. Only pypto and its corresponding simpler (submodule) are tracked — ptoas and pto-isa versions are not part of the bisect. If the culprit is a simpler submodule bump, performs a second-level bisect within simpler.
Reproduce a reported problem, collect dependency versions, and create a GitHub issue. Use when the user wants to file a bug, request a feature, or create any GitHub issue.
Ascend C 开发资源索引(本地+在线)。提供:(1) 本地 API 文档索引、示例代码映射,(2) 在线文档搜索功能,(3) 资源查找优先级,(4) Explore Agent 使用指南。优先使用本地资源,仅在本地检索不到时使用在线搜索。
Complete git commit workflow including pre-commit checks, staging, message generation, and verification. Use when creating commits or preparing changes for commit.
| name | ascendc-npu-arch |
| description | Ascend NPU 架构知识库。包含架构代际划分(NpuArch/SocVersion)、芯片型号映射、arch35 特殊优化(Regbase/SIMT/FP8)。当需要查询 NPU 架构信息、芯片特性、架构条件编译时触发。 |
| 概念 | 说明 |
|---|---|
| NpuArch | 芯片架构,定义指令集和微架构 |
| SocVersion | 片上系统版本,软件命名标识 |
| NpuArch | NPU_ARCH | 目录 | 芯片型号 |
|---|---|---|---|
| DAV_1001 | 1001 | arch32 | Ascend910 |
| DAV_2002 | 2002 | arch32 | Ascend310P |
| DAV_2201 | 2201 | arch32 | Ascend910B, Ascend910_93 |
| DAV_3002 | 3002 | arch32 | Ascend310B |
| DAV_3510 | 3510 | arch35 | Ascend950DT, Ascend950PR |
| 特性 | 说明 | 典型算子 |
|---|---|---|
| Regbase 编程 | 直接操作寄存器 | 量化算子 |
| SIMT 编程 | 线程级并行 | 随机数、排序 |
| FP8 格式 | 8-bit 浮点 | 量化、动态量化 |