en un clic
dist-op-dev
// Execution-oriented workflow for HyperParallel distributed operator development. Analyzes the operator, implements or updates code and tests.
// Execution-oriented workflow for HyperParallel distributed operator development. Analyzes the operator, implements or updates code and tests.
GitCode fork workflow automation. Use this skill whenever the user wants to commit code, push, create or append to a Pull Request, view PR status, squash commits, regenerate a PR description, or run lint checks against a GitCode `origin` (fork) + `upstream` repository. Supports both Chinese and English natural-language triggers (e.g. "帮我提交", "create PR", "看下 PR 状态") and slash-command shortcuts (`/commit`, `/create-pr`, etc.). The full trigger → subcommand mapping lives in the "When to Activate" section.
Review HyperParallel code changes for distributed correctness, stream synchronization, memory safety, cross-platform consistency, and code quality. Use when reviewing PRs, code changes, or when the user mentions "review", "code review", or "check this".
HyperParallel platform abstraction layer development. Use when adding new platform APIs, implementing cross-platform features (FSDP/HSDP/Pipeline/Activation Checkpoint), creating DTensorBase extensions, or modifying collective operations. Covers both PyTorch and MindSpore backends.
Internal analysis tool for distributed operator development — provides interface specs, Primitive/ATen mappings and HyperParallel layout derivation logic. Used by dist-op-dev workflow. NOT for direct user calls.
Analyze model architecture and hardware constraints to recommend optimal parallel strategy combinations (DP/FSDP/TP/PP/EP/CP) with memory, communication, compute, and pipeline bubble estimation.
| name | dist-op-dev |
| description | Execution-oriented workflow for HyperParallel distributed operator development. Analyzes the operator, implements or updates code and tests. |
✅ 【Unified Entry】When developing HyperParallel distributed operators, just call this SKILL, and I will automatically handle the entire process including operator analysis, implementation, testing, etc.
Use this workflow when developers need to add distributed operator support for the HyperParallel framework or optimize sharding strategy inference for existing operators.
Call this SKILL directly, providing the MindSpore mint interface name or PyTorch operator name, along with source code paths:
# Develop distributed support for MindSpore mint interface
/dist-op-dev I want to develop distributed support for MindSpore mint interface mint.matmul. MindSpore source code is at /root/workspace/mindspore, PyTorch source code is at /root/workspace/pytorch.
# Develop distributed support for PyTorch operator
/dist-op-dev I want to develop distributed support for PyTorch operator torch.nn.functional.linear. MindSpore source code is at /root/workspace/mindspore, PyTorch source code is at /root/workspace/pytorch.
Source code paths are required — the dist-op-analysis SKILL needs them to locate interface definitions, Primitive mappings, and distributed strategy references.
Distributed operator development follows a 5-step process, from operator analysis to code push:
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ 1. Operator │ ──▶ │ 2. Python │ ──▶ │ 3. YAML │
│ Analysis │ │ Implement │ │ Registration│
│ Call SKILL │ │ Inherit/Custom │ │ Configure map │
│ 🔴Output report │ │ infer_layout │ │ Select suffix │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│
┌───────────────────────────────────────────────┘
▼
┌─────────────────┐ ┌─────────────────┐
│ 4. Unit Test │ ──▶ │ 5. Integration │
│ (UT) │ │ Test (ST) │
│ Verify inference│ │ 8-card verify │
│ Cover DP/MP │ │ Compare output │
└─────────────────┘ └─────────────────┘
When using this SKILL to develop distributed operators, create a TODOLIST, then execute the following workflows in order:
Step 1: Operator Analysis
.claude/skills/dist-op-dev/analysis-results/{OpName}-analysis.md (🔴required)Step 2: Python Implementation
hyper_parallel/core/shard/ops/parallel_*.py fileStep 3: YAML Registration
hyper_parallel/core/shard/ops/yaml/*.yaml entryStep 4: Unit Testing (UT)
tests/ut/core/shard/ops/test_parallel_*.pyStep 5: Integration Testing (ST)
tests/mindspore/st/shard/ops/test_ops_*.py + *_shard_in_python.py or tests/torch/shard/ops/test_parallel_op_*.py + parallel_op_*.pyStep 6: Git Commit and PR Creation
feat/{OpName}-distributed-support, commit pushed, PR created (if needed)| Decision Point | Criteria | Options | Impact |
|---|---|---|---|
| Operator Category | Semantic matching | ElementWise/MatMul/Reduce/Reshape/Gather | Determines base class and YAML file |
| Implementation Method | Need custom logic | Scenario 0/Scenario 1/Scenario 2 | Code volume and UT coverage |
| Broadcast Support | Support broadcasting | No suffix/WithShape | YAML config and test scenarios |
| Partial Support | Handle partial state | _allow_partial_inputs=True/False | get_expand_impl implementation |
| Detailed decision reference: See Implementation Decisions |
| Task | File Location | Key Notes |
|---|---|---|
| Python Implementation | hyper_parallel/core/shard/ops/parallel_*.py | Inherit DistributedOp or its subclass |
| YAML Registration | hyper_parallel/core/shard/ops/yaml/*.yaml | Configure operator to distributed implementation class mapping |
| Unit Test (UT) | tests/ut/core/shard/ops/ | Platform-agnostic, verify infer_layout and get_expand_impl logic |
| Integration Test (ST) | tests/mindspore/st/shard/ops/ tests/torch/shard/ops/ | 8-card environment verify distributed execution |
Detailed quick reference: See references/quick-reference.md
| Item | MindSpore | PyTorch |
|---|---|---|
| Interface Name Style | mint.matmul, mint.nn.functional.relu | torch.matmul, torch.nn.functional.linear |
| YAML Files | element_wise_ops.yaml, matmul_ops.yaml, etc. | torch_*.yaml |
| UT Test Directory | tests/ut/core/shard/ops/ (shared) | tests/ut/core/shard/ops/ (shared) |
| ST Test Directories | tests/mindspore/st/shard/ops/ | tests/torch/shard/ops/ |
Important Note: If MindSpore operator and PyTorch operator have the same semantics, they can reuse the same distributed operator implementation class.
| SKILL | Purpose | When Called |
|---|---|---|
| autogit | Git workflow automation (commit, pr, status, etc.) | Workflow 6, complete code commit and PR creation |
| dist-op-analysis | Internal operator analysis (read-only) | Workflow 1, provides interface specs, distributed strategies, and HyperParallel implementation guidance |
Workflow detailed steps: workflows/ directory
Knowledge reference documents: references/ directory
Template files: templates/operator-analysis-template.md