| name | platform-dev |
| description | HyperParallel platform abstraction layer development. Use when adding new platform APIs, implementing cross-platform features (FSDP/HSDP/Pipeline/Activation Checkpoint), creating DTensorBase extensions, or modifying collective operations. Covers both PyTorch and MindSpore backends. |
HyperParallel Platform Development Skill
Guides development of cross-platform features in the platform/ abstraction layer โ adding new Platform APIs, implementing backend-specific features (FSDP, HSDP, Pipeline Parallelism, Activation Checkpoint), extending DTensorBase, and managing collective operations across PyTorch and MindSpore backends.
When to Use This Skill
- Adding a new method to the Platform abstraction (
platform/platform.py)
- Implementing a new feature in
platform/torch/ or platform/mindspore/
- Modifying FSDP, HSDP, Pipeline Parallelism, or Activation Checkpoint platform code
- Extending DTensorBase (torch or mindspore)
- Adding or modifying collective operations (all_gather, all_reduce, reduce_scatter, etc.)
- Implementing stream synchronization or memory lifecycle patterns
- Working on process group management or device/RNG management
Architecture Overview
platform/
โโโ platform.py # Platform base class (~100+ abstract methods)
โโโ torch/ # PyTorch backend
โ โโโ platform.py # TorchPlatform(Platform)
โ โโโ dtensor.py # DTensorBase (torch.Tensor subclass)
โ โโโ function_override.py # DTensor backward hooks
โ โโโ init_weights.py # init_on_device context manager
โ โโโ group_utils.py # Process group creation
โ โโโ clip_grad.py # Distributed gradient clipping
โ โโโ activation_checkpoint/ # SAC + Activation Swap
โ โโโ fully_shard/ # FSDP + HSDP (state, param, scheduler, hooks; core hsdp_*.py)
โ โโโ pipeline_parallel/ # Pipeline stages + micro-batch
โโโ mindspore/ # MindSpore backend
โโโ platform.py # MindSporePlatform(Platform)
โโโ dtensor.py # DTensorBase (ms.Tensor subclass)
โโโ init_weights.py # init_on_device context manager
โโโ parameter_init.py # Parameter initialization with slice_index
โโโ platform_graph.py # Graph construction utilities
โโโ custom_pass/ # Custom graph passes
โโโ fully_shard/ # FSDP + HSDP (state, param, scheduler, hooks; core hsdp_*.py)
โโโ pipeline_parallel/ # Pipeline stages + micro-batch
How to Use
Call this skill with your task description:
/platform-dev Add a new `scatter()` collective operation to the Platform abstraction
/platform-dev Implement activation swap support for MindSpore backend
/platform-dev Fix the unshard scheduling in torch FSDP to support prefetch
/platform-dev Add a new property to DTensorBase for tracking communication state
Execution Flow
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ 1. Scope โ โโโถ โ 2. Base Class โ โโโถ โ 3. Backend โ
โ Analysis โ โ API Design โ โ Implementationโ
โ Identify what โ โ platform.py โ โ torch/ + ms/ โ
โ needs to change โ โ abstract method โ โ concrete impl โ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โผ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ 4. Cross-Platformโ โโโถ โ 5. Testing โ โโโถ โ 6. Git Commit โ
โ Verification โ โ UT + ST โ โ & PR Creation โ
โ Parity check โ โ Both backends โ โ Call autogit โ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
Workflow Execution Checklist
Key Decision Points
| Decision | Criteria | Options | Impact |
|---|
| Change Scope | New API vs modifying existing | New abstract method / Modify existing / Internal only | Files affected, backward compat |
| Backend Priority | Which backend first | Torch first / MindSpore first / Both together | Development order |
| Feature Parity | Both backends needed? | Full parity / One backend + NotImplementedError | Test coverage |
| Stream Sync | Async operations involved? | Sync / Async with handle / Event-based | Correctness risk |
| Memory Pattern | Buffer management needed? | resize_(0) / Reuse / Allocate new | Memory efficiency |
Quick Reference
See references/quick-reference.md for:
- File location guide
- Platform API categories
- Cross-platform type mapping
- Common patterns and anti-patterns
See references/architecture.md for:
- Platform abstraction design
- DTensorBase dispatch mechanism
- FSDP/HSDP state lifecycle
- Stream synchronization patterns
- Memory management patterns
Hard Rules
- Never import torch/mindspore directly in platform-agnostic code โ use
get_platform()
- New Platform APIs must be added to base class first (
platform/platform.py)
- Both backends must be considered โ implement or raise
NotImplementedError
- Cross-platform type differences โ torch uses
torch.device vs mindspore uses str; torch uses ProcessGroup vs mindspore uses str group names
- Lazy backend imports โ in
platform/torch/ and platform/mindspore/, use lazy imports inside methods for framework modules; add # pylint: disable=C0415. Non-platform code uses module-top imports (see code-style.md)
- handle.wait() before reading async collective output
- event.record(src) โ event.wait(dst) for cross-stream dependencies
- resize_(0) to free device memory, never access freed storage
Related Skills
| Skill | When to Use |
|---|
| code-review | After implementation, review for distributed correctness |
| autogit | Commit, push, create PR |
| dist-op-dev | When implementing distributed operator support (not platform layer) |