一键导入
vllm-ascend-release
// End-to-end release management skill for vLLM Ascend. Creates release checklist issues, identifies critical bugs, runs functional tests, invokes release note generation, and guides through the complete release process.
// End-to-end release management skill for vLLM Ascend. Creates release checklist issues, identifies critical bugs, runs functional tests, invokes release note generation, and guides through the complete release process.
| name | vllm-ascend-release |
| description | End-to-end release management skill for vLLM Ascend. Creates release checklist issues, identifies critical bugs, runs functional tests, invokes release note generation, and guides through the complete release process. |
This skill manages the complete end-to-end release process for vLLM Ascend, from creating the release checklist issue to final release announcement. It automates repetitive tasks while ensuring human oversight at critical decision points.
Use this skill when:
gh) authenticated with write access to vllm-project/vllm-ascenduv for running scriptsBefore starting the release process, verify that gh CLI is installed and authenticated:
# Check if gh is installed
gh --version
# If not installed, install gh CLI:
# Ubuntu/Debian
apt install gh -y
# macOS
brew install gh
# OpenEuler
yum install gh -y
# Check authentication status
gh auth status
# If not authenticated, login with:
gh auth login
Expected output for gh auth status:
github.com
✓ Logged in to github.com account <username> (keyring)
- Active account: true
- Git operations protocol: https
- Token: gho_****
- Token scopes: 'gist', 'read:org', 'repo', 'workflow'
Required scopes: repo (for creating issues, PRs, releases) and workflow (for triggering CI workflows).
┌─────────────────────────────────────────────────────────────────────────────┐
│ vLLM Ascend Release Process │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ Phase 1: Initialization │
│ ├── Determine version & branch │
│ ├── Create feedback issue │
│ └── Create release checklist issue │
│ │
│ Phase 2: Bug Triage │
│ ├── Scan open bugs │
│ ├── Identify release-blocking bugs │
│ └── Update checklist with bug list │
│ │
│ Phase 3: PR Management │
│ ├── Identify must-merge PRs │
│ └── Update checklist with PR list │
│ │
│ Phase 4: Test Coverage Analysis │
│ ├── Scan PRs for features/models without tests │
│ ├── Check previous feedback issue status │
│ └── Update checklist with items needing manual testing │
│ │
│ Phase 5: Nightly Status │
│ ├── Get latest Nightly-A3 and Nightly-A2 runs │
│ ├── Analyze failures with extract_and_analyze.py │
│ └── Update checklist with nightly status table │
│ │
│ Phase 6: Release Notes (invoke existing skill) │
│ ├── Generate release notes via vllm-ascend-release-note-writer │
│ └── Create release notes PR │
│ │
│ Phase 7: Documentation & Artifacts │
│ └── Update version references (Docker/wheel built by CI automatically) │
│ │
│ Phase 8: Release Execution (requires human review) │
│ ├── Human review & approval │
│ ├── Merge release notes PR │
│ ├── Create GitHub release │
│ └── Verify automated pipelines (PyPI, Docker, ReadTheDocs) │
│ │
│ Phase 9: WeChat Article (微信公众号推文) │
│ ├── Collect release statistics (commits, contributors) │
│ ├── Generate WeChat article from template │
│ └── Review and publish to WeChat official account │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Prompt the user for:
v0.15.0rc1, v0.15.0main2026.03.15# Get the latest release tag
gh release list --repo vllm-project/vllm-ascend --limit 5
# Or check existing tags
git tag --sort=-creatordate | head -10
Create a community feedback issue for the release:
gh issue create --repo vllm-project/vllm-ascend \
--title "[Feedback]: v${VERSION} Release Feedback" \
--body "$(cat templates/feedback-issue-template.md)" \
--label "feedback"
Use the template in templates/release-checklist-template.md:
# Generate the checklist from template
python scripts/generate_checklist.py \
--version ${VERSION} \
--branch ${BRANCH} \
--date ${DATE} \
--manager ${MANAGER} \
--feedback-issue ${FEEDBACK_ISSUE_NUMBER} \
--output release-checklist.md
# Create the issue
gh issue create --repo vllm-project/vllm-ascend \
--title "[Release]: Release checklist for ${VERSION}" \
--body-file release-checklist.md \
--label "release"
Run the issue scanning script to browse all issues since the last release:
python scripts/scan_release_bugs.py \
--repo vllm-project/vllm-ascend \
--since-tag ${LAST_VERSION} \
--output issue-scan.md
The script:
The output is designed for quick human review:
Issues are automatically flagged when they have:
bug, regression, blocker, priority:high, criticalAfter manual review, add important bugs to the release checklist:
python scripts/update_checklist_section.py \
--issue-number ${CHECKLIST_ISSUE} \
--section "Bug need Solve" \
--content-file bug-list.md
Scan for PRs that should be included in the release:
# 1. [Priority] List open PRs/issues in the release milestone
gh pr list --repo vllm-project/vllm-ascend \
--state open \
--search "milestone:${VERSION}" \
--json number,title,url,labels
gh issue list --repo vllm-project/vllm-ascend \
--state open \
--search "milestone:${VERSION}" \
--json number,title,url,labels
# 2. List open PRs with release-related labels
gh pr list --repo vllm-project/vllm-ascend \
--state open \
--label "release-blocker" \
--json number,title,url
# 3. List PRs merged since last release
gh pr list --repo vllm-project/vllm-ascend \
--state merged \
--search "merged:>${LAST_RELEASE_DATE}" \
--json number,title,mergedAt
Priority Order:
release-blocker label - critical items that must be mergedUpdate the checklist with PRs that need to be merged:
python scripts/update_checklist_section.py \
--issue-number ${CHECKLIST_ISSUE} \
--section "PR need Merge" \
--content-file pr-list.md
CI already covers most test cases. Manual testing is only needed for:
Run the test coverage scanner:
python scripts/scan_test_coverage.py \
--repo vllm-project/vllm-ascend \
--since-tag ${LAST_VERSION} \
--feedback-issue ${PREVIOUS_FEEDBACK_ISSUE} \
--output test-coverage-analysis.md
This script:
The output categorizes items:
Features/Models Needing Manual Testing:
Previous Feedback Status:
For items identified above, perform manual testing:
#### Manual Testing Required
- [ ] Model: Kimi K2.5 - Basic inference works
- [ ] Model: GLM-5 - Multimodal features work
- [ ] Feature: Expert parallel with 8 GPUs
- [ ] Feedback: User reported slow startup (verify fixed)
python scripts/update_checklist_section.py \
--issue-number ${CHECKLIST_ISSUE} \
--section "Functional Test" \
--content-file test-results.md
Get the latest Nightly-A3 and Nightly-A2 CI runs and analyze failures:
python scripts/scan_nightly_status.py \
--repo vllm-project/vllm-ascend \
--output nightly-status.md
This script:
extract_and_analyze.py (from main2main-error-analysis skill) for failed runsThe output includes:
| Workflow | Status | Failed Jobs | Code Bugs | Env Flakes | Run |
|---|---|---|---|---|---|
| Nightly-A3 | ✅ success | 0/15 | 0 | 0 | #123 |
| Nightly-A2 | ❌ failure | 3/12 | 2 | 1 | #124 |
For failed runs, it also shows:
python scripts/update_checklist_section.py \
--issue-number ${CHECKLIST_ISSUE} \
--section "Nightly Status" \
--content-file nightly-status.md
This phase handles the complete release notes writing process, from fetching commits to producing the final release notes.
Fetch all commits between the previous and current version:
# Create output directory
mkdir -p output/${VERSION}
# Fetch commits with contributor statistics
uv run python scripts/fetch_commits.py \
--owner vllm-project \
--repo vllm-ascend \
--base-tag ${LAST_VERSION} \
--head-tag ${NEW_VERSION} \
--stats \
--output output/${VERSION}/0-current-raw-commits.md \
--stats-output output/${VERSION}/0-contributor-stats.md
The script outputs:
0-current-raw-commits.md: Raw commit list for analysis0-contributor-stats.md: Contributor statistics including new contributorsCreate a CSV file to analyze each commit:
# Create analysis workspace
touch output/${VERSION}/1-commit-analysis-draft.csv
The CSV should have headers:
| Column | Description |
|---|---|
title | Commit title |
pr number | PR number |
user facing impact/summary | What users should know |
category | Highlights/Features/Performance/etc. |
decision | include/exclude/merge |
reason | Why this decision |
Create the initial draft following the category order:
## v${VERSION} - ${DATE}
This is the first release candidate of v${VERSION} for vLLM Ascend.
Please follow the [official doc](https://docs.vllm.ai/projects/ascend/en/latest) to get started.
### Highlights
(Top 3-5 most impactful changes)
### Features
(New functionality)
### Hardware and Operator Support
(New hardware/operators)
### Performance
(Performance improvements)
### Dependencies
(Version upgrades)
### Deprecation & Breaking Changes
(Breaking changes)
### Documentation
(Doc updates)
### Others
(Bug fixes, misc)
### Known Issue
(Known limitations)
Save drafts to:
output/${VERSION}/2-highlights-note-draft.md - Initial draftoutput/${VERSION}/3-highlights-note-edit.md - Reviewed/edited versionInclusion Criteria:
Writing Tips:
gh pr view <number> --repo vllm-project/vllm-ascend[#12345](https://github.com/vllm-project/vllm-ascend/pull/12345)Reference:
references/ref-past-release-notes-highlight.md for style examplesAfter release notes are finalized:
# Create branch
git checkout -b release/${VERSION}
# Make changes (see Phase 6 for full list)
# ...
# Create PR
gh pr create --repo vllm-project/vllm-ascend \
--title "Release ${VERSION}" \
--body "Release notes and version updates for ${VERSION}" \
--label "release"
| File | Update Required |
|---|---|
README.md | Getting Started version, Branch section |
README.zh.md | Same as above (Chinese) |
docs/source/faqs.md | Feedback issue link |
docs/source/user_guide/release_notes.md | Add new release notes |
docs/source/community/versioning_policy.md | Compatibility matrix, release window |
docs/source/community/contributors.md | New contributors |
docs/conf.py | Package version |
.github/workflows/schedule_image_build_and_push.yaml | Config |
.github/workflows/schedule_update_estimated_time.yaml | Config |
python scripts/update_version_references.py \
--version ${VERSION} \
--vllm-version ${VLLM_VERSION} \
--feedback-issue ${FEEDBACK_ISSUE_URL}
Before executing the release, verify:
⚠️ Human Review Required: Before executing the release, ensure all previous phases have been reviewed and approved by the release manager. This step requires explicit human confirmation.
Current Approach (Manual): For now, execute release steps manually through GitHub UI or CLI after human review:
Future Approach (Automated): Once the release process is mature and well-tested, consider:
workflow_dispatch)Manual Execution Commands (for reference):
# 1. Merge release notes PR (after human review)
gh pr merge ${RELEASE_PR_NUMBER} --repo vllm-project/vllm-ascend --squash
# 2. Create GitHub release
gh release create ${VERSION} \
--repo vllm-project/vllm-ascend \
--title "vLLM Ascend ${VERSION}" \
--notes-file release-notes.md \
--target main
# 3. Verify automated pipelines (no action needed - CI handles these)
# - Docker image: quay.io/ascend/vllm-ascend:${VERSION}
# - PyPI package: https://pypi.org/project/vllm-ascend/${VERSION}
# - ReadTheDocs: https://app.readthedocs.org/dashboard/
# 4. Upload 310P wheel if applicable
gh release upload ${VERSION} \
--repo vllm-project/vllm-ascend \
vllm_ascend-${VERSION}-310p-*.whl
# 1. Broadcast release (prepare announcement)
python scripts/generate_announcement.py \
--version ${VERSION} \
--release-notes release-notes.md \
--output announcement.md
# 2. Close release checklist issue
gh issue close ${CHECKLIST_ISSUE} \
--repo vllm-project/vllm-ascend \
--comment "Release ${VERSION} completed successfully!"
After release notes are finalized and the release is completed, generate a WeChat article for community broadcast.
The WeChat article follows a structured format with emojis for visual appeal:
| Section | Emoji | Description | Recommended Items |
|---|---|---|---|
| Opening Paragraph | 🎉 | Version announcement + positioning + core highlights summary | 1 paragraph |
| Statistics | 🥳 | Number of commits, new contributors | 1 line |
| Core Highlights | 💥 | Top 2-3 most important features/optimizations | 2-3 items |
| New Features | 🆕 | New functionality, models, operators | 3-5 items |
| Performance | 🚀 | Performance improvements (include metrics when available) | 2-4 items |
| Refactoring | 🔨 | Code refactoring, dependency upgrades | 1-3 items |
| Bug Fixes | 🐞 | Important bug fixes | 3-5 items |
| Quality/Testing | 🛡️ | Test coverage, CI/CD improvements | 0-2 items |
| Documentation | 📄 | Documentation updates (can combine into 1 item) | 1 item |
| Links | ➡️ | Source code, quick start, installation guide | 3 links |
vLLM Ascend ${VERSION}版本发布🎉 此版本是针对vLLM v${VLLM_VERSION}系列版本首个RC版本,[1-2句核心亮点描述]。
🥳 本版本共计${COMMITS_COUNT}个commits,新增${NEW_CONTRIBUTORS_COUNT}位新开发者!
💥 [核心亮点1]
💥 [核心亮点2]
🆕 [新特性1]
🆕 [新特性2]
🆕 [新特性3]
🚀 [性能优化1,最好包含具体数据如"提升X%"]
🚀 [性能优化2]
🔨 [重构/依赖升级1]
🔨 [重构/依赖升级2]
🐞 修复 [重要bug1]
🐞 修复 [重要bug2]
🐞 修复 [重要bug3]
🛡️ [质量/测试改进]
📄 [文档更新汇总]
➡️ 源码地址:https://github.com/vllm-project/vllm-ascend/releases/tag/${VERSION}
➡️ 快速体验:https://vllm-ascend.readthedocs.io/en/latest/quick_start.html
➡️ 安装指南:https://vllm-ascend.readthedocs.io/en/latest/installation.html
Important: WeChat articles are typically published after the release is complete. Always fetch the release note directly from the release tag, as it contains the most accurate and up-to-date information including the precise new contributor count.
# Fetch release note from release tag (recommended - most accurate source)
gh release view ${VERSION} --repo vllm-project/vllm-ascend --json body,name,tagName
# The release body contains:
# - Highlights, Features, Performance, Documentation sections
# - Bug fixes (Others section)
# - Dependencies and Known Issues
# - New Contributors list with exact count
Why use release tag instead of other sources:
Opening Paragraph:
Content Selection:
Language Style:
Statistics from Release Tag:
git rev-list --count ${LAST_VERSION}..${VERSION}vLLM Ascend v0.18.0rc1版本发布🎉 此版本是针对vLLM v0.18.0系列版本首个RC版本,重点完成了C8(INT8 KV cache)对GQA attention模型的支持,以及性能优化、问题修复等。
🥳 本版本新增9位新开发者,感谢社区开发者的持续贡献!
💥 C8(INT8 KV cache)支持GQA attention模型,同时适配DeepSeek-V3.1 PD分离场景
💥 DeepSeek模型通过新MLA算子支持A5硬件
🆕 Flash Comm V1支持VL模型的MLA,解除多模态服务限制
🆕 支持speculative decoding中target和draft模型使用不同attention backend
🆕 VL MoE模型支持SP,`sp_threshold`替换为vLLM原生`sp_min_token_num`
🆕 Qwen VL模型支持`w8a8_mxfp8`量化
🚀 Triton算子重编译优化,提升算子性能
🚀 Qwen3.5/Qwen3-Next GDN prefill路径优化,预构建chunk metadata减少h2d同步开销
🚀 FIA prefill context merge路径简化,提升运行时效率
🐞 torch-npu 和 triton-ascend 依赖版本更新,请参考官方release note
🐞 修复PD分离场景decode节点因DP节点shape不对齐导致卡住的问题
🐞 修复单卡部署多实例显存 OOM 问题
🐞 修复 DeepSeek v3.1 C8在MTP + full decode + full graph模式下的问题
🐞 修复`AscendModelSlimConfig`中量化配置key映射导致的权重加载报错问题
📄 更新Kimi-K2.5、GLM-4.7、DeepSeek-V3.2、MiniMax-M2.5及PD分离部署文档
➡️ 源码地址:
https://github.com/vllm-project/vllm-ascend/releases/tag/v0.18.0rc1
➡️ 快速体验:
https://vllm-ascend.readthedocs.io/en/v0.18.0/quick_start.html
➡️ 安装指南:
https://docs.vllm.ai/projects/ascend/en/v0.18.0/installation.html
Fetches all commits between two tags and generates contributor statistics.
Arguments:
--owner: Repository owner (default: vllm-project)--repo: Repository name (default: vllm-ascend)--base-tag: Base tag (older version, e.g., v0.14.0)--head-tag: Head tag (newer version, e.g., v0.15.0rc1)--output: Output file for commits (default: 0-current-raw-commits.md)--stats: Generate contributor statistics--stats-output: Output file for statistics (default: 0-contributor-stats.md)--sort: Sort mode (chronological/alphabetical/reverse)--include-date: Include commit date in output--token: GitHub token (or use GITHUB_TOKEN env var)Output:
Generates the release checklist issue body from template.
Arguments:
--version: Release version (e.g., v0.15.0rc1)--branch: Release branch (default: main)--date: Target release date--manager: Release manager GitHub username--feedback-issue: Feedback issue number--output: Output file pathScans GitHub issues since the last release for human review.
Arguments:
--repo: Repository (default: vllm-project/vllm-ascend)--since-tag: Previous release tag (including rc versions)--state: Issue state filter (open, closed, all; default: all)--output: Output file pathOutput: Markdown report with:
Identifies features/models that need manual testing.
Arguments:
--repo: Repository (default: vllm-project/vllm-ascend)--since-tag: Previous release tag--feedback-issue: Previous release feedback issue number (optional)--output: Output file pathOutput: Markdown report with:
Scans Nightly CI status for release readiness.
Arguments:
--repo: Repository (default: vllm-project/vllm-ascend)--output: Output file pathOutput: Markdown report with:
Dependencies:
main2main-error-analysis/scripts/extract_and_analyze.py for detailed analysisUpdates a specific section of the release checklist issue.
Arguments:
--issue-number: Release checklist issue number--section: Section name to update--content-file: File containing new content--append: Append to section instead of replaceUpdates version references across documentation files.
Arguments:
--version: New version--vllm-version: Compatible vLLM version--feedback-issue: Feedback issue URLGenerates release announcement for broadcasting.
Arguments:
--version: Release version--release-notes: Release notes file--output: Output file pathThe release checklist issue template (see file for full template).
The feedback collection issue template.
List of files that need version updates and their update patterns.
Past release notes examples for style and category reference. Use this as a guide when writing new release notes to maintain consistency in:
| Issue | Solution |
|---|---|
| GitHub API rate limit | Use authenticated requests, implement backoff |
| Test timeout | Increase timeout, check hardware availability |
| Model not found | Verify model path, check storage |
| CI failure | Check CI logs, retry or fix |
If the release process fails midway:
Human Oversight: This skill automates tasks but requires human approval at key decision points (bug prioritization, test results review, release approval).
Idempotency: Most scripts can be re-run safely. Issue updates use section replacement.
Rollback: If a release needs to be rolled back:
Communication: Keep the community informed through the feedback issue and release checklist.
Testing: Always run functional tests before release, even for RC versions.
Adapt vLLM-Ascend to track upstream vLLM main branch changes incrementally: detect commit drift, plan steps, adapt code, run CI, and commit verified changes. Use whenever the user mentions main2main, upgrading or syncing vllm-ascend to a newer vLLM commit, vLLM API changes breaking vllm-ascend, or provides both a vllm path and vllm-ascend path for syncing. Also triggers on: "vLLM broke our plugin", "bump the vLLM commit", "ascend CI failing after upstream update".
Adapt and debug existing or new models for vLLM on Ascend NPU. Implement in /vllm-workspace/vllm and /vllm-workspace/vllm-ascend, validate via direct vllm serve from /workspace, and deliver one signed commit in the current repo.