| name | arc-releasing |
| description | Use this skill whenever the user (an arcforge contributor) says they want to bump arcforge's version, cut a release, "ship vX.Y.Z", "準備發版", "ready to release", or any equivalent intent on the arcforge repo itself — even if they don't use the word "release". Runs the canonical release workflow: pre-flight checks → vault ingest → outdated-doc audit → CHANGELOG → 9-file version bump (incl. website) → commit/push/PR → post-merge tag. Contributor-only; do NOT trigger inside projects that merely install arcforge as a plugin. |
arc-releasing
You are helping an arcforge contributor ship a new version. The goal of this skill is zero silent drift between the canonical version files, the CHANGELOG, shipped documentation, and the Obsidian vault knowledge base. Those four surfaces get out of sync surprisingly easily — a prior release (v1.4.0) discovered that marketplace.json had been stuck two versions behind, and v1.4.1 discovered the observer daemon and the JS side had diverged state roots. The checklist below is designed to catch exactly those classes of drift before they ship.
This is a project-local, contributor-only skill. It lives in .claude/skills/ (not the shipped skills/ directory) because releasing arcforge is a maintainer activity, not a user activity. If you see this skill trigger inside a project that merely installs arcforge, something is wrong — stop and tell the user.
Pre-Flight (before touching anything)
Never start the release workflow on a broken branch.
npm run lint — exit code 0 (warnings acceptable, errors are not)
npm test — all 4 runners green
git status clean of unrelated work-in-progress. Untracked lock files or editor droppings that belong in .gitignore must be addressed separately, never folded into the release commit
git log main..HEAD --oneline — verify the commits listed match the intended release scope
If any pre-flight fails, stop and tell the user. A broken release is worse than a delayed one, because it ships to users through the marketplace cache and is painful to recall.
Semver Decision — Always Ask the User
Do NOT decide semver level unilaterally. After pre-flight, read git log main..HEAD --oneline and any design docs the branch touched, then present the user with the commit list plus a recommendation and ask them to confirm:
Here's what's on this branch since last release:
<commit list>
Recommendation: patch / minor / major
Reasoning: <why>
Confirm the level before I proceed.
Guidelines for your recommendation:
| Change shape | Suggest |
|---|
| Backward-compatible fixes, refactors, doc-only updates | patch (1.x.y → 1.x.y+1) |
| New skill, new CLI command, new backward-compatible feature | minor (1.x.y → 1.x+1.0) |
| Breaking change (removed/renamed shipped API, hook contract change, CLI flag rename) | major (x.y.z → x+1.0.0) |
Branch prefix is a hint, not a rule: fix/* usually patches but can minor-bump if it added something. The final call is always the user's — your job is to make the call easy to judge, not to make it for them.
The Checklist
Do these in order. Each step depends on the previous one being correct.
1. Ingest the release into the Obsidian vault
Invoke /arcforge:arc-maintaining-obsidian in ingest mode. Scope depends on the release shape — always propose scope before bulk-processing, because ingest is the most expensive step in the workflow:
| Release shape | Ingest scope |
|---|
Patch with an architectural decision inside (e.g., v1.4.1's ~/.arcforge/ consolidation) | Decision note + refresh Source notes whose content substantively changed + propagate + index/log + daily note |
| Patch with only small fixes | Skip, or just append a single log.md entry |
| Minor release (new skill or feature) | Full sync: new Source notes for new skills/guides, Decision notes for any architectural shifts, update affected MOCs, propagate cross-refs |
| Major release | Everything above, plus update MOC-ArcForge.md to reflect the new surface |
Ingest before the version bump. Once the version flips, reconstructing the "why this release existed" narrative for the vault becomes harder — git shows what changed, but the reasoning context has moved on.
2. Audit outdated documents in shipped surface
Branches that do migrations often miss a file or two. Grep for stale patterns that the branch was supposed to eliminate. The specific patterns depend on what the release contained — examples from recent releases:
grep -rn "~/\.claude/instincts\|~/\.claude/diaryed\|~/\.claude/observations" skills/ docs/guide/ .claude-plugin/ hooks/
grep -rn "\.arcforge-worktrees/" skills/ docs/guide/ .claude-plugin/
grep -rn "<old-version>" skills/ docs/guide/
Also check for renamed helpers, removed CLI flags, or deprecated config keys that the SKILL.md / rules / guides still mention.
Install docs and platform READMEs get missed every release because they sit off the main code path — "what did this release touch" diffs usually light up scripts/, skills/, docs/guide/, not .codex/ / .gemini/ / .opencode/ / docs/README.*.md. Audit all six every release:
| File | Why it matters |
|---|
.codex/INSTALL.md, .gemini/INSTALL.md, .opencode/INSTALL.md | Fetched and executed by the quick-install one-liner — broken commands ship silently to every new user |
docs/README.codex.md, docs/README.gemini.md, docs/README.opencode.md | Human-facing platform guides |
Things to verify, with reasoning:
- No hardcoded skill/symlink counts. Values like "24 symlinks" drift every time a skill ships. Replace with invariants ("one symlink per skill") — a description that stays true across all releases needs no maintenance.
- No stale path references. If this release moved anything (state dirs, worktree paths, config locations), greps from the examples above apply here too.
- Sibling parity, with judgment. When Gemini/Codex/OpenCode docs diverge, ask whether the asymmetry is intentional before "fixing" it. Windows shell variants (cmd/PowerShell/Git Bash) being uneven is usually real drift; a missing Tool Mapping table in the Codex doc is intentional (Codex's tool vocabulary isn't publicly documented). Don't blindly force parity — verify why they differ first.
- Pre-v1.0.0: no "Migrating from old paths" sections. Before the first public release, migration sections describe fictional users — there is no prior published version they could be migrating from. Keep only the latest setup until after v1.0.0 ships.
Never rewrite past CHANGELOG.md entries. They are history, and downstream users, the vault's Decision notes, and git log vPREV..vCURRENT workflows all depend on them being stable. If a past entry turns out wrong, add a correction inside the new release's entry. Stealth edits break provenance.
Out-of-scope for this audit (do not modify):
docs/plans/* — design history
.claude/rules/* — contributor rules, not shipped surface
- Tests that deliberately blacklist old patterns — they should still reference the old string, that's how they enforce the new convention
3. Benchmark-freshness gate (required for any release that touched eval-backed surface)
A release that changes skill behavior must ship a benchmark that reflects this release, not the last one. A stale benchmark silently asserts that behavior nothing has re-measured is still passing — exactly the failure mode eval exists to prevent.
Regenerating the benchmark is a live-eval step. The grading is done by LLM graders, not by code in this skill or in CI's npm test. You (or a dedicated CI live-eval job) must run the regeneration manually before tagging; the skill cannot run it for you. Treat this section as the gate you must clear, not a command the skill executes.
-
Regenerate the benchmark against the release branch (manual/CI live-eval run — see skills/arc-evaluating/SKILL.md for the regeneration procedure). This refreshes evals/benchmarks/latest.json and evals/benchmarks/raw/latest.json.
Path note: the canonical report location is evals/benchmarks/ (latest.json + raw/latest.json), each carrying a top-level generated ISO-8601 timestamp. Older notes that say evals/reports/latest.json are referring to this same artifact under its prior name.
-
Assert freshness. The generated timestamp in both evals/benchmarks/latest.json and evals/benchmarks/raw/latest.json must be newer than the previous release tag's commit date. If the timestamp predates the last tag, the benchmark was never re-run for this release — stop and regenerate.
PREV_TAG=$(git tag --sort=-version:refname | head -1)
PREV_TAG_DATE=$(git log -1 --format=%cI "$PREV_TAG")
GEN=$(node -p "require('./evals/benchmarks/latest.json').generated")
RAW_GEN=$(node -p "require('./evals/benchmarks/raw/latest.json').generated")
echo "prev tag: $PREV_TAG_DATE | latest: $GEN | raw: $RAW_GEN"
-
No unclassified failing rows on changed scenarios. For every scenario that changed since the last release, confirm the raw report has no failing row left unclassified (every fail must carry a known failure category, not a blank or unknown classification). An unclassified failing row means a regression nobody triaged — that is a release blocker until it is either classified as a known/accepted gap or fixed.
If either freshness or the unclassified-failure check fails, stop and tell the user. Do not write the CHANGELOG or bump the version against a stale or untriaged benchmark — the release would ship a behavior claim that the eval surface contradicts.
4. Update CHANGELOG.md
Insert a new section at the top, under the header block, before the previous release:
## [X.Y.Z] - YYYY-MM-DD
### Fixed
- ...
### Changed
- ...
### Added
- ...
### Removed
- ...
Include only sections that have entries. Order: Fixed → Changed → Added → Removed.
Write narrative, not file lists. The reader of this entry six months from now needs to know: what broke, why it broke, how the fix works, and what they can now do (or stop worrying about) as a result. "Updated session-utils.js" is useless. "Diary enricher had silently failed for 30 days because Claude Code v2.1.78+ blocks nested Writes inside ~/.claude/ — moved state to ~/.arcforge/, 91 stubs now enrich" is reference-grade. The release.yml workflow extracts this exact ## [X.Y.Z] section verbatim into the GitHub Release body when the tag is pushed (it slices from the version header to the next ## [ header), so this is the text users read on the GitHub release page — treat it as a user-facing artifact. The release job fails if no matching CHANGELOG section exists, which enforces the "no bump without CHANGELOG entry" rule below.
5. Bump the version in all 9 canonical locations
| File | Where in the file |
|---|
package.json | top-level "version" field |
.claude-plugin/plugin.json | top-level "version" field (canonical per .claude/rules/plugin.md) |
.claude-plugin/marketplace.json | plugins[0].version |
.opencode/plugins/arcforge.js | version: property inside the default export |
README.md | version badge URL (shields.io, near line 3) |
website/page/hero.jsx | hero version label (vX.Y.Z near top of component) |
website/page/sections.jsx | footer line (MIT · vX.Y.Z · By ...) |
website/page/hero.js | babel build artifact — regenerate, do not hand-edit |
website/page/sections.js | babel build artifact — regenerate, do not hand-edit |
After editing the two .jsx files, regenerate the .js artifacts:
npm run build:website
The babel output is committed to the repo (no separate publish pipeline reads it from a build cache), so .jsx and .js must match in the same commit. Don't hand-edit the .js files — the babel transform also touches surrounding output and an out-of-band edit drifts from what build:website would produce next time.
Verify with a single grep after bumping + building:
grep -rn "X\.Y\.Z" package.json .claude-plugin/ .opencode/plugins/arcforge.js README.md website/page/
Expect exactly 9 hits. Fewer means a split-brain bump (dangerous — different platforms or the website disagree about the current version). More means a stale copy elsewhere that also needs attention.
For an authoritative pass/fail that compares every location against the canonical plugin.json version, run npm run check:versions (zero-dep scripts/check-version-sync.js). It prints a location → version table and exits non-zero on any drift. The same check runs in CI and gates release.yml before the GitHub Release is created, so a drifted bump fails the release rather than shipping silently.
package-lock.json top-level "version" is known-stale at an older value. Leave it unless you're doing a dedicated lockfile refresh; never combine that with a release commit, since mixed diffs make rollback painful.
For releases that change shipped surface area (new skill, removed CLI flag, new marketing claim), also audit the website content — website/page/hero.jsx, sections.jsx, and sdd.jsx carry the project framing. Patch releases usually just need the version label bumped; minor/major releases often need copy adjustments too. Confirm with the user before rewriting hero copy or feature lists.
6. Commit, push, open PR
- Commit message:
chore(release): vX.Y.Z with a brief body summarizing scope
- Stage exactly the 10 release files (9 version locations +
CHANGELOG.md). Avoid git add -A — it tends to pull in lock files, editor droppings, and workspace metadata
git push -u origin <branch>
gh pr create with a test-plan checklist in the body: 4 runners green, lint green, secret scan clean, canonical 9-location grep returned exactly 9 hits
7. After PR merges to main — tag it
Arcforge has tagged every release since v1.0.0. Skipping a tag breaks the git log vPREV..HEAD workflow that the next release relies on to scope its CHANGELOG.
git checkout main && git pull
git tag -a vX.Y.Z -m "Release vX.Y.Z"
git push origin vX.Y.Z
If the user is merging via GitHub UI (squash or merge), run the tag commands against main after the merge completes — the merge commit on main is what represents the release on the main timeline, not the source branch's tip.
Things That Are Easy to Forget
These are the steps that get skipped when a contributor is in a hurry. The skill's job is to surface them even when the user doesn't ask:
- Ingest before bump. Once the version flips, the "why" narrative is harder to reconstruct for the vault. That's why it's step 1, not step 5.
.opencode/plugins/arcforge.js. The only non-JSON, non-README version location. It doesn't pattern-match "config file" in a search, so it gets missed.
- Website version labels + babel rebuild.
website/page/hero.jsx and sections.jsx both carry the version, and the committed .js siblings must be regenerated via npm run build:website to match — easy to miss because the website looks like a "doc only" surface but .jsx ≠ .js in a single commit is a real defect.
- README badge URL. The shields.io badge is image-cached; stale numbers visually persist even after every other file is correct. Worth an extra explicit mention.
- The six install-surface files.
.codex/INSTALL.md, .gemini/INSTALL.md, .opencode/INSTALL.md plus the three docs/README.{codex,gemini,opencode}.md files are sibling-flat and don't light up in normal "what did this release touch" diffs. Step 2 covers the audit details.
- Secret scan. Release commits are large diffs.
git diff --cached | grep -iE "api[_-]?key|token|secret|password" before pushing. The cost of a false positive is low; the cost of a committed secret is very high.
- Daily note append. After the release ships,
obsidian daily:append with a one-line release summary so the release is preserved in the vault's chronological log, not only in log.md.
- The post-merge tag. Merging the PR does not auto-tag. This is the single most commonly skipped step.
Anti-Patterns (from real arcforge release incidents)
- Silent version drift — v1.4.0 discovered
.claude-plugin/marketplace.json had been stuck two versions behind. The 9-location grep is designed to catch exactly this. v3.0.1 expanded the surface to include website/page/{hero,sections}.{jsx,js} after the website was found to have been silently bumped manually during v3.0.0.
- Version bump without CHANGELOG entry — the marketplace release cache is version-keyed. A bump with no CHANGELOG entry ships to users who have no way to tell what changed. The checklist order (CHANGELOG before version bump) enforces pairing them.
- Editing past CHANGELOG entries — downstream users and vault Decision notes depend on past entries being stable. Add corrections to the current entry; never stealth-edit the past.
- Partial bump shipped — bumping a subset of the 9 locations produces a release where Claude Code, OpenCode, the marketplace JSON, or the website disagree about the current version. Always use the 9-location grep as a post-bump gate.
- Mixing release commit with other work —
chore(release): vX.Y.Z should be only the 6 release files. Unrelated fixes bundled in make bisect and rollback painful. Commit work-in-progress separately before the release commit.
- Skipping the post-merge tag — without the tag, the next release can't use
git log vPREV..HEAD to scope its CHANGELOG. Missing tags cause the next release to either drop entries or include already-shipped ones.
After the Release
- If a new skill was added, verify it appears in
docs/guide/skills-reference.md and in MOC-ArcForge-Skills.md in the vault (the ingest step should have caught this, but a final visual check is cheap)
gh issue list --search "X.Y.Z" to find issues the release resolves and close them
- If this was a minor or major release, consider whether any evals need regenerated baselines for the new surface area (see
.claude/rules/eval.md)