with one click
skill-review
// Monthly review of skills for staleness, drift, and gaps. Use when skills feel out of sync or on first Friday of month. "skill review", "review skills", "skill audit"
// Monthly review of skills for staleness, drift, and gaps. Use when skills feel out of sync or on first Friday of month. "skill review", "review skills", "skill audit"
[HINT] Download the complete skill directory including SKILL.md and all related files
| name | skill-review |
| description | Monthly review of skills for staleness, drift, and gaps. Use when skills feel out of sync or on first Friday of month. "skill review", "review skills", "skill audit" |
| effort | high |
| user_invocable | true |
Periodic audit to catch skill drift, identify gaps, and prune unused skills.
Before forming any drop, demote, or merge hypothesis on a candidate skill, read the full SKILL.md end-to-end โ including queues, completed-mines tables, cross-references, and "see also" sections. The catalog is intentionally rich; assume each skill is load-bearing until the SKILL.md proves otherwise. The recurring failure mode (codified 2026-05-06 Slot 54 retrospective, three retracted hypotheses in one session) is shallow description-match generating false-positive drops. The principled consolidation criteria are (a) trigger collisions (deterministic, run python3 ~/germline/membrane/cytoskeleton/skill-trigger-gen.py to surface), and (b) genuine semantic duplication confirmed by reading both SKILL.mds. "Low fire count", "bare-name listing", and "thematic similarity" are advisory-only signals โ none produced a clean fix in the codifying instance. See marks/feedback_dont_propose_skill_drops_from_descriptions.md (PROTECTED).
ls -la /home/vivesca/skills/*/SKILL.md | wc -l
ls -la /home/vivesca/.claude/skills/*/SKILL.md | wc -l
Count skills in both locations. Flag any missing symlinks.
Budget check โ estimate character usage vs the configured limit:
# Skills in budget (excludes disable-model-invocation: true)
grep -rL 'disable-model-invocation: true' /home/vivesca/skills/*/SKILL.md | wc -l
# Rough estimate: in-budget count ร 309 chars (200 avg desc + 109 overhead)
# Note: SLASH_COMMAND_TOOL_CHAR_BUDGET is a no-op since v2.1.32 โ auto-scales at 2% of context window
# Sonnet 4.6 Max (200k) = ~4k token budget. Aim to reduce via consolidation.
# Full reference: ~/docs/solutions/ai-agent-skill-tool-count-research.md
Verify the skill-trigger matcher is alive (rotted silently for 25 days April 2026 โ see ~/epigenome/marks/finding_skill_trigger_system_silent_failure.md):
~/germline/effectors/skill-trigger-stats 30 # last 30 days fire stats
~/germline/effectors/skill-trigger-stats 30 --dead # list dead triggers
test "$(find ~/.claude/skill-triggers.json -mtime +7)" && echo "STALE โ run skill-trigger-gen.py"
cd ~/germline && python3 -m pytest assays/test_skill_trigger_freshness.py
Triage:
skill-trigger-gen.py reports the gap; baseline at 28 Apr 2026 was 20 such skills. Add triggers (YAML triggers: list or ## Triggers markdown section) when reviewing each.Three signals must merge. Single-signal counts mislead โ slash-only skills look dormant in trigger logs; trigger-only skills look dormant in slash logs. Borrowed from Hermes Curator's use_count design (2026-05-01 review).
# Merged-signal skill usage scan
import json, re
from collections import Counter
from datetime import datetime, timedelta
from pathlib import Path
WINDOW_DAYS = 90
cutoff = datetime.now() - timedelta(days=WINDOW_DAYS)
counts = Counter()
last_used = {}
# Signal 1: slash invocations from anam.jsonl
ANAM = Path.home() / ".claude" / "anam.jsonl"
if ANAM.exists():
for line in ANAM.read_text().splitlines():
try:
data = json.loads(line)
except json.JSONDecodeError:
continue
msg = data.get("display", "").lower()
for m in re.findall(r"(?:^|\s)/([a-z][a-z0-9-]+)", msg):
counts[m] += 1
# Signal 2: trigger fires from skill-suggest-log.tsv
LOG = Path.home() / ".claude" / "skill-suggest-log.tsv"
if LOG.exists():
for line in LOG.read_text().splitlines():
parts = line.split("\t")
if len(parts) < 4:
continue
try:
ts = datetime.fromisoformat(parts[0])
except ValueError:
continue
if ts < cutoff:
continue
skill = parts[2]
counts[skill] += 1
if skill not in last_used or ts > last_used[skill]:
last_used[skill] = ts
# Signal 3: keyword references in conversation history
# (catches skills invoked by natural-language phrases that aren't registered triggers)
keywords = {
"quorate": r"ask.llms|consilium|multi.llm|council",
"endocrine": r"check.*email|inbox|gmail",
"keryx": r"whatsapp|check.*messages",
"todo": r"add.*todo|check.*todo",
"sopor": r"oura|sleep.*score|how.*sleep",
"message": r"draft.*reply|draft.*message",
}
# (extend keywords as drift in ยง4 surfaces new patterns)
# Output: top-20 used + bottom-20 used (with triggers registered but zero count)
TRIGGERS = Path.home() / ".claude" / "skill-triggers.json"
registered = set(json.loads(TRIGGERS.read_text()).keys()) if TRIGGERS.exists() else set()
zero_signal = sorted(registered - set(counts.keys()))
print("Top 20:", counts.most_common(20))
print(f"Zero-signal skills with triggers ({len(zero_signal)}):", zero_signal)
Read both lists. Top-20 = working well, leave alone. Zero-signal = candidates for review (description tweak, deprecation, or โ most often โ they ARE used but via natural-language phrasing not in keywords above; extend the dict and re-run).
Identify:
wc -l ~/.claude/projects/-Users-terry/memory/MEMORY.md ~/CLAUDE.md
| File | Threshold | Action |
|---|---|---|
| MEMORY.md | >150 lines | Audit for tool-specific content that belongs in skills |
| CLAUDE.md | >160 lines | Audit for detailed content that belongs in skills or docs/solutions |
If over threshold: scan each section and ask "is this a behavioral rule (stays) or tool-specific reference (move to skill)?"
For each active skill, check:
| Check | How |
|---|---|
| Vault references valid? | Do paths in skill still exist? |
| Vocabulary aligned? | Does skill terminology match current vault notes? |
| Workflow still accurate? | Has the process changed since skill was written? |
| Context shifted? | Has a hook, tool, or other skill made parts of this skill redundant? A component can be correct but no longer worth its weight. See ~/docs/solutions/patterns/tightening-pass.md. |
| Description trigger timing? | Does the description fire at the earliest useful moment โ when the uncertainty exists โ or only after the decision is already made? A skill consulted too late is a skill not consulted. |
Open question (unresolved as of 2026-03-04): MEMORY.md vs skill description โ which is more reliable for behavioral nudges?
Audit wrap output quality to catch skill drift before it compounds. Run once per monthly review.
Extract a sample:
evolvo # 30-day window, 15-sample spread (default)
evolvo --days 90 # extend to 90 days for thorough monthly review
Evaluate each sampled output against:
| Signal | Healthy | Flag if |
|---|---|---|
| Narrative specificity | Names tools, decisions, outcomes | Generic ("light session", "routine work") |
| Multi-legatum rate | <15% of sessions | >25% โ skip gate not firing |
| Pre-wrap block | โ /โ mix, dirty repos caught | Always "all clear" (may be skipping) |
| Step 4 boilerplate | "Nothing to implement" or silent skip | "Nothing new here" repeated >3ร in sample |
| Decisions captured | Open items surface in NOW.md | Same open items appear across multiple wraps |
Flag any signal for skill edit. If multi-legatum rate is high, the skip gate in the legatum skill needs tightening.
Review recent sessions for patterns:
## Skill Review - [Date]
### Healthy (Keep)
- `/skill-name` โ Used X times, working well
### Needs Update
- `/skill-name` โ Issue: [what's wrong]
### Candidates for Deprecation
- `/skill-name` โ Last used: [date], reason to keep/remove
### Gaps Identified
- [Workflow that should be a skill]
### Actions
- [ ] Update X
- [ ] Create Y
- [ ] Deprecate Z
Quick skim of releases/READMEs for new patterns worth cherry-picking. Don't adopt wholesale โ just note anything novel.
| Repo | What to watch for |
|---|---|
| obra/superpowers | Skill methodology, discipline enforcement |
| disler/claude-code-hooks-mastery | Hook patterns, observability |
| trailofbits/skills | Domain-specialized skill design |
| OthmanAdi/planning-with-files | Planning workflows |
| parcadei/Continuous-Claude-v3 | Context management, state persistence |
| mattpocock/skills | Interface/module design vocabulary, DDD, Ousterhout-grounded patterns |
Save review to /home/vivesca/notes/Skill Review - YYYY-MM.md
Lighter version for weekly reset:
organogenesis โ How skills should be structuredvault-search โ Finding content skills reference