تشغيل أي مهارة في Manus بنقرة واحدة

improve-claude-code

النجوم١٤

التفرعات١

آخر تحديث٢ يوليو ٢٠٢٦ في ٢٣:٣٠

Triage and batch-implement Claude-tagged Things todos as PRs for the claude config repo, discover improvement candidates from session history, or watch open PRs to implement review feedback and close shipped todos. Use when the user wants to work on their Claude Code improvement backlog, process Things todos tagged claude-code, batch-implement configuration changes, mine session history for grounded config-change candidates (Discover mode), or watch this skill's open PRs for review feedback and merges (Watch mode).

التثبيت

التثبيت باستخدام Codex أو Claude انسخ هذا Prompt والصقه في Codex أو Claude أو مساعد آخر ليراجع صفحة Skill ويثبّتها لك.

تشغيل في Manus

المصدر

bendrucker

bendrucker/claude

فتح مستودع GitHub عرض مستودعات المنشئ

تنزيل

تشغيل في Manus

SKILL.md

readonly

المزيد من هذا المستودع

نفس المستودع

claude-code-session

bendrucker/claude

Query Claude Code session history via a DuckDB index over `~/.claude/projects/`. Use when asked about Claude Code activity ("how many tokens today?", "what did I work on this week?") or instead of reading, grepping, or jq-ing session transcripts. Not for codebase search, git log queries, or arbitrary databases.

2026-07-0314

claude-code-skill

bendrucker/claude

Creating and optimizing Claude Code Skills including activation patterns, content structure, and development workflows. Use when creating new skills, converting memory files to skills, debugging skill activation, or understanding skill architecture and best practices.

2026-07-0314

things-jxa

bendrucker/claude

Read and query Things 3 data (lists, todos, projects, tags, logbook). Not for writes. Use things:url to create/update, things:inbox for quick captures.

2026-07-0214

github-pr-comments

bendrucker/claude

Fetch, reply to, and resolve review threads on a GitHub pull request. Use when checking what review feedback needs addressing, whether threads are resolved, replying to review comments, or clearing AI-reviewer threads after acting on them.

2026-07-0214

review-inbox

bendrucker/claude

Live tmux inbox for reviewing inbound pull requests across GitHub and GitLab. Use when reviewing multiple PRs, checking the review queue, batch reviews, or managing your review inbox. Pass --queue to spawn an already-ordered queue.

2026-07-0214

review-peer

bendrucker/claude

Review a pull request when requested by a peer. Use when reviewing PRs, providing code review feedback, or analyzing proposed changes. Supports GitHub and GitLab. Pass --triage to summarize a PR and estimate its review effort without reviewing it.

2026-07-0214

name	improve-claude-code
disable-model-invocation	true
description	Triage and batch-implement Claude-tagged Things todos as PRs for the claude config repo, discover improvement candidates from session history, or watch open PRs to implement review feedback and close shipped todos. Use when the user wants to work on their Claude Code improvement backlog, process Things todos tagged claude-code, batch-implement configuration changes, mine session history for grounded config-change candidates (Discover mode), or watch this skill's open PRs for review feedback and merges (Watch mode).
allowed-tools	["Skill(things:jxa)","Skill(things:url)","Skill(claude-code:session)","Skill(pull-request:create)","Skill(code-review)","Skill(github:actions-monitor)","Skill(github:pr-comments)"]

Improve Claude Code

Work through the claude-code Things backlog: fetch todos, triage with the user, then plan and implement each in parallel as separate PRs.

The backlog has two sources. The user files todos tagged claude-code by hand (and agent-ideas files external-harvest ideas in the same shape). Discover mode adds a second source: it mines this machine's session history for config-change candidates, grounds them against the live config, and files the keepers as claude-code todos. Both sources feed the one implement loop below. Discover is upstream of triage, not a replacement.

Watch mode is the downstream follow-on. Once PRs exist, it tracks them under /loop, implements review feedback as it lands, and closes each backing todo on merge. See Watch.

In every mode, the loop itself is in scope: this skill's own SKILL.md, the claude-code:session skill's queries and views, and the Things scripts the loop depends on. Findings in that class may be dispatched to background worktree agents immediately, even when everything else routes to planning or triage discussion.

All Things interaction goes through the things:jxa and things:url skills (never inline JXA). PRs go through pull-request:create (never gh pr create).

Discover

Mine session history for improvement candidates, ground them against the live config, write a digest, and file the keepers. Discover never auto-files: filing is an explicit user choice. Implementing is a separate run of the loop below, unless the user opts into Direct Implementation for the run. The engine is the claude-code:session skill's fan-out, whose references/discovery.md carries the recipe (dimension cheat sheet, grounding mandate, host safety, Tier-2 catalog). Load that skill to read it.

Refresh

Run the session skill's scripts/refresh.ts --refresh once, alone (it takes an exclusive write lock), and capture the printed $DB path. Hand that path to every agent, and never let a fanned-out agent refresh.

Fan-Out

Launch one Task agent per dimension (hook latency, hook blocks, permissions and sandbox, context tax, tokens, turns and compaction, skill economy), the same mining fan-out agent-ideas uses. Give each agent the $DB path and point it at references/discovery.md. Each agent runs its dimension's named queries (by name, read-only) plus any inline rollups, and returns structured candidate findings plus the exact SQL it ran. Read-only opens take no lock, so agents never contend.

Grounding

Mandatory. Launch one or more grounding agents that re-check every candidate against the live files under /Users/ben/src/bendrucker/claude. Drop anything the config already addresses. Downgrade anything thin or host-skewed. Carry grounded (boolean) and confidence (high/medium/low) per candidate. Raw query findings go stale within a week against a config that changes weekly: a prior run overturned four of its own headline findings. See the grounding rules in references/discovery.md (hooks run in parallel, so never sum durations as wall-clock, and split friction into what a setting can fix and what it cannot).

Dedup

Fingerprint each candidate (see Fingerprint). Then query Things via things:jxa for every claude-code-tagged todo and recently-completed (logbook) todo, and scan their notes for Discovery: <fp>. Mark each candidate:

already-filed: fingerprint found in an open claude-code todo.
already-shipped: fingerprint found in a completed todo (the annotate phase removes the claude-code tag on a shipped todo, so the marker persists in notes or the logbook).
new: fingerprint not found.

Suppress already-filed and already-shipped from the actionable set; still count them in the digest tail. Things is the ledger: no separate dedup store.

Digest

The only guaranteed output. Write tmp/claude-discovery-digest-<YYYY-Www>.md, ranked and grouped high to low confidence. Each entry shows the finding, its grounding note, the SQL that produced it, and its dedup status. Default host=local for config-change candidates; cite imported hosts as corroborating counts only, never pasting raw content/command/stdout from an egress-blocked host (see host safety in references/discovery.md). Never auto-file from this step.

File the Keepers

Present the actionable (new, grounded) candidates and ask the user which to file (numbers, ranges like 1-3, or all), mirroring the triage UX below. For each selected candidate, create one Things todo via things:url, tagged claude-code:

Title: [discovery] <finding title>
Notes: the pitch, then the SQL/evidence, then Discovery: <fingerprint> on its own line.

One todo per candidate, not one blob. Filing lands findings in the same backlog the implement loop drains.

Hand Off

Report how many todos landed. The existing triage, plan, implement, PR, CI, and annotate phases run on them later, unchanged. Filing is the default terminal action of Discover mode. Implementing is a separate, explicit choice: run the loop below when ready, or opt into Direct Implementation at the start of a run.

Direct Implementation

Implement-as-you-go is an opt-in alternative to filing, chosen explicitly by the user per run. When the user asks for direct implementation, dispatch one background general-purpose agent with isolation: "worktree" for each grounded finding as soon as it lands, while the rest of the run continues. Each agent folds grounding in: it verifies the finding against the live config first, and if the config already addresses it, reports "not grounded" and changes nothing. Otherwise it implements, tests, runs /code-review, and opens a PR via pull-request:create. Collect the PR links at the end for the user to review locally or on GitHub.

These PRs have no backing Things todo, so skip the Original Task link. Instead the body carries an Evidence section (local-host evidence only, never content from an egress-blocked host) plus one Discovery: <fingerprint> line per finding. Dedup today scans only Things notes, so these markers stay invisible until that scan is extended to PR bodies. Until then, a finding implemented this way can resurface as new on the next run.

Cadence

On-demand is primary: invoke this skill in Discover mode at your terminal. A weekly run is optional and must run locally (a /loop or /schedule trigger on this machine). The session DB and the duckdb CLI live here, so the agent-ideas headless-then-teleport bridge does not apply (that works only because RSS is public). A documented option, not built infrastructure.

Fingerprint

The dedup identity. Compute sha256(finding_type + '|' + normalized_target) truncated to 12 chars:

printf '%s' "hook-noop|team-workaround.ts" | shasum -a 256 | cut -c1-12

finding_type is a stable slug for the class of finding (hook-noop, permission-allowlist-miss, repeat-read, sandbox-deny).
normalized_target is the config object the finding is about (a hook script basename, a permission pattern, a skill name), never a count or a date, so re-runs of the same underlying finding collapse to one identity.

Filed todos carry Discovery: <fingerprint> in notes. The dedup step extracts those markers from Things and suppresses matches. Suppressing dismissed findings (surfaced but not filed) is deferred: dismissed findings reappear as new until filed.

Fetch and Triage

Use things:jxa to find all open todos tagged claude-code. Display a numbered table:

#	Title	Notes (first line)	List

Ask the user which items to work on (numbers, ranges like 1-3, or all). Cap each batch at 3 to keep parallel agents manageable. Split larger selections automatically.

Session Context

Each todo's notes embed the originating session as Session: <uuid>. For every selected todo, parse that UUID and use the claude-code:session skill to pull the original context: what you were doing, the commands that ran, and the errors that prompted the todo. This is richer than the todo's prose summary and grounds each plan in the real failure.

Refresh the index once (refresh.ts --refresh), then look up each todo's session over the shared file with duckdb -readonly "$DB" (see the session skill's "Parallel queries" section). Read-only opens take no lock, so a batch of lookups runs concurrently without contending; never re-refresh per todo. Query messages / content_items / text_content filtered by WHERE session_id = '<uuid>'. Do not filter by host: many todos come from the work machine, whose corpus is imported as a separate host, and omitting the filter spans every machine. Distill the result to a few lines per todo and pass it to the matching Plan agent with the title and notes.

If the UUID is absent from the index (not yet imported, or the index needs a refresh), proceed with notes only and say so for that todo.

Egress

Session context informs local planning only. Imported hosts may be marked block_egress, so never paste session-derived content into PR bodies or any other output that leaves the machine.

Plan

Launch parallel Plan agents (one per todo). Give each the todo title, full notes, the session context from the previous step, and instruction to explore the repo and produce an implementation plan.

Point agents to relevant domain skills: claude-code:skill for skill changes, claude-code:hook for hooks, bun:bun for scripts.

Present all plans. For each, propose a /code-review effort (typically low; medium for changes touching multiple plugins) and confirm via AskUserQuestion alongside plan approval.

Implement

For each approved plan, launch a background general-purpose agent with isolation: "worktree". Each agent implements the plan, runs bun test, runs /code-review <effort> with the level chosen during planning, commits, and creates the PR via pull-request:create. Pass the same domain skills from the planning step.

PR body

Include an Original Task link so the PR traces back to the Things todo:

Original Task: [<todo-title>](https://things.bendrucker.me/show?id=<todo-id>)

Monitor CI and Fix Failures

Use github:actions-monitor agents (one per PR) to collect pass/fail status. For failures, launch a worktree agent with the logs and branch to fix, test, and push. Re-monitor after fixes.

Annotate Things

Use things:url to update each todo based on its PR outcome:

Passing: Append PR link to notes, add review tag, remove claude-code tag, move to Anytime
Failing: Append PR link and failure summary to notes. Leave claude-code tag so it resurfaces next run.

Summary

Output a bulleted list (one entry per todo): PR link (pass/fail), Things URL (https://things.bendrucker.me/show?id=<todo-id>), title.

Watch

A follow-on mode for the PRs this skill opened. Run it under /loop self-paced so each tick re-checks every open PR from the batch, acts on new review feedback and merges, then sleeps. Watch implements requested changes and pushes them. It never merges for you, and it ends the loop once every tracked PR is merged or closed.

Invoke as /loop /improve-claude-code watch. Recover the PR-to-todo mapping from each PR's Original Task link.

Each Tick

Walk every open tracked PR once and handle its state:

CI red: launch a worktree agent with the failing logs and branch to fix, test, and push, the same as Monitor CI and Fix Failures.
A reviewer thread requests a change: launch a worktree agent to implement it, run bun test and biome check, then commit and push. Reply to the thread naming the change and its commit. Leave the thread unresolved for the reviewer, and do not merge.
Merged: close the backing Things todo. Append the PR link, mark it completed, and remove the claude-code tag via things:url.
Closed without merging: leave the todo tagged claude-code with a note so it resurfaces next run.

Fetch reviewer threads with the github:pr-comments script (--role reviewer --include-resolved), never a hand-authored GraphQL query. Report one status line per PR each tick.

Guardrails

Auto-implement covers review changes and CI fixes only. A reviewer question or design objection pauses for you with the thread quoted, no edit.

Every push re-runs CI, so a PR often carries across ticks rather than resolving in one pass. That is expected under /loop.