Jeden Skill in Manus ausführen
mit einem Klick

Jeden Skill in Manus mit einem Klick ausführen

research-gated-build-plan

Use at the START of anything sizable, before writing code — when Mark says 'want to work out the full plan', 'I want to do some deep research', 'lets start the planning', 'what does complete look like?', 'what tickets/tasks remain', 'how close are we / what are our current gaps', 'do we already have a harness for this', or 'is there a system we can use?'. This is Mark's front-loaded decision discipline — inventory what already exists before building, scope the gap against a concrete target, research and persist the approach as an artifact, sequence prerequisites, and bake quality bars + phase gates with human checkpoints directly into the go-ahead. It decides WHETHER and HOW to enter the execution skills (start-phase-plan, feature-new); run it first.

In Manus ausführen

Überblick

Installationsbefehl

npx skills add https://github.com/artsmc/claude-dev-agents --skill research-gated-build-plan

Kopieren Sie diesen Befehl und fügen Sie ihn in Claude Code ein, um den Skill zu installieren

Quelle

artsmc/claude-dev-agents

Sterne3

Forks0

Aktualisiert31. Mai 2026 um 03:44

Datei-Explorer

2 Dateien

SKILL.md

readonly

Mehr aus diesem Repository

gleiches Repository

fleet-dispatch-and-watch

artsmc/claude-dev-agents

Use when Mark wants large or mechanical work fanned out across his machine fleet, or wants long-running/background/remote agents monitored. Triggers on 'distribute among the fleet', 'delegate with tmux and worktrees', 'kick it off fan them out', 'offload to mac-mini, desktop just orchestrate', 'check on the agents every 5 min', 'is it done its been some hours', 'are we ready to delegate', or any dispatch to background workers. This is Mark's dispatch→snapshot→poll→escalate→checkpoint loop — fan independent work to workers (local box orchestrates), hand each a precise state snapshot, then poll on a fixed cadence with trustworthy liveness proxies, error-signature buckets, and hard stop/escalation thresholds — never blocking or assuming success.

2026-05-313

scope-question-and-delegate

artsmc/claude-dev-agents

Use when work is genuinely ambiguous in a way that changes the plan, or big/multi-part/underspecified enough to balloon context — or when the orchestrator's window is trending toward the ~200k cliff where quality and cost degrade. Fires when Mark says 'this is a big one', 'what do you need from me before you start', 'dont drag the whole context through this', 'should we split this up', or 'keep the main thread lean' — or when one thread is doing both the planning AND the execution. This is Mark's triage-then-delegate discipline — decide fast whether real ambiguity or context cost warrants a stop, ask only the few DECISIVE questions, treat the window as a finite budget, and hand each worker a minimal scoped snapshot so the orchestrator plans and synthesizes without accumulating execution detail. Not a hard gate — small clear tasks just proceed.

2026-05-313

diagnose-from-raw-symptom

artsmc/claude-dev-agents

Use when Mark reports a bug by pasting a raw artifact with little or no prose — a stack trace, an HTTP request/response trace (URL + method + status like 500/403/404/502), a console/React/Prisma error, a ChunkLoadError, a shell error, or a screenshot of a visual defect — or when he says things like 'still failing', 'clicking X does nothing', 'no inline editor on the rendered page', 'do we have local postgres/pgvector?', or 'identify the true error'. This is the front-to-back debugging procedure — extract the trigger and reproduction from terse input, probe whether the underlying plumbing even exists, localize by contrasting known-good vs broken, and drive to a durable root cause rather than a workaround. Reach for this BEFORE proposing any fix.

2026-05-303

enumerated-menu-pick-and-sweep

artsmc/claude-dev-agents

Use whenever you're about to present Mark with consequential choices, questions, or tradeoffs — structure them as a numbered or lettered menu so a bare reply counts as a full selection, and use this to read his terse picks. Triggers on one-token replies that only resolve against a labeled list you just gave ('go with B', 'lets go with 2', 'run option B', 'c one big commit'), picks carrying a rider ('option 1 and update envs when your done', '1. yes, and execute it'), multi-item sweeps answering every numbered point at once ('1. yes 2. thats fine 3. just aws 4. whats the mcp server? 5. minimal'), ranged picks with a because-clause scope-cut ('do 1-3, 4 wont be needed because...'), and Mark authoring his own numbered decision list to apply across a spec. Format choices as a menu — a bare letter is uninterpretable unless you offered the list first.

2026-05-303

prove-it-live-before-done

artsmc/claude-dev-agents

Use whenever an agent (or you) is about to claim work is done/fixed/shipped/passing, or when Mark says 'ready?', 'done?', 'did it work?', 'still failing', 'is the live URL responding?', 'check the deploy status', or wants to confirm a fix before pushing. Treat every completion claim, green CI, and passing test as UNPROVEN until the real artifact is exercised end-to-end — drive the actual running app at its real URL/UI/API, confirm the deployed revision is actually live, verify the mutating side-effect (email/queue/DB row) truly fired, gate every forward step on an explicit green signal, audit coverage across the whole surface, and reconcile against the system of record. Then name the specific residual defect with expected-vs-actual.

2026-05-303

reference-as-executable-spec

artsmc/claude-dev-agents

Use when Mark frames a feature by pointing at a concrete reference instead of describing behavior from scratch — when he names a live product as the target ('an inline editor like editor.js', 'a code editor like monaco-editor dark theme', 'much like openclaw.ai'), points at an internal subsystem as the literal spec to replicate ('the OTP from sentient-monorepo is the pattern we should match', 'use sentient-monorepo as the api source of truth'), or anchors a requirement on an external library as the structural model ('setups similar to CASL', 'building this same CASL boilerplate'). Whenever he says 'build it like THAT', 'same as X', or 'mimic/replicate X', treat the named reference as the executable spec — go observe it, extract its real behavior, and let THAT define correct rather than inventing requirements.

2026-05-303

Quelle

artsmc

artsmc/claude-dev-agents

GitHub-Repository öffnen Creator-Repositorys ansehen

Installationsbefehl

Download

In Manus ausführen

Nützlich fürSOC

SoftwareentwicklerInformatik- und Mathematikberufe15-1252L4

name

research-gated-build-plan

description

Research-Gated Build Plan

Mark refuses to start coding on anything sizable until the approach is settled. He optimizes for the quality of the decision over the speed of shipping (whatever option you think is best not fastest but the best; make the one with long term benefits not short term gain). Run these gates before any build.

Gate 1: Inventory existing capabilities — check before you build

Before committing to build something new, examine the foundation and ask whether it already exists. This counters the default urge to hand-roll.

do we have inngest workflows... first just examining the foundation
do we already have a harness for this in docs?
is there a system we can use? / I don't think raw sql is the best
Separate genuine build work from mere configuration: understanding how much is just for the agent and how much we need to build.

Also probe feasibility as an open question, not a prescribed solution: is it possible to run task in another terminal...?, can you access tailscale?, is that setup per-change or global?

Gate 2: Scope assessment — what's left against the target

Open the session by asking what REMAINS, not by jumping to execution. Anchor on a reference artifact (gap analysis, planning doc, ticket cycle) or a concrete target metric, then derive a sequenced ordering.

what betachat tickets are still remaining?
how close are we to autonomous agents — what is our current gaps?
based on planning/gap-analysis what task remain inside of platform
in the best order id like to execute on tickets in a order that gets us to zero betachat tickets

Gate 3: Research → document → ticket (persist it, no memory leakage)

For comprehensive asks, run the explicit pipeline and persist artifacts so context isn't lost.

this is a really comprehensive ask so im looking for deep research and planning
make a folder in the docs i want to truely vet through everything
Pipeline: phase 1 research, phase 2 markdown, phase 3 ticket building — start from markdowns... then move them into linear so we dont have memory leakage.
Record config decisions into project memory: make note of it in claude.md. Propagate shared-doc changes to every node: ssh-machines.md will need to be updated.

Gate 4: Decompose into developer-sized units

Map the scope by which pieces belong to which subsystem/app, then break coarse work into smaller independently-testable chunks with a spec per unit.

i need to work out which parts belong to which apps
SEN-209 and SEN-204 both look they should be broken down into smaller ticket
Prefer incremental per-item integration: incremental — merge each as it greens, not one big batch merge.
When a unit is sizable/ambiguous or context-heavy, route it through scope-question-and-delegate to ask the decisive questions first, then delegate with minimal context.

Gate 5: Sequence prerequisites; gate downstream on completion

Order dependent work explicitly — clear blockers first, sequence env setup, gate releases behind upstream conditions.

unblock first
lets do this first then start them up
once the database migration and update clear, ready to ship tagged version updates to all repos
first thing first i need to harden what we have now

Gate 6: Bake quality bars + phase gates into the go-ahead

Embed non-negotiable constraints directly in the kickoff so they need not be re-stated. Mark's standing bar:

90%+ unit test coverage, strict TypeScript, linters with zero errors
Files under ~350 lines (especially React)
SRP / DRY, swagger/OpenAPI updated
Stop-gates that validate each stage before advancing

each stage is back by stop gates to validate and test, we should have linters built in with no errors. unit test backed 90% or better, strict typescript. Often ask for an assessment of current violations before refactoring.

Then phase-gate the work with human checkpoints — advance only on command, dry-run before anything destructive: i need gates between each phase, I want to test each part one at a time, yes dry run, ship v1.6.2 once builds pass.

Gate 7: Subordinate the mechanism to the user-visible outcome

If the plan over-engineers, restate the actual goal and pick the cheaper path that achieves the visible result:

the api is not important what is important is that its present on the homepage
it is mostly marketing lets just use the value we have as issues and keep it static
could we kill the Lambda and just use it as our backend? Prune scope when a simpler architecture becomes viable; cut optional features by reasoning about the product's stage (4 wont be needed because this is a small application).

Gate 8: Define hard architectural boundaries up front

Specify the single boundary all cross-component interaction must route through, and bake authorization/permission invariants into the spec as hard constraints:

the only thing unified should be that API layer... should not interact with beta chat directly
I can only link a conversation to a case if the people have permission to view the case
Designate one source of truth and phase out the rest: the public page is our source of truth, redirect and phase out the quiz result page.

After the plan: hand off

Once the plan is agreed, either phase-gate it (Gate 6) OR grant uninterrupted autonomy if Mark trusts it and wants momentum (come up with a plan... you will execute without breaks on your own). See steer-and-correct-the-agent for how he hands off, and fleet-dispatch-and-watch if the work fans out across machines.

Red flags

Anti-pattern	Mark's correction
Jumping to code before research/inventory	`first just examining the foundation`; `do we already have a harness?`
Hand-rolling when a system exists	`is there a system we can use?`
Starting without scoping the gap to a target	`what tickets are still remaining?`
Research that isn't persisted	`move them into linear so we dont have memory leakage`
Omitting quality bars from the kickoff	`unit test backed 90%... strict typescript... linters with no errors`
Over-engineering the plumbing	`the api is not important — it just needs to be present on the homepage`
One big batch merge at the end	`incremental — merge each as it greens`