원클릭으로
autonomous-development
// Autonomously discovers ready tasks from Tickets, executes them with Research → Implement → Verify cycles, continuing with related work while context remains healthy - invoke directly without specifying a task
// Autonomously discovers ready tasks from Tickets, executes them with Research → Implement → Verify cycles, continuing with related work while context remains healthy - invoke directly without specifying a task
| name | autonomous-development |
| description | Autonomously discovers ready tasks from Tickets, executes them with Research → Implement → Verify cycles, continuing with related work while context remains healthy - invoke directly without specifying a task |
Execute development tasks autonomously using a three-stage subagent cycle: Research → Implement → Verify. Continue with related tasks when context is healthy; exit when context is strained or only complex unrelated work remains.
Designed for headless operation with -p flag:
# Single task execution
claude -p "Use autonomous-development skill"
# Infinite loop (external bash script)
while true; do
claude -p "Use autonomous-development skill" || break
sleep 2
done
AUTONOMOUS TASK EXECUTION:
You execute tasks autonomously, continuing through multiple tasks when conditions are favorable:
tk ready to get the list of ready tasksHEADLESS MODE PRINCIPLES:
tk ready and select tasksMULTI-TASK SESSION CAPABILITY:
The external loop handles:
IMPORTANT: Always run tk help first to confirm current command syntax.
You (the coordinator) ONLY:
tk commands (ready, show, create, status, dep, etc.)You (the coordinator) NEVER:
Subagents do ALL the actual work.
MANDATORY to use when:
-p flag (headless mode) ← This is you right now if you're reading thisclaude -p "Use autonomous-development skill"YOU MUST NOT ASK CLARIFYING QUESTIONS IN HEADLESS MODE:
-p flag means run autonomouslyNEVER use when:
Goal: Record baseline system state and verify project is stable enough to proceed.
CRITICAL: This runs ONCE per invocation, BEFORE selecting any task.
Before anything else, check git state:
git symbolic-ref HEAD 2>/dev/null || echo "DETACHED"
If DETACHED HEAD detected:
This often happens when agents checkout specific commits for debugging and forget to return to the branch. The work may be orphaned.
Dispatch git recovery subagent:
You are investigating a detached HEAD state in this repository.
Working directory: [path]
**FIRST ACTION:** Read the project CLAUDE.md to understand the PRIME DIRECTIVE.
**INVESTIGATION STEPS:**
1. Run `git status` to confirm detached HEAD state and see current commit
2. Run `git log --oneline -5` to see recent commits from HEAD
3. Run `git reflog -20` to understand how we got here
4. Look for patterns like:
- `checkout: moving from [branch] to [sha]` - agent checked out a commit
- `commit:` entries after the checkout - work was done in detached state
- The branch name that was active before the detachment
5. Identify:
- What branch we were on before detachment
- Whether any commits were made in detached state
- Whether those commits exist on any branch
6. Run `git branch -a --contains HEAD` to see if current commit is on any branch
7. Run `git log --oneline [original-branch]..HEAD 2>/dev/null` to see commits not on that branch
**RECOVERY ACTIONS:**
Based on findings, execute ONE of these:
**Case A: No commits made in detached state**
- Just checkout the original branch: `git checkout [branch]`
**Case B: Commits made in detached state, not on any branch**
- Save the detached commit(s): `git branch recovery-[short-sha] HEAD`
- Checkout the original branch: `git checkout [branch]`
- Merge the recovery branch: `git merge recovery-[short-sha] --no-edit`
- If merge succeeds, delete recovery branch: `git branch -d recovery-[short-sha]`
- If merge conflicts: leave recovery branch, report conflict
**Case C: HEAD is already on a branch (false positive)**
- Nothing to do, report that we're on [branch]
**REPORT FORMAT:**
GIT_STATE_BEFORE: [detached at sha | on branch X] DETACHMENT_CAUSE: [what reflog showed - e.g., "checkout from master to abc123"] COMMITS_IN_DETACHED: [count, with short descriptions] RECOVERY_ACTION: [what was done] GIT_STATE_AFTER: [on branch X] MERGE_RESULT: [clean | conflicts | not needed]
If merge conflicts occur, report the conflicting files and EXIT - human intervention needed.
If recovery fails or conflicts exist:
STATUS: ERRORIf recovery succeeds or no recovery needed:
Dispatch baseline subagent to:
Baseline output format required:
BASELINE_BUILD: [pass|fail]
BASELINE_BUILD_WARNINGS: [count]
BASELINE_BUILD_ERRORS: [count]
BASELINE_TESTS_PASSING: [count]
BASELINE_TESTS_FAILING: [count]
BASELINE_FAILURES: [list any test names or error messages]
Store baseline in session context for later comparison during verification.
If baseline is broken (tests failing, build errors):
tk ready and tk blocked to check what's already trackedSTATUS: COMPLETEDIf baseline is healthy: Continue to task selection.
Goal: Understand requirements, existing code, and implementation approach
Dispatch subagent(s) to:
Research complexity assessment:
Use judgment, not metrics. The question is: "Can I hold the full solution in my head?" not "How many lines will this be?"
If task too complex after research:
STATUS: BLOCKED_SUBTASKSResearch subagent prompt pattern:
Your job is to research [task name].
Working directory: [path]
**CRITICAL FIRST ACTION:** Read the project CLAUDE.md to understand the PRIME DIRECTIVE (maximal simplicity policy), development environment setup, and file editing best practices.
**SECOND ACTION:** Read [repo]/CLAUDE.md to understand the project architecture and vision summary.
**THIRD ACTION:** If docs/VISION.md exists, read it to understand design philosophy, optimization targets, and anti-goals that should guide implementation decisions.
**MANDATORY BASELINE RECORDING:**
Before researching the task, record the current system state:
1. Run the project build/compile command, record all warnings/errors
2. Run the project test suite, record pass/fail counts
3. Document specific error messages for any failures
Return baseline as:
BASELINE_BUILD: [pass|fail]
BASELINE_BUILD_WARNINGS: [count]
BASELINE_BUILD_ERRORS: [count]
BASELINE_TESTS_PASSING: [count]
BASELINE_TESTS_FAILING: [count]
Then perform these research tasks:
1. [Specific research goal]
2. [Specific research goal]
3. ...
Return a summary including:
- Exact requirements from the task
- Current state of the code
- What needs to be implemented
- Clear implementation approach
- Estimated complexity (simple/medium/complex)
Goal: Make the minimal, targeted changes to complete the task
CRITICAL REQUIREMENT: Implementation MUST follow test-driven development (see test-driven-development skill for detailed workflow).
CRITICAL: Concurrent Work Safety
Your implementation subagent prompt MUST include:
**CRITICAL NOTICE:** Another AI may be working concurrently in this repository. When you commit:
- ONLY stage and commit changes related to [task-id]
- Use `git add` for SPECIFIC FILES only (do not use `git add .`)
- Run `git status` before staging to see all changes
- Review `git diff --cached` to verify only [task-id] changes are staged
- If you see changes from other tasks, DO NOT commit them
Implementation subagent prompt pattern:
You are implementing [task name].
Working directory: [path]
**FIRST ACTION:** Read the project CLAUDE.md to understand the PRIME DIRECTIVE (maximal simplicity policy), development environment setup, and file editing best practices.
**SECOND ACTION:** Read [repo]/CLAUDE.md to understand the project architecture and vision summary.
**THIRD ACTION:** If docs/VISION.md exists, read it to understand design philosophy, optimization targets, and anti-goals.
**MANDATORY: Use Test-Driven Development**
Before implementing ANY code changes:
1. Invoke the test-driven-development skill
2. Follow the red-green-refactor cycle:
- RED: Write failing test(s) demonstrating the requirement
- Verify the test fails correctly (not a typo or existing feature)
- GREEN: Write minimal code to pass the test
- Verify all tests pass (new and existing)
- REFACTOR: Clean up while keeping tests green
3. Report results showing tests were written and failed first
NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST.
**CRITICAL NOTICE:** [concurrent work safety instructions above]
**TASK SUMMARY:**
[Clear description of what needs to be done]
**REQUIRED CHANGES:**
1. [Specific change with file paths and line numbers if known]
2. [Specific change]
3. ...
**KEY CONSTRAINTS:**
- PRIME DIRECTIVE - simplest implementation that achieves the goal
- Follow existing code patterns
- All tests must pass
- Careful git staging
**BEFORE COMMITTING:**
- Close the Tickets task: `tk close [task-id]`
- Stage the ticket file WITH your code changes: `git add .tickets/[task-id].md`
(The commit should include both implementation AND ticket status update)
**AFTER IMPLEMENTATION:**
Report back with:
- Summary of changes made
- Test results (including test names that were written first and failed, then passed)
- Confirmation that red-green-refactor cycle was completed
- Files changed (verify nothing unrelated)
- Commit hash
- Any issues encountered
If implementation discovers missing requirements or ambiguity:
tk dep <current-task-id> <new-investigation-task-id>)STATUS: BLOCKED_INVESTIGATIONNote: Use REQUIRES-INVESTIGATION: prefix, NOT HUMAN-TASK:. The investigate-blocker skill will determine if human input is truly needed after thorough research.
Goal: Review implementation against PRIME DIRECTIVE, run tests, find issues
Dispatch code-reviewer subagent with this prompt:
Verify implementation for: [task-id]
**FIRST ACTION:** Read the project CLAUDE.md to understand the PRIME DIRECTIVE (maximal simplicity policy).
**SECOND ACTION:** Read [repo]/CLAUDE.md for project-specific guidance.
Context:
- Task: [task description]
- Commit: [hash]
- Files changed: [list from implementation report]
- Baseline: [tests passing, build warnings from health check]
**PRIME DIRECTIVE CHECK (Overriding Concern):**
Review the git diff and evaluate:
1. Is this the simplest implementation that achieves the goal?
2. Are concerns separated, not complected (interleaved)?
3. Were unnecessary abstractions added?
4. Is there over-engineering or premature generalization?
5. Could this be simpler while still meeting requirements?
**Standard Verification:**
- Run full test suite - all tests pass?
- Compare against baseline - forward progress demonstrated (no regressions)?
- Requirements met per task description?
- Security issues introduced?
- Only task-related files committed (check git diff)?
- Commit message quality?
**Report Format:**
PRIME_DIRECTIVE: [PASS | VIOLATIONS_FOUND]
[If violations: specific issues with file:line references]
TESTS: [PASS | FAIL]
[If fail: which tests, error messages]
BASELINE_COMPARISON: [PROGRESS | REGRESSION | MAINTAINED]
[Compare: X tests passing vs baseline Y]
REQUIREMENTS: [MET | GAPS]
[If gaps: what's missing]
SECURITY: [CLEAN | ISSUES]
[If issues: describe]
GIT_HYGIENE: [CLEAN | ISSUES]
[If issues: unrelated files, bad commit message]
Recommendation: [APPROVED | NEEDS_FIXES]
[If NEEDS_FIXES: prioritized list of issues to address]
Verification outcomes:
PASS - Task complete AND forward progress demonstrated
tk close [task-id]STATUS: COMPLETEDISSUES FOUND - Enter fix loop
STATUS: BLOCKED_SUBTASKSTESTS FAIL - Systematic debugging needed
STATUS: BLOCKED_SUBTASKSGoal: Decide whether to continue with another task or exit the session.
This assessment occurs ONLY after PASS verification. Any BLOCKED_* or ERROR status exits immediately.
Self-Assessment Questions:
Context Health: Am I struggling to recall details from earlier in this session? Do I feel uncertain about the baseline state or project patterns? If yes → EXIT.
Available Work: Run tk ready - are there tasks remaining? If no → EXIT.
Task Relatedness: Does any ready task relate to work just completed?
Bias toward related tasks - they benefit from warm context and may need abbreviated research.
Apparent Scope: Looking at the ready tasks, do any feel manageable within remaining context? Use judgment, not metrics. A task that "feels like a quick fix" vs "feels like a rabbit hole."
Decision:
CONTINUE if:
EXIT if:
When continuing:
Conservative default: When uncertain, EXIT. Fresh sessions are cheap; context pollution is expensive.
tk ready Returns Zero Tasks)Trigger: No unblocked tasks exist.
YOU MUST NOT IDLE. Instead, discover new work by using and improving the project.
1. Dispatch usage exploration subagent:
2. Dispatch improvement design subagent:
3. Create Tickets tasks for discovered issues:
tk create "Fix [specific issue discovered]" --priority 2
tk create "Add test for [scenario]" --priority 2
# Create 2-4 improvement tasks from exploration findings
4. EXIT:
Investigation and task creation IS your work for this session. The external loop will re-invoke and pick up the newly-created tasks in priority order. EXIT with STATUS: COMPLETED.
You have full autonomy to create new Tickets tasks when needed.
1. Investigation Required (Ambiguity/Missing Info):
CRITICAL: Prefix title with REQUIRES-INVESTIGATION: for blockers that need deeper research.
tk create "REQUIRES-INVESTIGATION: Clarify authentication method for OAuth flow" \
--priority 0 \
--type task \
--description "Implementation blocked: Unclear whether to use PKCE flow or client credentials. Need research into project patterns, security requirements, and OAuth best practices to determine correct approach."
# Mark current task as blocked
tk dep current-task-id new-task-id
# This marks current-task-id as blocked by new-task-id
The investigate-blocker skill will pick up REQUIRES-INVESTIGATION tasks and either:
2. Complex Work Breakdown:
# Current task is too large, break it down
tk create "Implement database schema for feature X" --priority 1 --type task
tk create "Implement service layer for feature X" --priority 1 --type task
tk create "Implement API routes for feature X" --priority 1 --type task
# Mark dependencies
tk dep service-task schema-task # Service depends on schema
tk dep routes-task service-task # Routes depend on service
# Mark current epic as blocked
tk dep current-epic-id schema-task
3. Discovered Missing Work:
# Found missing test coverage during verification
tk create "Add integration tests for auth middleware" \
--priority 1 \
--type task \
--description "Verification revealed missing test coverage for middleware edge cases"
# Create task
tk create "Task title" \
--priority [0-4] \ # 0=highest (P0), 4=lowest (P4)
--type [task|epic|bug] \
--description "Details" \
--assignee [username]
# Add dependency (A blocks B)
tk dep B A # B depends on A (A blocks B)
# Update task status
tk status task-id [open|in_progress|closed]
# Edit task details (description, priority)
tk edit task-id
# Query tasks
tk ready # Show ready work
tk show task-id # Show task details
tk dep tree task-id # Show dependency tree
Pattern 1: Blocker Needing Investigation
# Create P0 blocker with REQUIRES-INVESTIGATION: prefix
NEW_TASK=$(tk create "REQUIRES-INVESTIGATION: [specific question or ambiguity]" \
--priority 0 \
--type task \
--description "[Context: what you were trying to do, what's unclear, what research might help]")
# Mark current task as blocked (NEW_TASK contains the ticket ID printed by tk create)
tk dep [current-task-id] $NEW_TASK
Pattern 2: Break Down Epic
# Create subtasks (tk create prints the ticket ID directly)
TASK1=$(tk create "Subtask 1" --priority 1 --type task)
TASK2=$(tk create "Subtask 2" --priority 1 --type task)
TASK3=$(tk create "Subtask 3" --priority 1 --type task)
# Chain dependencies if needed
tk dep $TASK2 $TASK1 # Task 2 depends on Task 1
tk dep $TASK3 $TASK2 # Task 3 depends on Task 2
# Block current epic on first subtask
tk dep [current-epic-id] $TASK1
Pattern 3: Follow-up Work
# Current task complete but revealed follow-up needed
tk create "Refactor error handling based on new pattern" \
--priority 2 \
--type task \
--description "The auth fix revealed inconsistent error handling that should be standardized"
git symbolic-ref HEAD 2>/dev/null || echo "DETACHED"
If DETACHED:
STATUS: ERRORIf on a branch: Continue to health check.
Dispatch baseline subagent (see Health Check section above).
If build/tests broken:
tk ready and tk blocked to see what's trackedSTATUS: COMPLETEDIf build/tests pass: Continue to step 1.
tk ls | grep -A 1 "in_progress"
If in_progress task found:
If no in_progress task: Continue to step 2.
# Get highest priority ready task
tk ready
If tasks exist: Pick ONE task. Continue to step 2.5. If no tasks: Enter Maintenance Mode (see section above). EXIT after creating improvement tasks.
YOU MUST DO THIS IMMEDIATELY. NON-NEGOTIABLE.
Pick a task based on:
Task Selection for Continuation: When selecting a second (or subsequent) task in a session, relatedness to just-completed work can outweigh strict priority ordering. A P2 task in code you just modified is often better than a P1 task in completely different code, because:
CRITICAL: Do not ask the user which task to work on. Do not wait for input. Apply judgment and proceed.
tk show [task-id]
Parse:
tk status [task-id] in_progress
TodoWrite([
{ content: "Research [task-id]: [description]", status: "in_progress", activeForm: "Researching..." },
{ content: "Implement [task-id]: [description]", status: "pending", activeForm: "Implementing..." },
{ content: "Verify [task-id]: [description]", status: "pending", activeForm: "Verifying..." }
])
Complexity decision:
Update TodoWrite: mark research complete, implement in_progress
If missing requirements or ambiguity discovered:
STATUS: BLOCKED_INVESTIGATIONUpdate TodoWrite: mark implement complete, verify in_progress
Outcome handling:
PASS:
ISSUES FOUND (< 3 iterations):
ISSUES FOUND (≥ 3 iterations):
STATUS: BLOCKED_SUBTASKSTESTS FAIL:
STATUS: BLOCKED_SUBTASKSALWAYS print structured summary before EXIT:
# AUTONOMOUS DEVELOPMENT SESSION SUMMARY
## Tasks Completed This Session
| Task ID | Title | Status | Commit |
|---------|-------|--------|--------|
| [id-1] | [title] | COMPLETED | [hash] |
| [id-2] | [title] | COMPLETED | [hash] |
(Single task sessions will have one row)
## Session Statistics
- **Tasks attempted:** [count]
- **Tasks completed:** [count]
- **Exit reason:** [context health | no related tasks | blocked | error | no tasks remain]
## Final Task Details
(Details for the last task worked on, or the task that caused exit)
- **ID:** [task-id]
- **Title:** [task title]
- **Priority:** [P0/P1/P2/P3/P4]
- **Status:** [COMPLETED | BLOCKED_INVESTIGATION | BLOCKED_SUBTASKS | ERROR]
## Work Summary
- Research: [summary of key findings across tasks]
- Implementation: [summary of changes made]
- Verification: [summary of test results]
## Changes (Aggregate)
- Files modified: [count]
- Lines added: [count]
- Lines removed: [count]
- Commits: [list commit hashes]
## Test Results
- Total tests: [count]
- Passing: [count]
- Failing: [count]
## Baseline Comparison
- **Starting:** [X tests passing, Y build warnings/errors]
- **Final:** [A tests passing, B build warnings/errors]
- **Progress:** [improved/maintained/REGRESSED]
## Created Tasks
[If any tasks were created, list them with IDs and reasons]
## Blocking Issues
[If BLOCKED_* status, explain what's blocking and what was created]
## Continuation Decision
[Why session ended: context strained / only complex tasks remain / blocked / all done]
## Next Action
[What the external loop should do next]
---
EXIT_STATUS: [COMPLETED|BLOCKED_INVESTIGATION|BLOCKED_SUBTASKS|ERROR]
TASKS_COMPLETED: [count]
Exit status meanings:
COMPLETED - Task done, loop continues with next taskBLOCKED_INVESTIGATION - Created REQUIRES-INVESTIGATION task, investigate-blocker will handleBLOCKED_SUBTASKS - Created subtasks, work can continue on thoseERROR - Fatal error, loop should stopALWAYS include in subagent prompts:
**CRITICAL NOTICE:** Another AI may be working concurrently in this repository.
When you commit:
1. Run `git status` to see ALL changes
2. Identify which files belong to [task-id]
3. Use `git add [file1] [file2] [file3]` for SPECIFIC FILES ONLY
4. INCLUDE the ticket file: `git add .tickets/[task-id].md`
5. NEVER use `git add .` or `git add -A`
6. Run `git diff --cached` to review staged changes
7. Verify ONLY [task-id] changes are staged (code + ticket file)
8. If you see unrelated changes, DO NOT commit them
9. Ask for help if uncertain
Every subagent must follow the PRIME DIRECTIVE (maximal simplicity):
Every subagent MUST read repository CLAUDE.md as first action:
**FIRST ACTION:** Read /path/to/repo/CLAUDE.md to understand the project architecture.
Update TodoWrite after each stage:
// After research complete
TodoWrite([
{ content: "Research [task]: ...", status: "completed", activeForm: "Researched..." },
{ content: "Implement [task]: ...", status: "in_progress", activeForm: "Implementing..." },
{ content: "Verify [task]: ...", status: "pending", activeForm: "Verifying..." }
])
// After implement complete
TodoWrite([
{ content: "Research [task]: ...", status: "completed", activeForm: "Researched..." },
{ content: "Implement [task]: ...", status: "completed", activeForm: "Implemented..." },
{ content: "Verify [task]: ...", status: "in_progress", activeForm: "Verifying..." }
])
// After verify complete
TodoWrite([
{ content: "Research [task]: ...", status: "completed", activeForm: "Researched..." },
{ content: "Implement [task]: ...", status: "completed", activeForm: "Implemented..." },
{ content: "Verify [task]: ...", status: "completed", activeForm: "Verified..." }
])
After research, if the implementation feels too complex to hold in your head as a coherent unit:
tk create "Phase 1: Database schema" --priority 1 --type task
tk create "Phase 2: Service layer" --priority 1 --type task
tk create "Phase 3: API routes" --priority 1 --type task
tk create "Phase 4: Integration tests" --priority 1 --type task
# Chain dependencies
tk dep phase2-id phase1-id
tk dep phase3-id phase2-id
tk dep phase4-id phase3-id
# Block current task
tk dep current-task-id phase1-id
# Use tk edit to update the description
tk edit current-task-id
# In the editor, update description to: "Epic broken into 4 subtasks (phase1-id through phase4-id). See individual tasks for implementation details."
STATUS: BLOCKED_SUBTASKSExternal loop will pick up phase1 next iteration.
If implementation can't proceed due to ambiguity or missing information:
tk create "REQUIRES-INVESTIGATION: Determine authentication method" \
--priority 0 \
--type task \
--description "Implementation blocked: Unclear whether to use OAuth PKCE flow vs client credentials. Need research into: 1) existing auth patterns in codebase, 2) security requirements, 3) OAuth best practices for this app type. See [file:line] for decision point."
tk dep current-task-id blocker-task-id
STATUS: BLOCKED_INVESTIGATIONThe investigate-blocker skill will pick up this task, perform deep research, and either resolve it with an actionable task or escalate to HUMAN-TASK if genuine human judgment is needed.
If fix loop runs 3+ times without success:
tk create "Debug persistent test failures in [feature]" \
--priority 1 \
--type task \
--description "Tests failing after 3 fix attempts. Failures: [test names]. Need systematic debugging. See commit [hash] for latest attempt."
tk dep current-task-id debug-task-id
STATUS: BLOCKED_SUBTASKSThe external loop coordinates two skills:
┌─────────────────────────────────────────────────────────────┐
│ External Bash Loop │
├─────────────────────────────────────────────────────────────┤
│ Check tk ready: │
│ ├─ REQUIRES-INVESTIGATION tasks? → investigate-blocker │
│ ├─ Regular tasks? → autonomous-development │
│ ├─ Only HUMAN-TASK? → Stop, notify human │
│ └─ No tasks? → Stop, all done │
└─────────────────────────────────────────────────────────────┘
Location: autonomous-dev-loop.sh (in repository root)
#!/bin/bash
while true; do
# Check for investigation tasks first (higher priority)
INVESTIGATION=$(tk query --ready --format json 2>/dev/null | jq '[.[] | select(.title | startswith("REQUIRES-INVESTIGATION:"))]')
INVESTIGATION_COUNT=$(echo "$INVESTIGATION" | jq 'length' 2>/dev/null || echo "0")
# Check for human tasks
HUMAN=$(tk query --ready --format json 2>/dev/null | jq '[.[] | select(.title | startswith("HUMAN-TASK:"))]')
HUMAN_COUNT=$(echo "$HUMAN" | jq 'length' 2>/dev/null || echo "0")
# Check for regular tasks
REGULAR=$(tk query --ready --format json 2>/dev/null | jq '[.[] | select((.title | startswith("REQUIRES-INVESTIGATION:") | not) and (.title | startswith("HUMAN-TASK:") | not))]')
REGULAR_COUNT=$(echo "$REGULAR" | jq 'length' 2>/dev/null || echo "0")
if [ "$INVESTIGATION_COUNT" -gt 0 ]; then
echo "Investigation task found, running investigate-blocker..."
claude -p "Use investigate-blocker skill" || break
elif [ "$REGULAR_COUNT" -gt 0 ]; then
echo "Regular task found, running autonomous-development..."
claude -p "Use autonomous-development skill" || break
elif [ "$HUMAN_COUNT" -gt 0 ]; then
echo "Only HUMAN-TASK items remain. Human input required."
# Send notification here
break
else
echo "All tasks complete!"
break
fi
sleep 2
done
Key Features:
Exit Conditions:
The external system can:
BLOCKED_HUMAN_INPUTERROR statusCOMPLETEDBLOCKED_SUBTASKSBLOCKED_INVESTIGATIONBLOCKED_SUBTASKSFor external monitoring:
Track these from session summaries:
❌ Ignoring detached HEAD state - ALWAYS check git state first; recover orphaned work before proceeding
❌ Skipping health check - ALWAYS run baseline at session start
❌ Re-running health check per task - Baseline is session-scoped, run it ONCE
❌ Ignoring broken baseline - Fix build/tests before proceeding to tasks
❌ No baseline comparison - Verify forward progress, not just "tests pass"
❌ Idling when no tasks exist - Use Maintenance Mode to discover work
❌ Not using tk ready - Use the Tickets system to discover ready work
❌ Strict priority ordering on continuation - Bias toward related tasks for warm context
❌ Continuing when context is strained - If uncertain about earlier details, EXIT
❌ Picking complex unrelated tasks late in session - Only continue with bounded/related work
❌ Full research for related tasks - Abbreviate research when context is warm
❌ Not creating subtasks - Break down complex work
❌ Guessing requirements - Create REQUIRES-INVESTIGATION task for deeper research
❌ Using HUMAN-TASK directly - Use REQUIRES-INVESTIGATION first; investigate-blocker decides if human needed
❌ Infinite fix loops - Max 3 iterations, then create debugging subtask
❌ Not printing session summary - Required for external monitoring
❌ Not using tk commands - You have full autonomy to create tickets
❌ Committing unrelated changes - Review git diff carefully
❌ Using git add . - Always specify files explicitly
❌ Continuing after error or block - EXIT immediately, don't attempt more tasks
Git state check (ONCE per session, BEFORE health check)
├─ Detached HEAD?
│ ├─ YES:
│ │ ├─ Dispatch recovery subagent
│ │ ├─ Investigate reflog, find original branch
│ │ ├─ Recovery successful?
│ │ │ ├─ YES → Continue to health check
│ │ │ └─ NO (conflicts/failure) → Create HUMAN-TASK → EXIT: ERROR
│ │
│ └─ NO → Continue to health check
Health check (ONCE per session)
├─ Build/tests broken?
│ ├─ YES:
│ │ ├─ Check tk ready/blocked
│ │ ├─ Failures tracked?
│ │ │ ├─ YES → Pick priority task → Research/Implement/Verify → Continue below
│ │ │ └─ NO → Create investigation tasks → EXIT: COMPLETED
│ │
│ └─ NO → Continue to task selection
│
Task selection
├─ Ready tasks exist?
│ ├─ YES → Select task (priority + relatedness) → Continue to research
│ └─ NO → Maintenance Mode: explore/investigate → Create improvement tasks → EXIT: COMPLETED
│
Research complete
├─ Task feels too complex for current context?
│ ├─ YES → Create subtasks → EXIT: BLOCKED_SUBTASKS
│ └─ NO → Continue to implement
│
Implementation complete
├─ Missing requirements or ambiguity?
│ ├─ YES → Create REQUIRES-INVESTIGATION → EXIT: BLOCKED_INVESTIGATION
│ └─ NO → Continue to verify
│
Verification complete
├─ Tests pass? Forward progress? (vs baseline)
│ ├─ YES → Close task → CONTINUATION ASSESSMENT (below)
│ └─ NO → Fix loop
│ ├─ Iteration < 3? → Dispatch fix → Re-verify
│ └─ Iteration ≥ 3? → Create debugging subtask → EXIT: BLOCKED_SUBTASKS
│
CONTINUATION ASSESSMENT (after successful verification)
├─ Context health good?
│ ├─ NO → EXIT: COMPLETED (context strained)
│ └─ YES → Check tk ready
│ ├─ No tasks? → EXIT: COMPLETED (all done)
│ └─ Tasks exist?
│ ├─ Related/bounded task available? → Loop to "Task selection" (skip health check)
│ └─ Only complex/unrelated tasks? → EXIT: COMPLETED (fresh session better)
│
Error at any stage
└─ EXIT: ERROR
Remember:
The goal: Autonomous development with intelligent session management, leveraging warm context for related work while maintaining clear status reporting. The two-tier escalation model (REQUIRES-INVESTIGATION → investigate-blocker → HUMAN-TASK) maximizes autonomous resolution while ensuring humans are only interrupted when truly necessary.
Reviews completed work by exploring features as a user would, finding gaps between intent and implementation, writing failing tests for real issues, and creating tickets for genuine problems. Runs automatically after all planned tasks complete.
Creates new Claude Code agent skills with proper structure, frontmatter, and best practices. Use when the user wants to create a new skill, add capabilities, or extend Claude with domain-specific expertise.
Autonomously investigates REQUIRES-INVESTIGATION tasks through deep parallel research (internal and external), synthesizes findings, and either resolves with an actionable task or escalates to HUMAN-TASK when genuine human judgment is needed. Use when autonomous-development creates investigation tasks.
Translates finalized feature plans into comprehensive tickets for autonomous development. Use after project-planning and reviewing-plans sessions are complete. Creates self-contained tickets with inlined requirements and verification commands.
Orchestrates comprehensive feature planning through autonomous research phases and map-reduce cycles. Researches codebase first to understand current state, only interviews user for human judgment questions, then runs research waves after each interview round. Produces detailed implementation plans with executable verification commands.
Captures project vision and philosophy through structured interview, producing comprehensive vision documentation and condensed summary. Use before project-planning for new projects or major architectural work requiring design alignment. Ensures blockers and autonomous development can align with project goals.