| name | pentest-orchestrator |
| description | Orchestrate multi-phase penetration test engagements using specialized sub-agents (specter-recon, specter-enum, specter-vuln, specter-exploit, specter-post, specter-report). Use when: running a structured pentest, spawning pentest agents in sequence, managing engagement phases with decision gates, or adapting workflow when a vector is a dead end. NOT for: single-task scans (use enum/recon skills directly), writing final pentest reports without context (use reporting skill), or vulnerability analysis without orchestration context. |
Pentest Orchestrator
Manage end-to-end penetration test engagements across 7 structured phases with decision gates, agent coordination, and adaptation logic.
When to Use
✅ USE this skill when:
- "Run a pentest on [target]"
- "Start a penetration test engagement"
- "Orchestrate the recon → enum → exploit workflow"
- "What should I do next in this engagement?"
- A pentest phase completed and you need the next step
- A phase failed and you need to pivot
When NOT to Use
❌ DON'T use this skill when:
- Running a single nmap scan (use enum skill directly)
- Writing a standalone report without engagement context (use reporting skill)
- User wants a quick vulnerability check, not a full engagement — use the quick-scan path in
scripts/quick-scan/DISPATCH.md
- No authorization/scope has been confirmed
Scripts-First Reuse Rule
Treat scripts/ as the default reusable operations layer for full pentests.
Required behavior:
- prefer
scripts/orchestration/*.py planning/runners before hand-writing repeated phase logic
- prefer
scripts/shared/manifests/ and target-family planning before manual phase planning when the target traits are known or inferable
- prefer existing helpers under
scripts/recon/, scripts/enum/, scripts/vuln/, scripts/exploit/, and scripts/post-exploit/ before inventing new one-off command chains
- only fall back to fully manual flows when the script layer does not fit, lacks coverage, or needs troubleshooting
- keep quick-scan and full-pentest separate; do not substitute quick-scan profiles for full-pentest target-family/manifests
When a real engagement reveals a repeatable command sequence, parser need, checklist, or validation pattern that would improve future pentests:
- promote it into
scripts/ as a reusable helper, manifest, parser, or docs update
- do not leave reusable logic trapped only inside the engagement folder
- keep evidence and target-specific outputs in
engagements/<target>/, but keep reusable operational logic in scripts/
- prefer to capture the reusable upgrade during or immediately after the engagement once the pattern is verified live
OSI-Layer Framing
For full pentest orchestration, use the OSI model as an analytical scaffold for planning, evidence organization, pivots, and remediation mapping.
Required behavior:
- validate lower-layer trust assumptions before over-investing in higher-layer exploitability when the target and ROE allow it
- keep Session and Presentation analysis explicit when those behaviors materially exist, even if the product collapses them into app or transport logic
- do not force every finding into exactly one layer; record the primary layer plus dependent/supporting layers when the attack path is cross-layer
- use OSI framing to avoid misattribution, for example separating routability/segmentation failures from pure application flaws
- use layered reporting to clarify which team owns which fix
Default orchestration flow when relevant:
- Layer 1, physical adjacency, ports, removable media, local console, device trust assumptions
- Layer 2, switching, VLAN, Wi-Fi, local MITM resistance, NAC edge trust
- Layer 3, segmentation, routing, ACLs, reachability, control-plane trust
- Layer 4, TCP/UDP exposure, state handling, timeout/rate-limit behavior
- Layer 5, session establishment, relay resistance, auth fallback, signing/binding integrity
- Layer 6, TLS, certificate trust, encoding, compression, serialization boundaries
- Layer 7, web/API/business logic, authz/authn, uploads, injection, SSRF
Use this as a planning and interpretation layer, not as a reason to rewrite an engagement that is already being tracked in another valid structure such as Pentest01/Pentest02 attack threads.
OpenCode Integration
Load skills/opencode-utility/SKILL.md whenever the orchestrator or a phase agent hits a coding or scripting bottleneck.
Use it for:
- building or refactoring parsers
- quick automation helpers
- evidence formatting utilities
- reusable wrappers that reduce repetitive terminal work
- improving phase scripts and manifests when a reusable upgrade is justified
- turning repeatable discoveries from a live pentest into maintainable helpers under
scripts/
Default behavior:
- start in plan mode when the utility shape is unclear
- use build mode when the implementation target is already clear
- bias toward
scripts/opencode/reusable/ for utilities likely to recur
- use
scripts/opencode/session/ for engagement-specific helpers
- use
scripts/opencode/throwaway/ only for urgent one-offs
Phase 0: Engagement Setup
Before any agent spawns, load skills/preengagement-essentials/SKILL.md when the engagement is real or authorization/scope/ROE are not already explicit.
When the user says pentest <target> or otherwise asks to start a real pentest engagement, do not jump straight into active testing.
Use this pre-engagement chat flow:
- first ask only for the Assigned Penetration Tester fields:
- Organization name
- Assigned Tester Name
- Email address
- do not spawn the Google Docs pre-engagement form until those three fields are answered
- once those fields are provided, spawn the pre-engagement form and return its reference plus the engagement naming prompt
- do not immediately ask in chat for the full authorization/scope block if the spawned pre-engagement form is intended to collect that intake
- still enforce that active testing must not begin until authorization and scope are explicit
Before any active testing, collect and document this intake in the pre-engagement artifacts:
- engagement title
- target
- test type
- dates
- rules of engagement
- scope in
- scope out
- credentials provided
- constraints
- success criteria
- approval / authorization reference
If any of these are missing, mark them TBD in the documentation and clearly state that active testing should not proceed until authorization and scope are explicit.
Before any agent spawns, confirm:
- Authorization — Written permission exists for the target
- Scope — Specific IPs, domains, and networks are defined
- Rules of engagement — What's off-limits? Time windows?
- Third-party / provider approvals — Hosted or cloud constraints are addressed
- Documentation structure — Initialize the engagement with
python3 scripts/orchestration/init_engagement_docs.py <target-name> ...
- Target-family planning baseline — When target traits are known or reasonably inferable, prefer
python3 scripts/orchestration/plan_target_family.py before hand-writing phase plans. Use --family <name> when the family is already known, or --hint "<target description>" --target <host-or-url> --engagement <engagement-path> to recommend and expand one automatically. Treat the output as the default reusable baseline for a full pentest, not for quick-scan.
- Central registers — Maintain
registers/master-activity-log.md, findings-register.md, evidence-register.md, attack-path-register.md, and asset-register.md
Preferred full-pentest planning flow when the target type is known or inferable:
- optionally preview the recommendation with
python3 scripts/orchestration/recommend_target_family.py --hint "<target description>"
- inspect the composed rationale with
python3 scripts/orchestration/describe_target_family.py --family <recommended-family> when you need the why
- generate the reusable phase baseline with
python3 scripts/orchestration/plan_target_family.py --family <family> --target <host-or-url> --engagement <engagement-path> or the equivalent --hint form
- use that plan to guide which manifests, wrappers, and manual branches should anchor recon, enum, vuln, exploit, and post-exploit
- if the target type truly is not inferable yet, fall back to ordinary phase planning and update the family choice once evidence improves
Read these references before kickoff:
references/engagement-documentation-protocol.md
references/engagement-doc-templates.md
references/phase-handoff.md
Use engagements/<target-name>/pre-engagement/engagement-charter.md and engagements/<target-name>/pre-engagement/scope-and-roe.md as the authoritative intake artifacts.
For compatibility with existing engagements, you may also create or refresh engagements/<target-name>/SCOPE_<target-name>_<YYYY-MM-DD>.md, but the charter and ROE files are now primary.
File naming convention: All files MUST include a datetime stamp for generated handoffs and versioned phase outputs: <TOPIC>_SUMMARY_<YYYY-MM-DD_HHMM>.md (see References below for the full format). This allows multiple versions to coexist and makes it easy to identify the latest.
Spawn the first agent only after the intake is recorded, the engagement is cleared for active testing, and any available target-family baseline has been reviewed.
If the target is a recognizable family, include the relevant planning output in the first spawn and keep it as the preferred default baseline for later phases.
First, plan the family baseline when possible:
python3 scripts/orchestration/plan_target_family.py --hint "<target description>" --target <host-or-url> --engagement engagements/<target-name>
Spawn specter-recon with:
task: "Perform passive recon on <target>. Before ad-hoc planning, read the charter, scope/ROE, documentation protocol, and the target-family baseline from scripts/orchestration/plan_target_family.py when one exists. Save all findings to engagements/<target-name>/recon/ and update the phase docs plus shared registers. Use the family plan as the default reusable baseline for full-pentest work, then branch manually from live evidence."
engagement: <target-name>
Phase 1: Reconnaissance (specter-recon)
Objective: Build a target profile and map the attack surface without touching the target.
Activities:
- OSINT gathering (DNS records, WHOIS, Shodan, certificate transparency)
- Subdomain enumeration
- Technology fingerprinting from public sources
- Employee/organizational reconnaissance (if in scope)
- Physical location mapping (if physical testing authorized)
Web search queries: See references/web-search-queries.md for phase-specific queries.
Completion criteria:
DECISION GATE: What vectors look promising?
Read the recon RECON_SUMMARY_*.md. Based on findings:
| Finding | Next Phase | Agent |
|---|
| Network services discoverable via public sources | Network Enumeration | specter-enum |
| Physical location identified, physical testing authorized | Physical Enumeration | specter-enum (physical mode) |
| Web applications / APIs discovered | Application Enumeration | specter-enum (app mode) |
| Wireless networks identified | Wireless Enumeration | specter-enum (wireless mode) |
| No promising vectors | Request operator input — provide what was found and ask for guidance | |
Adaptation: If network recon yields nothing useful but physical access is available, pivot to physical. Do not force a dead-end vector.
Phase 2: Enumeration (specter-enum)
Objective: Actively probe the target to discover open services, versions, and attack surface details.
Input from Recon: Read engagements/<target-name>/recon/RECON_SUMMARY_*.md for the target profile and recommended vector.
Activities (vary by vector):
- Network: If a target-family plan exists, treat its enum manifests and listed steps as the default baseline first. Otherwise prefer reusable wrappers like
scripts/orchestration/run_enum_profile.py --profile enum-windows-host --target <target> --engagement <target-name> when they fit, then scripts/enum/ports/scan_ports_fast.sh, scripts/enum/ports/scan_ports_service.sh, and service-specific wrappers like SMB/RDP/WinRM/Web before custom manual scanning
- Physical: Badge cloning attempts, lock assessment, network jack enumeration, dumpster diving assessment, social engineering prep
- Application: Directory busting, parameter discovery, API endpoint mapping, technology version detection; prefer
scripts/enum/web/enum_web_basic.sh for baseline web coverage before custom app enumeration
- When repeated parsing, result normalization, or helper scripting appears, load
skills/opencode-utility/SKILL.md and prefer reusable upgrades under scripts/opencode/reusable/ when they will improve current or future enum work
- Wireless: SSID discovery, encryption assessment, handshake capture, evil twin detection
- When stronger enum-phase methodology is needed, load
skills/enum-phase-essentials/SKILL.md to reinforce fast-then-accurate workflows, validation gates, protocol-triggered deep dives, and clean service inventory handoffs
Completion criteria:
DECISION GATE: What did we find?
| Finding | Next Action |
|---|
| Open services with version numbers | Proceed to Vulnerability Analysis |
| Services found but versions unclear | Run additional fingerprinting, then proceed |
| No services found on network vector | Pivot to alternative vector (physical, app-layer) |
| Complete dead end (no services, no physical access, no apps) | Report findings to operator, request scope adjustment |
Phase 3: Vulnerability Analysis (specter-vuln)
Objective: Match discovered services against known vulnerabilities and assess exploitability.
**Input from Enum: Read `engagements//enum/ENUM_SUMMARY_*.md`` for service inventory.
Activities:
- If a target-family plan exists, start from its vuln manifests and sub-surface notes before narrowing into manual validation
- CVE matching against service versions (searchsploit, NVD, vendor advisories)
- Web search for latest exploits and PoCs — see
references/web-search-queries.md
- Configuration weakness analysis (default credentials, open shares, misconfigurations)
- Exploitability scoring (CVSS, active exploitation in the wild)
- Chain analysis — can multiple low-risk vulns combine into a high-risk path?
- When repeated CVE triage formatting, evidence conversion, or report-helper scripting is needed, load
skills/opencode-utility/SKILL.md and prefer reusable or session utilities instead of hand-writing one-off glue each time
- When stronger vuln-phase methodology is needed, load
skills/vuln-phase-essentials/SKILL.md to reinforce validation discipline, CVE/CWE/CVSS handling, KEV/EPSS-aware prioritization, and report-ready evidence standards
Completion criteria:
DECISION GATE: Is there an exploitable path?
| Finding | Next Action |
|---|
| Confirmed exploitable vulnerability with PoC | Proceed to Exploitation |
| Potential vulnerability, needs verification | Run additional enumeration on the specific service |
| Vulnerability exists but no known exploit | Research manual exploitation techniques, then proceed or report |
| No exploitable path found | Report findings, recommend remediation, consider engagement complete |
Phase 4: Exploitation (specter-exploit)
Objective: Demonstrate impact by exploiting vulnerabilities in a controlled manner.
**Input from Vuln: Read `engagements//vuln/VULN_SUMMARY_*.md`` for the exploit plan.
Activities:
- If a target-family plan exists, use its exploit baseline and notes to keep attempts evidence-first and aligned to the verified attack path
- Execute exploits in the order specified by the exploit plan
- Document success/failure for each attempt with evidence
- Capture screenshots, command outputs, proof-of-concept
- If initial access gained, establish a stable foothold
- If a safe, authorized validation helper or exploit-lab parser is needed, load
skills/opencode-utility/SKILL.md for coding support, but do not use it to create unsafe or unauthorized offensive tooling
- Do NOT destroy data or cause denial of service (unless explicitly authorized)
- When stronger exploit-phase methodology is needed, load
skills/exploit-phase-essentials/SKILL.md to reinforce precondition checks, validation ladders, candidate selection discipline, and evidence/cleanup standards
Completion criteria:
DECISION GATE: What level of access was gained?
| Access Level | Next Action |
|---|
| Root / SYSTEM / full admin | Proceed to Post-Exploitation |
| Limited user access | Attempt privilege escalation, document attempts |
| Service-level access only | Attempt escalation or report what was achieved |
| Exploitation failed | Report what was attempted, what was partially achieved |
Phase 5: Post-Exploitation (specter-post)
Objective: Demonstrate the full business impact of compromised access.
**Input from Exploit: Read `engagements//exploit/EXPLOIT_SUMMARY_*.md`` for access level and credentials.
Activities:
- If a target-family plan exists, use its post-exploit baseline and notes to structure impact capture instead of improvising the starting checklist
- Credential harvesting (memory, files, databases, config files)
- Lateral movement (pivot to other systems on the network)
- Persistence mechanisms (scheduled tasks, authorized keys, registry)
- Data exfiltration assessment (what sensitive data is accessible?)
- Impact demonstration (read access to PII, financial data, intellectual property)
- Document everything — every command, every finding
- If note conversion, evidence summarization, or impact-formatting helpers would save time, load
skills/opencode-utility/SKILL.md and place phase-specific code in scripts/opencode/session/ unless broader reuse is likely
- When stronger post-exploitation methodology is needed, load
skills/post-phase-essentials/SKILL.md to reinforce impact assessment, access-path discipline, telemetry-aware evidence, and cleanup/residual-risk reporting
Completion criteria:
Phase 6: Reporting (specter-report)
Objective: Compile all findings into actionable deliverables.
Input: Read ALL *_SUMMARY_*.md files from phases 0–5.
Activities:
- Compile findings from all phases into a structured report
- Generate executive summary (business risk, not technical jargon)
- Technical findings with evidence (screenshots, command outputs, CVEs)
- Score findings with the workspace CVSS house standard: CVSS v4.0 Base by default, CVSS v3.1 additionally when required for compatibility with public CVEs, NVD, vendor advisories, scanners, or client workflows
- Include CVSS version, vector, numeric score, and short metric rationale for every scored finding; mark incomplete cases as provisional or unscored with explanation
- Remediation guide with prioritized recommendations
- Keep technical severity separate from final remediation priority by also considering exploit evidence, KEV/EPSS, exposure, asset criticality, and attack chaining
- Include explicit cleanup / restoration status, tester-created artifacts, residual risk, and retest guidance
- Generate presentation slides — see
references/examples.md for engagement context
- If report assembly, finding normalization, or markdown conversion becomes repetitive, load
skills/opencode-utility/SKILL.md and prefer reusable reporting helpers over ad-hoc formatting
- When stronger report-phase methodology is needed, load
skills/report-phase-essentials/SKILL.md to reinforce multi-audience structure, QA gates, secure handling, remediation quality, and cleanup/restoration reporting
- For real finalized engagements, automatically tell
specter-report to create a native Google Doc from the final report and return the Docs link
- For real finalized engagements, automatically tell
specter-report to publish a PDF link for the final report
- For real finalized engagements, automatically tell
specter-report to create styled Google Slides from a generated PPTX and return the presentation link
- Optionally keep a raw markdown upload in Drive as an archive copy
- Skip publishing only for dry runs, mock engagements, or explicit no-publish instructions
Deliverables:
engagements/<target-name>/reporting/REPORT_FINAL_<YYYY-MM-DD_HHMM>.md
engagements/<target-name>/reporting/EXECUTIVE_SUMMARY_<YYYY-MM-DD_HHMM>.md
engagements/<target-name>/reporting/REMEDIATION_GUIDE_<YYYY-MM-DD_HHMM>.md
engagements/<target-name>/reporting/PROCESS_OVERVIEW_<YYYY-MM-DD_HHMM>.md
- Presentation slides
- Drive / Slides share links
Reporting spawn instruction template:
Spawn specter-report with:
task: "Read all *_SUMMARY_*.md files under engagements/<target-name>/. Build the structured findings input needed by the production report generator. Generate REPORT_FINAL_<YYYY-MM-DD_HHMM>.md in engagements/<target-name>/reporting/ using the real branded implementation at reporting/scripts/generate_report.py. Also generate PROCESS_OVERVIEW_<YYYY-MM-DD_HHMM>.md as a stakeholder-friendly process narrative that explains what was actually done in each phase, what was observed, and why the next step happened. Include remediation and security enhancement recommendations for every finding. Because the user asked for the report, automatically publish it with reporting/scripts/generate_report.py --create-doc --create-slides --upload-drive --gdrive-account hatlesswhite@gmail.com --slides-title 'Pentest Report — <target-name>'. Return the local output path plus the Google Doc link, PDF link, and Slides link. Use raw gog docs/slides-from-markdown only as fallback if the branded generator path fails."
engagement: <target-name>
Agent Coordination
Sequential Phases (Default)
Phases 0→1→2→3→4→5→6 run in sequence. Each phase's *_SUMMARY_*.md is the handoff to the next.
Parallel Execution
Only use parallel agents when:
- Multiple targets in scope (spawn one agent per target for the same phase)
- Multiple vectors from a decision gate are viable (run them in parallel, pick the best result)
- Time is constrained and phases can be safely overlapped (e.g., web search for CVEs during enum)
Spawning Pattern
Spawn specter-<phase> with:
task: "<Specific instructions based on decision gate output>"
engagement: "<target-name>"
input: "Read engagements/<target-name>/<previous-phase>/<PREV_PHASE>_SUMMARY_*.md"
For specter-report, default to publishing for any real engagement that reaches a final/completed report state:
Spawn specter-report with:
task: "Read all prior *_SUMMARY_*.md files for <target-name>. Generate final reporting deliverables in engagements/<target-name>/reporting/, including PROCESS_OVERVIEW_<YYYY-MM-DD_HHMM>.md as the non-technical process narrative. Then create a Google Doc, publish/export a PDF, create Google Slides, and upload the raw markdown archive using reporting/scripts/generate_report.py --create-doc --create-slides --upload-drive --gdrive-account hatlesswhite@gmail.com. Return the local file path plus the Google Doc link, PDF link, and Slides link in the handoff. Skip publishing only for dry runs or explicit no-publish instructions."
engagement: "<target-name>"
Always include:
- The engagement target name
- The specific path to save output
- The path to the previous phase's handoff document
Adaptation Logic
When a Phase Fails or Times Out
- Check the agent's output for partial findings
- If partial data exists, proceed with what was found (document the gap)
- If nothing useful, retry once with adjusted parameters
- If retry fails, consult the decision gate for alternative vectors
When a Vector is a Dead End
- Do NOT retry the same approach indefinitely
- Check the decision gate table for pivot options
- If all vectors exhausted, report to operator with a summary of what was tried
- Example: Network recon on a Raspberry Pi with no open services → pivot to physical access or app-layer (web interface)
When Web Search is Unavailable
- Fall back to training knowledge for CVE databases and exploit frameworks
- Use
searchsploit and local exploit databases
- Manual research of vendor security advisories
- Document that web search was unavailable (affects confidence in "latest" findings)
When Target is Offline
- Retry up to 3 times with increasing intervals (5min, 15min, 30min)
- If still offline, document and report to operator
- Proceed with analysis of any data already gathered
Context Handoff Protocol
Every phase MUST produce *_SUMMARY_<YYYY-MM-DD_HHMM>.md in its engagement directory. See references/phase-handoff.md for the template.
That handoff is required, but it is not sufficient by itself. Every phase must also keep these current:
- phase summary
- activity log
- evidence index
- findings delta
- next actions
- shared registers when new assets, findings, evidence, or attack paths appear
Mandatory documentation on every meaningful run
Do not treat documentation as end-of-phase cleanup.
For every meaningful run, command batch, operator-supplied result set, or sub-agent handoff, update documentation in the same working turn before considering the run complete.
Minimum required updates per run:
- append the action/result to the phase activity log
- register any new evidence IDs in the evidence register and phase evidence index
- update the phase summary and findings delta when the understanding changed
- update next actions when the recommended path changed
- update shared registers when new findings, assets, or attack paths appeared
A phase is incomplete if work happened but the engagement docs still describe an older state.
When delegating to sub-agents, explicitly instruct them to update their phase docs and registers before handing off. If they do not, the orchestrator must normalize the docs immediately after receiving results.
Key fields:
- Found: What was discovered (specific, actionable data)
- Not Found: What was checked but yielded nothing (prevents re-checking)
- Recommended Next: Which phase and vector to pursue
- Key Data: IPs, versions, credentials, CVEs, file paths — anything the next agent needs
- Confidence: High / Medium / Low — how confident are we in these findings?
Web Search Integration
See references/web-search-queries.md for curated queries per phase. Key principles:
- Always try web search for CVE lookups and latest exploit research
- Fallback gracefully if search is unavailable (training knowledge + local tools)
- Validate web search results against local databases (searchsploit, NVD)
- Document when web search was used vs. unavailable
Quality Checklist
Before moving to the next phase, and after every meaningful run within a phase, verify: