| name | kali-pentest |
| description | Execute authorized penetration testing via Kali Linux CLI tools over SSH or Docker.
Covers: information gathering, vulnerability analysis, sniffing & spoofing, web/API testing, exploitation, password attacks, wireless, cloud-native security, RFID/NFC, VoIP/ICS, reverse engineering, forensics, post-exploitation/C2, and reporting.
|
| user-invocable | true |
Kali Pentest Skill
Security Constraints (Read First)
- Authorization is mandatory: never scan or probe any target without confirmed written authorization from the user.
- Scope is binding: only test the hosts, domains, ports, accounts, and techniques that the user has authorized.
- Risk confirmation: High/Critical operations require explicit user approval before execution.
- Prohibited: attacking unauthorized targets, destructive operations on production systems, modifying/deleting/encrypting target files without explicit request, attacking critical infrastructure.
High-risk operations include exploitation, credential spraying, brute forcing, NTLM relay, phishing, wireless attacks, persistence, data exfiltration, DoS-like scans, and intrusive vulnerability checks.
Step 1: Determine Execution Environment
Check the user's message for connection details. If not provided, ask the user.
| Available | Mode | Command pattern |
|---|
| Kali is the local system | Local (direct) | {command} (no wrapper needed) |
| SSH credentials | SSH (preferred for remote) | ssh {CONNECT_CMD} "{command}" |
| Running container | Docker exec | docker exec {CONTAINER} {command} |
| Docker only | Docker persistent container | Create or start kali-pentest, then run docker exec kali-pentest {command} |
Environment initialization:
- New task:
mkdir -p /tmp/kali-pentest-state/<target>/ on the Agent's host. Read references/environment/state-files.md for naming rules and file formats.
- Continuing task: re-read existing state files to recover progress.
- Before starting any testing, update Kali tools:
apt-get update && apt-get upgrade -y.
Capability Check
Use the environment that matches the task:
| Requirement | Preferred environment | Rule |
|---|
| Full assessment, internal LAN, raw sockets, service daemons, database-backed tools | SSH to a full Kali VM/server | Prefer SSH when available |
| CLI-only reconnaissance, web testing, basic scanning, reporting | Persistent Docker container | Acceptable if tools and network reachability are verified |
| Wireless monitor mode, USB adapters, hardware-dependent tests | Physical/VM Kali over SSH | Do not use Docker |
| GPU password cracking | Full Kali with GPU access | Verify hashcat -I before planning GPU work |
| GVM/OpenVAS, Neo4j/BloodHound, ZAP daemon, Metasploit database | Full Kali preferred | Docker is acceptable only if the service stack is already configured |
If the selected task requires unsupported capabilities, stop and explain the limitation instead of forcing the tool to run.
Before heavy work, run the readiness checks for the selected environment:
- Local mode:
references/environment/local-mode.md
- Full Kali/server mode:
references/environment/server-mode.md
- Docker mode:
references/environment/docker-mode.md
Missing optional tools are not an environment failure. Install the required package after selecting the playbook, or choose an alternative from the category README.
SSH Patterns
ssh {CONNECT_CMD} "whoami && uname -a"
ssh {CONNECT_CMD} "nohup {cmd} </dev/null > /tmp/{task}.log 2>&1 & echo \$!"
scp {USER}@{HOST}:/tmp/{output_file} /tmp/
Docker Mode
Use Docker only as a persistent Kali execution environment. Do not use one-shot temporary containers for assessments.
Read references/environment/docker-mode.md first. Read references/environment/docker-mode-persistent-container.md only when creating, starting, or installing tools into the container. Read references/environment/docker-mode-networking.md only for raw-socket behavior, Docker Desktop limitations, or reachability/routing problems.
Step 2: Plan
2.1 Scope Confirmation
- Clarify target: IP, domain, CIDR range, application URL, wireless SSID, image file, or account set.
- Confirm authorization: explicitly verify with the user. This is mandatory.
- Identify test type: black-box, white-box, gray-box, authenticated, unauthenticated, internal, external, wireless, or forensic.
- Ask about constraints: time limits, excluded hosts/ports, rate limits, lockout policy, maintenance windows, compliance requirements.
- Set risk gates: agree which High/Critical actions require a second confirmation during execution.
2.2 Select Depth
| Depth | Use when | Coverage |
|---|
| Quick | User needs a fast scan or connectivity check | Low-noise discovery, top ports, basic web fingerprinting, no intrusive checks |
| Standard | Default for authorized assessments | Service enumeration, vulnerability scanning, web crawling, common protocol checks, reportable evidence |
| Deep | User explicitly wants maximum coverage and accepts time/risk | Full ports, selected UDP, authenticated checks, larger wordlists, GVM/OpenVAS, deeper brute-force or exploitation workflows |
Do not run Deep or intrusive checks by default unless the user explicitly requests it. Otherwise, start from Standard and escalate to Deep only when results justify it and the user consents to the upgrade.
Depth mapping from natural language:
- "full assessment", "comprehensive", "deep", "maximum coverage" → Deep
- "quick scan", "fast check", "connectivity test" → Quick
- No depth qualifier or ambiguous → Standard
- Mixed signals in a compound request → use the highest depth implied; if ambiguous, ask the user
2.3 Select Playbook
See the decision tree in references/playbooks/README.md to select the correct playbook.
If no playbook fits, follow the standard lifecycle: information gathering -> vulnerability analysis -> web or exploitation -> post-exploitation -> reporting.
Step 3: Execute
Reference Reading Order
Follow this 4-layer reading sequence:
references/playbooks/README.md — select the correct playbook from the decision tree (at task start, or when switching playbooks mid-task).
references/playbooks/<playbook>.md — follow the scenario workflow for the current phase.
references/<category>/README.md — use Golden Path and Decision Tree to select suitable tools for the current phase.
references/<category>/tools/<toolname>.md — read only for the tool you are about to run.
When a playbook hands off to another playbook, restart this sequence from layer 2 for the new playbook. Do not pre-read materials for phases you have not reached.
Tool Categories
| Phase | Reference | Start here when... |
|---|
| Information Gathering | references/information-gathering/ | Need to discover hosts, ports, subdomains, or OSINT |
| Vulnerability Analysis | references/vulnerability/ | Need to enumerate services or find vulnerabilities |
| Sniffing & Spoofing | references/sniffing-spoofing/ | Need ARP spoofing, MITM, credential sniffing, DNS spoofing, or packet crafting |
| Web Testing | references/web/ | Target is a web application, API (GraphQL, OpenAPI/REST, gRPC, WebSocket) |
| Exploitation | references/exploitation/ | Vulnerabilities are confirmed and exploitation is authorized |
| Password Attacks | references/password/ | Have hashes to crack or credentials/services to test |
| Wireless | references/wireless/ | Target is a wireless network |
| Cloud-Native | references/cloud-native/ | Target is cloud accounts, Kubernetes, containers, registries, or Docker hosts |
| RFID/NFC | references/rfid-nfc/ | Target is RFID/NFC, Proxmark3, PC/SC, smart cards, or physical credentials |
| VoIP-ICS | references/voip-ics/ | Target is VoIP, SIP/IAX, ICS, OT, PLCs, or Modbus |
| Reverse Engineering | references/reverse-engineering/ | Need binary analysis, disassembly, firmware extraction, or mobile app decompilation |
| Forensics | references/forensics/ | Analyzing disk images, memory dumps, traffic captures, or logs |
| Post-Exploitation | references/post-exploitation/ | Have initial access and need to escalate, pivot, analyze AD, or inspect binaries for privesc |
| Reporting | references/reporting/ | Testing complete and a report is required |
Use multiple complementary tools for critical checks. A clean result from one tool is not proof that the target is clean.
Cross-service interaction testing: When multiple services run on the same host, test for shared resources — shared filesystems (file upload on one service accessible via another), shared databases (credentials from one app accessing another's data), reverse proxy relationships (bypassing WAF by accessing the backend directly), and session/credential sharing between services on different ports.
Execution Standards
- Automated tools first: at each phase, run the automated scanners recommended by the playbook and category README before manual testing. Do not silently replace automated tools with manual scripts — manual testing alone cannot match the coverage of purpose-built scanners.
- Tool before script: when a Kali tool can accomplish the task, use the tool — through a signing proxy or wrapper if needed — rather than writing equivalent custom code. Custom scripts are for target-specific logic that no existing tool covers. A proxy or wrapper that adapts standard tools to custom protocols is part of the workflow, not a reason to skip tools.
- New attack surfaces: after discovering new subdomains, hosts, or services, run the relevant automated scans on each reachable target before proceeding with manual testing.
- Check availability:
which {tool} || apt-get install -y {tool}. If the tool is not in the apt repository (e.g., katana, httpx, naabu), check the tool's reference file for alternative install methods such as go install or pip install.
- Non-interactive: use
-y, --batch, --no-interaction, or equivalent flags to prevent hangs.
- Long tasks: redirect output to a task-specific log, e.g.
nohup {cmd} </dev/null > /tmp/{task}.log 2>&1 & echo \$!.
- Do not reuse logs: each long-running tool gets a unique log filename.
- Retrieve files: use
scp for SSH or docker cp for Docker.
- Record commands: preserve command lines, start/end time, target, output file, and important errors for reporting.
- Confirm scan results: high-speed scans can miss ports. Confirm with a slower, higher-retry scan before proceeding.
- Do not skip unidentified services: probe
unknown, tcpwrapped, or ?-suffixed services before marking them as unidentified.
- Test every service systematically: identifying a service is the beginning of testing, not the end. For each service, complete version identification, CVE assessment, information disclosure checks, access and authentication testing, protocol probing, credential testing, TLS configuration checks, and service configuration auditing. Record negative results explicitly.
Artifact and State Management
State files live on the Agent's local host at /tmp/kali-pentest-state/<target>/. Raw tool output stays in the remote Kali environment (/tmp/).
Before starting each new task, create the temporary directory for state files:
mkdir -p /tmp/kali-pentest-state/<target>/
This runs directly on the Agent host — not inside docker exec or ssh.
After each tool execution, extract key findings to the host state directory:
- Docker mode:
docker cp kali-pentest:/tmp/<file> /tmp/kali-pentest-state/<target>/
- SSH mode:
scp user@host:/tmp/<file> /tmp/kali-pentest-state/<target>/
Rules:
- Do not rely on conversation memory — state files are the only data that survives context compression.
- At the start of each new phase, re-read state files to confirm current progress.
- When switching playbooks, write the return point to
todo.txt under /tmp/kali-pentest-state/<target>/. When returning to a playbook, read todo.txt to resume at the correct phase and process deferred items.
- Do not store state files only inside the remote environment — they must exist on the Agent host for recovery.
See references/environment/state-files.md for file formats and naming.
Output Management
Large tool outputs (full port scans, vulnerability scanners with thousands of templates) can exceed the agent's processing capacity. Redirect output to files and extract relevant findings rather than reading full output. See individual tool docs for specific output flags and extraction commands.
Error Handling
- Tool not found and install fails: try an alternative install method, or switch to an alternative tool from the category README. Report only if no workable alternative exists.
- Parameter or syntax error: run
{tool} --help or {tool} -h to verify flags and argument format before retrying.
- Command times out or hangs: kill the process, reduce scope, lower concurrency, and retry.
- Empty output: verify reachability (
ping, curl, nc, or protocol-specific checks) and confirm the tool supports the target type.
- SSH disconnects: reconnect and check whether background tasks are still running with
ps aux | grep {tool}.
- Warnings: distinguish informational warnings from scan blockers. For example, template warnings may be recorded while target unreachability requires connectivity troubleshooting.
Step 4: Analyze and Iterate
- Parse output and extract key findings: open ports, versions, vulnerabilities, credentials, misconfigurations, reachable paths.
- Chain findings into the next action: open 445 -> SMB enumeration; web login -> auth testing; SQL injection -> sqlmap; AD signals -> AD playbook.
- Cross-host credential testing: When credentials are discovered on one host (config files, database dumps, path traversal), test them against ALL in-scope hosts and services — not just the current host. Cross-host credential reuse is a common lateral movement vector.
- Cross-validate important findings with a second tool or manual protocol check before reporting them as confirmed. Every confirmed finding must include the complete reproducible command and its actual output as evidence.
- STOP on Critical/High finding: Notify the user immediately and wait for explicit confirmation before further exploitation or escalation.
- If exploitation fails, return to enumeration with a narrower hypothesis instead of repeating the same tool.
- If new hosts, credentials, domains, or pivots are discovered, restart the relevant playbook within the authorized scope.
- Update
/tmp/kali-pentest-state/<target>/ files with each iteration's new findings before planning the next action.
- Self-check before closing a service: Before marking any service as done, verify that every item from the Execution Standards systematic testing requirements has been completed or explicitly recorded as not applicable. A service that "looks secure" has not been tested — it has only been identified.
Step 5: Report
Follow references/playbooks/reporting-workflow.md step by step — it is an 8-step workflow, not a single "write report" action. Use references/reporting/tools/report-template.md as the document structure — do not invent a custom structure.
Before starting the report, execute the active playbook's Stop When checklist. Unmet items require returning to the relevant phase — do not proceed to reporting with known coverage gaps undocumented.
Include:
- Scope, authorization statement, dates, environment, and limitations.
- Commands executed and major tool versions.
- Confirmed findings with severity, evidence, impact, reproduction steps, and remediation. Each finding must include the full verification/reproduction process — the exact commands executed and their actual output — so a reader can independently reproduce the result.
- Negative results that matter, such as unreachable hosts or services tested with no finding.
- Artifacts produced and where they were saved.
See references/reporting/README.md for reporting tool selection.
Report File Generation
Reports can exceed 20KB. Do not attempt to write the full report in a single tool call — split into segments (≤8KB each) using cat > for the first segment and cat >> for subsequent ones. Write the report to /tmp/kali-pentest-state/<target>/. Reports are host-side artifacts; do not generate them inside the remote Kali environment.
Reference Layout
references/environment/ defines server and Docker execution environment readiness.
references/playbooks/ defines scenario workflows and decision points.
references/<category>/README.md helps select suitable tools in a category.
references/<category>/tools/<name>.md provides concrete command parameters and examples.