| name | tarn-api-testing |
| description | Write, run, debug, and iterate on API tests using Tarn — a CLI-first API testing tool with .tarn.yaml files and structured JSON output. Use when: writing API tests, running tarn commands, debugging test failures, generating tests from OpenAPI specs or curl commands, setting up CI smoke tests, or integrating with tarn-mcp. Triggers: "API test", "tarn", ".tarn.yaml", "test this endpoint", "smoke test", "integration test for API".
|
Tarn API Testing Skill
Tarn is a CLI-first API testing tool. Tests are declarative YAML files (.tarn.yaml). Results are structured JSON designed for AI agent consumption. Single binary, zero runtime dependencies.
Core Workflow — Failures-First Loop
This is the canonical sequence. Default to it for every debugging task. Do not reach for the full JSON report until steps 3–6 have failed to answer the question.
1. tarn validate <path> # syntax/config before running
2. tarn run <path> # produces .tarn/runs/<run_id>/
3. tarn failures # root-cause groups; cascades collapsed
4. tarn inspect last FILE::TEST::STEP # full context for ONE failure
5. Patch tests or application code
6. tarn rerun --failed # reruns only the failing subset
7. tarn diff prev last # confirm fixed / new / persistent
8. Full report.json only when 3–6 are insufficient
Agent rules (read before doing anything else)
- Never slurp
report.json if failures.json is sufficient. failures.json is a 1–50 KB file with one entry per failing step. report.json is often megabytes. tarn failures is the correct entry point, not cat .tarn/last-run.json | jq ....
- Never parse the full run JSON when a summary + inspect will answer the question.
tarn inspect last FILE::TEST::STEP opens one step's request/response/assertion block directly. Use it instead of grepping the whole report.
- Never blame business logic before confirming the response shape didn't drift. See the "Response-shape drift" section below — most
assertion_failed + skipped_due_to_failed_capture clusters are a single upstream capture path that no longer matches the new response body.
- Cascade skips (
skipped_due_to_failed_capture) are suppressed noise. Fix the root cause first. tarn failures already collapses them into └─ cascades: N skipped. Do NOT open each cascade skip individually; they will all resolve when the upstream step is fixed.
- Always
tarn validate before tarn run. Always use --format json (or llm) when parsing results programmatically.
Failures-first commands
tarn failures
tarn failures --format json
tarn failures --include-cascades
tarn failures --run <run_id>
tarn inspect last
tarn inspect last tests/users.tarn.yaml::create-user::1
tarn inspect last --format json --filter-category assertion_failed
tarn rerun --failed
tarn rerun --failed --run <run_id>
tarn diff prev last
tarn diff prev last --format json --filter-category assertion_failed
Aliases for run ids: last, latest, @latest (most recent archive), prev (the one before latest).
Deep-inspection flow (only when failures-first is insufficient)
Reach for the full JSON report ONLY when you genuinely need something failures + inspect can't surface — e.g. a cross-step captures map, a passed-step response body, or a bespoke grep against every executed request. Otherwise the small artifacts are always preferred:
.tarn/runs/<run_id>/summary.json — run id, timings, exit code, failed-file list
.tarn/runs/<run_id>/failures.json — one entry per failing step with coordinates, category, message, request method/url, response status, ~500-char redacted body excerpt
.tarn/runs/<run_id>/report.json — full structured report (megabytes; last resort)
.tarn/summary.json / .tarn/failures.json — pointers to the latest run's artifacts
Commands
tarn run
tarn run tests/users.tarn.yaml
tarn run --format json
tarn run --format json --json-mode compact
tarn run --format llm
tarn run --format compact
tarn run --tag smoke
tarn run --env staging
tarn run --only-failed
tarn run --only-fails
tarn run --verbose-responses
tarn run --max-body 16384
tarn run --no-progress
tarn run --test-filter create-user
tarn run --step-filter 2
tarn run --step-filter "Login"
tarn run --select tests/users.tarn.yaml::create-user
tarn run --select tests/users.tarn.yaml::create-user::0
tarn run --ndjson
tarn run --ndjson --format json=out.json
tarn run --redact-header x-custom-secret
tarn run --no-fixtures
tarn run --fixture-retention 10
tarn run --parallel -j 4
tarn run --parallel --no-parallel-warning
tarn summary .tarn/last-run.json
tarn summary - --format compact
tarn validate
tarn validate tests/users.tarn.yaml
tarn validate --format json
tarn env
tarn env --json
tarn fmt
tarn fmt tests/users.tarn.yaml
tarn fmt --check
tarn list
tarn failures
tarn failures --run <id> --format json
tarn rerun --failed
tarn inspect last
tarn inspect last FILE::TEST::STEP
tarn diff prev last
failures, rerun --failed, inspect, and diff all read the per-run archives under .tarn/runs/<run_id>/ introduced in NAZ-400/401. They are the preferred failure-triage entry points. See the Failures-First Loop section near the top of this file.
Output formats for agents vs humans
When stdout is not a TTY and no --format is set, tarn auto-selects llm so piped output is always LLM-digestible without an external parser. When stdout is a TTY, the default is human. Pass --format human explicitly to force boxed human output into a pipe.
--format llm — one-line grep-friendly verdict (tarn: PASS 498/500 steps, 2 failed, 14 files, 16.3s) followed by only failure blocks with request/response/assertion details. Stable ordering across runs. Use this when an LLM will read the output.
--format compact — per-file badge + inline failure expansion + grouped failure summary. Middle ground between human and llm. Respects -v to show captured values per test.
--format human — the colored TTY-friendly boxed output. Default on a TTY.
--format json — the full structured report. Huge; pipe into tarn summary or --only-failed.
tarn summary <run.json> — re-render a saved JSON report as llm or compact format without re-running the tests. Accepts - for stdin (tarn run --format json | tarn summary -).
Streaming and filtering
tarn run streams per-test output as each test finishes (per-file in --parallel mode). With --format human the stream writes to stdout; with structured formats (json, junit, tap, html, curl) it writes to stderr so stdout stays parseable.
--only-failed drops passing files, tests, and steps from both human and JSON output. Summary counts still reflect the full run.
--no-progress disables streaming and prints the final report in one batch at the end — use it when a CI harness already timestamps every line.
--verbose-responses embeds request/response bodies and resolved captures in the JSON / compact / llm report for every step, pass or fail. Per-step opt-in via step-level debug: true. Default body cap is 8 KiB; override with --max-body <bytes>.
Streaming and selective execution
--select FILE[::TEST[::STEP]] narrows a run to specific files, tests, or steps. Repeatable — multiple selectors union. ANDs with --tag. STEP accepts either a step name or a 0-based integer index. Step-level selection runs ONLY that step with no prior steps, so any captures produced by earlier steps will be unset — prefer test-level selectors for chained flows and reserve step-level selection for isolated smoke checks.
--test-filter NAME / --step-filter NAME-OR-INDEX apply a wildcard selector across every discovered file — shorter than --select when you want the same filter everywhere.
--ndjson streams one JSON event per line on stdout: file_started, step_finished (with phase: setup|test|teardown), test_finished, file_finished, and a final done event carrying the aggregated summary. Failing step_finished events include failure_category, error_code, and assertion_failures. In parallel mode each file's events are emitted atomically on file_finished to avoid interleaving across files. --ndjson composes with --format json=path to write a final report to a file while streaming events on stdout. It collides with any other structured format that would also write to stdout — pick one stdout consumer.
Writing Tests
Minimal Test
name: Health check
steps:
- name: GET /health
request:
method: GET
url: "{{ env.base_url }}/health"
assert:
status: 200
Full-Featured Test
name: User CRUD
env:
base_url: "http://localhost:3000"
defaults:
headers:
Content-Type: "application/json"
setup:
- name: Authenticate
request:
method: POST
url: "{{ env.base_url }}/auth/login"
body:
email: "admin@example.com"
password: "{{ env.admin_password }}"
capture:
token: "$.token"
assert:
status: 200
tests:
create-and-fetch:
description: Create a user then fetch it
steps:
- name: Create user
request:
method: POST
url: "{{ env.base_url }}/users"
headers:
Authorization: "Bearer {{ capture.token }}"
body:
name: "Jane Doe"
email: "jane-{{ $uuid }}@example.com"
capture:
user_id: "$.id"
assert:
status: 201
duration: "< 500ms"
body:
"$.name": "Jane Doe"
"$.id": { type: string, not_empty: true }
- name: Fetch user
request:
method: GET
url: "{{ env.base_url }}/users/{{ capture.user_id }}"
headers:
Authorization: "Bearer {{ capture.token }}"
assert:
status: 200
body:
"$.name": "Jane Doe"
teardown:
- name: Cleanup
request:
method: DELETE
url: "{{ env.base_url }}/users/{{ capture.user_id }}"
headers:
Authorization: "Bearer {{ capture.token }}"
assert:
status: { in: [200, 204, 404] }
Key Rules
- File extension must be
.tarn.yaml
name is required at top level
- Must have either
steps (flat) or tests (grouped) — not both
setup runs once before all tests; teardown runs even if tests fail
- Steps within a test are sequential and share captures
- Each test group is independent
- Captures preserve JSON types (numbers stay numbers, not strings)
- Automatic cookie jar is on by default; disable with
cookies: "off" or set cookies: "per-test" to clear the default jar between named tests within a file. Setup and teardown still share the file-level jar, and named cookie jars (multi-user scenarios) are untouched. The --cookie-jar-per-test CLI flag is equivalent at the command line and overrides the file setting except when the file sets cookies: "off" — that always wins.
Environment Variables
Six-layer resolution (highest priority wins):
--var key=value CLI flag
- Shell environment
$VAR
tarn.env.local.yaml (git-ignored secrets)
tarn.env.{name}.yaml (per-environment, selected via --env name)
tarn.env.yaml (shared defaults)
- Inline
env: block in the test file
Reference in YAML: {{ env.variable_name }}
Environment file format:
base_url: "http://localhost:3000"
api_version: "v1"
Captures
Capture values from responses for use in subsequent steps via {{ capture.name }}.
capture:
user_id: "$.id"
token: "$.auth.token"
session:
header: "set-cookie"
regex: "session=([^;]+)"
csrf_cookie:
cookie: "csrf_token"
final_url:
url: true
status_code:
status: true
body_text:
body: true
regex: "ID: (\\d+)"
Optional / conditional captures
Use when a field may or may not appear in the response — typical for
idempotent endpoints that return different shapes on 201 Created vs
409 Conflict, or list endpoints that might be empty.
capture:
middle_name:
jsonpath: "$.middle_name"
optional: true
count:
jsonpath: "$.total"
default: 0
label:
jsonpath: "$.label"
default: "unnamed"
created_id:
jsonpath: "$.id"
when:
status: 201
ok_id:
jsonpath: "$.id"
when:
status:
in: [200, 201]
Step-level if: / unless:
Skip an entire step when an interpolated expression is falsy / truthy.
Pairs well with optional: captures for idempotent flows.
steps:
- name: Lookup widget
request:
method: GET
url: "{{ env.base_url }}/widgets/{{ env.slug }}"
capture:
widget_id:
jsonpath: "$.id"
optional: true
- name: Create widget (only when missing)
unless: "{{ capture.widget_id }}"
request:
method: POST
url: "{{ env.base_url }}/widgets"
body: { slug: "{{ env.slug }}" }
- name: Audit widget
if: "{{ capture.widget_id }}"
request:
method: POST
url: "{{ env.base_url }}/widgets/{{ capture.widget_id }}/audit"
Truthy rules: empty / whitespace / "false" / "0" / "null" / unresolved {{ ... }} placeholders are falsy; every other value is truthy. Skipped steps report with failure_category: skipped_by_condition and passed: true so they never flip the exit code.
Assertions
See references/assertion-reference.md for the complete operator list.
Quick reference for the most common assertions:
assert:
status: 200
status: "2xx"
status: { in: [200, 201] }
duration: "< 500ms"
headers:
content-type: "application/json"
x-request-id: 'matches "^[a-f0-9-]+"'
body:
"$.name": "Jane"
"$.id": { type: string, not_empty: true }
"$.age": { gt: 0, lt: 200 }
"$.tags": { type: array, contains: "admin" }
"$.email": { matches: "^[^@]+@[^@]+$" }
Multiple operators on the same JSONPath combine with AND logic.
Built-in Functions
Use in any string field:
{{ $uuid }} — UUID v4 (alias for $uuid_v4)
{{ $uuid_v4 }} — random UUID v4
{{ $uuid_v7 }} — time-ordered UUID v7 (Unix-ms prefix)
{{ $timestamp }} — Unix epoch seconds
{{ $now_iso }} — ISO 8601 datetime
{{ $random_hex(8) }} — random hex string of length N
{{ $random_int(1, 100) }} — random integer in range
Faker (EN locale)
{{ $email }}, {{ $first_name }}, {{ $last_name }}, {{ $name }}, {{ $username }}, {{ $phone }}
{{ $word }}, {{ $words(3) }}, {{ $sentence }}, {{ $slug }}
{{ $alpha(8) }}, {{ $alnum(8) }}, {{ $choice(a, b, c) }}, {{ $bool }}
{{ $ipv4 }}, {{ $ipv6 }}
Reproducible runs
Set TARN_FAKER_SEED=<u64> or tarn.config.yaml: faker.seed: <u64> to freeze every RNG-backed built-in for the process. The environment variable wins. Wall-clock generators ($timestamp, $now_iso, and the timestamp prefix of $uuid_v7) stay real-time — seeding only pins randomness.
Formatting
Tarn ships tarn fmt, a canonical .tarn.yaml formatter.
tarn fmt [PATH] reformats whole files in place. Omit PATH to reformat every .tarn.yaml under the current working directory.
tarn fmt --check is the CI variant: exit 0 if every file is already formatted, 1 if any file would change. No files are written in check mode.
- The
tarn::format::format_document library surface is shared with tarn-lsp's textDocument/formatting request, so CLI and editor output are byte-identical.
- Range formatting is deliberately unsupported —
tarn fmt is whole-document only.
Structured Validation and Env Introspection
Two CLI subcommands return structured JSON for editors and CI.
tarn validate --format json emits {"files": [{"file", "valid", "errors": [{"message", "line", "column"}]}]}. YAML syntax errors carry precise line and column extracted from serde_yaml; semantic parser errors without a known location fall back to message-only (line and column are optional). Exit code is 0 when every file is valid, 2 otherwise. The default human format is unchanged.
tarn env --json dumps the resolved env chain with provenance. Inline vars declared in tarn.config.yaml environments are redacted against redaction.env (case-insensitive) so secrets never appear in stdout. The per-environment file field is source_file (matching the VS Code extension contract). Environments are sorted alphabetically. Exit code is 0 on success, 2 on configuration error.
JSON Output
tarn run --format json returns structured results. Key fields for diagnosis:
summary.status — "PASSED" or "FAILED"
files[].tests[].steps[].status — per-step result
files[].tests[].steps[].failure_category — why it failed
files[].tests[].steps[].assertions.failures[] — assertion details with expected, actual, message
files[].tests[].steps[].request — sent request (failed steps only)
files[].tests[].steps[].response — received response (failed steps only)
files[].tests[].steps[].remediation_hints — suggested fixes
See references/json-output.md for the full schema.
Failure Categories
| Category | Meaning | Typical Fix |
|---|
assertion_failed | Request succeeded, assertion mismatch | Fix assertion or application |
response_shape_mismatch | JSONPath missed because the response shape drifted | Replace the JSONPath with the suggested candidate and add a guard assertion |
connection_error | DNS/connect/TLS failure | Check URL, server status, network |
timeout | Request exceeded allowed time | Increase timeout or fix server |
parse_error | Invalid YAML, JSONPath, or config | Fix syntax |
capture_error | Capture extraction failed | Fix JSONPath or response shape |
unresolved_template | {{ ... }} couldn't resolve before send | Check env or upstream capture; optional captures produce a distinct message |
skipped_due_to_failed_capture | Downstream step skipped because a capture it needed failed earlier | Fix the root cause; skips are suppressed noise, not new failures |
skipped_due_to_fail_fast | Downstream step skipped after an earlier failure in the same test | Fix the first failing step |
skipped_by_condition | Step-level if: / unless: evaluated to skip | Intentional; passed: true, never flips exit code |
skipped_by_policy | Shell command: step skipped because the run was not granted --allow-exec | Intentional security default; pass --allow-exec (CLI) or set allow_exec: true in tarn.config.yaml once the file is trusted |
command_failed | Shell command: step exited non-zero, was killed, or its stdout_regex capture missed | Inspect the failure message for the exit code / signal / regex pattern; rerun the script manually to reproduce |
Diagnosis Loop (structured report, step level)
Use this when you already have a single failing step open (e.g. via tarn inspect last FILE::TEST::STEP --format json):
1. Read failure_category first — it tells you which branch below applies
2. If assertion_failed → read assertions.failures[].expected vs actual
3. If response_shape_mismatch → apply the suggested replacement JSONPath after checking response.body
4. If connection_error → check request.url for unresolved {{ }} templates
5. If timeout → check server state / increase step timeout
6. If capture_error → check previous step's response shape (see "Response-shape drift")
7. If unresolved_template → check env chain or upstream capture
8. If skipped_due_to_failed_capture/skipped_due_to_fail_fast → IGNORE; fix the root cause grouped by `tarn failures`
9. Fix YAML or application → tarn rerun --failed → tarn diff prev last
Response-shape drift (check this before blaming business logic)
Symptom. One step fails with a status-mismatch or capture error. Several downstream steps in the same test are marked skipped_due_to_failed_capture. tarn failures collapses them into one root-cause group plus └─ cascades: N skipped.
Why it usually happens. The endpoint's response shape changed. The old capture JSONPath no longer matches. Every downstream step that needed the captured value is now unreachable.
Worked example — the reopen-request incident. A create/reopen endpoint used to return:
{ "uuid": "e4f2…", "status": "pending" }
and the test captured user_id: "$.uuid". The endpoint now returns an envelope:
{ "request": { "uuid": "e4f2…" }, "stageStatus": "pending" }
$.uuid no longer matches. The step's own status/body assertions fail (or its capture fails); every downstream GET /requests/{{ capture.user_id }} step skips with skipped_due_to_failed_capture.
Recovery loop.
tarn failures --format json — read the one root-cause group. Cascade count tells you how many steps will unblock once the root cause is fixed.
tarn inspect last FILE::TEST::STEP --format json — open the failing step. Read response.body_excerpt (or the full response block) and identify the new shape.
- Find the field in the new response. For the example above:
$.request.uuid instead of $.uuid.
- Patch the capture path and add a body-type assertion as a guard rail so the next drift is caught at the failing step, not at the downstream cascade.
Before:
capture:
user_id: "$.uuid"
assert:
status: 201
After:
capture:
user_id: "$.request.uuid"
assert:
status: 201
body:
"$.request": { type: object }
"$.request.uuid": { type: string, not_empty: true }
tarn rerun --failed — replay only the failing (file, test) pair; don't rerun the whole suite.
tarn diff prev last --format json — confirm the root cause moved to fixed and that no new failures appeared. persistent items need another pass.
Mutation vs read-response patterns
Shape drift is disproportionately common on mutation endpoints. Use these defaults:
- Mutation endpoints (
POST/PUT/PATCH) often return an envelope distinct from the read shape — e.g. {"request": {...}, "meta": {...}} vs the read endpoint's {"uuid": "..."}. Assert the envelope explicitly with a type assertion and capture from the wrapped path ($.request.uuid, not $.uuid). Do NOT reuse a read-shape capture on a mutation response.
- Read endpoints (
GET /resource/:id) typically return the resource directly and drift less, but paginated/list endpoints wrap in {"items": [...], "page": N} and must never be flattened — capture from $.items[0].id, not $[0].id.
- When authoring a new capture, do a one-shot
GET /resource/:id (or, for mutations, a live POST with --dry-run: false and debug: true on the step) and copy the minimal JSONPath from the observed body. Never capture from memory.
Shell command: steps (NAZ-464)
A step can be either a request: (HTTP) or a command: (shell) — never both. command: lets you run fixture-prep / version-stamping helpers inline with your suite without a wrapper script.
setup:
- name: Bump fixture version
command:
run: "python3 bin/bump-version.py --version {{ $timestamp }}"
pass_env: [PATH, PYTHON]
workdir: "scripts/fixtures"
capture:
bumped_version:
stdout_regex: "version=([^\\s]+)"
result_code:
exit_code: true
Security model — command: is INERT by default
Without an explicit opt-in, every command: step is skipped with failure_category: skipped_by_policy. The runner does not spawn the child process at all. To authorize execution:
- CLI:
tarn run --allow-exec ... (preferred for ad-hoc runs and CI)
- Project config:
allow_exec: true in tarn.config.yaml (for trusted, project-owned repos)
- MCP: pass
"allow_exec": true in the tarn_run / tarn_run_agent / tarn_rerun_failed tool params
A freshly cloned repo never executes commands until the human opts in. This is intentional — a malicious .tarn.yaml cannot exfiltrate secrets just by being cloned and a default tarn run.
Env scrubbing (pass_env)
The child process gets a tiny baseline env: PATH, HOME (USERPROFILE on Windows), TMPDIR/TEMP/TMP. Anything else must be allowlisted via pass_env: [VAR1, VAR2]. Tarn's own {{ env.x }} and {{ capture.x }} chain is NEVER implicitly forwarded — secrets in tarn.env.local.yaml stay scoped to template interpolation. If you need a tarn env variable in the shell, add it to pass_env only if the parent process actually exports it; otherwise interpolate it into command.run directly (run: "FOO={{ env.foo }} python3 ...").
Captures from commands
Each entry under command.capture: must use exactly one of:
stdout_regex: "PATTERN" — capture group 1 (or full match if no group). A miss fails the step under failure_category: command_failed unless optional: true.
exit_code: true — capture the literal exit code as an integer.
assert: and poll: are only valid on request: steps; the runner rejects them on command steps at parse time.
When commands fail
failure_category: command_failed covers all of: non-zero exit, signal kill (Unix), and required-but-missed stdout regex. The error message identifies which case fired. skipped_by_policy is a benign skip (passed: true) — the run keeps green when nothing else failed.
Exit Codes
- 0 — all tests passed
- 1 — one or more tests failed
- 2 — configuration or parse error
- 3 — runtime error (network failure, timeout)
MCP Integration
Tarn ships with tarn-mcp, an MCP server for Claude Code, opencode, Cursor, and Windsurf.
See references/mcp-integration.md for setup and tool reference.
Parallel Execution and Isolation
--parallel -j N schedules whole files across N workers. Tests within a file always run sequentially. Two markers let you opt files in or out of the parallel pool:
name: Payments
serial_only: true
group: postgres
tests:
refund:
serial_only: true
steps: [...]
serial_only: true on a file (or any of its named tests) pins it onto the serial worker after every parallel bucket finishes.
group: "name" runs every file with the same group string on the same worker, serially — useful when tests share an external resource (DB, message broker, S3 bucket) but different resources can still parallelize across buckets.
- Running
--parallel without parallel_opt_in: true in tarn.config.yaml prints a one-line stderr warning. Set it once after you've verified your suite is safe to parallelize.
parallel_opt_in: true
parallel: false
Fixtures (.tarn/fixtures/** and .tarn/state.json)
Every run writes two kinds of artifacts under .tarn/ (gitignored by default):
.tarn/fixtures/<file-hash>/<test-slug>/<step-index>/ — per-step request/response/captures (redacted), retaining up to 5 rolling-history files plus a latest-passed.json copy. Feeds the LSP's inline JSONPath hover and the tarn.diffLastPassing debug command. Disable with --no-fixtures; override retention with --fixture-retention N.
.tarn/state.json — human-readable snapshot of the last run (pass/fail counts, failure list, env metadata, argv). Atomic rewrite. Tail this file from an LLM harness to answer "what just happened?" without re-running.
.tarn/last-run.json — always-on machine-readable report of the last run (includes args, env_name, working_directory, start_time, end_time so "rerun last failures" replays with the same env). Override path with --report-json PATH, disable with --no-last-run-json.
Editor Integration via tarn-lsp
Tarn ships tarn-lsp, an editor-agnostic LSP 3.17 stdio server delivered as its own binary in the same release pipeline as tarn and tarn-mcp (and bundled inside the Tarn Docker image). Any LSP client — Claude Code, VS Code, Neovim, Helix, Zed, IntelliJ — can register it against .tarn.yaml files to get the same static intelligence the VS Code extension provides.
Agent-relevant capabilities:
- Schema-aware completion — top-level and nested keys from the bundled Tarn testfile schema, env keys sorted by resolution priority, captures in scope for the current step, builtin snippets with parameter placeholders.
- Hover — provenance for
{{ env.* }} (full resolution chain), {{ capture.* }} (declaring step + JSONPath source), and {{ $builtin }} signatures. Hover over a JSONPath literal in assert.body.* also evaluates the expression against the step's last recorded response and appends the result inline.
- Diagnostics —
publishDiagnostics driven by tarn::validation::validate_document, with ranges matching the JSON report locations.
- Navigation — go-to-definition and references for env keys (workspace-scoped) and captures (per-test), plus rename with collision detection.
- Code lens —
Run test and Run step lenses on every test and step. Stable command IDs tarn.runTest / tarn.runStep, selector format FILE::TEST::STEP_INDEX. The server emits the lenses but does NOT execute them — the client (Claude Code, VS Code) is responsible for running the selector via tarn run --select ....
- Formatting —
textDocument/formatting shares the tarn::format::format_document library surface with the tarn fmt CLI, so edits are byte-identical to a CLI reformat.
- Code actions —
extract env var (lift a string literal into an inline env: key), capture this field (stub a capture: from a JSONPath literal under the cursor), scaffold assert from response (generate a pre-typed assert.body block from a recorded response).
- Quick fix —
CodeActionKind::QUICKFIX shares the tarn::fix_plan library with the MCP tool tarn_fix_plan. The MCP path is report-driven, the LSP path is diagnostic-driven, the remediation engine is the same.
workspace/executeCommand tarn.evaluateJsonpath — evaluates a JSONPath expression against either an inline response payload or a recorded step fixture. Returns a typed result: { result: "match", value }, { result: "no_match", available_top_keys }, or { result: "no_fixture", message }. Agents iterate on a wrong JSONPath path without ever rerunning the test.
tarn.explainFailure — given { file, test?, step? }, returns a structured blob: expected vs actual, the last passing step's captures, and a root-cause hint (5xx → server error, 401/403 → auth, JSONPath no-match → response shape changed, unresolved capture → upstream failed). Paste the blob into a reasoning loop.
- Run-from-gutter commands —
tarn.runFile, tarn.runTest, tarn.runStep, tarn.runLastFailures. Stream tarn/progress notifications, return the final RunOutcome.
- Step-through debugger —
tarn.debugTest starts a session that pauses between steps and publishes tarn/captureState notifications carrying the current captures map and last response. Drive it with tarn.debugContinue, tarn.debugStepOver, tarn.debugRerunStep, tarn.debugRestart, tarn.debugStop. tarn.getCaptureState polls the current snapshot.
- Response diff vs last passing —
tarn.diffLastPassing returns a structured status / headers / body diff against the most recent passing fixture for a step, or { error: "no_baseline" } when no passing run exists.
- Fixture store access —
tarn.getFixture reads the latest fixture for a step (or { error: "no-fixture" }); tarn.clearFixtures removes them.
All commands return { schema_version: 1, data: ... } envelopes so the contract can evolve without breaking clients. See docs/commands.json for the full manifest and parameter shapes.
Recorded fixtures live under .tarn/fixtures/<file-hash>/<test-slug>/<step-index>/ — the runner writes them automatically on every run. The LSP reads this directory to drive hover / JSONPath evaluation / diff. No more manual sidecar convention.
See docs/TARN_LSP.md for the full spec and editors/claude-code/tarn-lsp-plugin/README.md for the Claude Code plugin install flow.
Advanced Features
Includes
Reuse shared steps across files:
setup:
- include: ./shared/auth-setup.tarn.yaml
with:
role: admin
Polling
Wait for async operations:
- name: Wait for processing
request:
method: GET
url: "{{ env.base_url }}/jobs/{{ capture.job_id }}"
poll:
until:
body:
"$.status": "completed"
interval: "2s"
max_attempts: 10
Named Cookie Jars
Multi-user scenarios:
- name: Admin login
cookies: "admin"
request:
method: POST
url: "{{ env.base_url }}/login"
body: { email: "admin@test.com", password: "pass" }
- name: User login
cookies: "user"
request:
method: POST
url: "{{ env.base_url }}/login"
body: { email: "user@test.com", password: "pass" }
Lua Scripts
Escape hatch for complex validation:
- name: Custom check
request:
method: GET
url: "{{ env.base_url }}/data"
script: |
local body = json.decode(response.body)
assert(#body.items > 0, "expected items")
for _, item in ipairs(body.items) do
assert(item.price > 0, "price must be positive: " .. item.name)
end
GraphQL
- name: Query users
request:
method: POST
url: "{{ env.base_url }}/graphql"
graphql:
query: |
query GetUser($id: ID!) {
user(id: $id) { name email }
}
variables:
id: "{{ capture.user_id }}"
assert:
status: 200
body:
"$.data.user.name": { not_empty: true }
Multipart Upload
- name: Upload file
request:
method: POST
url: "{{ env.base_url }}/upload"
multipart:
fields:
- name: "title"
value: "My Photo"
files:
- name: "file"
path: "./fixtures/test.jpg"
content_type: "image/jpeg"
Auth Helpers
defaults:
auth:
bearer: "{{ capture.token }}"
basic:
username: "{{ env.api_user }}"
password: "{{ env.api_pass }}"
Redirect Assertions
assert:
redirect:
url: "https://api.example.com/final"
count: 2
Troubleshooting
Always start at tarn failures — it groups by root cause, collapses cascade skips, and points at a single failing step. Only after that does it make sense to inspect a specific step.
"Unresolved template" in URL — The {{ env.x }} or {{ capture.x }} was not resolved. Check that the env var exists in the resolution chain or that the previous capture step passed. If the cascade count is non-zero, fix the upstream capture first, then tarn rerun --failed.
"Connection refused" — The target server is not running. Start it before running tests.
"Capture extraction failed" — The JSONPath does not match the actual response body. See "Response-shape drift" above. Open the failing step with tarn inspect last FILE::TEST::STEP --format json, read response.body_excerpt, and fix the JSONPath against the observed body, not from memory.
"Parse error" — Invalid YAML syntax. Run tarn validate to see the exact error location.
Tests pass individually but fail together — Steps share captures within a test group. Check for capture name collisions or ordering issues.
A lot of downstream steps are skipped — That's skipped_due_to_failed_capture fanning out from one upstream failure. Do NOT open each skipped step. Run tarn failures and fix the single root-cause group at the top.
File Organization
Recommended project structure:
tests/
health.tarn.yaml # simple smoke tests
users.tarn.yaml # user CRUD flows
auth.tarn.yaml # authentication tests
shared/
auth-setup.tarn.yaml # reusable auth steps
tarn.env.yaml # shared env defaults
tarn.env.local.yaml # local secrets (git-ignored)
tarn.env.staging.yaml # staging environment
tarn.config.yaml # optional project config