원클릭으로
run-tck
// Help an SDK implementor run the A2A TCK against their System Under Test (SUT). Use when the user wants to validate their A2A agent implementation, debug TCK failures, or understand conformance results.
// Help an SDK implementor run the A2A TCK against their System Under Test (SUT). Use when the user wants to validate their A2A agent implementation, debug TCK failures, or understand conformance results.
Work with the a2a-java SUT (System Under Test). Use when the user wants to regenerate the Java SUT from Gherkin scenarios, build it, run it, or test it with the TCK.
Work with the a2a-jakarta SUT (System Under Test). Use when the user wants to regenerate the Jakarta SUT from Gherkin scenarios, build it, run it, or test it with the TCK.
Work with the a2a-python SUT (System Under Test). Use when the user wants to regenerate the Python SUT from Gherkin scenarios, build it, run it, or test it with the TCK.
Diagnose a TCK requirement failure and draft a GitHub issue with the requirement context, failure details, and a curl reproducer. Use when the user wants to report a failing requirement, understand why it failed, or create a bug report for an SUT implementor.
Interact with remote AI agents using the A2A (Agent-to-Agent) protocol as a client. Use this skill whenever the user wants to communicate with an A2A agent, send tasks to a remote agent, discover agent capabilities, check task status, or get results from an A2A-compatible service. Trigger on mentions of A2A, agent-to-agent, remote agents, agent cards, or when the user provides a URL to an A2A agent endpoint.
Learn about a specific TCK requirement. Use when the user asks about a requirement ID (e.g. CORE-SEND-001, GRPC-ERR-001), wants to understand what it tests, or needs to find the spec section, tests, and validators related to a requirement.
| name | run-tck |
| description | Help an SDK implementor run the A2A TCK against their System Under Test (SUT). Use when the user wants to validate their A2A agent implementation, debug TCK failures, or understand conformance results. |
| compatibility | Requires Python 3.11+ and uv |
| allowed-tools | Bash Read Glob Grep Agent |
Follow these steps to help an SDK implementor run the A2A TCK against their SUT.
Check that the TCK project is set up:
uv run python --version # Python 3.11+ required
If dependencies are not installed, run:
uv pip install -e .
Ask the user for their SUT URL (e.g., http://localhost:9999).
Verify the SUT is running by fetching its agent card:
curl -s <sut-host>/.well-known/agent-card.json | uv run python -m json.tool
The agent card must be served at {sut-host}/.well-known/agent-card.json per A2A spec Section 8.2.
The agent card must include a supportedInterfaces array. Each entry needs:
protocolBinding — one of "JSONRPC", "GRPC", or "HTTP+JSON"url — the endpoint URL for that transportExample:
{
"name": "My Agent",
"supportedInterfaces": [
{ "protocolBinding": "JSONRPC", "url": "http://localhost:9999/jsonrpc" },
{ "protocolBinding": "HTTP+JSON", "url": "http://localhost:9999/a2a" }
]
}
If the agent card is missing or malformed, help the user fix it before proceeding.
Important: Always use uv run to invoke the test runner so that the
project's virtual environment and dependencies are available.
Running ./run_tck.py directly may fail if the system Python lacks pytest or
other dependencies.
Reports are always generated in reports/ after every run. Point the user to
reports/compatibility.html and reports/tck_report.html after the run completes.
Important: Always run TCK commands from the TCK project root directory
(a2a-tck/), not from a subdirectory like sut/a2a-python/.
Start with MUST-level tests to catch blocking issues first:
uv run ./run_tck.py --sut-host <sut-host> --level must -v
Optionally filter by transport (grpc, jsonrpc, http_json):
uv run ./run_tck.py --sut-host <sut-host> --transport jsonrpc --level must -v
Once MUST tests pass, run the full suite:
uv run ./run_tck.py --sut-host <sut-host> -v
After a run completes, read reports/compatibility.json to get a structured
overview of all failures. The per_requirement object lists every requirement
with its status, transports, errors, and test IDs.
To show the user all failures at a glance, extract requirements where
status is "FAIL" and present them in a table with:
For detailed diagnosis and GitHub issue drafting, use the diagnose-failure skill. It will gather the requirement context, spec text, failure details, build a curl reproducer, and draft a ready-to-file issue.
For a quick triage without a full diagnosis, you can also:
Use pytest's -k flag to isolate a specific test:
uv run ./run_tck.py --sut-host <sut-host> -- -k "test_name" -v
Or use verbose log output for maximum detail:
uv run ./run_tck.py --sut-host <sut-host> --verbose-log -- -k "test_name"
Some tests (e.g., gRPC streaming subscribe) may hang indefinitely due to SUT
bugs. Use pytest's --deselect flag to skip them:
uv run ./run_tck.py --sut-host <sut-host> -v -- --deselect "tests/path/to/stuck_test"
Multiple --deselect flags can be combined. This lets the rest of the suite
run to completion while the stuck test is investigated separately.
/.well-known/agent-card.jsonprotocolBinding values don't match any known transport, or --transport filter excludes all declared transportscompatibility.json: The errors field is often empty
for failing requirements. Re-run the specific test with --verbose-log and
-- --tb=long to get the actual error message and stack trace.tail or
other commands when running in the background — the pipe buffers all output
and produces nothing until the process finishes. Run without pipes instead.| Level | Meaning | Test behavior |
|---|---|---|
| MUST | Absolute requirement | Hard failure — blocks compatibility |
| SHOULD | Expected unless valid reason to differ | Expected failure (xfail) — does not block |
| MAY | Truly optional | Skipped if agent doesn't declare the capability |
Every TCK run generates:
reports/compatibility.json — machine-readable results with per-requirement and per-transport breakdownsreports/compatibility.html — self-contained HTML report with executive summaryreports/tck_report.html — standard pytest-html reportreports/junitreport.xml — JUnit XML for CI integrationRead reports/compatibility.json to get a structured view of which requirements passed/failed per transport.
Guide the user through a fix-and-retest cycle: