with one click
a2a-python-sut
// Work with the a2a-python SUT (System Under Test). Use when the user wants to regenerate the Python SUT from Gherkin scenarios, build it, run it, or test it with the TCK.
// Work with the a2a-python SUT (System Under Test). Use when the user wants to regenerate the Python SUT from Gherkin scenarios, build it, run it, or test it with the TCK.
| name | a2a-python-sut |
| description | Work with the a2a-python SUT (System Under Test). Use when the user wants to regenerate the Python SUT from Gherkin scenarios, build it, run it, or test it with the TCK. |
| compatibility | Requires Python 3.11+ and uv |
| allowed-tools | Bash(make:*) Bash(uv:*) Bash(python:*) Bash(python3:*) Bash(curl:*) Bash(kill:*) Bash(lsof:*) Bash(pkill:*) Read Edit Write Glob Grep Agent |
The a2a-python SUT is a Python application generated from Gherkin .feature files in scenarios/. It implements the A2A protocol using the a2a-python SDK (a2a-sdk) and serves as a conformance target for TCK tests.
scenarios/*.feature → codegen (parser + steps + python_emitter) → sut/a2a-python/
scenarios/*.feature) define SUT behavior via messageId prefix matchingcodegen/) parses .feature files and emits a Python projectcodegen/a2a-python/*.j2) produce sut_agent.py and pyproject.tomlsut/a2a-python/) is a complete runnable Python project| File | Template | Purpose |
|---|---|---|
sut_agent.py | sut_agent.py.j2 | Main entry point; TckAgentExecutor with messageId-prefix routing, agent card, and server setup for all three transports |
pyproject.toml | pyproject.toml.j2 | Project config with a2a-sdk dependency (installed from local path) |
The generated TckAgentExecutor matches on the messageId prefix from incoming messages:
if message_id.startswith('tck-complete-task'):
await updater.complete(updater.new_agent_message([Part(text="Hello from TCK")]))
return
The TCK tests use tck_id("complete-task") which generates tck-complete-task-<session_hex>, matching the prefix.
The SUT runs all three transports in a single process:
A2AStarletteApplication (Starlette)A2ARESTFastAPIApplication (FastAPI mounted on the Starlette app)GrpcHandler (grpc.aio server on a separate port)The a2a-python SDK version is controlled by the A2A_PYTHON_SDK_VERSION environment variable.
The default value is defined in codegen/python_emitter.py (_DEFAULT_A2A_PYTHON_SDK_VERSION).
The SDK is installed from a local path controlled by A2A_PYTHON_SDK_PATH (default: ../../../a2a-python, i.e., ~/Developer/a2aproject/a2a-python/).
Read the actual default version from codegen/python_emitter.py (grep for _DEFAULT_A2A_PYTHON_SDK_VERSION) and check the env var. Also check the local SDK's git tag to report its actual version:
cd ~/Developer/a2aproject/a2a-python && git describe --tags 2>/dev/null || git tag --sort=-creatordate | head -1
Report to the user which version will be used and propose setting the env var if they want a different version:
# Use default version
make codegen-a2a-python-sut
# Use a specific version
A2A_PYTHON_SDK_VERSION=1.0.0 make codegen-a2a-python-sut
When Gherkin scenarios change, regenerate the Python project:
make codegen-a2a-python-sut
This runs uv run python -m codegen.generator --target a2a-python --output sut/a2a-python which:
scenarios/*.feature filescodegen/steps.pycodegen/a2a-python/To add new behaviors:
.feature file in scenarios/codegen/steps.py, add a new entrycodegen/model.py and handle it in codegen/python_emitter.pymake codegen-a2a-python-sutTemplates are in codegen/a2a-python/:
sut_agent.py.j2 — handler routing, agent card, and server setuppyproject.toml.j2 — Python project dependenciesAfter modifying templates, regenerate with make codegen-a2a-python-sut.
cd sut/a2a-python && uv sync
This installs the a2a-sdk from the local path and all other dependencies into a .venv in sut/a2a-python/.
~/Developer/a2aproject/a2a-python/ exists and is the correct checkout. The pyproject.toml references it via [tool.uv.sources].cd sut/a2a-python && uv run python sut_agent.py
Or to run in the background. Note: When using Claude Code's run_in_background,
cd does not work — use uv run python sut_agent.py from the TCK project root
instead (which uses the project-level venv, not the SUT's):
cd sut/a2a-python && uv run python sut_agent.py &
The SUT listens on:
http://localhost:9999 (POST)http://localhost:9999/a2a/rest (REST routes)localhost:10000 (separate port, default is HTTP port + 1)curl -s http://localhost:9999/.well-known/agent-card.json | python3 -m json.tool
The agent card should list all three supportedInterfaces (JSONRPC, HTTP+JSON, GRPC).
When sending manual requests to the SUT, use tck/requirements/base.py as the source of truth for method names and sample_input fields in requirement specs (e.g., tck/requirements/core_operations.py) for request payload format.
The A2A protocol uses protobuf enum naming conventions:
ROLE_USER, ROLE_AGENT (not user/USER)TASK_STATE_COMPLETED, TASK_STATE_WORKING, etc.{"text": "..."} (not {"kind": "text", "text": "..."} — the protobuf oneof uses the field name to indicate part type)SendMessage (JSON-RPC):
curl -s -X POST http://localhost:9999/ \
-H 'Content-Type: application/json' \
-d '{"jsonrpc":"2.0","id":"1","method":"SendMessage","params":{"message":{"messageId":"tck-complete-task-1234","role":"ROLE_USER","parts":[{"text":"Hello from TCK"}]}}}'
ListTasks (JSON-RPC):
curl -s -X POST http://localhost:9999/ \
-H 'Content-Type: application/json' \
-d '{"jsonrpc":"2.0","id":"1","method":"ListTasks","params":{"contextId":"<contextId>"}}'
GetTask (JSON-RPC):
curl -s -X POST http://localhost:9999/ \
-H 'Content-Type: application/json' \
-d '{"jsonrpc":"2.0","id":"1","method":"GetTask","params":{"id":"<taskId>"}}'
If port 9999 is already in use:
lsof -ti:9999 | xargs kill -9
If gRPC port 10000 is already in use:
lsof -ti:10000 | xargs kill -9
Use the run-tck skill to run the TCK against this SUT with the host being http://localhost:9999.
Read reports/compatibility.json for structured results. For detailed diagnosis, use the diagnose-failure skill.
/.well-known/agent-card.jsontck_id() usage in tests vs prefix in .feature fileA2A-Version header validation) are in the SDK itself, not the SUT; check if the SDK's server handlers implement the required behaviorThe SUT only controls behavior inside TckAgentExecutor.execute() and cancel(). Protocol-level behavior (header validation, error code mapping, transport framing) is handled by the a2a-python SDK. If a failure is in protocol handling rather than business logic, it's an SDK issue — file it against a2aproject/a2a-python.
The typical development cycle is:
scenarios/*.featuremake codegen-a2a-python-sutcd sut/a2a-python && uv synccd sut/a2a-python && uv run python sut_agent.pyuv run ./run_tck.py --sut-host http://localhost:9999 -- -k "test_name" -vThe codegen has its own unit tests:
make unit-test
This runs tests in tests/unit/codegen/ covering the parser, step resolution, Python emitter, and generator CLI.
Work with the a2a-java SUT (System Under Test). Use when the user wants to regenerate the Java SUT from Gherkin scenarios, build it, run it, or test it with the TCK.
Work with the a2a-jakarta SUT (System Under Test). Use when the user wants to regenerate the Jakarta SUT from Gherkin scenarios, build it, run it, or test it with the TCK.
Diagnose a TCK requirement failure and draft a GitHub issue with the requirement context, failure details, and a curl reproducer. Use when the user wants to report a failing requirement, understand why it failed, or create a bug report for an SUT implementor.
Help an SDK implementor run the A2A TCK against their System Under Test (SUT). Use when the user wants to validate their A2A agent implementation, debug TCK failures, or understand conformance results.
Interact with remote AI agents using the A2A (Agent-to-Agent) protocol as a client. Use this skill whenever the user wants to communicate with an A2A agent, send tasks to a remote agent, discover agent capabilities, check task status, or get results from an A2A-compatible service. Trigger on mentions of A2A, agent-to-agent, remote agents, agent cards, or when the user provides a URL to an A2A agent endpoint.
Learn about a specific TCK requirement. Use when the user asks about a requirement ID (e.g. CORE-SEND-001, GRPC-ERR-001), wants to understand what it tests, or needs to find the spec section, tests, and validators related to a requirement.