一键导入
bug-hunt
// Hunt for instrumentation bugs by analyzing code gaps and running e2e tests through DISABLED/RECORD/REPLAY cycle
// Hunt for instrumentation bugs by analyzing code gaps and running e2e tests through DISABLED/RECORD/REPLAY cycle
| name | bug-hunt |
| description | Hunt for instrumentation bugs by analyzing code gaps and running e2e tests through DISABLED/RECORD/REPLAY cycle |
| disable-model-invocation | true |
$ARGUMENTS - The library name, optionally followed by focus context.
Format: <library> [focus on <area>]
Examples:
/bug-hunt redis — broad bug hunting across all redis functionality/bug-hunt redis focus on pub sub interactions — prioritize pub/sub patterns/bug-hunt psycopg2 focus on async cursors and connection pooling — prioritize those areasParsing: The first word of $ARGUMENTS is always the library name. Everything after it is the optional focus context. All references to <library> below mean this parsed first word — NOT the raw $ARGUMENTS string.
Use this mapping to clone the package source code for analysis:
| Library | GitHub Repo | Notes |
|---|---|---|
| aiohttp | https://github.com/aio-libs/aiohttp | |
| django | https://github.com/django/django | |
| fastapi | https://github.com/tiangolo/fastapi | |
| flask | https://github.com/pallets/flask | |
| grpc | https://github.com/grpc/grpc | Focus on src/python/grpcio/ |
| httpx | https://github.com/encode/httpx | |
| psycopg | https://github.com/psycopg/psycopg | Monorepo — focus on psycopg/ |
| psycopg2 | https://github.com/psycopg/psycopg2 | |
| redis | https://github.com/redis/redis-py | |
| requests | https://github.com/psf/requests | |
| sqlalchemy | https://github.com/sqlalchemy/sqlalchemy | |
| urllib | N/A | Built-in Python stdlib — no repo to clone |
| urllib3 | https://github.com/urllib3/urllib3 |
Each library has a single e2e-tests directory (no ESM/CJS variants):
| Library | E2E test path |
|---|---|
| aiohttp | drift/instrumentation/aiohttp/e2e-tests/ |
| django | drift/instrumentation/django/e2e-tests/ |
| fastapi | drift/instrumentation/fastapi/e2e-tests/ |
| flask | drift/instrumentation/flask/e2e-tests/ |
| grpc | drift/instrumentation/grpc/e2e-tests/ |
| httpx | drift/instrumentation/httpx/e2e-tests/ |
| psycopg | drift/instrumentation/psycopg/e2e-tests/ |
| psycopg2 | drift/instrumentation/psycopg2/e2e-tests/ |
| redis | drift/instrumentation/redis/e2e-tests/ |
| requests | drift/instrumentation/requests/e2e-tests/ |
| sqlalchemy | drift/instrumentation/sqlalchemy/e2e-tests/ |
| urllib | drift/instrumentation/urllib/e2e-tests/ |
| urllib3 | drift/instrumentation/urllib3/e2e-tests/ |
Extract the library name (first word) and optional focus context (remaining words) from the arguments.
The library must be one of: aiohttp, django, fastapi, flask, grpc, httpx, psycopg, psycopg2, redis, requests, sqlalchemy, urllib, urllib3.
If the library is invalid, list the valid options and stop.
If focus context is provided, it will guide Phases 1 and 2 to prioritize that area of the library's functionality.
Check if Docker is running. If not, start it:
dockerd --storage-driver=vfs &>/tmp/dockerd.log &
# Wait for Docker to be ready
for i in $(seq 1 30); do
docker info &>/dev/null 2>&1 && break
sleep 1
done
docker info &>/dev/null 2>&1 || { echo "Docker failed to start. Check /tmp/dockerd.log"; exit 1; }
If Docker is already running, skip this step.
This is required before running any e2e tests:
cd <repo-root>
docker build -t python-e2e-base:latest -f drift/instrumentation/e2e_common/Dockerfile.base .
If the library has a GitHub repo (see mapping above), clone it for reference:
git clone --depth 1 <repo-url> /tmp/<library-name>-source
This is read-only reference material — you will NOT modify this repo.
Skip this step if you are already on a dedicated branch (e.g., in Claude Code Web where each session has its own branch).
git checkout -b bug-hunt/<library>-$(date +%Y-%m-%d)
If focus context was provided, prioritize your analysis around that area. For example, if the focus is "pub sub interactions", concentrate on pub/sub-related code paths in the instrumentation, tests, and package source.
Read the instrumentation code at:
drift/instrumentation/<library>/instrumentation.py
Also check for any additional files in the same directory:
drift/instrumentation/<library>/
Identify:
Review the test files:
drift/instrumentation/<library>/e2e-tests/src/app.py — all test endpoints (Flask/FastAPI/Django app)drift/instrumentation/<library>/e2e-tests/src/test_requests.py — which endpoints are calleddrift/instrumentation/<library>/e2e-tests/entrypoint.py — test orchestration and setupUnderstand what functionality is already tested and identify coverage gaps.
If you cloned the package source, read it to understand:
If focus context was provided, prioritize bugs related to that area. You may still note other potential issues, but test the focus area first.
Reason about potential issues in the instrumentation. Consider:
with statements and async context managers handled?Produce a prioritized list of potential bugs to investigate.
Create BUG_TRACKING.md in the e2e test directory:
# Path: drift/instrumentation/<library>/e2e-tests/BUG_TRACKING.md
# <library> Instrumentation Bug Tracking
Generated: <current date and time>
## Summary
- Total tests attempted: 0
- Confirmed bugs: 0
- No bugs found: 0
- Skipped tests: 0
---
## Test Results
(Tests will be documented below as they are completed)
For each potential bug, follow this workflow:
Navigate to the e2e test directory:
cd drift/instrumentation/<library>/e2e-tests/
Build and start Docker containers:
docker compose build
docker compose run --rm -d --name bug-hunt-app app /bin/bash -c "sleep infinity"
This starts the container in the background so you can exec into it.
docker exec bug-hunt-app rm -rf .tusk/traces/* .tusk/logs/*
Add a new endpoint to src/app.py that exercises the potential bug. Also add the corresponding request to src/test_requests.py.
Example for Flask:
@app.route("/test/my-new-test", methods=["GET"])
def my_new_test():
# Your test code here
return jsonify({"success": True})
Start server without instrumentation:
docker exec -e TUSK_DRIFT_MODE=DISABLED bug-hunt-app python src/app.py &
sleep 5
Hit the endpoint:
docker exec bug-hunt-app curl -s http://localhost:8000/test/my-new-test
Verify: Response is correct and endpoint works.
Stop the server:
docker exec bug-hunt-app pkill -f "python src/app.py" || true
sleep 2
If the endpoint fails in DISABLED mode:
BUG_TRACKING.md with status: "Skipped - Failed in DISABLED mode"Clean traces and logs:
docker exec bug-hunt-app rm -rf .tusk/traces/* .tusk/logs/*
Start server in RECORD mode:
docker exec -e TUSK_DRIFT_MODE=RECORD bug-hunt-app python src/app.py &
sleep 5
Hit the endpoint:
docker exec bug-hunt-app curl -s http://localhost:8000/test/my-new-test
Wait for spans to export:
sleep 3
Stop the server:
docker exec bug-hunt-app pkill -f "python src/app.py" || true
sleep 2
Check for issues:
Endpoint returns error or wrong response vs DISABLED mode:
BUG_TRACKING.md: Status "Confirmed Bug - RECORD mode failure", Failure Point "RECORD"No traces created (docker exec bug-hunt-app ls .tusk/traces/):
BUG_TRACKING.md: Status "Confirmed Bug - No traces captured", Failure Point "RECORD"Run the Tusk CLI to replay:
docker exec -e TUSK_ANALYTICS_DISABLED=1 -e TUSK_REQUIRE_INBOUND_REPLAY_SPAN=1 bug-hunt-app tusk drift run --print --output-format "json" --enable-service-logs
Check for issues:
Test fails ("passed": false in JSON output):
BUG_TRACKING.md: Status "Confirmed Bug - REPLAY mismatch", Failure Point "REPLAY"No logs created (docker exec bug-hunt-app ls .tusk/logs/):
BUG_TRACKING.md: Status "Confirmed Bug - No replay logs", Failure Point "REPLAY"Logs contain socket warnings:
docker exec bug-hunt-app cat .tusk/logs/*.log | grep -i "TCP connect() called from inbound request context"
BUG_TRACKING.md: Status "Confirmed Bug - Unpatched dependency", Failure Point "REPLAY"If all modes pass with no issues:
BUG_TRACKING.md: Status "No Bug - Test passed all modes"src/app.py and src/test_requests.pyAfter each test, append to BUG_TRACKING.md:
### Test N: [Brief description]
**Status**: [Confirmed Bug | No Bug | Skipped]
**Endpoint**: `/test/endpoint-name`
**Failure Point**: [DISABLED | RECORD | REPLAY | N/A]
**Description**:
[What this test was trying to uncover]
**Expected Behavior**:
[What should happen]
**Actual Behavior**:
[What actually happened]
**Error Logs**:
[Relevant error messages, stack traces, or warnings]
**Additional Notes**:
[Observations, potential root causes, context]
---
Important: Update BUG_TRACKING.md immediately after each test — do not batch updates.
After testing all potential bugs:
docker stop bug-hunt-app || true
docker compose down -v
Clean up cloned package source:
rm -rf /tmp/*-source
Final state of the e2e test files:
src/app.py should contain ONLY the original endpoints + new endpoints that expose confirmed bugssrc/test_requests.py should be updated to include requests to bug-exposing endpointsBUG_TRACKING.md should have accurate summary counts and all test resultsCommit the changes:
git add drift/instrumentation/<library>/e2e-tests/
git commit -m "bug-hunt(<library>): add e2e tests exposing instrumentation bugs"
Push the branch (skip if in Claude Code Web where the session handles this):
git push origin bug-hunt/<library>-$(date +%Y-%m-%d)
BUG_TRACKING.md before starting any testsBUG_TRACKING.md after each individual testBUG_TRACKING.mdbug-hunt/ branch