// Comprehensive guide for implementing AIDB tests following E2E-first philosophy,
| name | testing-strategy |
| description | Comprehensive guide for implementing AIDB tests following E2E-first philosophy, DebugInterface abstraction, and MCP response health standards |
| version | 1.0.0 |
| tags | ["testing","e2e","integration","unit","mcp","framework","debugging"] |
Priority: E2E → Integration → Unit (Highest ROI First)
All tests MUST be run via ./dev-cli test run:
./dev-cli test run -s {suite} [-k 'pattern'] [-l {lang}]
-k and -l flags supported--local - suites know their natural execution environment; forcing local causes unexpected behaviorpytest invocation is NOT supportedThis skill guides you through creating and modifying tests for the AIDB project. The test infrastructure is complete - your job is to implement tests using proven patterns.
When implementing tests, you may also need:
For complete test architecture, see test infrastructure in src/tests/.
Why E2E First?
Testing Priority:
Goal: Verify we can launch and debug programs using various frameworks
Pattern:
We're testing that AIDB works WITH frameworks, not testing the frameworks themselves.
Don't just validate structure - validate content accuracy and payload efficiency.
Bad test:
assert "locals" in response["data"] # Structure only
Good test:
# Structure
assert "locals" in response["data"]
# Content accuracy
assert response["data"]["locals"]["x"]["value"] == 10
assert response["data"]["locals"]["x"]["line"] == 5
# Efficiency (no junk)
assert len(response["data"]) <= 3 # No bloated payloads
assert len(response["summary"]) <= 200 # Concise summaries
Why critical:
${workspaceFolder}, etc.)Test thoroughly:
Critical Breakpoint Timing:
Set breakpoints when STARTING sessions (not after) to avoid race conditions with fast-executing programs:
# ✅ CORRECT
await debug_interface.start_session(program=prog, breakpoints=[{"line": 10}])
# ❌ WRONG: Race condition
await debug_interface.start_session(program=prog)
await debug_interface.set_breakpoint(file=prog, line=10) # May be too late!
Exception: Long-running processes (servers) where you attach. Reference: src/tests/aidb_shared/e2e/test_complex_workflows.py
The cornerstone of our test strategy is the DebugInterface abstraction - a unified API that works with both MCP tools and the direct API.
For implementation details, see:
src/tests/_interfaces/ - Debug interface source and docstringsWhy? One test validates both entry points.
Hypothetical Example (illustrates pattern, not a real test file):
from tests._helpers.parametrization import parametrize_interfaces
class TestBreakpoints(BaseE2ETest):
@parametrize_interfaces # Runs twice: MCP and API
@pytest.mark.asyncio
async def test_set_breakpoint(self, debug_interface, simple_program):
"""Test works with BOTH MCP and API."""
await debug_interface.start_session(program=simple_program)
bp = await debug_interface.set_breakpoint(
file=simple_program,
line=5
)
self.verify_bp.verify_breakpoint_verified(bp)
await debug_interface.stop_session()
Key Points:
@parametrize_interfaces runs test with both MCP and APIThe shared suite is AIDB's language-agnostic test foundation that validates core debugging capabilities across all supported languages using normalized, programmatically generated test programs.
Key Innovation: Semantic markers that map identical logic to language-specific line numbers.
Location: src/tests/aidb_shared/ (integration/ + e2e/)
What it tests:
For complete details, see DebugInterface.
Use shared suite when:
Use framework tests when:
src/tests/
├── aidb_shared/ # ⭐ Shared suite: language-agnostic debug fundamentals
│ ├── integration/ # Core debug operations (breakpoints, stepping, variables)
│ └── e2e/ # Complex workflows, parallel sessions
├── aidb/ # Core API tests - organized by component
│ ├── adapters/ # Adapter-specific tests
│ ├── api/ # Public API tests
│ ├── audit/ # Audit logging tests
│ ├── dap/ # DAP client tests
│ └── session/ # Session management tests
├── aidb_mcp/ # MCP server tests - organized by component
├── frameworks/ # Framework integration tests
│ ├── python/ # Flask, FastAPI, pytest
│ ├── javascript/ # Express, Jest
│ └── java/ # Spring Boot, JUnit
├── _helpers/ # Test helpers and utilities
├── _fixtures/ # Shared fixtures
│ └── unit/ # ⭐ Unit test infrastructure (see below)
└── _assets/ # Test programs and data
├── framework_apps/ # Framework test applications
└── test_programs/ # Generated programs for shared suite
The centralized unit test infrastructure at src/tests/_fixtures/unit/ provides reusable mocks, builders, and fixtures:
_fixtures/unit/
├── builders/ # DAPRequestBuilder, DAPResponseBuilder, DAPEventBuilder
├── dap/ # Transport, events, receiver mocks
├── session/ # Registry, lifecycle, state, child_manager mocks
├── adapter/ # Port, process, launch_orchestrator mocks
├── mcp/ # DebugAPI, MCPSessionContext mocks
├── api/ # Session, launch_config, breakpoint mocks
├── conftest.py # Master fixture re-exports
├── context.py # mock_ctx, null_ctx, tmp_storage
└── assertions.py # UnitAssertions class
Usage Pattern:
# In domain conftest.py (e.g., src/tests/aidb/dap/unit/conftest.py)
from tests._fixtures.unit.conftest import * # noqa: F401, F403
from tests._fixtures.unit.builders import DAPEventBuilder, DAPResponseBuilder
# In test file
def test_something(mock_ctx, mock_transport):
event = DAPEventBuilder.stopped_event(reason="breakpoint")
# ...
Key Components:
Test suites run in different environments based on their requirements:
Local-Only Suites (no Docker):
cli - CLI command testsmcp - MCP server unit/integration testscore - Core AIDB API testscommon - Common utilities testslogging - Logging framework testsci_cd - CI/CD workflow testsDocker Suites (require containers):
shared - Multi-language shared tests (parallel language containers)frameworks - Framework integration tests (parallel language containers)launch - Launch config tests (parallel language containers)Why the split?
shared/frameworks/launchRunning tests:
./dev-cli test run -s mcp # Local execution
./dev-cli test run -s shared # Docker execution
Always use existing infrastructure:
BaseE2ETest, BaseIntegrationTest, FrameworkDebugTestBase@parametrize_interfaces, @parametrize_languagesself.verify_bp, self.verify_exec, MCPAssertionsStopReason, TestTimeouts, MCPToolFor complete details, see E2E Patterns.
Study these real tests before writing new ones:
test_flask_debugging.py, test_fastapi_debugging.py, test_pytest_debugging.pytest_express_debugging.py, test_jest_debugging.pytest_springboot_debugging.py, test_junit_debugging.pytest_launch_variable_resolution.pytest_session_target_handling.pyFor complete file paths and patterns, see E2E Patterns.
For hypothetical examples illustrating common patterns, see E2E Patterns.
Key patterns covered:
Look at E2E Patterns:
Don't start from scratch:
Don't create:
conftest.py files first)constants.py)Do create:
Current State: No performance baselines exist yet
Phase 1: Establish baselines
Phase 2: Regression testing
For now: Focus on functional correctness, not performance.
@parametrize_interfaces for MCP/API coverageFrameworkDebugTestBasetest_launch_via_api()test_launch_via_vscode_config()test_dual_launch_equivalence()framework_name attributeCRITICAL: When tests fail, check logs BEFORE attempting fixes.
See: Debugging Failures for log locations, investigation workflow, and common patterns.
For CI test failures, use the ci-cd-workflows skill's troubleshooting guide.
| Resource | Content |
|---|---|
| E2E Patterns | Test patterns, markers, code reuse, working examples |
| Framework Tests | Dual-launch pattern, Flask/Express examples |
| DebugInterface | Unified API abstraction, shared suite architecture |
| Debugging Failures | Log locations, investigation workflow, common issues |
Test Infrastructure: src/tests/ (see _interfaces/, _fixtures/, _helpers/ for core components)
wip/test-implementation-backlog/CONTEXT.mdtest_flask_debugging.py) and Express (test_express_debugging.py)Internal Documentation:
src/tests/ - Test infrastructure (see _interfaces/, _fixtures/, _helpers/)docs/developer-guide/overview.md - System architectureCode References:
src/aidb/dap/protocol.py (fully typed)src/tests/_helpers/ and src/tests/_fixtures/Remember: