// Comprehensive pytest testing and debugging framework. Use when running tests, debugging failures, fixing broken tests, or investigating test errors. Includes systematic investigation workflow with external AI tool consultation and verification strategies.
| name | run-tests |
| description | Comprehensive pytest testing and debugging framework. Use when running tests, debugging failures, fixing broken tests, or investigating test errors. Includes systematic investigation workflow with external AI tool consultation and verification strategies. |
This skill provides a systematic approach to running tests and debugging failures using pytest. The core workflow integrates investigation, external tool consultation, and verification to efficiently resolve test failures.
Key capabilities:
This skill may run operations that take up to 5 minutes. Be patient and wait for completion.
Bash(command="...", timeout=300000)run_in_background=True for test suites, builds, or analysisPolling BashOutput repeatedly creates spam and degrades user experience. Long operations should run in foreground with appropriate timeout, not in background with frequent polling.
# Test suite that might take 5 minutes (timeout in milliseconds)
result = Bash(command="pytest src/", timeout=300000) # Wait up to 5 minutes
# The command will block here until completion - this is correct behavior
# Don't use background + polling
bash_id = Bash(command="pytest", run_in_background=True)
output = BashOutput(bash_id) # Creates spam!
5-Phase Process:
Key principles:
Quick decision guide:
If unfamiliar with test organization:
# Quick summary
sdd test discover --summary
# Directory tree
sdd test discover --tree
# Quick run (stop on first failure)
sdd test run --quick
# Debug mode (verbose with locals and prints)
sdd test run --preset-debug
# Run specific test
sdd test run tests/test_module.py::test_function
# Coverage report
sdd test run --coverage
# List all presets
sdd test run --list
Or use pytest directly:
pytest -v # Verbose
pytest -vv -l -s # Very verbose, show locals, show prints
pytest -x # Stop on first failure
pytest -k "test_user" # Run tests matching pattern
For large test suites with many failures:
# Save output to timestamped file
sdd test run --preset-debug | tee /tmp/test-run-$(date +%Y%m%d-%H%M%S).log
For each failure:
When available: If codebase documentation exists (generated by sdd doc generate), use it for faster investigation.
Check availability:
sdd doc stats
Useful commands when debugging:
# Search for functions or concepts
sdd doc search "authentication"
# Show function definition
sdd doc show-function AuthService.login
# Find dependencies
sdd doc list-dependencies src/services/authService.ts
# Find what depends on a file (impact analysis)
sdd doc dependencies --reverse src/auth.py
Benefits:
If not available: Continue with standard file exploration. Run sdd doc generate to create documentation for future use.
CRITICAL: This is mandatory for test failures when external tools exist.
sdd test check-tools
Decision:
All external tools operate in read-only mode. They analyze and suggest; YOU implement all fixes.
# Auto-route based on failure type
sdd test consult assertion --error "Full error message" --hypothesis "Your theory about the cause"
# Include code for context
sdd test consult exception --error "AttributeError: ..." --hypothesis "Missing return" --test-code tests/test_file.py --impl-code src/module.py
# Show routing matrix
sdd test consult --list-routing
# Manual tool selection
sdd test consult --tool gemini --prompt "Custom question..."
| Tool | Best For | Example Use |
|---|---|---|
| Gemini | Hypothesis validation, framework explanations, strategic guidance | "Why is this fixture not found?" |
| Codex | Code-level review, specific fix suggestions | "Review this code and suggest fixes" |
| Cursor | Repo-wide discovery, finding patterns | "Find all call sites" |
Use multi-agent consultation for:
# Auto-selects configured consensus agents
sdd test consult assertion --error "..." --hypothesis "..." --multi-agent
# Override which agents participate (comma-separated list)
sdd test consult exception --error "..." --hypothesis "..." --multi-agent --agents gemini,codex
Combine insights from:
# Make targeted changes using Edit tool
# Example: Add missing return statement
# Run the specific fixed test
sdd test run tests/test_module.py::test_function
# If passing, run full suite
sdd test run
# Verify no regressions
pytest tests/ -v
Add comments explaining:
Check availability of external tools and get routing suggestions.
# Basic check
sdd test check-tools
# Get routing for specific failure type
sdd test check-tools --route assertion
sdd test check-tools --route fixture
Smart pytest runner with presets for common scenarios.
# List all presets
sdd test run --list
# Presets
sdd test run --quick # Stop on first failure
sdd test run --preset-debug # Verbose + locals + prints
sdd test run --coverage # Coverage report
sdd test run --fast # Skip slow tests
sdd test run --parallel # Run in parallel
# Run specific test
sdd test run tests/test_file.py::test_name
External tool consultation with auto-routing.
# Auto-route based on failure type
sdd test consult {assertion|exception|fixture|import|timeout|flaky} --error "..." --hypothesis "..."
# Include code
sdd test consult exception --error "..." --hypothesis "..." --test-code tests/test.py --impl-code src/module.py
# Multi-agent mode
sdd test consult assertion --error "..." --hypothesis "..." --multi-agent
# Manual tool selection
sdd test consult --tool {gemini|codex|cursor} --prompt "..."
# Show routing matrix
sdd test consult --list-routing
# Dry run
sdd test consult fixture --error "..." --hypothesis "..." --dry-run
Test structure analyzer and discovery.
# Quick summary
sdd test discover --summary
# Directory tree
sdd test discover --tree
# All fixtures
sdd test discover --fixtures
# All markers
sdd test discover --markers
# Detailed analysis
sdd test discover --detailed
# Analyze specific directory
sdd test discover tests/unit --summary
Available on all commands:
--no-color - Disable colored output--verbose, -v - Show detailed output--quiet, -q - Minimal output (errors only)# Run test multiple times
pytest tests/test_flaky.py --count=10
# Run with random order
pytest --random-order
# Show fixture setup and teardown
pytest --setup-show tests/test_module.py
# List available fixtures
pytest --fixtures
Common fixture problems:
Check in order:
Quick reference for which tool to use based on failure type:
| Failure Type | Primary Tool | Secondary (if needed) | Why |
|---|---|---|---|
| Assertion mismatch | Codex | Gemini | Code-level bug analysis |
| Exceptions | Codex | Gemini | Precise code review |
| Import/packaging | Gemini | Cursor | Framework expertise |
| Fixture issues | Gemini | Cursor | Pytest scoping knowledge |
| Timeout/performance | Gemini + Cursor | - | Strategy + pattern discovery |
| Flaky tests | Gemini + Cursor | - | Diagnosis + state dependencies |
| Multi-file issues | Cursor | Gemini | Discovery + synthesis |
| Unclear errors | Gemini | Web search | Explanation first |
Query type routing:
When running tests to verify refactoring:
# Run full suite
sdd test run
# If all pass: Done! No consultation needed.
# If tests fail: Follow standard debugging workflow
Key point: Passing verification runs require no consultation. Only investigate failures.
If two tools give different recommendations:
Use additional tools when:
Consultation timeouts:
.claude/ai_config.yaml (run-tests.consultation.timeout_seconds)When tools time out:
ps aux | grep <tool>| Recommended | If Unavailable | How to Compensate |
|---|---|---|
| Gemini | Codex or Cursor | Ask "why" with extra context; use web search |
| Codex | Gemini | Ask for very specific code examples |
| Cursor | Manual Grep + Gemini | Use Grep to find patterns, Gemini to analyze |
Multi-agent mode consults two agents in parallel and synthesizes their insights:
sdd test consult fixture --error "..." --hypothesis "..." --multi-agent
Output includes:
Benefits:
# Drop into debugger on failure
pytest --pdb
# Drop into debugger on first failure
pytest -x --pdb
# conftest.py
def pytest_configure(config):
config.addinivalue_line("markers", "slow: marks tests as slow")
config.addinivalue_line("markers", "integration: marks integration tests")
config.addinivalue_line("markers", "unit: marks unit tests")
# Usage
@pytest.mark.slow
def test_complex_calculation():
pass
from unittest.mock import Mock, patch
def test_api_call():
with patch('requests.get') as mock_get:
mock_get.return_value.json.return_value = {"status": "ok"}
result = fetch_data()
assert result["status"] == "ok"
mock_get.assert_called_once()
__init__.py files existsdd test run --parallel-v) for better visibility-x to stop on first failure when debugging-l flag to see local variablesFor test failures:
sdd test check-toolsSkip consultation when:
A test debugging session is successful when: