| name | Auditing Test Quality |
| description | Automates test quality assessment, identifies vanity tests, and guides systematic improvement of test suites. Use when reviewing test suites, identifying shallow tests, enforcing behavioral test standards, or when the user mentions test quality, vanity tests, or test effectiveness. |
Auditing Test Quality
Systematically audit test quality to identify vanity tests and guide improvement.
What you should do
Automate test quality assessment, identify vanity tests, and guide systematic improvement of test suites.
Prerequisites
- Test audit script available (usually
scripts/audit_tests.py or make audit-tests)
- Project follows behavioral test naming conventions
- Mock and assertion counting tools configured
Workflow
Step 1: Run test quality audit
python scripts/audit_tests.py
make audit-tests
npm run audit:tests
Expected output format:
Test Quality Audit Report
=========================
ISSUES FOUND: 12
tests/unit/test_user_service.py:
- test_save_user: Non-behavioral test name
- test_get_user: Too many mocks (7)
- test_update_user: Poor mock-to-assertion ratio (1/4)
tests/integration/test_api.py:
- test_endpoint: Non-behavioral test name
- test_auth: Poor mock-to-assertion ratio (2/6)
SUMMARY:
Total tests examined: 45
Tests with issues: 12 (26.7%)
Issues by type:
- Non-behavioral names: 8
- Too many mocks: 2
- Poor mock ratio: 4
- Vanity tests: 3
Step 2: Prioritize issues by severity
High priority (fix immediately):
- Tests with >5 mocks
- Tests with only mock assertions (no behavior verification)
- Tests that would pass with broken functionality
Medium priority (fix this sprint):
- Tests with poor mock-to-assertion ratio (<3:1)
- Tests with implementation-focused names
- Tests mocking internal business logic
Low priority (fix when touching code):
- Tests with vague names but correct behavior
- Tests with minor mock discipline violations
Step 3: Fix high-priority issues
For each high-priority test:
- Analyze the test purpose: What behavior should it verify?
- Identify true dependencies: What actually needs mocking?
- Rewrite the test:
- Use behavioral naming:
test_should_X_when_Y
- Reduce mocks to external dependencies only
- Add meaningful assertions about outcomes
- Verify the test fails when functionality breaks
Example transformation:
def test_save_user():
mock_db = Mock()
service = UserService(mock_db)
service.save(user_data)
mock_db.save.assert_called_once_with(user_data)
def test_should_return_user_id_when_saving_valid_user():
service = UserService(in_memory_repository)
user_data = {"name": "John", "email": "john@example.com"}
user_id = service.save(user_data)
assert user_id is not None
assert service.find_by_id(user_id).name == "John"
assert service.find_by_id(user_id).email == "john@example.com"
Step 4: Fix medium-priority issues
Improving mock-to-assertion ratios:
def test_user_creation():
mock_db = Mock()
mock_email = Mock()
mock_logger = Mock()
service = UserService(mock_db, mock_email, mock_logger)
service.create_user({"email": "test@example.com"})
mock_db.save.assert_called_once()
def test_should_create_user_and_send_welcome_email_when_valid_data():
mock_db = Mock()
mock_email = Mock()
mock_logger = Mock()
service = UserService(mock_db, mock_email, mock_logger)
result = service.create_user({"email": "test@example.com"})
assert result["status"] == "created"
assert result["user_id"] is not None
mock_db.save.assert_called_once()
mock_email.send_welcome.assert_called_once_with("test@example.com")
assert mock_logger.info.call_count >= 1
Improving test names:
def test_uses_bcrypt_hash()
def test_calls_api_endpoint()
def test_saves_to_database()
def test_should_hash_password_when_user_registers()
def test_should_return_user_data_when_valid_token_provided()
def test_should_persist_user_when_registration_completes()
Step 5: Remove vanity tests
For tests that only verify mocks were called:
- Check if behavior is covered elsewhere: If yes, delete the vanity test
- If not covered: Rewrite to test actual behavior
- Consider integration tests: Some behaviors need higher-level testing
Step 6: Re-run audit and verify improvements
python scripts/audit_tests.py
make test
Step 7: Update audit baseline
python scripts/audit_tests.py --update-baseline
git add audit_results.json
git commit -m "test: update quality audit baseline after improvements"
Automation setup
Pre-commit hook
repos:
- repo: local
hooks:
- id: test-quality-audit
name: Test quality audit
entry: python scripts/audit_tests.py
language: system
pass_filenames: false
files: ^tests/.*\.py$
CI integration
name: Test Quality
on: [push, pull_request]
jobs:
audit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: 3.9
- name: Run test quality audit
run: |
python scripts/audit_tests.py
if [ $? -ne 0 ]; then
echo "Test quality issues found. See report above."
exit 1
fi
Makefile target
audit-tests: ## Run test quality audit
@echo "Running test quality audit..."
@python scripts/audit_tests.py
@if [ $$? -eq 0 ]; then \
echo "✅ All tests pass quality checks"; \
else \
echo "❌ Test quality issues found"; \
exit 1; \
fi
Audit script customization
Language-specific configurations
Python (pytest):
MOCK_PATTERNS = ['mock', 'Mock', 'patch', 'MagicMock']
ASSERTION_PATTERNS = ['assert', 'assertEqual', 'assertTrue']
BEHAVIORAL_KEYWORDS = ['should', 'when', 'given', 'returns', 'rejects']
JavaScript (Jest):
const MOCK_PATTERNS = ['jest.fn', 'mock', 'spy', 'stub'];
const ASSERTION_PATTERNS = ['expect', 'toBe', 'toEqual', 'assert'];
const BEHAVIORAL_KEYWORDS = ['should', 'when', 'given', 'returns', 'rejects'];
Java (JUnit):
List<String> MOCK_PATTERNS = Arrays.asList("mock", "spy", "when", "verify");
List<String> ASSERTION_PATTERNS = Arrays.asList("assert", "verify", "assertEquals");
Custom quality rules
Projects can extend the audit with domain-specific rules:
- Performance test naming patterns
- Database test cleanup verification
- API test status code checking
- Security test coverage requirements
Quality metrics tracking
Track improvements over time:
python scripts/audit_tests.py --report=json > quality_report_$(date +%Y%m%d).json
Benefits
- Prevent test debt: Catch quality issues before they accumulate
- Improve debugging: Better tests make failures easier to understand
- Reduce maintenance: Behavioral tests are more resilient to refactoring
- Increase confidence: Quality tests actually catch regressions
- Standardize practices: Consistent quality across team members