| name | testing-services |
| description | Writes unit tests for Python service classes using Arrange-Act-Assert pattern with proper mocking at boundaries. Tests behavior, not implementation. Mocks external systems only (API calls, file I/O, databases). Use when writing tests for services or fixing test coverage. |
| user-invocable | true |
| argument-hint | [service-class-name] |
Testing Services
When to Use This Skill
Activate this skill when:
- Writing unit tests for service classes
- Need to improve test coverage for services
- Unclear what to mock vs what to test directly
- Writing integration tests that test service interactions
Quick Start
Running Tests Locally
pytest tests/unit/ -v
pytest tests/integration/ -v
pytest tests/unit/ tests/integration/ -v
pytest tests/unit/ tests/integration/ --cov=src --cov-report=term-missing --cov-report=html
pytest tests/unit/services/test_entity_service.py -v
pytest tests/unit/services/test_entity_service.py::TestEntityService::test_create_entity_success -v
Understanding Test Results
pytest tests/ --cov=src --cov-report=term-missing
pytest tests/ --cov=src --cov-report=html
open htmlcov/index.html
xdg-open htmlcov/index.html
Quick Reference
AAA Pattern
def test_service_operation():
mock_dependency = Mock(spec=Dependency)
mock_dependency.method.return_value = "expected"
service = ServiceClass(mock_dependency)
result = service.operation(param="value")
assert result == "expected_outcome"
mock_dependency.method.assert_called_once_with("value")
Running Tests
pytest tests/ -v
pytest tests/unit/test_service.py -v
pytest tests/ --cov=src --cov-report=term-missing
pytest tests/ -k "test_create" -v
Coverage Guidelines
- Aim for 85%+ overall coverage
- Focus on testing behavior, not hitting targets
- Missing coverage acceptable for edge cases or trivial code
Test Isolation and Independence
Every test should be completely independent. Tests must not:
- Rely on execution order
- Share mutable state
- Depend on previous test results
- Affect other tests when run in parallel
Example
class TestTaskManagement:
def test_find_first_task(self, tmp_path):
"""Should find the first unchecked task"""
spec_file = tmp_path / "spec.md"
spec_file.write_text("""
- [ ] Task 1
- [ ] Task 2
""")
result = find_next_available_task(str(spec_file))
assert result == (1, "Task 1")
shared_state = {"count": 0}
def test_increment():
shared_state["count"] += 1
assert shared_state["count"] == 1
Mock at Boundaries Rule
The Golden Rule: Mock external systems only, never internal logic.
What to Mock ✅
External systems and infrastructure:
- HTTP API clients, database connections, repositories
- File system operations, subprocess calls
- External service clients (email, payment, etc.)
Service dependencies:
- Other services injected via constructor (enables isolation testing)
What NOT to Mock ❌
Internal logic:
- Domain models and dataclasses (see domain-modeling skill for understanding what domain models are)
- Pure functions, helper methods, static methods
- Business logic calculations
- Exceptions and error classes
Standard library basics:
- datetime (use fixed times), string/list/dict operations
Decision Guide
Ask: "Is this an external dependency or internal logic?"
- External dependency → Mock it
- Internal logic → Test it directly
Service Test Templates
Testing a Core Service
from unittest.mock import Mock
import pytest
from src.services.entity_service import EntityService
from src.domain.entity import Entity
from src.domain.exceptions import ValidationError
class TestEntityService:
"""Tests for EntityService."""
def test_create_entity_success(self):
mock_repository = Mock()
mock_validator = Mock()
mock_validator.is_valid.return_value = True
service = EntityService(mock_repository, mock_validator)
result = service.create_entity(name="Test Entity", value=42)
assert result.name == "Test Entity"
assert result.value == 42
mock_validator.is_valid.assert_called_once()
mock_repository.save.assert_called_once()
def test_create_entity_validation_failure(self):
mock_repository = Mock()
mock_validator = Mock()
mock_validator.is_valid.return_value = False
service = EntityService(mock_repository, mock_validator)
with pytest.raises(ValidationError, match="Invalid entity data"):
service.create_entity(name="Bad", value=-1)
mock_repository.save.assert_not_called()
def test_find_by_criteria_filters_correctly(self):
mock_repository = Mock()
entities = [
Entity(name="Low", value=10),
Entity(name="High", value=100),
]
mock_repository.list_all.return_value = entities
service = EntityService(mock_repository, Mock())
result = service.find_by_criteria(min_value=50)
assert len(result) == 1
assert result[0].name == "High"
Testing a Composite Service
from unittest.mock import Mock
import pytest
from src.services.workflow_service import WorkflowService
from src.domain.exceptions import WorkflowError, ValidationError
class TestWorkflowService:
"""Tests for WorkflowService."""
def test_execute_workflow_success(self):
mock_entity_service = Mock()
mock_notification_service = Mock()
mock_audit_service = Mock()
mock_entity = Mock(id="entity-123")
mock_entity_service.create_entity.return_value = mock_entity
service = WorkflowService(
mock_entity_service,
mock_notification_service,
mock_audit_service
)
result = service.execute_workflow(name="Test", value=100, user_id="user-456")
assert result.success is True
assert result.entity_id == "entity-123"
mock_entity_service.create_entity.assert_called_once_with("Test", 100)
mock_notification_service.notify_entity_created.assert_called_once()
mock_audit_service.log_success.assert_called_once()
def test_execute_workflow_handles_failure(self):
mock_entity_service = Mock()
mock_entity_service.create_entity.side_effect = ValidationError("Invalid")
mock_notification_service = Mock()
mock_audit_service = Mock()
service = WorkflowService(
mock_entity_service,
mock_notification_service,
mock_audit_service
)
with pytest.raises(WorkflowError, match="Entity creation failed"):
service.execute_workflow(name="Bad", value=-1, user_id="user-456")
mock_audit_service.log_failure.assert_called_once()
mock_notification_service.notify_entity_created.assert_not_called()
def test_calculate_priority_static_method(self):
result = WorkflowService.calculate_priority(value=200, user_tier="premium")
assert result == 4
result = WorkflowService.calculate_priority(value=150, user_tier="standard")
assert result == 1
Layer-Based Testing Strategy
Python applications often follow a layered architecture, and we test each layer differently:
┌─────────────────────────────────────────┐
│ CLI Layer │ ← Test command orchestration
│ (commands: prepare, finalize, etc.) │ Mock everything below
├─────────────────────────────────────────┤
│ Service Layer │ ← Test business logic
│ (services: task mgmt, reviewer mgmt) │ Mock infrastructure
├─────────────────────────────────────────┤
│ Infrastructure Layer │ ← Test external integrations
│ (git, github, filesystem) │ Mock external systems
├─────────────────────────────────────────┤
│ Domain Layer │ ← Test directly
│ (models, config, exceptions) │ Minimal/no mocking
└─────────────────────────────────────────┘
Key insight: The lower the layer, the less you mock. Domain models are pure logic and need minimal mocking. CLI commands orchestrate everything and need maximum mocking.
Domain Layer (99% coverage target)
Testing approach:
- ✅ Direct testing - Minimal mocking
- ✅ Test business rules - Validation logic, data transformations
- ✅ Test edge cases - Invalid inputs, boundary conditions
What to mock: Almost nothing. Domain models are pure logic.
Example:
class TestMarkdownFormatter:
"""Test suite for MarkdownFormatter functionality"""
def test_bold_formatting_for_github(self):
"""Should format text as bold using GitHub markdown syntax"""
formatter = MarkdownFormatter(for_slack=False)
result = formatter.bold("important")
assert result == "**important**"
Infrastructure Layer (97% coverage target)
Testing approach:
- ✅ Mock external systems - subprocess, HTTP, file I/O
- ✅ Test success and error paths - What happens on success? On failure?
- ✅ Verify system calls - Check correct commands are executed
What to mock: Everything external to Python (subprocess, network, filesystem)
Service Layer (95% coverage target)
Testing approach:
- ✅ Mock infrastructure - Mock git, GitHub API, filesystem
- ✅ Test business logic - Algorithms, data processing, validation
- ✅ Test service instantiation - Verify services are created with correct dependencies
- ✅ Test edge cases - No reviewers available, all tasks complete, etc.
What to mock: Infrastructure layer (subprocess, GitHub API, file I/O) and service dependencies
CLI Integration Tests (98% coverage target)
Testing approach:
- ✅ Mock service layer - Mock service classes and their methods
- ✅ Mock infrastructure - Mock git, GitHub API, filesystem operations
- ✅ Test command orchestration - Verify correct sequence of service calls
- ✅ Test service instantiation - Verify services are created with correct dependencies
- ✅ Test output - Verify CLI outputs are formatted correctly
What to mock: Service Layer classes and Infrastructure dependencies
Key Testing Patterns
Pattern: Use Real Domain Models
def test_process_order():
mock_repository = Mock()
service = OrderService(mock_repository)
result = service.process_order(customer_id="c123", items=["item1"], total=99.99)
assert isinstance(result, Order)
assert result.customer_id == "c123"
assert result.total == 99.99
Pattern: Test Exception Handling
def test_find_user_not_found():
mock_repository = Mock()
mock_repository.get_by_id.return_value = None
service = UserService(mock_repository)
with pytest.raises(UserNotFoundError, match="User u999 not found"):
service.find_user("u999")
Pattern: Verify Side Effects
def test_send_notification():
mock_email_client = Mock()
service = NotificationService(mock_email_client)
service.send_notification(user_email="user@example.com", message="Hello!")
mock_email_client.send.assert_called_once_with(
to="user@example.com",
subject="Notification",
body="Hello!"
)
Pattern: Parametrized Tests
import pytest
@pytest.mark.parametrize("value,tier,expected", [
(100, "premium", 2),
(100, "standard", 1),
(200, "premium", 4),
])
def test_calculate_priority(value, tier, expected):
result = WorkflowService.calculate_priority(value, tier)
assert result == expected
Mock vs MagicMock vs Real Objects
Use Mock(spec=ClassName)
For most scenarios - provides autocomplete and catches attribute errors.
mock_repository = Mock(spec=UserRepository)
mock_repository.find_by_id.return_value = User(id="123")
Use MagicMock
When you need magic methods (__len__, __iter__, __getitem__).
from unittest.mock import MagicMock
mock_collection = MagicMock()
mock_collection.__len__.return_value = 5
Use Real Objects
For domain models, value objects, exceptions.
user = User(id="123", name="Alice")
order = Order(items=[Item("book")])
mock_service.process.side_effect = ValidationError("Bad data")
Fixtures for Reusable Setup
import pytest
from unittest.mock import Mock
@pytest.fixture
def mock_repository():
return Mock(spec=UserRepository)
@pytest.fixture
def user_service(mock_repository):
return UserService(mock_repository)
def test_create_user(user_service, mock_repository):
mock_repository.save.return_value = User(id="123")
result = user_service.create_user(name="Alice")
assert result.id == "123"
Integration Tests
Integration tests verify services work together - only mock infrastructure.
def test_workflow_integration():
mock_database = Mock()
mock_email_client = Mock()
entity_service = EntityService(mock_database)
notification_service = NotificationService(mock_email_client)
workflow_service = WorkflowService(entity_service, notification_service)
result = workflow_service.execute_workflow(name="Test", value=100, user_id="u123")
assert result.success is True
mock_database.save.assert_called()
mock_email_client.send.assert_called()
One Concept Per Test
Each test should verify ONE specific behavior. If a test has multiple unrelated assertions, split it into separate tests.
Why? When a test fails, you should immediately know what broke. Multi-concept tests make debugging harder.
Exception for E2E Tests: End-to-end tests are expensive and slow to run (often taking minutes). For E2E tests, it's acceptable and encouraged to verify multiple related aspects of a workflow in a single test to minimize execution time. For example, an E2E test that triggers a workflow can verify:
- The workflow completes successfully
- A PR was created
- The PR has correct labels
- The PR has an AI summary
- The PR has cost information
This is a pragmatic trade-off: E2E tests prioritize execution speed over granular isolation, while unit/integration tests maintain strict one-concept-per-test discipline.
Example
def test_user_service():
service = UserService(mock_repo)
user = service.create_user("Alice")
assert user.name == "Alice"
found = service.find_user(user.id)
service.delete_user(user.id)
def test_create_user_sets_name():
service = UserService(mock_repo)
user = service.create_user("Alice")
assert user.name == "Alice"
def test_find_user_returns_existing():
service = UserService(mock_repo)
mock_repo.get_by_id.return_value = User(id="123")
found = service.find_user("123")
assert found.id == "123"
Coverage Guidelines
Current Coverage Status
Target: 85%+ overall coverage with meaningful tests that provide confidence in refactoring and catch real regressions.
What to Test
High Priority (90%+ coverage):
- Business logic (task management, data processing algorithms)
- Command orchestration (CLI commands, workflow coordination)
- Configuration validation
- Error handling paths
Medium Priority (70%+ coverage):
- Utilities and formatters
- API integrations (GitHub, external services)
- File operations and artifact handling
Low Priority (may skip):
- Simple getters/setters
- Third-party library wrappers
- CLI argument parsing (tested via E2E)
- Entry points (
__main__.py - tested via E2E)
Viewing Coverage Reports
pytest tests/ --cov=src --cov-report=html
open htmlcov/index.html
xdg-open htmlcov/index.html
pytest tests/ --cov=src --cov-report=term-missing
Coverage reports show which lines are tested and which are not, helping identify gaps in test coverage.
Common Anti-Patterns
❌ Mocking Internal Logic
def test_bad():
service = UserService(mock_repo)
with patch.object(UserService, 'validate_email', return_value=True):
service.create_user(email="test@example.com")
def test_good():
service = UserService(mock_repo)
service.create_user(email="test@example.com")
❌ Mocking Domain Models
mock_user = Mock(spec=User)
user = User(id="123", name="Alice")
❌ Testing Implementation Details
def test_bad():
service.create_user("Alice")
assert service._validate_name.called
def test_good():
user = service.create_user("Alice")
assert user.name == "Alice"
❌ Not Using spec= Parameter
mock_repo = Mock()
mock_repo.typo_method.return_value = "data"
mock_repo = Mock(spec=UserRepository)
Coverage Guidance
Skip testing:
- Trivial getters/setters
- Simple
__repr__ or __str__
- Framework boilerplate
- One-line defensive code
Focus coverage on:
- Business logic in services
- Complex conditionals and loops
- Error handling for important cases
- Integration points between services
Test Organization
Tests should follow consistent organization conventions for maintainability. See the python-code-style skill for code organization conventions that also apply to test files.
Directory Structure
Tests mirror the src/ directory structure:
tests/
├── conftest.py # Shared fixtures
├── unit/
│ ├── domain/ # Domain layer tests
│ │ ├── test_config.py
│ │ ├── test_models.py
│ │ └── test_exceptions.py
│ ├── infrastructure/ # Infrastructure layer tests
│ │ ├── git/
│ │ │ └── test_operations.py
│ │ ├── github/
│ │ │ ├── test_operations.py
│ │ │ └── test_actions.py
│ │ └── filesystem/
│ │ └── test_operations.py
│ └── services/ # Service layer tests
│ ├── formatters/
│ │ └── test_table_formatter.py
│ ├── test_entity_service.py
│ ├── test_workflow_service.py
│ └── test_statistics_service.py
├── integration/ # Integration tests
│ └── cli/ # CLI command integration tests
│ └── commands/
│ ├── test_prepare.py
│ ├── test_finalize.py
│ └── test_statistics.py
├── e2e/ # End-to-end tests
└── builders/ # Test helpers/factories
Naming Conventions
Class Names: TestModuleName or TestFunctionName
- ✅
TestEntityService
- ✅
TestCheckUserCapacity
- ❌
ServiceTests (wrong suffix)
- ❌
TestServices (too vague)
Test Method Names: test_<what>_<when>_<condition>
- ✅
test_find_user_returns_none_when_all_at_capacity
- ✅
test_create_branch_raises_error_when_branch_exists
- ✅
test_parse_data_extracts_id_and_name
- ❌
test_user (not descriptive)
- ❌
test_find_user_works (vague)
Fixture Names: <resource>_<state> or mock_<service>
- ✅
user_config, sample_spec_file, mock_github_api
- ❌
data, setup, fixture1
File Names: test_<module_name>.py
- ✅
test_entity_service.py
- ✅
test_operations.py
- ❌
entity_tests.py
CI/CD Integration
Automated Testing
Tests run automatically on:
- Every push to main branch
- Every pull request
- Via GitHub Actions workflow
Test Workflow Example
name: Run Tests
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: Install dependencies
run: |
pip install -r requirements.txt
pip install pytest pytest-cov
- name: Run tests with coverage
run: |
pytest tests/unit/ tests/integration/ \
--cov=src \
--cov-report=term-missing \
--cov-report=html \
--cov-fail-under=70
- name: Upload coverage report
uses: actions/upload-artifact@v3
with:
name: coverage-report
path: htmlcov/
PR Requirements
- All tests must pass
- Coverage must meet minimum threshold (typically 70%)
- New features require tests
- Bug fixes should include regression tests
Related Skills
- creating-services: For understanding service constructor patterns that you'll be testing
- domain-modeling: For understanding domain models and why not to mock them
- python-code-style: For code organization conventions that apply to test files
- identifying-layer-placement: For understanding which layer your code belongs to and how to test it
Further Reading