| name | integration-testing |
| description | Execute comprehensive integration testing of all EDPS skills working together, validate end-to-end workflows, and ensure seamless VS Code integration with performance and consistency validation. |
| license | MIT |
Integration & Testing Skill
Intent
Orchestrate comprehensive testing and validation of the complete EDPS skill ecosystem, ensuring all 22 skills integrate seamlessly, maintain consistent markdown input/output formats, meet performance standards, and provide excellent user experience within VS Code. This skill validates end-to-end workflows, detects integration issues, and ensures system reliability.
Inputs
- Skill Definitions:
.github/skills/*/SKILL.md (all skill definitions)
- Test Projects:
projects/*/ (sample projects for testing)
- Sample Data:
projects/*/artifacts/Sample Data/ (test inputs)
- Documentation:
README.md, skill documentation files
- Configuration: VS Code settings, extension configurations
Outputs
Files Generated:
projects/[test-project]/artifacts/Testing/integration-test-report.json - Structured test results
projects/[test-project]/artifacts/Testing/integration-test-report.md - Human-readable test report
projects/[test-project]/artifacts/Testing/performance-analysis.json - Performance metrics and analysis
projects/[test-project]/artifacts/Testing/workflow-validation-results.md - End-to-end workflow test results
projects/[test-project]/artifacts/Testing/markdown-consistency-report.md - Format consistency analysis
projects/[test-project]/artifacts/Testing/vscode-integration-test.md - VS Code integration validation
projects/[test-project]/artifacts/Testing/user-experience-assessment.md - UX evaluation results
JSON Structure (integration-test-report.json)
{
"test_execution": {
"test_suite_id": "integration-test-v1.0",
"execution_timestamp": "ISO8601",
"total_skills_tested": 22,
"test_environment": {
"vscode_version": "string",
"copilot_version": "string",
"os_platform": "Windows|Mac|Linux",
"test_project": "project_name"
},
"execution_time_total": "seconds",
"overall_status": "PASSED|FAILED|WARNING"
},
"skill_integration_tests": [
{
"skill_name": "requirements-ingest",
"test_status": "PASSED|FAILED|SKIPPED",
"execution_time": "seconds",
"input_validation": {
"format_check": "PASSED|FAILED",
"schema_validation": "PASSED|FAILED",
"error_handling": "PASSED|FAILED"
},
"output_validation": {
"json_structure": "PASSED|FAILED",
"markdown_format": "PASSED|FAILED",
"file_generation": "PASSED|FAILED",
"traceability_links": "PASSED|FAILED"
},
"integration_checks": {
"downstream_compatibility": "PASSED|FAILED",
"data_flow_integrity": "PASSED|FAILED"
},
"performance_metrics": {
"processing_time": "seconds",
"memory_usage": "MB",
"meets_standard": "boolean"
},
"issues": [
{
"severity": "CRITICAL|HIGH|MEDIUM|LOW",
"category": "performance|integration|format|functionality",
"description": "string",
"recommendation": "string"
}
]
}
],
"workflow_tests": [
{
"workflow_name": "complete_requirements_to_schedule",
"description": "Full pipeline from requirements ingestion to schedule generation",
"skill_chain": ["requirements-ingest", "goals-extract", "process-w5h", "domain-extractconcepts", "domain-alignentities", "domain-proposenewconcepts", "diagram-generatecollaboration", "process-scopemin", "plan-derivetasks", "plan-estimateeffort", "plan-buildschedule"],
"test_status": "PASSED|FAILED|WARNING",
"execution_time_total": "seconds",
"data_flow_validation": {
"input_to_output_consistency": "PASSED|FAILED",
"traceability_chain": "PASSED|FAILED",
"intermediate_file_validation": "PASSED|FAILED"
},
"workflow_steps": [
{
"step": 1,
"skill": "requirements-ingest",
"input_source": "sample_requirements.md",
"output_validation": "PASSED|FAILED",
"data_handoff": "PASSED|FAILED"
}
],
"end_to_end_metrics": {
"total_processing_time": "seconds",
"data_consistency_score": "0.0-1.0",
"error_rate": "percentage"
}
}
],
"performance_analysis": {
"overall_performance_rating": "EXCELLENT|GOOD|ACCEPTABLE|POOR",
"skills_meeting_standards": 22,
"skills_exceeding_targets": 5,
"performance_bottlenecks": [
{
"skill": "skill_name",
"bottleneck_type": "processing|memory|io",
"impact": "HIGH|MEDIUM|LOW",
"recommendation": "string"
}
],
"benchmark_results": {
"fastest_skill": {"name": "string", "time": "seconds"},
"slowest_skill": {"name": "string", "time": "seconds"},
"average_processing_time": "seconds",
"memory_efficiency_rating": "EXCELLENT|GOOD|ACCEPTABLE|POOR"
}
},
"consistency_validation": {
"markdown_format_consistency": "PASSED|FAILED",
"json_schema_compliance": "PASSED|FAILED",
"traceability_format_consistency": "PASSED|FAILED",
"file_naming_conventions": "PASSED|FAILED",
"cross_skill_reference_validation": "PASSED|FAILED",
"format_violations": [
{
"skill": "skill_name",
"violation_type": "markdown|json|naming|reference",
"description": "string",
"severity": "HIGH|MEDIUM|LOW"
}
]
},
"vscode_integration": {
"copilot_skill_loading": "PASSED|FAILED",
"skill_discovery": "PASSED|FAILED",
"workspace_file_integration": "PASSED|FAILED",
"error_handling_in_vscode": "PASSED|FAILED",
"user_prompt_responsiveness": "PASSED|FAILED",
"integration_issues": [
{
"component": "string",
"issue": "string",
"impact": "HIGH|MEDIUM|LOW"
}
]
},
"recommendations": {
"critical_fixes": ["string"],
"performance_optimizations": ["string"],
"integration_improvements": ["string"],
"documentation_updates": ["string"],
"user_experience_enhancements": ["string"]
}
}
Test Framework Architecture
Test Suite Categories
1. Individual Skill Testing
- Input Validation: Format checking, schema compliance, error handling
- Processing Logic: Core functionality validation, edge case handling
- Output Validation: JSON structure, markdown format, file generation
- Performance Testing: Execution time, memory usage, resource efficiency
- Documentation Testing: SKILL.md validation, example accuracy
2. Skill Integration Testing
- Data Flow: Input/output compatibility between connected skills
- Chain Validation: Multi-skill workflow execution
- Dependency Resolution: Prerequisites and sequence validation
- Error Propagation: Failure handling across skill boundaries
- State Management: Workspace consistency during skill chains
3. End-to-End Workflow Testing
- Complete Pipelines: Full requirements-to-schedule workflows
- Real Data Testing: Actual project scenarios and edge cases
- Traceability Validation: Reference chain integrity across all skills
- Output Quality: Final deliverable validation and coherence
- Workflow Performance: End-to-end execution timing and efficiency
4. VS Code Integration Testing
- Skill Loading: Copilot skill discovery and initialization
- Workspace Integration: File system interactions and workspace structure
- User Interface: Prompt handling and response formatting
- Error Handling: VS Code error display and user feedback
- Extension Compatibility: Integration with other VS Code extensions
5. Consistency & Standards Testing
- Format Compliance: Markdown, JSON, file naming standards
- Cross-Skill Consistency: Reference formats, terminology usage
- Schema Validation: JSON output structure compliance
- Documentation Standards: SKILL.md format and completeness
- Traceability Format: Reference link consistency across skills
Test Execution Methodology
Phase 1: Environment Setup
-
Test Project Initialization
- Create isolated test project structure
- Prepare diverse sample data sets
- Configure VS Code test environment
- Initialize baseline performance metrics
-
Skill Discovery & Validation
- Enumerate all available skills
- Validate SKILL.md files for completeness
- Check skill dependencies and prerequisites
- Verify VS Code skill loading
Phase 2: Individual Skill Testing
FOR each skill IN skill_directory:
1. Load skill definition and validate SKILL.md
2. Prepare test inputs (valid, invalid, edge cases)
3. Execute skill with monitoring:
- Execution time measurement
- Memory usage tracking
- Error handling validation
4. Validate outputs:
- JSON structure compliance
- Markdown format correctness
- File generation success
- Traceability link accuracy
5. Performance analysis:
- Compare against < 1 minute standard
- Memory efficiency assessment
- Resource usage optimization check
6. Document issues and recommendations
Phase 3: Integration Testing
FOR each skill_chain IN workflow_definitions:
1. Execute skill chain with data flow monitoring
2. Validate intermediate outputs at each step
3. Check data consistency across handoffs
4. Verify traceability chain integrity
5. Test error handling and recovery
6. Measure end-to-end performance
7. Validate final output quality
Phase 4: End-to-End Workflow Validation
FOR each complete_workflow IN test_scenarios:
1. Start with raw requirements input
2. Execute complete skill pipeline
3. Monitor data flow and transformations
4. Validate business logic preservation
5. Check output completeness and accuracy
6. Measure overall workflow performance
7. Assess user experience and usability
Phase 5: VS Code Integration Validation
Integration_Test_Suite:
1. Test skill discovery in VS Code
2. Validate Copilot prompt handling
3. Check workspace file interactions
4. Test error display and user feedback
5. Verify extension compatibility
6. Validate user workflow experience
Performance Standards & Validation
Performance Criteria
- Individual Skills: < 1 minute execution time
- Skill Chains: < 5 minutes for 5-skill sequences
- End-to-End: < 15 minutes for complete requirements-to-schedule
- Memory Usage: < 500MB peak per skill execution
- Error Rate: < 1% for valid inputs
- Format Compliance: 100% for output validation
Quality Assurance Metrics
- Input Validation: Handle malformed inputs gracefully
- Output Consistency: Maintain format standards across all skills
- Traceability: Preserve reference chains with 100% accuracy
- Documentation: Complete and accurate SKILL.md files
- User Experience: Intuitive prompts and clear error messages
Test Data Management
Sample Data Categories
- Requirements Documents: Varied formats, complexity levels, domains
- Domain Content: Technical, business, regulatory contexts
- Project Scenarios: Small, medium, large-scale project simulations
- Edge Cases: Malformed inputs, missing data, boundary conditions
- Integration Scenarios: Multi-skill workflows, complex dependencies
Test Environment Configuration
- Isolated Testing: Separate workspace for test execution
- Baseline Measurement: Clean environment performance baselines
- Resource Monitoring: CPU, memory, disk usage tracking
- Error Logging: Comprehensive error capture and analysis
- Result Archival: Test history and regression analysis
Validation Reporting
Automated Report Generation
- Executive Summary: High-level status and key metrics
- Detailed Analysis: Skill-by-skill performance and issues
- Integration Assessment: Workflow validation results
- Performance Benchmarks: Timing, resource usage, efficiency
- Recommendations: Prioritized improvement suggestions
- Trend Analysis: Performance changes over time
Continuous Integration Support
- Automated Testing: Triggered on skill updates
- Regression Detection: Compare against baseline metrics
- Quality Gates: Pass/fail criteria for releases
- Performance Monitoring: Trend analysis and alerting
- Documentation Updates: Auto-generated test reports
Error Handling & Recovery
Error Classification
- Critical: Complete skill failure, data corruption
- High: Performance degradation, format violations
- Medium: Minor inconsistencies, documentation gaps
- Low: Style issues, optimization opportunities
Recovery Procedures
- Graceful Degradation: Continue testing after individual failures
- Rollback Capability: Restore clean test environment
- Error Isolation: Prevent error propagation across tests
- Root Cause Analysis: Detailed failure investigation
- Remediation Tracking: Issue resolution monitoring
Usage Examples
Complete Integration Test
# Execute full integration test suite
Use the integration-testing skill to run comprehensive validation of all EDPS skills with the Banking Transactions sample project.
Expected outputs:
- Complete test report with pass/fail status for all 22 skills
- Performance analysis showing execution times and resource usage
- Integration validation results for all skill chains
- VS Code compatibility assessment
- Recommendations for any identified issues
Workflow-Specific Testing
# Test specific workflow chain
Run integration testing for the requirements-to-domain-model workflow using the AI Slowcooker project requirements.
Focus areas:
- requirements-ingest → goals-extract → process-w5h → domain-extractconcepts → domain-alignentities → domain-proposenewconcepts → diagram-generatecollaboration
- Data flow consistency across all handoffs
- Traceability preservation through the chain
- Performance within acceptable standards
Performance Benchmarking
# Performance validation focused test
Execute integration testing with emphasis on performance benchmarking for all skills using varied sample data sizes.
Performance criteria:
- Individual skills complete in < 1 minute
- Skill chains complete in < 5 minutes
- Memory usage stays below 500MB per skill
- Generate performance optimization recommendations
Regression Testing
# Validate after skill updates
Run integration testing to validate system stability after updating the goals-extract and plan-derivetasks skills.
Regression checks:
- Ensure updated skills maintain backward compatibility
- Validate integration points with dependent skills
- Confirm performance hasn't degraded
- Check output format consistency is preserved