تشغيل أي مهارة في Manus بنقرة واحدة

auditing-test-quality

النجوم٠

التفرعات٠

آخر تحديث٩ فبراير ٢٠٢٦ في ٠٢:٤٢

Automates test quality assessment, identifies vanity tests, and guides systematic improvement of test suites. Use when reviewing test suites, identifying shallow tests, enforcing behavioral test standards, or when the user mentions test quality, vanity tests, or test effectiveness.

التثبيت

التثبيت باستخدام Codex أو Claude انسخ هذا Prompt والصقه في Codex أو Claude أو مساعد آخر ليراجع صفحة Skill ويثبّتها لك.

تشغيل في Manus

المصدر

kynoptic

kynoptic/wikipedia-reliable-sources

فتح مستودع GitHub عرض مستودعات المنشئ

تنزيل

تشغيل في Manus

المهن ذات الصلةSOC

استنادا إلى تصنيف SOC المهني

محللو ضمان جودة البرمجيات والمختبرونمهن الحاسوب والرياضيات·SOC 15-1253

SKILL.md

readonly

المزيد من هذا المستودع

نفس المستودع

refactoring-deprecated-code

kynoptic/wikipedia-reliable-sources

Systematically scans for deprecated library functions, APIs, and language features, then automatically updates code to use modern alternatives. Handles migration guides, version compatibility, and integration testing. Use when detecting deprecation warnings, updating dependencies, addressing API evolution, or when the user mentions deprecated APIs, legacy code, or migration needs.

2026-02-090

reviewing-content-governance

kynoptic/wikipedia-reliable-sources

Reviews content against HMS IT governance standards including style guide, word list, and platform-specific guidelines. Checks voice and tone, word usage, formatting, structure, and compliance with content type requirements. Use when reviewing emails, knowledge base articles, website pages, or documents for style consistency, or when users mention "content review", "style guide check", "governance compliance", "editorial standards", "content quality", or "editorial review".

2026-02-090

creating-agent-skills

kynoptic/wikipedia-reliable-sources

Creates or improves Agent Skills following official documentation and best practices. Use when creating new skills, improving existing skills, evaluating skill quality, or ensuring skills follow naming conventions, structure requirements, and discovery patterns. Guides through description writing, progressive disclosure, workflows, and testing.

2026-02-090

debugging-runtime-errors

kynoptic/wikipedia-reliable-sources

Analyzes stack traces, error logs, and code context to identify root causes and suggest targeted fixes. Specializes in systematic troubleshooting across multiple languages. Use when encountering runtime errors, production issues, exceptions, crashes, or when the user mentions error messages, stack traces, or debugging needs.

2026-02-090

extension-create

kynoptic/wikipedia-reliable-sources

Creates slash commands, subagents/workflows, or agent skills using the Claude request type framework. Analyzes requirements, determines the optimal extension type (command/subagent/skill), decides project vs personal scope, and generates properly structured files following templates and best practices. Use when creating new commands, workflows, or skills, or when the user mentions "create a command", "new workflow", "build a skill", or requests custom Claude Code extensions.

2026-02-090

managing-github-issue-dependencies

kynoptic/wikipedia-reliable-sources

Manages GitHub issue blocking/blocked-by relationships using the native dependencies feature via GraphQL API. Use when showing that one issue must complete before another can proceed, querying dependency relationships, or preventing circular dependencies.

2026-02-090

name	Auditing Test Quality
description	Automates test quality assessment, identifies vanity tests, and guides systematic improvement of test suites. Use when reviewing test suites, identifying shallow tests, enforcing behavioral test standards, or when the user mentions test quality, vanity tests, or test effectiveness.

Auditing Test Quality

Systematically audit test quality to identify vanity tests and guide improvement.

What you should do

Automate test quality assessment, identify vanity tests, and guide systematic improvement of test suites.

Prerequisites

Test audit script available (usually scripts/audit_tests.py or make audit-tests)
Project follows behavioral test naming conventions
Mock and assertion counting tools configured

Workflow

Step 1: Run test quality audit

# Run the audit script
python scripts/audit_tests.py
# or
make audit-tests
# or
npm run audit:tests

Expected output format:

Test Quality Audit Report
=========================

ISSUES FOUND: 12

tests/unit/test_user_service.py:
  - test_save_user: Non-behavioral test name
  - test_get_user: Too many mocks (7)
  - test_update_user: Poor mock-to-assertion ratio (1/4)

tests/integration/test_api.py:
  - test_endpoint: Non-behavioral test name
  - test_auth: Poor mock-to-assertion ratio (2/6)

SUMMARY:
  Total tests examined: 45
  Tests with issues: 12 (26.7%)
  Issues by type:
    - Non-behavioral names: 8
    - Too many mocks: 2
    - Poor mock ratio: 4
    - Vanity tests: 3

Step 2: Prioritize issues by severity

High priority (fix immediately):

Tests with >5 mocks
Tests with only mock assertions (no behavior verification)
Tests that would pass with broken functionality

Medium priority (fix this sprint):

Tests with poor mock-to-assertion ratio (<3:1)
Tests with implementation-focused names
Tests mocking internal business logic

Low priority (fix when touching code):

Tests with vague names but correct behavior
Tests with minor mock discipline violations

Step 3: Fix high-priority issues

For each high-priority test:

Analyze the test purpose: What behavior should it verify?
Identify true dependencies: What actually needs mocking?
Rewrite the test:
- Use behavioral naming: test_should_X_when_Y
- Reduce mocks to external dependencies only
- Add meaningful assertions about outcomes
- Verify the test fails when functionality breaks

Example transformation:

# BEFORE (vanity test)
def test_save_user():
    mock_db = Mock()
    service = UserService(mock_db)
    service.save(user_data)
    mock_db.save.assert_called_once_with(user_data)

# AFTER (behavioral test)
def test_should_return_user_id_when_saving_valid_user():
    # Use real objects where possible
    service = UserService(in_memory_repository)
    user_data = {"name": "John", "email": "john@example.com"}

    user_id = service.save(user_data)

    # Verify actual behavior
    assert user_id is not None
    assert service.find_by_id(user_id).name == "John"
    assert service.find_by_id(user_id).email == "john@example.com"

Step 4: Fix medium-priority issues

Improving mock-to-assertion ratios:

# BEFORE: Poor ratio (1 assertion, 3 mocks)
def test_user_creation():
    mock_db = Mock()
    mock_email = Mock()
    mock_logger = Mock()
    service = UserService(mock_db, mock_email, mock_logger)

    service.create_user({"email": "test@example.com"})

    mock_db.save.assert_called_once()  # Only 1 assertion

# AFTER: Good ratio (4+ assertions, 3 mocks)
def test_should_create_user_and_send_welcome_email_when_valid_data():
    mock_db = Mock()
    mock_email = Mock()
    mock_logger = Mock()
    service = UserService(mock_db, mock_email, mock_logger)

    result = service.create_user({"email": "test@example.com"})

    # Multiple meaningful assertions
    assert result["status"] == "created"
    assert result["user_id"] is not None
    mock_db.save.assert_called_once()
    mock_email.send_welcome.assert_called_once_with("test@example.com")
    assert mock_logger.info.call_count >= 1

Improving test names:

# BEFORE: Implementation-focused names
def test_uses_bcrypt_hash()
def test_calls_api_endpoint()
def test_saves_to_database()

# AFTER: Behavior-focused names
def test_should_hash_password_when_user_registers()
def test_should_return_user_data_when_valid_token_provided()
def test_should_persist_user_when_registration_completes()

Step 5: Remove vanity tests

For tests that only verify mocks were called:

Check if behavior is covered elsewhere: If yes, delete the vanity test
If not covered: Rewrite to test actual behavior
Consider integration tests: Some behaviors need higher-level testing

Step 6: Re-run audit and verify improvements

# Run audit again to confirm fixes
python scripts/audit_tests.py

# Verify tests still pass
make test

# Check that tests fail when functionality breaks
# (Temporarily break a function and confirm its tests fail)

Step 7: Update audit baseline

# Update the expected audit results
python scripts/audit_tests.py --update-baseline

# Or commit the new audit report as baseline
git add audit_results.json
git commit -m "test: update quality audit baseline after improvements"

Automation setup

Pre-commit hook

# .pre-commit-config.yaml
repos:
  - repo: local
    hooks:
      - id: test-quality-audit
        name: Test quality audit
        entry: python scripts/audit_tests.py
        language: system
        pass_filenames: false
        files: ^tests/.*\.py$

CI integration

# .github/workflows/test-quality.yml
name: Test Quality
on: [push, pull_request]
jobs:
  audit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: 3.9
      - name: Run test quality audit
        run: |
          python scripts/audit_tests.py
          if [ $? -ne 0 ]; then
            echo "Test quality issues found. See report above."
            exit 1
          fi

Makefile target

audit-tests: ## Run test quality audit
	@echo "Running test quality audit..."
	@python scripts/audit_tests.py
	@if [ $$? -eq 0 ]; then \
		echo "✅ All tests pass quality checks"; \
	else \
		echo "❌ Test quality issues found"; \
		exit 1; \
	fi

Audit script customization

Language-specific configurations

Python (pytest):

MOCK_PATTERNS = ['mock', 'Mock', 'patch', 'MagicMock']
ASSERTION_PATTERNS = ['assert', 'assertEqual', 'assertTrue']
BEHAVIORAL_KEYWORDS = ['should', 'when', 'given', 'returns', 'rejects']

JavaScript (Jest):

const MOCK_PATTERNS = ['jest.fn', 'mock', 'spy', 'stub'];
const ASSERTION_PATTERNS = ['expect', 'toBe', 'toEqual', 'assert'];
const BEHAVIORAL_KEYWORDS = ['should', 'when', 'given', 'returns', 'rejects'];

Java (JUnit):

List<String> MOCK_PATTERNS = Arrays.asList("mock", "spy", "when", "verify");
List<String> ASSERTION_PATTERNS = Arrays.asList("assert", "verify", "assertEquals");

Custom quality rules

Projects can extend the audit with domain-specific rules:

Performance test naming patterns
Database test cleanup verification
API test status code checking
Security test coverage requirements

Quality metrics tracking

Track improvements over time:

# Generate quality report
python scripts/audit_tests.py --report=json > quality_report_$(date +%Y%m%d).json

# Track metrics:
# - Percentage of behavioral test names
# - Average mock-to-assertion ratio
# - Number of vanity tests eliminated
# - Test maintenance effort (time to update when requirements change)

Benefits

Prevent test debt: Catch quality issues before they accumulate
Improve debugging: Better tests make failures easier to understand
Reduce maintenance: Behavioral tests are more resilient to refactoring
Increase confidence: Quality tests actually catch regressions
Standardize practices: Consistent quality across team members