تشغيل أي مهارة في Manus بنقرة واحدة

hypothesis-driven-debugging

النجوم٤٬٣٠٦

التفرعات٨٦٥

آخر تحديث١٧ يونيو ٢٠٢٦ في ٠٩:٠٠

Investigate compiler failures, test errors, or unexpected behavior through systematic minimal reproduction, 3-hypothesis testing, and verification. Always re-run builds and tests after changes.

التثبيت

التثبيت باستخدام Codex أو Claude انسخ هذا Prompt والصقه في Codex أو Claude أو مساعد آخر ليراجع صفحة Skill ويثبّتها لك.

تشغيل في Manus

المصدر

dotnet

dotnet/fsharp

فتح مستودع GitHub عرض مستودعات المنشئ

تنزيل

تشغيل في Manus

المهن ذات الصلةSOC

استنادا إلى تصنيف SOC المهني

مطوّرو البرمجياتمهن الحاسوب والرياضيات·SOC 15-1252

SKILL.md

readonly

المزيد من هذا المستودع

نفس المستودع

binlog-analysis

dotnet/fsharp

Triage a build / compile / restore / WarnAsError failure from its MSBuild binary log. Fetches the binlog (a local build's, or a failed dotnet/fsharp Azure DevOps PR build's published artifact) and analyzes it live via the `binlog-mcp` MCP server — structured errors, root-cause diagnosis, and an MSBuild perf X-ray. NOT for test failures or CheckCodeFormatting: a build binlog has no errors there.

2026-06-174.3k

pr-build-status

dotnet/fsharp

Retrieve and analyze Azure DevOps build failures for GitHub PRs. Use when CI fails. CRITICAL: Collect ALL errors from ALL platforms FIRST, write hypotheses to file, then fix systematically.

2026-06-174.3k

pr-description

dotnet/fsharp

Use when drafting, proposing, creating, or editing prose for a dotnet/fsharp GitHub PR or issue — body, title, comment, review summary, edits — including bare asks like "open a PR", "ship this", "write up what I did", "summarise the change", "reply on the PR", "edit the issue body", "gh pr create", "gh pr comment", "gh pr edit --body", "gh issue comment", "gh pr review --body". Primary use case is PR descriptions; same rules apply to PR/issue comments and review summaries. Not for labels, reviewers, merging, or code-review findings (just the prose write-up of them).

2026-06-084.3k

fsharp-diagnostics

dotnet/fsharp

Always invoke after editing `.fs` files under `src/Compiler/`. Fast parse/typecheck without `dotnet build`, plus symbol references and inferred type hints. Use whenever the user asks about F# errors, compile errors, type inference, finding usages, or renaming a symbol in the compiler tree.

2026-06-044.3k

reviewing-compiler-prs

dotnet/fsharp

Performs multi-agent, multi-model code review of F# compiler PRs across 19 dimensions including type checking, IL emission, binary compatibility, and IDE performance. Dispatches parallel assessment agents per dimension, consolidates with cross-model agreement scoring, and filters false positives. Invoke when reviewing compiler changes, requesting expert feedback, or performing pre-merge quality checks.

2026-05-204.3k

flaky-test-detector

dotnet/fsharp

Detect flaky tests by scanning recent AzDo CI builds for test failures recurring across multiple unrelated PRs. Use when investigating intermittent failures, CI instability, deciding which tests to quarantine, or checking if RunTestCasesInSequence no-ops are causing parallel-safety issues.

2026-03-164.3k

name	hypothesis-driven-debugging
description	Investigate compiler failures, test errors, or unexpected behavior through systematic minimal reproduction, 3-hypothesis testing, and verification. Always re-run builds and tests after changes.

Hypothesis-Driven Debugging

A systematic, rigorous approach to debugging failures in the F# compiler codebase.

When to Use This Skill

Use this skill when:

Investigating test failures (unit tests, integration tests, end-to-end tests)
Debugging build errors or compilation failures
Analyzing unexpected runtime behavior
Troubleshooting performance regressions
Examining warning/error message issues

Related: for a build / compile / restore failure, run the binlog-analysis skill first — it fetches the build's MSBuild binary log and analyzes it live via the binlog-mcp MCP (structured errors + root-cause diagnosis), a fast way to scope the minimal reproduction below.

Core Principles

Always start with a minimal reproduction
Form multiple competing hypotheses
Design verification for each hypothesis
Document findings rigorously
Re-run builds and tests after every change

Process

Step 1: Create Minimal Reproduction

Before forming hypotheses, create the smallest possible reproduction:

Extract the failure:

# For test failures - run just the failing test
dotnet test -- --filter-method "*YourTest*"

# For build failures - try to isolate the problematic file
# Create a minimal .fs file that reproduces the issue

Reduce to essentials:
- Remove unrelated code
- Simplify to the core issue
- Verify the minimal case still fails

Document the repro:

## Minimal Reproduction

File: test-case.fs
Command: dotnet test -- --filter-method "*TestName*"
Expected: <expected behavior>
Actual: <actual behavior>

Step 2: Form 3 Hypotheses

Always form at least 3 competing hypotheses about the root cause:

## Hypothesis 1: [Brief description]
**Theory**: The failure occurs because...
**How to verify**: Run/change X and observe Y
**Verification result**: [To be filled]
**Implications**: If true, this means...

## Hypothesis 2: [Brief description]
**Theory**: The failure occurs because...
**How to verify**: Add instrumentation/logging at point Z
**Verification result**: [To be filled]
**Implications**: If true, this means...

## Hypothesis 3: [Brief description]
**Theory**: The failure occurs because...
**How to verify**: Check assumption A by running test B
**Verification result**: [To be filled]
**Implications**: If true, this means...

Step 3: Verification Methods

For each hypothesis, use one or more verification methods:

Code Instrumentation

// Add temporary debugging output
printfn "DEBUG: Value at checkpoint: %A" someValue
printfn "DEBUG: Entering function X with args: %A %A" arg1 arg2

Minimal Test Cases

// Create focused test to verify specific behavior
[<Test>]
let ``Hypothesis 1 verification test`` () =
    let result = functionUnderTest input
    result |> should equal expectedValue

Build with Different Flags

# Try different configurations
./build.sh -c Debug
./build.sh -c Release

# Compare outputs
diff debug-output.log release-output.log

Targeted Logging

# Enable verbose logging for specific component
export FSHARP_COMPILER_VERBOSE=1
dotnet build

Step 4: Document Findings

Maintain a HYPOTHESIS.md file in the working directory:

# Hypothesis Investigation

## Issue Summary
Brief description of the failure/bug being investigated.

## Minimal Reproduction
[Code/commands to reproduce]

## Hypotheses

### Hypothesis 1: Token position tracking issue
**Theory**: The warning check compares line numbers but lastNonCommentTokenLine is not being updated correctly.
**How to verify**: Add printfn debugging in LexFilter.fs to log every token and its line number.
**Verification result**: ✅ CONFIRMED - Logging showed LBRACE tokens were updating the tracking when they shouldn't.
**Implications**: Need to exclude LBRACE and potentially other structural tokens from tracking.

### Hypothesis 2: Lexer pattern matching order
**Theory**: The /// pattern might be matched after other patterns, losing context.
**How to verify**: Check lex.fsl pattern order and add logging in the /// rule.
**Verification result**: ❌ DENIED - Pattern order is correct; /// is matched specifically.
**Implications**: Issue is not in the lexer pattern matching.

### Hypothesis 3: Test expectations wrong
**Theory**: The test expectations might not match actual compiler behavior.
**How to verify**: Manually compile test code and check actual warning positions.
**Verification result**: ⚠️ PARTIAL - Some tests had wrong expectations, but underlying issue still exists.
**Implications**: Fixed test expectations, but still need to address token tracking.

## Resolution
[Final solution and verification]

## Lessons Learned
- What worked well
- What to do differently next time
- Patterns to remember

Step 5: Critical - Always Re-run Tests

ABSOLUTELY REQUIRED: After implementing any fix:

Build from scratch:

./build.sh -c Release
# Record: Time, exit code, number of errors

Run affected tests:

# For targeted testing
dotnet test -- --filter-class "*AffectedTestSuite*"

# Record: Passed, Failed, Skipped, Time

Verify the fix:
- Run the minimal reproduction - confirm it passes
- Run related tests - confirm no regressions
- Build the full project - confirm no new errors

Document results:

## Verification Results

Build:
- Command: ./build.sh -c Release
- Time: 4m 23s
- Errors: 0

Tests:
- Command: dotnet test -- --filter-class "*XmlDocTests*"
- Total: 61
- Passed: 56
- Failed: 0
- Skipped: 5
- Time: 2.1s

Minimal Repro:
- Status: ✅ PASSING

Example Workflow

# 1. Observe failure
dotnet test -- --filter-class "*XmlDocTests*"
# Result: 15 tests failing

# 2. Create minimal repro
cat > test-case.fs <<EOF
type R = { /// field doc
    Field: int
}
EOF
dotnet fsc test-case.fs
# Observe: Warning FS3879 incorrectly triggered

# 3. Form hypotheses (in HYPOTHESIS.md)
# - H1: LBRACE token incorrectly tracked
# - H2: Lexer pattern issue
# - H3: Test expectations wrong

# 4. Verify H1
# Add: printfn "DEBUG: Token %A at line %d" token lineNum
./build.sh -c Release && dotnet test ...
# Result: Confirms LBRACE is being tracked

# 5. Implement fix
# Exclude LBRACE from tracking in LexFilter.fs

# 6. CRITICAL: Re-run everything
./build.sh -c Release
# 4m 44.9s, 0 errors

dotnet test -- --filter-class "*XmlDocTests*"
# 61 total, 56 passed, 0 failed, 5 skipped, 2s

# 7. Verify minimal repro
dotnet fsc test-case.fs
# No warning - ✅ FIXED

# 8. Update HYPOTHESIS.md with results
# 9. Commit with evidence

Anti-Patterns to Avoid

❌ Don't:

Skip the minimal reproduction
Form only one hypothesis
Make changes without verification
Forget to re-run tests after fixes
Claim "fixed" without build evidence

✅ Do:

Start with smallest possible repro
Consider multiple explanations
Verify each hypothesis systematically
Always re-run build and tests
Document commands, timings, and results

Integration with Development Workflow

After using this skill:

Clean up temporary debugging code
Remove or archive HYPOTHESIS.md
Update documentation with lessons learned
Add regression tests if appropriate
Consider whether findings reveal deeper issues

References

Software Debugging Techniques
Scientific Method Applied to Software
F# Compiler build guide: docs/DEVGUIDE.md
F# Compiler testing guide: docs/testing.md