ワンクリックで
surface-benchmarker
// Guidelines for running and interpreting Surface API performance benchmarks. Use when modifying code in src/Hex1b/Surfaces/ to ensure performance is not regressed.
// Guidelines for running and interpreting Surface API performance benchmarks. Use when modifying code in src/Hex1b/Surfaces/ to ensure performance is not regressed.
Guidelines for reviewing API design in the Hex1b codebase. Use when evaluating public APIs, reviewing accessibility modifiers, or assessing whether new APIs follow project conventions.
Step-by-step guide for creating new widgets in the Hex1b TUI library. Use when implementing new widgets from scratch, including widget records, nodes, extension methods, theming, reconciliation, and tests.
Agent for diagnosing and fixing flaky terminal UI tests in the Hex1b test suite. Use when tests pass locally but fail in CI, or when tests exhibit timing-sensitive behavior.
Guidelines for producing accurate and maintainable documentation for the Hex1b TUI library. Use when writing XML API documentation comments, creating end-user guides, or updating existing documentation.
Guidelines for writing unit tests in the Hex1b TUI library. Use when creating new tests for widgets, nodes, or terminal functionality.
Agent for validating Hex1b documentation against actual library behavior. Use when auditing documentation accuracy, testing interactive examples, or identifying discrepancies between documentation and implementation.
| name | surface-benchmarker |
| description | Guidelines for running and interpreting Surface API performance benchmarks. Use when modifying code in src/Hex1b/Surfaces/ to ensure performance is not regressed. |
This skill provides guidelines for AI agents to run and interpret performance benchmarks for the Hex1b Surface API. Run benchmarks whenever you modify code in src/Hex1b/Surfaces/ to ensure performance is not regressed.
| Action | Command |
|---|---|
| Run all Surface benchmarks | dotnet run -c Release --project benchmarks/Hex1b.Benchmarks -- --filter "Surface*" |
| Run specific benchmark | dotnet run -c Release --project benchmarks/Hex1b.Benchmarks -- --filter "*WriteText*" |
| Quick dry-run (sanity check) | dotnet run -c Release --project benchmarks/Hex1b.Benchmarks -- --filter "Surface*" --job dry |
ALWAYS run benchmarks after modifying:
| File/Area | Critical Benchmarks |
|---|---|
SurfaceCell.cs | All (cells are used everywhere) |
Surface.cs | CreateSurface_*, WriteText_*, Fill_*, Clone_* |
CompositeSurface.cs | CompositeSurface_* |
SurfaceComparer.cs | Compare_*, ToTokens_*, ToAnsiString_* |
SurfaceDiff.cs | Compare_* |
ComputeContext.cs | CompositeSurface_* (computed cells use this) |
benchmarks/Hex1b.Benchmarks/
├── Hex1b.Benchmarks.csproj # Console app with BenchmarkDotNet
├── Program.cs # Entry point
└── SurfaceBenchmarks.cs # Surface API benchmarks
BenchmarkDotNet is a .NET performance benchmarking library that:
[MemoryDiagnoser] is enabled)┌─────────────────────────────────────────────────────────────────┐
│ 1. GlobalSetup │
│ - Runs once before all benchmarks │
│ - Creates pre-allocated surfaces, diffs, etc. │
│ - Sets up shared test data │
└─────────────────────────────────────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────────────┐
│ 2. For each [Benchmark] method: │
│ a. Warmup phase (JIT compilation, cache warming) │
│ b. Pilot phase (determine optimal iteration count) │
│ c. Actual measurements (multiple iterations) │
│ d. Results aggregation │
└─────────────────────────────────────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────────────┐
│ 3. Report Generation │
│ - Console output with statistics │
│ - Optional: HTML, CSV, JSON exports │
└─────────────────────────────────────────────────────────────────┘
| Attribute | Purpose |
|---|---|
[MemoryDiagnoser] | Reports memory allocations (Gen0/1/2 collections, allocated bytes) |
[Benchmark] | Marks a method as a benchmark |
[GlobalSetup] | Runs once before all benchmarks to set up shared state |
[Params(80, 160, 320)] | Parameterizes benchmarks with multiple values |
CreateSurface_*)Measures allocation and initialization cost for surfaces of various sizes.
[Benchmark]
public Surface CreateSurface_Small() => new Surface(80, 24); // Standard terminal
public Surface CreateSurface_Medium() => new Surface(160, 48); // Large terminal
public Surface CreateSurface_Large() => new Surface(320, 96); // Very large
public Surface CreateSurface_4K() => new Surface(480, 135); // 4K equivalent
What to watch for:
WriteText_*)Measures grapheme parsing and wide character handling.
[Benchmark]
public void WriteText_Short(); // "Hello"
public void WriteText_Medium(); // Standard sentence
public void WriteText_Long(); // Long paragraph
public void WriteText_WideChars(); // Chinese characters (2 cells each)
public void WriteText_FillScreen(); // 24 lines of text
What to watch for:
FillScreen should scale linearlyFill_*)Measures bulk cell assignment.
[Benchmark]
public void Fill_SmallRect(); // 20x10 region
public void Fill_FullScreen(); // 80x24
public void Fill_LargeScreen(); // 320x96
What to watch for:
Composite_*, CompositeSurface_*)Measures layer merging and transparency resolution.
[Benchmark]
public void Composite_SmallOntoSmall(); // 20x10 onto 80x24
public void Composite_MediumOntoLarge(); // 80x24 onto 320x96
public Surface CompositeSurface_Flatten_3Layers(); // Resolve 3 layers
public SurfaceCell CompositeSurface_GetCell_Resolved(); // Single cell lookup
public void CompositeSurface_GetAllCells(); // 80x24 = 1920 lookups
What to watch for:
Flatten should be O(layers × cells)Compare_*)Measures change detection.
[Benchmark]
public SurfaceDiff Compare_FullDiff(); // 100% of cells changed
public SurfaceDiff Compare_SparseDiff(); // ~10% of cells changed
public SurfaceDiff Compare_NoDiff(); // 0% changed
public SurfaceDiff CompareToEmpty(); // Initial render
What to watch for:
NoDiff should be very fast (early exit when possible)ToTokens_*, ToAnsiString_*)Measures ANSI escape sequence generation.
[Benchmark]
public IReadOnlyList<AnsiToken> ToTokens_FullDiff(); // Generate tokens
public string ToAnsiString_FullDiff(); // Tokens → string
public string ToAnsiString_SparseDiff(); // Fewer changes
What to watch for:
cd /home/midenn/Code/hex1b
dotnet run -c Release --project benchmarks/Hex1b.Benchmarks -- --filter "Surface*"
Expected runtime: 5-15 minutes depending on hardware.
dotnet run -c Release --project benchmarks/Hex1b.Benchmarks -- --filter "Surface*" --job dry
This runs fewer iterations to quickly verify benchmarks work without waiting for full statistical accuracy.
# Only WriteText benchmarks
dotnet run -c Release --project benchmarks/Hex1b.Benchmarks -- --filter "*WriteText*"
# Only Diff/Token generation
dotnet run -c Release --project benchmarks/Hex1b.Benchmarks -- --filter "*Compare*|*Token*|*Ansi*"
# Single benchmark
dotnet run -c Release --project benchmarks/Hex1b.Benchmarks -- --filter "*Fill_FullScreen*"
| Method | Mean | Error | StdDev | Gen0 | Allocated |
|-----------------------|------------|----------|----------|--------|-----------|
| CreateSurface_Small | 12.34 μs | 0.15 μs | 0.14 μs | 1.2345 | 15.36 KB |
| CreateSurface_Large | 98.76 μs | 1.23 μs | 1.15 μs | 9.8765 | 122.88 KB |
| Column | Meaning |
|---|---|
| Mean | Average execution time |
| Error | Half of 99.9% confidence interval |
| StdDev | Standard deviation (lower = more consistent) |
| Gen0 | Gen0 garbage collections per 1000 operations |
| Allocated | Bytes allocated per operation |
| Benchmark | Expected Range | Alert If |
|---|---|---|
CreateSurface_Small (80×24) | 5-20 μs | > 50 μs |
CreateSurface_4K (480×135) | 50-200 μs | > 500 μs |
WriteText_Short | 0.5-2 μs | > 5 μs |
WriteText_WideChars | 1-5 μs | > 10 μs |
Fill_FullScreen | 2-10 μs | > 25 μs |
Compare_NoDiff | 5-20 μs | > 50 μs |
Compare_SparseDiff | 10-50 μs | > 100 μs |
ToAnsiString_SparseDiff | 20-100 μs | > 250 μs |
Note: These are rough guidelines. Actual values depend on hardware.
When making changes:
# Save before results
dotnet run -c Release --project benchmarks/Hex1b.Benchmarks -- \
--filter "Surface*" --exporters json > before.json
# Make changes...
# Compare (manual inspection)
dotnet run -c Release --project benchmarks/Hex1b.Benchmarks -- \
--filter "Surface*" --exporters json > after.json
Symptom: High Allocated column, many Gen0 collections.
Common causes:
SurfaceCell in object)StringBuilderFix strategies:
ArrayPool<T>.Shared for temporary buffersSpan<T> and stack allocation for small buffersSymptom: Large surfaces are disproportionately slower than small ones.
Common causes:
Fix strategies:
Symptom: Operations are slower than expected for the amount of data.
Common causes:
Fix strategies:
readonly and in parameters to avoid copiesWhen adding new functionality to the Surface API, add corresponding benchmarks:
[Benchmark]
public void NewFeature_TypicalCase()
{
// Benchmark the common case
_surface.NewMethod(typicalInput);
}
[Benchmark]
public void NewFeature_WorstCase()
{
// Benchmark the worst case for regression detection
_surface.NewMethod(worstCaseInput);
}
[GlobalSetup] - don't measure setup timeBenchmarks can be integrated into CI to catch regressions:
# .github/workflows/benchmarks.yml (example)
- name: Run benchmarks
run: |
dotnet run -c Release --project benchmarks/Hex1b.Benchmarks -- \
--filter "Surface*" --exporters json
- name: Compare with baseline
run: |
# Compare against stored baseline, fail if regression > 20%
Before merging changes to src/Hex1b/Surfaces/: