Run any Skill in Manus with one click

$pwd:

triage-suite-failure

Name: Triage Suite Failure
Author: tv-labs

// Diagnose a failing Lua 5.3 official test suite file. Isolates the failure, classifies it, decides whether to fix-now or defer, and produces either a unit test + plan file (for fix-now) or a deferred-with-comment skip (for out-of-scope). Use this when investigating a specific suite file (`literals.lua`, `bitwise.lua`, etc.), when a /next-plan ship reveals downstream suite failures, when /suite-status shows new failures, or when the user asks "why is X failing". This skill does not ship code itself — it produces the artifacts (plan file, unit test, or skip tag) that other plans then ship via ship-a-plan.

Run Skill in Manus

$ git log --oneline --stat

stars:167

forks:8

updated:May 27, 2026 at 23:26

SKILL.md

readonly

related-skills.json

same repository

ship-a-plan.md

from "tv-labs/lua"

Execute one plan file from .agents/plans/ as a single PR against main. Reads the plan, verifies preconditions, implements only what's in scope, runs full validation, opens a PR, and updates the plan file's status. Stops before merging — review is human-gated. Use this skill when the user invokes /next-plan, asks to "ship the next plan", "start plan A1" or similar, or when picking up a specific plan file by id. One plan = one PR = one issue = one merge to main. Do not batch multiple plans into a single PR.

2026-05-07167

run-bench.md

from "tv-labs/lua"

Execute the benchmark harness under benchmarks/ and produce a markdown report comparing this build against the recorded baseline (and optionally against Luerl and PUC-Lua). Flag regressions over a configurable threshold. Use when the user asks for a benchmark run, after any executor or codegen change in Direction B, or when a Direction B plan's verification calls for it. STATUS: stub. Will be filled in when Direction B begins. The benchee harness is already in benchmarks/ (added in PR #143).

2026-05-04167

package.json

"author": "tv-labs"

"repository": "tv-labs/lua"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

$ useful --forSOC

Software Quality Assurance Analysts and TestersComputer and Mathematical Occupations15-1253L4

name

triage-suite-failure

description

Diagnose a failing Lua 5.3 official test suite file. Isolates the failure, classifies it, decides whether to fix-now or defer, and produces either a unit test + plan file (for fix-now) or a deferred-with-comment skip (for out-of-scope). Use this when investigating a specific suite file (`literals.lua`, `bitwise.lua`, etc.), when a /next-plan ship reveals downstream suite failures, when /suite-status shows new failures, or when the user asks "why is X failing". This skill does not ship code itself — it produces the artifacts (plan file, unit test, or skip tag) that other plans then ship via ship-a-plan.

triage-suite-failure

A reproducible workflow for understanding why a Lua 5.3 suite test fails and turning that understanding into a shippable plan.

What this skill is for

Reproducing a single suite failure deterministically.
Classifying it (lexer / parser / codegen / executor / stdlib / unimplemented).
Reducing it to a 5–20 line repro that lives in test/lua/vm/.
Producing one of three outputs:
1. A new plan file under .agents/plans/ if the fix is shippable.
2. A line-range skip entry in test/lua53_skips.exs with a one-line reason and (optionally) a tracking issue. See §6.C.
3. An update to an existing plan if this failure is part of an ongoing fix.

What this skill is NOT for

Implementing the fix. (That's ship-a-plan.)
Reasoning about whole-suite trends. (That's /suite-status.)
Designing new test infrastructure.

Workflow

1. Reproduce the failure standalone

The suite test runner has overhead and shared state. Always reproduce against a freshly-constructed Lua value, with sandbox config matching the suite runner.

Use this exact pattern (write to /tmp/triage_<file>.exs):

path = "test/lua53_tests/<file>.lua"
source = File.read!(path)
lua = Lua.new(exclude: [[:package], [:require], [:load], [:loadstring], [:loadfile], [:dofile]])
lua = Lua.set_lua_paths(lua, [Path.join(Path.dirname(path), "?.lua")])

try do
  {_, _} = Lua.eval!(lua, source)
  IO.puts(:pass)
rescue
  e ->
    msg = case e do
      %{value: v} when is_binary(v) -> v
      %{value: v} -> inspect(v)
      _ -> Exception.message(e)
    end
    IO.puts(String.slice(msg, 0, 2000))
end

Run with mix run /tmp/triage_<file>.exs. Note: do not use mix run -e for this — file scripts give clean stack traces.

2. Find the failing line

Suite files are large (hundreds to thousands of lines). To pinpoint the failure:

If the error includes a line number, jump straight to step 3.
Otherwise, instrument with line-print probes. Walk the file in halves: insert print("checkpoint A") halfway down, run; if it prints, the failure is below; if not, it's above. Bisect to a single statement.

For long-running tests (10+ seconds), run with a timeout via Task.yield and look at where stdout cut off — that's near the failure.

3. Reduce to a minimal repro

Take the failing statement and the minimum surrounding context. Goal: 5–20 lines that reproduce the same failure.

Remove all assert calls except the failing one.
Inline any helpers it depends on.
Replace data with the smallest value that triggers the bug.

4. Add the repro as a unit test

Path: test/lua/vm/<area>_<bug>_test.exs (or extend an existing file in the right area). Mark as a regression test:

# Regression test for Lua 5.3 suite: <file>.lua line N
# When fixed, this file should also pass: <related suite files>
test "table index returns nil for missing keys" do
  code = """
  local t = {}
  return t[5]
  """
  {[result], _} = Lua.eval!(code)
  assert result == nil
end

If the test passes (i.e. the bug doesn't reproduce in isolation), the suite failure is from interaction effects — investigate further or escalate.

5. Classify the failure

Use this taxonomy. Pick the most specific one.

Category	Symptom	Where to look
Lexer	`Failed to compile`, parse error, "no case clause matching {:comment, ...}"	`lib/lua/lexer.ex`
Parser	`Failed to compile`, "Expression statement must be a function call", "expected ..."	`lib/lua/parser.ex`
Codegen	Compiles but wrong instructions emitted (inspect with `Lua.Compiler.compile/1`)	`lib/lua/compiler/codegen.ex`
Executor	Runs but wrong value, `key N not found`, type errors on valid code	`lib/lua/vm/executor.ex`
Stdlib	"function 'X' not implemented", string/math/table edge case	`lib/lua/vm/stdlib*`
Unimplemented	Whole feature missing (coroutines, full debug, files)	N/A — defer
Semantic	Implementation correct per spec, but suite expects a Lua-5.3-specific behavior we don't match	depends — check the Lua 5.3 reference manual

6. Decide: fix-now, plan-it, or defer

Three outcomes:

A. Trivial fix in scope

If the fix is < 50 lines in one file and isolated, AND it's part of an in-progress plan: include it in that plan's branch with a note in ## Discoveries.

B. Multi-file impact OR new concern

Write a new plan file under .agents/plans/<id>-<slug>.md. Use the template in .agents/plans/README.md. Frontmatter:

id: next available in the current direction (A or B).
unlocks: list of suite files this plan should fix. Run them all beforehand to confirm; list only ones you expect to flip.
status: ready if independent, blocked if waiting on another plan.

Required sections:

Goal: one sentence.
Out of scope: the things this plan is NOT doing.
Success criteria: include the unit test path, the suite count delta, and the unit-test stability requirement.
Implementation notes: hypothesis, files to touch, what to look for.
Verification: mix test, mix test --only lua53.
Risks: known unknowns.

Then open a corresponding GitHub issue and link them via issue: frontmatter field.

C. Defer with a line-range skip

The bug is real but fixing it is out of scope for the current cycle (coroutines, GC, full goto CFG, etc.). Don't tag the whole suite file @tag :skip — that throws away signal from every other assertion in the file. Instead, add a line-range entry to test/lua53_skips.exs:

%{
  "pm.lua" => [
    %{lines: 245..289, category: :stdlib, reason: "string.gmatch frontier %f not implemented", issue: 312}
  ]
}

Entry fields:

:lines — a Range covering the failing statement and any dependent state. Use :all only for whole-file deferrals awaiting initial triage (existing seed entries do this; you should replace one with a real range).
:category — pick one from the table in §5 of this skill (:lexer | :parser | :codegen | :executor | :stdlib | :unimplemented | :semantic).
:reason — one line, stands alone. Do not reference plan ids (plans are ephemeral per CLAUDE.md repo conventions).
:issue — optional GitHub issue number, or nil. Issues are durable; use them when there is a tracked fix.

Pick the smallest range that contains the failure. The whole failing assert(...) line if it stands alone; the enclosing do ... end block if assertions depend on earlier locals. Then scan the range for other assert(...) calls — if there are more, either name them in :reason or split into multiple entries. Broad ranges silently hide bugs.

Edge cases that make a range fail to parse:

Multi-line strings [[...]] and long comments --[[...]] — both delimiters must be inside or outside the range; never split.
local x = ... referenced later in the file — skipping the declaration breaks the downstream x; widen the range to cover those uses too.
if/then/.../end — then block is skippable but if, then, and end must survive. (Lua allows empty then.)

If your range crosses a syntactic boundary, the suite test fails with a parse error pointing near the boundary; widen and re-run.

Re-run the file after adding the range:

mix test test/lua53_suite_test.exs --only lua53

Confirm: the file either passes (test name now reads pm.lua (47 lines skipped, 3 ranges)) or progresses to a new, well-located failure. If a parse error appeared, the range crossed a boundary — widen.

Leave the unit test from step 4 in place — even if the suite range is skipped, the unit test documents the bug and will catch a regression if a future change accidentally fixes it.

7. Output a triage report

Whatever the outcome, end with a short summary the user can paste into a PR or issue:

Triage: <suite-file>

Symptom: <one line>
Failing line: <file>.lua:<N>
Repro: test/lua/vm/<area>_test.exs:<N>
Classification: <category>
Decision: fix-in-plan | defer | follow-up

If fix-in-plan: created .agents/plans/<id>-<slug>.md
If defer: added range <N..M> to test/lua53_skips.exs (issue #<N>)
Lines skipped: <N..M>

Conventions

Always leave a unit test, even when deferring. The unit test outlives the triage session.
Always cite the suite file and line in test comments so future readers can trace back to the official test.
Never silence a failing test without a comment naming what would unblock it.
When in doubt about the Lua 5.3 spec, check the reference manual: https://www.lua.org/manual/5.3/

When you're not sure

Ask the user. Specifically: when classifying a "semantic" failure (where both interpretations are defensible), surface the Lua 5.3 manual section and let the user decide whether to match it.

triage-suite-failure

More from this repository

More from this repository

triage-suite-failure

What this skill is for

What this skill is NOT for

Workflow

1. Reproduce the failure standalone

2. Find the failing line

3. Reduce to a minimal repro

4. Add the repro as a unit test

5. Classify the failure

6. Decide: fix-now, plan-it, or defer

A. Trivial fix in scope

B. Multi-file impact OR new concern

C. Defer with a line-range skip

7. Output a triage report

Conventions

When you're not sure

triage-suite-failure

What this skill is for

What this skill is NOT for

Workflow

1. Reproduce the failure standalone

2. Find the failing line

3. Reduce to a minimal repro

4. Add the repro as a unit test

5. Classify the failure

6. Decide: fix-now, plan-it, or defer

A. Trivial fix in scope

B. Multi-file impact OR new concern

C. Defer with a line-range skip

7. Output a triage report

Conventions

When you're not sure