Diagnose a failing Lua 5.3 official test suite file. Isolates the failure, classifies it, decides whether to fix-now or defer, and produces either a unit test + plan file (for fix-now) or a deferred-with-comment skip (for out-of-scope). Use this when investigating a specific suite file (`literals.lua`, `bitwise.lua`, etc.), when a /next-plan ship reveals downstream suite failures, when /suite-status shows new failures, or when the user asks "why is X failing". This skill does not ship code itself — it produces the artifacts (plan file, unit test, or skip tag) that other plans then ship via ship-a-plan.
Execute one plan file from .agents/plans/ as a single PR against main. Reads the plan, verifies preconditions, implements only what's in scope, runs full validation, opens a PR, and updates the plan file's status. Stops before merging — review is human-gated. Use this skill when the user invokes /next-plan, asks to "ship the next plan", "start plan A1" or similar, or when picking up a specific plan file by id. One plan = one PR = one issue = one merge to main. Do not batch multiple plans into a single PR.
Execute the benchmark harness under benchmarks/ and produce a markdown report comparing this build against the recorded baseline (and optionally against Luerl and PUC-Lua). Flag regressions over a configurable threshold. Use when the user asks for a benchmark run, after any executor or codegen change in Direction B, or when a Direction B plan's verification calls for it. STATUS: stub. Will be filled in when Direction B begins. The benchee harness is already in benchmarks/ (added in PR #143).