en un clic
lezer-grammar
// Write, debug, and integrate Lezer grammars for the Liftosaur project. Use when creating or modifying .grammar files, evaluators, syntax highlighting, or CodeMirror integration.
// Write, debug, and integrate Lezer grammars for the Liftosaur project. Use when creating or modifying .grammar files, evaluators, syntax highlighting, or CodeMirror integration.
| name | lezer-grammar |
| description | Write, debug, and integrate Lezer grammars for the Liftosaur project. Use when creating or modifying .grammar files, evaluators, syntax highlighting, or CodeMirror integration. |
| disable-model-invocation | true |
| argument-hint | ["grammar description or task"] |
Write, fix, or integrate a Lezer grammar for: $ARGUMENTS
Before writing any grammar, read the relevant existing reference files:
liftoscript.grammar — expression-oriented grammar (operators, precedence, control flow)src/pages/planner/plannerExercise.grammar — line-oriented grammar (exercises, sets, mixed parsing)src/liftohistory/liftohistory.grammar — record-oriented grammar (dates, metadata, block structure)A .grammar file consists of:
@top directive — designates the root rule@tokens block — defines lexical tokens (regular language, no non-tail recursion)@skip directive — defines what to ignore between tokens@precedence directive (at rule level) — resolves shift/reduce conflicts@precedence (inside @tokens) — resolves token overlap conflicts@top Program { expression* }
@precedence { times @left, plus @left }
expression { NumberExpression | BinaryExpression }
BinaryExpression {
expression !plus "+" expression |
expression !times "*" expression
}
NumberExpression { Number }
@skip { space }
@tokens {
space { @whitespace+ }
Number { @digit+ }
}
ExerciseSet, Number) produce visible nodes in the syntax treeexpression, emptyLine) are anonymous/transparent — their content is inlined into the parent node, not appearing as a named node@tokens)Rules define tree structure. They can reference other rules AND tokens.
Operators:
* — zero or more+ — one or more? — optional| — alternation (lowest precedence)() — groupingExample:
ExerciseSet { (Rpe | Timer | SetPart | Weight)+ }
ExerciseLine { ExerciseName (SectionSeparator ExerciseSection)* linebreak }
@tokens)Tokens define lexical atoms. They can ONLY reference other tokens in the same @tokens block.
Character classes:
$[a-zA-Z] — include range$[.,] — include specific chars![/\n] — exclude (match anything except)_ — wildcard, any single characterBuilt-in classes:
@digit = $[0-9]@asciiLetter = $[a-zA-Z]@asciiLowercase = $[a-z]@asciiUppercase = $[A-Z]@whitespace = Unicode whitespace@eof = end of inputStrings: "keyword" matches literal text. Can be used in both rules and @tokens.
Examples:
@tokens {
Keyword { $[a-zA-Z_] $[0-9a-zA-Z_]* }
Int { @digit+ }
Float { @digit+ "." @digit+ }
Weight { ("+" | "-")? (Float | Int) ("lb" | "kg") }
NonSeparator { ![/{}() \t\n\r]+ }
LineComment { "//" ![\n]* linebreakOrEof }
QuotedString { '"' !["\n]* '"' }
}
@tokens)Resolves conflicts when multiple tokens match at the same position. Earlier items win:
@tokens {
@precedence { Weight, Percentage, Float, Int }
@precedence { LineComment, SectionSeparator }
}
Tokens are greedy (longest match wins). The DFA tracks all possible token states simultaneously and falls back to the last accepting state if a longer match fails.
Token precedence only matters when tokens can match at the same grammar position. If token A only appears in rule X and token B only in rule Y, no precedence is needed even if they match the same text — Lezer's tokenizer is context-aware.
Resolves shift/reduce conflicts in the grammar (e.g., operator precedence):
@precedence {
unary
fncall @left
times @left
plus @left
cmp @left
ternary @right
assign @right
}
Use !name markers in rules to reference precedence levels:
BinaryExpression {
expression !plus Plus expression |
expression !times Times expression
}
Associativity: @left (left-associative), @right (right-associative), or omitted.
@cut modifier: @precedence { keyword @cut } discards alternative parse branches once a !keyword marker is passed — improves performance and error messages.
@skipDefines tokens to silently consume between all other tokens:
@skip { space | LineComment | ";" | "{~" | "~}" }
Context-specific skip — disable skipping inside certain rules:
@skip {} {
String { stringOpen stringContent* stringClose }
}
@specialize and @extendHandle keywords that overlap with identifiers:
@specialize<Keyword, "if"> — when a Keyword token matches "if", produce a specialized "if" token instead@extend<Keyword, "none"> — like specialize but allows GLR (both interpretations possible)Used in rules:
IfExpression {
@specialize<Keyword, "if"> ParenthesisExpression BlockExpression
(@specialize<Keyword, "else"> BlockExpression)?
}
None { @specialize<Keyword, "none"> }
Parameterized rules for reuse:
commaSep<content> { "" | content ("," content)* }
FunctionExpression { FunctionName "(" commaSep<expression> ")" }
When the grammar is genuinely ambiguous, use ~name markers to split into parallel parse threads:
expression { GoodValue ~choice | BadValue ~choice }
Dynamic precedence tips the scale: A[@dynamicPrecedence=1] { ... } (range -10 to 10).
For patterns regular token rules cannot express (indentation, automatic semicolons):
@external tokens myTokenizer from "./tokens" { token1, token2 }
Implemented as ExternalTokenizer instances:
import { ExternalTokenizer } from "@lezer/lr";
import { token1 } from "./parser.terms";
export const myTokenizer = new ExternalTokenizer(
(input, stack) => {
if (someCondition) input.acceptToken(token1);
},
{ contextual: false, fallback: false, extend: false }
);
Maintain parse state accessible to external tokenizers:
@context trackContext from "./context.js"
Implemented as a ContextTracker:
import { ContextTracker } from "@lezer/lr";
export const trackContext = new ContextTracker({
start: initialValue,
shift(context, term, stack, input) { return newContext; },
reduce(context, term, stack, input) { return newContext; },
hash(context) { return hashNumber; },
});
Context-specific tokenization (e.g., inside strings or template literals):
@local tokens {
stringEnd[@name='"'] { '"' }
StringEscape { "\\" _ }
@else stringContent
}
The @else fallback captures everything not matched by the local tokens.
Conditional language variants:
@dialects { strictMode }
StrictKeyword[@dialect=strictMode] { "strict" }
Enable via parser.configure({ dialect: "strictMode" }).
Attach metadata:
StartTag[closedBy="EndTag"] { "<" }
Pseudo-props: @name (override node name), @detectDelim, @isGroup.
npm run grammar:planner # plannerExercise.grammar -> plannerExerciseParser.ts
npm run grammar:liftohistory # liftohistory.grammar -> liftohistoryParser.ts
These run:
lezer-generator --output <output.ts> <input.grammar>
The --output flag with a .ts extension produces TypeScript. The generator also creates a .terms.ts file with numeric term IDs.
Each generated parser file exports a parser via LRParser.deserialize():
import { LRParser } from "@lezer/lr";
export const parser = LRParser.deserialize({
version: 14,
states: "...", // serialized state machine
stateData: "...",
goto: "...",
nodeNames: "⚠ LineComment Program BinaryExpression ...",
maxTerm: 59,
skippedNodes: [0, 1],
tokenData: "...",
topRules: { Program: [0, 2] },
});
The .terms.ts file exports numeric constants for each node type:
export const Program = 1;
export const WorkoutRecord = 2;
export const Keyword = 6;
@lezer/generator (v1.2.3) — build-time grammar compiler (CLI + API)@lezer/lr (v1.3.7) — runtime LR parser@lezer/common — shared types: Tree, TreeCursor, SyntaxNode, NodeType@lezer/highlight (v1.1.6) — syntax highlighting tags@codemirror/language (v6.10.6) — CodeMirror language integrationimport { parser as liftoscriptParser } from "./liftoscript";
const tree = liftoscriptParser.parse(script);
const topNode = tree.topNode; // SyntaxNode for the root
Pattern 1: Linear cursor traversal — visit every node in the tree:
const cursor = tree.cursor();
do {
if (cursor.node.type.name === NodeName.StateVariable) {
// process node
}
} while (cursor.next());
Used for: scanning all nodes of a type (state variables, error detection, weight unit conversion).
Pattern 2: Child iteration — get all direct children of a node:
function getChildren(node: SyntaxNode): SyntaxNode[] {
const cur = node.cursor();
const result: SyntaxNode[] = [];
if (!cur.firstChild()) return result;
do {
result.push(cur.node);
} while (cur.nextSibling());
return result;
}
This helper is duplicated across all three evaluators. Use it to iterate children.
Pattern 3: Named child lookup — find specific children by type:
const numberNode = expr.getChild(NodeName.NumberExpression); // first match
const fnArgs = node.getChildren(PlannerNodeName.FunctionArgument); // all matches
Used for: extracting structured data from known node shapes.
Pattern 4: Type checking:
// By name (string comparison)
if (cursor.node.type.name === NodeName.BuiltinFunctionExpression) { ... }
// By numeric ID (faster, use .terms.ts imports)
if (child.type.id === WorkoutRecord) { ... }
// Error detection
if (cursor.node.type.isError) {
this.error("Syntax error", cursor.node);
}
Extract source text from node positions:
function getValue(node: SyntaxNode, script: string): string {
return script.slice(node.from, node.to);
}
Node positions (from, to) are byte offsets into the original string.
All three evaluators follow this pattern:
export class MySyntaxError extends SyntaxError {
public readonly line: number;
public readonly offset: number;
public readonly from: number;
public readonly to: number;
constructor(message: string, line: number, offset: number, from: number, to: number) {
super(message);
this.line = line;
this.offset = offset;
this.from = from;
this.to = to;
}
}
Line/offset are computed from node.from by splitting the source on newlines:
private getLineAndOffset(node: SyntaxNode): [number, number] {
const linesLengths = this.script.split("\n").map((l) => l.length + 1);
let offset = 0;
for (let i = 0; i < linesLengths.length; i++) {
if (node.from >= offset && node.from < offset + linesLengths[i]) {
return [i + 1, node.from - offset];
}
offset += linesLengths[i];
}
return [linesLengths.length, linesLengths[linesLengths.length - 1]];
}
Use cursor traversal with position tracking to rewrite parts of the source:
public switchWeightsToUnit(programNode: SyntaxNode, toUnit: IUnit): string {
const cursor = programNode.cursor();
let script = this.script;
let shift = 0;
do {
if (cursor.node.type.name === NodeName.WeightExpression) {
const from = cursor.node.from;
const to = cursor.node.to;
const newStr = convertWeight(cursor.node, toUnit);
const oldStr = script.slice(from + shift, to + shift);
script = script.substring(0, from + shift) + newStr + script.substring(to + shift);
shift += newStr.length - oldStr.length;
}
} while (cursor.next());
return script;
}
Track shift to account for length changes from earlier replacements.
Each grammar needs a manually maintained enum mapping node names:
Liftoscript (src/liftoscriptEvaluator.ts):
export enum NodeName {
Program = "Program",
BinaryExpression = "BinaryExpression",
NumberExpression = "NumberExpression",
WeightExpression = "WeightExpression",
StateVariable = "StateVariable",
BuiltinFunctionExpression = "BuiltinFunctionExpression",
Keyword = "Keyword",
// ... one entry per Capitalized grammar rule
}
Planner (src/pages/planner/plannerExerciseStyles.ts):
export enum PlannerNodeName {
Program = "Program",
ExerciseExpression = "ExerciseExpression",
ExerciseName = "ExerciseName",
SetPart = "SetPart",
// ... etc
}
Liftohistory — uses numeric term IDs from the auto-generated .terms.ts:
import { WorkoutRecord, ExerciseLine, SetPart, ... } from "./liftohistoryParser.terms";
if (child.type.id === WorkoutRecord) { ... }
Both approaches work. String comparison (type.name) is more readable; numeric ID comparison (type.id) is faster.
Wrap the parser with LRLanguage.define():
import { LRLanguage } from "@codemirror/language";
import { styleTags, tags as t } from "@lezer/highlight";
import { parser } from "./myParser";
const parserWithMetadata = parser.configure({
props: [
styleTags({
Number: t.number,
Keyword: t.keyword,
LineComment: t.lineComment,
StateVariable: t.variableName,
}),
],
});
export const myLanguage = LRLanguage.define({
name: "myLanguage",
parser: parserWithMetadata,
});
Map node names to @lezer/highlight tags. Selector syntax:
"Number" — match by node name"FunctionName/..." — recursive: tag applies to all descendants"CallExpression/VariableName" — path: child only when inside parent"*/Escape" — wildcard parentLiftosaur uses these common tag mappings:
export const plannerExerciseStyles = {
[`${PlannerNodeName.SetPart}/...`]: t.atom, // sets like "3x5"
[`${PlannerNodeName.Weight}/...`]: t.number, // "100kg"
[`${PlannerNodeName.Rpe}/...`]: t.number, // "@8"
[`${PlannerNodeName.Timer}/...`]: t.keyword, // "90s"
[PlannerNodeName.LineComment]: t.lineComment, // "// comment"
[PlannerNodeName.TripleLineComment]: t.blockComment, // "/// description"
[`${PlannerNodeName.FunctionName}/...`]: t.attributeName,
[`${PlannerNodeName.FunctionArgument}/...`]: t.attributeValue,
[PlannerNodeName.Week]: t.annotation, // "# Week 1"
[PlannerNodeName.Day]: t.docComment, // "## Day 1"
};
For non-CodeMirror rendering, use highlightTree:
import { highlightTree, classHighlighter, styleTags } from "@lezer/highlight";
export function highlight(script: string): string {
const tree = parserWithMetadata.parse(script);
const ranges: { from: number; to: number; clazz: string }[] = [];
highlightTree(tree, classHighlighter, (from, to, classes) => {
ranges.push({ from, to, clazz: classes });
});
return ranges.reduceRight((acc, range) => {
const span = `<span class="${range.clazz}">${acc.slice(range.from, range.to)}</span>`;
return acc.slice(0, range.from) + span + acc.slice(range.to);
}, script);
}
Embed one grammar inside another using parseMixed():
import { parseMixed } from "@lezer/common";
import { liftoscriptLanguage } from "../../liftoscriptLanguage";
const parserWithMetadata = plannerExerciseParser.configure({
props: [styleTags(plannerExerciseStyles)],
wrap: parseMixed((node) => {
return node.name === "Liftoscript" ? { parser: liftoscriptLanguage.parser } : null;
}),
});
The embedded region must be captured as a single token in the outer grammar:
Liftoscript { "{~" ![~]* "~}" }
This allows {~ if (cr >= 5) { state.weight += 5lb } ~} blocks in the planner grammar to be parsed and highlighted with the Liftoscript grammar.
| Pattern | Token | Example |
|---|---|---|
| Set notation | SetPart { Rep "x" (RepRange | Rep) } | 3x5, 3x5-8 |
| Weight | Weight { ("+" | "-")? (Float | Int) ("lb" | "kg") "+"? } (token) | 100kg, +5lb |
| Ask weight | AskWeight { "?+" } (token) | ?+ |
| Percentage | Percentage { ("+" | "-")? (Float | Int) "%" } (token) | 80%, -5% |
| RPE | Rpe { "@" PosNumber } (rule, "@" is inline token) | @8, @9.5 |
| Duration/Timer | Duration { @digit+ "s" } (token) | 90s |
| Section separator | SectionSeparator { "/" } (token) | / |
| Exercise name | ExerciseName { NonSeparator+ } (rule) | Bench Press |
| Comments | LineComment { "//" ![\n]* linebreakOrEof } (token) | // rest day |
| Unilateral reps | UnilateralReps { Rep "|" Rep } | 5|5 |
For formats where newlines are significant:
@skip { space } // space = $[ \t]+, NOT @whitespace (no newlines)
@tokens {
space { $[ \t]+ }
linebreakOrEof { linebreak | @eof }
linebreak { "\n" | "\r" | "\r\n" }
}
Always define linebreakOrEof to handle the last line without a trailing newline.
For exercise names and free-form text, exclude all structural characters:
NonSeparator { ![/{}() \t\n\r]+ }
Add characters as needed (e.g., @, |, ", : for liftohistory). Use @precedence for inline tokens that overlap:
@precedence { "+", "-", "x", NonSeparator }
When a sequence like keyword ":" "{" could be ambiguous, fuse it into a single token:
ExercisesOpen { "exercises:" $[ \t]* "{" }
The DFA's longest-match + fallback handles it: if the compound token fails (no {), it falls back to the shorter Keyword match.
Liftoscript expressions can be embedded in {~ ... ~} blocks:
Grammar:
Liftoscript { "{~" ![~]* "~}" }
Evaluator strips delimiters:
const liftoscriptText = script.slice(node.from + 2, node.to - 2); // strip {~ and ~}
@precedence { X, Y } in @tokens@precedence with !markers@tokens or restructureconst tree = parser.parse("input text");
const cursor = tree.cursor();
do {
console.log(
" ".repeat(cursor.node.depth ?? 0),
cursor.node.type.name,
`[${cursor.from}-${cursor.to}]`,
cursor.node.type.isError ? "ERROR" : ""
);
} while (cursor.next());
.grammar file, run npm run grammar:planner or npm run grammar:liftohistory@skip consuming significant whitespace — use $[ \t]+ not @whitespace+ for line-oriented formatsDuration { @digit+ "s" } must be a token, not a rule like Duration { Int "s" }, to avoid ambiguity between Int as reps and Int "s" as durationlinebreakOrEof — line-terminated rules must end with linebreakOrEof to handle the final lineWhen creating a new grammar:
.grammar file following the patterns abovepackage.json:
"grammar:myformat": "lezer-generator --output ./src/path/myParser.ts ./src/path/my.grammar"
npm run grammar:myformat.terms.ts numeric IDs)parser.parse(text)styleTags(...))LRLanguage with the configured parserparseMixed()highlightTree() with classHighlighter| File | Purpose |
|---|---|
liftoscript.grammar | Expression grammar (operators, control flow) |
src/pages/planner/plannerExercise.grammar | Exercise/set/week grammar |
src/liftohistory/liftohistory.grammar | History record grammar |
src/liftoscript.ts | Generated liftoscript parser |
src/liftoscriptEvaluator.ts | Liftoscript tree evaluator + NodeName enum |
src/parser.ts | ScriptRunner orchestrator |
src/pages/planner/plannerExerciseParser.ts | Generated planner parser |
src/pages/planner/plannerExerciseEvaluator.ts | Planner tree evaluator |
src/pages/planner/plannerExerciseStyles.ts | PlannerNodeName enum + highlight styles |
src/pages/planner/plannerExerciseCodemirror.ts | Planner CodeMirror language + autocomplete |
src/pages/planner/plannerHighlighter.ts | Planner HTML syntax highlighting |
src/liftoscriptLanguage.ts | Liftoscript CodeMirror language definition |
src/liftohistory/liftohistoryParser.ts | Generated liftohistory parser |
src/liftohistory/liftohistoryParser.terms.ts | Liftohistory term ID constants |
src/liftohistory/liftohistoryDeserializer.ts | Liftohistory tree deserializer |
Fix a production exception from Rollbar in an interactive Claude Code session. Fetches the occurrence data, analyzes the error, and either implements a fix or adds the error to the ignore list. Use when given a Rollbar occurrence ID.
Guidelines for writing React Native code in Liftosaur. Use when migrating web components to RN primitives, fixing RN performance issues, or building new RN screens.
Write idiomatic Liftoscript code for weightlifting programs. Use when creating or editing program files in programs/builtin/.
Writing style guide for prose content - program descriptions, technique pages, blog posts, and any user-facing text. Use when writing or editing descriptive content.
Capture important architectural decisions, bug patterns, feature designs, or other critical findings into the project knowledge base. Use proactively when significant technical decisions are made, non-obvious bugs are resolved, new subsystems are designed, or important product features are discussed.
Add a new server-rendered page to the Liftosaur website. Use when creating new public-facing pages with routing, SSR, and client hydration.