| name | add-language-support |
| description | Implements mutation testing support for new programming languages using tree-sitter grammars and the LanguageEngine trait. Triggers on "add language support", "add [language] support", "implement [language]", "new language", or mentions of specific languages like Python, TypeScript, C++, Java. |
| license | Apache-2.0 |
| metadata | {"author":"trailofbits","version":"1.1","mewt-version":">=2.0.0"} |
| allowed-tools | ["Read","Write","Edit","Bash","Glob","AskUserQuestion"] |
Adding Language Support to Mewt
Implements mutation testing support for new programming languages using tree-sitter grammars and the LanguageEngine trait.
When to Use
Use this skill when:
- Adding support for a new programming language (Python, JavaScript, Go, etc.)
- Implementing tree-sitter grammar integration
- User asks to "add language support" or "implement X language"
- Extending mewt's language capabilities
When NOT to Use
Don't use for using mewt: If the user wants to USE mewt for mutation testing (not add language support), use the mewt skill instead.
Don't use for grammar development: This assumes a tree-sitter grammar already exists. If no grammar exists, that's a separate specialized task beyond this skill's scope.
Essential Principles
Grammar First: Never attempt to write a tree-sitter grammar from scratch. Always find and verify an existing tree-sitter grammar repository before proceeding. Grammar development requires specialized expertise.
API Simplicity: The LanguageEngine trait has only 4 methods. Keep implementations minimal - no helper methods, just the trait interface and inline grammar loading.
Common First: Start with COMMON_MUTATIONS only. Add language-specific mutations only for unique constructs not covered by common patterns. Most languages need zero custom mutations.
Per-Slug Mutation Tests: Every mutation slug exposed by the engine must have a dedicated test module under tests/<language>/mutations/<SLUG>.rs. The guard test in tests/languages.rs enforces this convention. New languages must be wired into the guard before landing.
Integration Conformance First: Every language tests/<language>/integration_tests.rs must run the shared conformance harness from tests/conformance.rs, then add language-specific integration assertions as needed.
Example Fixture Policy: Keep canonical example fixtures at tests/<language>/example.<ext> (JavaScript may keep multiple canonical fixtures: example.js, example.ts, example.jsx, example.tsx). Integration tests should treat these as smoke checks (!mutants.is_empty()), while per-slug tests should use inline fixtures for precise assertions.
Verification Required: Each phase has explicit exit criteria. Do not proceed to the next phase until all validation passes.
When to Use
- User explicitly requests adding support for a specific programming language
- User mentions "add language support" or "new language"
- User provides a tree-sitter grammar repository URL
- User asks how to extend mewt with new languages
When NOT to Use
- Language already supported (check
src/languages/) - use existing implementation
- User wants to modify existing language support - use general code editing
- User wants to add mutations to existing language - edit
src/languages/<lang>/mutations.rs
- No tree-sitter grammar exists - inform user and halt (cannot create grammars)
Linear Progression
Phase 1: Grammar Acquisition
Entry Criteria: User has requested language support
Actions:
-
Check if tree-sitter grammar repository URL is provided
- If not, ask user with common patterns:
https://github.com/tree-sitter/tree-sitter-<language>
https://github.com/<maintainer>/tree-sitter-<language>
- Offer to search the web if needed
-
Once URL confirmed, use Edit to add to grammars/update.sh (lines 15-23):
declare -A REPO_URLS=(
["<language>"]="<tree-sitter-repo-url>"
)
declare -A GRAMMAR_PATHS=(
["<language>"]=""
)
-
Run grammar extraction:
cd grammars && bash update.sh <language> false
Exit Criteria:
Phase 2: Build System Integration
Entry Criteria: Grammar files extracted successfully
Actions:
-
Use Edit to add build configuration to build.rs after line 57:
let <language>_dir: PathBuf = ["grammars", "<language>", "src"].iter().collect();
build_grammar(&<language>_dir, "tree-sitter-<language>");
-
Verify compilation:
cargo check
Exit Criteria:
Phase 3: Language Engine Implementation
Entry Criteria: Grammar builds successfully
Actions:
-
Create language module structure:
mkdir -p src/languages/<language>
-
Write module declaration (src/languages/<language>/mod.rs):
pub mod engine;
pub mod mutations;
pub mod syntax;
-
Create syntax mappings (src/languages/<language>/syntax.rs):
- Use Read to examine
grammars/<language>/src/node-types.json
- Map node and field names from the grammar
pub mod nodes {
pub const IF_STATEMENT: &str = "if_statement";
pub const RETURN_STATEMENT: &str = "return_statement";
}
pub mod fields {
pub const CONDITION: &str = "condition";
pub const ARGUMENTS: &str = "arguments";
}
- For every operator-focused slug (AOS, AAOS, BOS, BAOS, LOS, SAOS, COS), confirm which node kinds the grammar uses for those operators (e.g.,
augmented_assignment_expression versus binary_expression). Only plug a node kind into patterns::shuffle_* after verifying it in node-types.json, and include any language-specific operator tokens (such as Go's &^= or JavaScript's **=/>>>=).
-
Create mutations file (src/languages/<language>/mutations.rs):
use crate::types::Mutation;
pub const <LANGUAGE>_MUTATIONS: &[Mutation] = &[];
-
Implement engine (src/languages/<language>/engine.rs):
- Use Read on
src/languages/rust/engine.rs as reference
- Follow the 4-method trait pattern:
use std::sync::OnceLock;
use tree_sitter::Language as TsLanguage;
use crate::LanguageEngine;
use crate::mutations::COMMON_MUTATIONS;
use crate::patterns;
use crate::types::{Mutant, Mutation, Target};
use crate::utils::{node_text, parse_source};
use super::mutations::<LANGUAGE>_MUTATIONS;
use super::syntax::{fields, nodes};
static <LANGUAGE>_LANGUAGE: OnceLock<TsLanguage> = OnceLock::new();
unsafe extern "C" {
fn tree_sitter_<language>() -> *const tree_sitter::ffi::TSLanguage;
}
pub struct <Language>LanguageEngine {
mutations: Vec<Mutation>,
}
impl <Language>LanguageEngine {
pub fn new() -> Self {
let mut mutations: Vec<Mutation> = Vec::new();
mutations.extend_from_slice(COMMON_MUTATIONS);
mutations.extend_from_slice(<LANGUAGE>_MUTATIONS);
Self { mutations }
}
}
impl LanguageEngine for <Language>LanguageEngine {
fn name(&self) -> &'static str {
"<Language>"
}
fn extensions(&self) -> &[&'static str] {
&["<ext>"]
}
fn get_mutations(&self) -> &[Mutation] {
&self.mutations
}
fn mutate(&self, target: &Target) -> Vec<Mutant> {
let source = &target.text;
let language = <LANGUAGE>_LANGUAGE
.get_or_init(|| unsafe { TsLanguage::from_raw(tree_sitter_<language>()) });
let tree = match parse_source(source, language) {
Some(t) => t,
None => return Vec::new(),
};
let root = tree.root_node();
let mut all_mutants = Vec::new();
for m in &self.mutations {
match m.slug {
"ER" => {
all_mutants.extend(
patterns::replace(
root,
source,
&[nodes::EXPRESSION_STATEMENT, nodes::RETURN_STATEMENT],
"panic!(\"mewt\")",
&|node, src| !node_text(node, src).contains("panic!"),
)
.into_iter()
.map(|p| Mutant::from_partial(p, target, "ER")),
)
}
"IF" => {
all_mutants.extend(
patterns::replace_condition(
root,
source,
nodes::IF_STATEMENT,
fields::CONDITION,
&["if"],
"false",
)
.into_iter()
.map(|p| Mutant::from_partial(p, target, "IF")),
)
}
_ => {}
}
}
all_mutants
}
}
impl Default for <Language>LanguageEngine {
fn default() -> Self {
Self::new()
}
}
Exit Criteria:
Phase 4: Language Registration
Entry Criteria: Engine implementation compiles
Actions:
-
Use Edit to add module to src/languages/mod.rs:
pub mod <language>;
-
Use Edit to register in src/main.rs (find the LanguageRegistry section):
registry.register(mewt::languages::<language>::engine::<Language>LanguageEngine::new());
Exit Criteria:
Phase 5: Tests and Examples
Entry Criteria: Language builds and registers
Actions:
-
Scaffold the test directories:
mkdir -p tests/<language>/mutations
The mutation folder must contain one Rust module per slug (for example tests/<language>/mutations/AOS.rs).
-
Write canonical example fixture(s):
- Most languages:
tests/<language>/example.<ext>.
- JavaScript-family languages: one canonical fixture per supported extension (
example.js, example.ts, example.jsx, example.tsx) as needed.
- Keep examples syntactically valid and representative, but small enough to stay smoke-test friendly.
-
Wire the language into the integration test entrypoint (tests/languages.rs):
mod <language>;
-
Create tests/<language>/mod.rs to expose both integration and per-slug suites:
mod integration_tests;
mod mutations;
-
Author tests/<language>/mutations/mod.rs that re-exports every slug module:
#![allow(non_snake_case)]
#[path = "AAOS.rs"]
mod aaos;
#[path = "AOS.rs"]
mod aos;
- Use uppercase filenames that match the slug (
<SLUG>.rs) and map to lowercase module names via #[path = ...] where needed.
- Keep manual slug wiring in
mutations/mod.rs in sync with files on disk.
-
For each slug surfaced by engine.get_mutations(), create tests/<language>/mutations/<SLUG>.rs:
- Prefer inline source fixtures for precise, slug-focused assertions.
- Use integration helpers (
create_test_target, shared slug assertion helpers) rather than duplicating setup code.
- For operator families (AAOS/BAOS/SAOS/AOS/BOS/COS/LOS), cover all supported operators and include negative cases where useful.
-
Create tests/<language>/integration_tests.rs with this pattern:
- Define a thin
create_test_target(...) wrapper that delegates to tests/utils.rs.
- Run
conformance::run_common_language_checks(...) for baseline behavior.
- Add canonical example-file smoke tests (
!mutants.is_empty()).
- Add only language-specific integration assertions beyond conformance (for example parser edge cases).
-
Update the guard in tests/languages.rs so the new language participates in per-slug coverage checks (import the engine and add a check_language(...) call).
Exit Criteria:
Phase 6: Validation
Entry Criteria: All tests pass
Actions:
-
Build release binary:
cargo build --release
-
Verify mutations are registered:
./target/release/mewt print mutations --language <language>
- Should list COMMON_MUTATIONS (ER, CR, IF, etc.)
-
Generate and verify mutants:
./target/release/mewt print mutants --target tests/<language>/example.<ext>
- Verify reasonable number of mutants
- Check mutations are diverse (ER, IF, CR, etc.)
- Verify line numbers are accurate
-
Run full test suite:
just test
Exit Criteria:
API Quick Reference
The 4-Method Trait
pub trait LanguageEngine: Send + Sync {
fn name(&self) -> &'static str;
fn extensions(&self) -> &[&'static str];
fn get_mutations(&self) -> &[Mutation];
fn mutate(&self, target: &Target) -> Vec<Mutant>;
}
Key changes from older versions:
- ✅
mutate() replaces apply_all_mutations()
- ✅ No
tree_sitter_language() method - load grammar inline
- ✅ No helper methods - keep implementations minimal
Common Mutation Patterns
Use Read on src/languages/rust/engine.rs for complete examples of:
| Slug | Pattern | Purpose |
|---|
| ER | patterns::replace() | Replace statements with errors |
| CR | patterns::replace() | Replace statements with comments |
| IF | patterns::replace_condition() | Replace if conditions with false |
| IT | patterns::replace_condition() | Replace if conditions with true |
| WF | patterns::replace_condition() | Replace while conditions with false |
| LC | patterns::swap_branches() | Swap true/false branches |
| BL | patterns::replace() | Replace boolean literals |
| AOS | patterns::replace_operator() | Replace arithmetic operators |
| BOS | patterns::replace_operator() | Replace bitwise operators |
| LOS | patterns::replace_operator() | Replace logical operators |
| COS | patterns::replace_operator() | Replace comparison operators |
Common Pitfalls
Node Type Mismatches
Problem: Using documentation names instead of actual grammar names
Solution: Always verify in grammars/<language>/src/node-types.json
FFI Function Naming
Problem: Incorrect external function name
Solution: Must be exactly tree_sitter_<language> (all lowercase, underscores for hyphens)
Missing Scanner
Problem: Build fails during C compilation
Solution: Some grammars need scanner.c - the update script handles this automatically
Over-Engineering
Problem: Adding helper methods, custom parsing logic
Solution: Keep it minimal - just implement the 4 trait methods
Too Many Custom Mutations
Problem: Adding language-specific mutations before verifying common ones work
Solution: Start with COMMON_MUTATIONS only. Most languages need zero custom mutations.
Compound Assignment Node Kinds
Problem: Assuming compound assignments share the binary_expression node kind
Solution: Check node-types.json for the exact node kind (e.g., augmented_assignment_expression, compound_assignment_expr) and wire AAOS/BAOS/SAOS to that. Include all language-specific operator tokens when configuring patterns::shuffle_operators.
Success Checklist
Example: Adding Go Support
Condensed walkthrough assuming tree-sitter-go URL is known:
cd grammars && bash update.sh go false
mkdir -p src/languages/go
mkdir -p tests/go/examples
cargo build --release
./target/release/mewt print mutations --language go
./target/release/mewt print mutants --target tests/go/example.go
just test