Run any Skill in Manus with one click

$pwd:

running-test262

Name: Running Test262
Author: SocketDev

// Run the test262 conformance suite against fleet parsers / runtimes (ultrathink acorn variants, socket-btm temporal-infra, future ports) using each repo's canonical runner. Never write homebrew test262 runners. Every parser/runtime in the fleet ships a runner under `test/scripts/test262-*.mts` and an unsupported-features config. Use this skill when asked to run spec tests, check conformance, debug a failing test262 case, or compare a parser against a reference implementation.

Run Skill in Manus

$ git log --oneline --stat

stars:0

forks:0

updated:May 28, 2026 at 18:24

SKILL.md

readonly

related-skills.json

same repository

trimming-bundle.md

from "SocketDev/socket-lib"

For repos that ship a built bundle, finds unused code paths in dist/ and iteratively stubs them via the bundler's stub plugin. Each candidate stub goes through stub → rebuild → test loop; only paths that pass the loop are kept. Today the only supported bundler is rolldown (createLibStubPlugin); the skill shape generalizes to other bundlers if the fleet adopts them. Use after a bundler migration, before publishing a new version, or whenever bundle size grows unexpectedly.

2026-05-290

auditing-gha-settings.md

from "SocketDev/socket-lib"

Audits a repo's GitHub Actions permissions + allowlist against the fleet baseline. Reports drift only. Fixes are manual in Settings → Actions because flipping these silently is unsafe. Use when a CI failure looks like "action X is not allowed to be used", when onboarding a new fleet repo, or as a periodic fleet-wide health check.

2026-05-280

cleaning-redundant-ci.md

from "SocketDev/socket-lib"

Sweeps a fleet repo (or every fleet repo) for redundant CI surface. Three classes: orphan workflow YAML files (lint.yml / check.yml / type.yml / test.yml that the unified ci.yml replaced), GitHub-Dependabot auto-fix PRs that the fleet handles via /updating-security, and stale workflow run history in the Actions sidebar. Deletes the YAML files, disables Dependabot automated-security-fixes via gh api, and reports anything that needs a manual UI toggle. Once-and-never-again sweep meant to leave a repo clean.

2026-05-280

reviewing-code.md

from "SocketDev/socket-lib"

Reviews the current branch against a base ref using multiple AI backends. Routes discovery, discovery-secondary, remediation, and verify passes through the available agents (codex, claude, opencode, kimi, …), gracefully skipping any backend that isn't installed. Writes a markdown findings report under docs/. Use when preparing or updating a PR, before merging a feature branch, or when wanting an independent second opinion from a different agent.

2026-05-280

scanning-quality.md

from "SocketDev/socket-lib"

Scans the codebase for bugs, logic errors, cache races, workflow problems, insecure defaults, security regressions in the diff, and variant analysis on prior findings. Spawns specialized Task agents per scan type, deduplicates findings, and produces an A-F prioritized report. Use when preparing a release, investigating quality issues, running pre-merge checks, or whenever a recent diff touches security-sensitive code.

2026-05-280

scanning-security.md

from "SocketDev/socket-lib"

Runs a multi-tool security scan: AgentShield for Claude config, zizmor for GitHub Actions, and optionally Socket CLI for dependency scanning. Produces an A-F graded security report. Use after modifying `.claude/` config, hooks, agents, or GitHub Actions workflows, and before releases.

2026-05-280

package.json

"author": "SocketDev"

"repository": "SocketDev/socket-lib"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

name	running-test262
description	Run the test262 conformance suite against fleet parsers / runtimes (ultrathink acorn variants, socket-btm temporal-infra, future ports) using each repo's canonical runner. Never write homebrew test262 runners. Every parser/runtime in the fleet ships a runner under `test/scripts/test262-*.mts` and an unsupported-features config. Use this skill when asked to run spec tests, check conformance, debug a failing test262 case, or compare a parser against a reference implementation.
user-invocable	true
allowed-tools	Bash(node:), Bash(pnpm:), Bash(ls:), Bash(cat:), Bash(grep:), Bash(find:), Read
model	claude-haiku-4-5
context	fork

running-test262

The fleet has multiple parsers + runtimes that conform to ECMA262 or to a TC39 proposal:

ultrathink/packages/acorn/: the JS parser, multiple lang ports (cpp/go/rust/typescript).
ultrathink/packages/test262-parser-runner/: the canonical shared runner package.
socket-btm/packages/temporal-infra/: Temporal-proposal C++ port.

Every one of them ships its own scripts/test262-*.mts runner + an unsupported-features config. Running test262 by hand (downloading the suite, scanning the metadata blocks, running each test) is the wrong shape. The runners already encode the suite-traversal, the per-feature skip logic, the harness setup, and the result-aggregation. Always reach for the existing runner.

Test262 submodule pin

The fleet pins to a shared tc39/test262 SHA. As of 2026-05-21 both ultrathink + socket-btm pin 7e115f46a. When bumping in one repo, bump in the other so cross-fleet comparison stays apples-to-apples.

Annotation lives in each repo's .gitmodules with the pattern # test262-YYYY.MM.DD (commit-date of the pinned SHA, enforced by the gitmodules-comment-guard hook).

🚨 Strict allowlist policy

An allowlist entry is ONLY for non-parser test fails. Anything a parser should handle MUST NOT be allowlisted; it must be fixed in the parser. This is strict; the runners enforce it via design choices below.

What counts as "non-parser":

Unimplemented TC39 feature: the proposal is at Stage 3+ but we haven't ported the grammar yet (decorators, source-phase imports). Goes in test262-config/test262.unsupported-features keyed on the TC39 feature name (NOT a test path).
Runner / harness bug: the test runner itself produces a false signal (e.g. async-throws semantics, error-name matching). Fix the runner, don't allow-list the symptom.
Runtime-only test: the test exercises a runtime API (Reflect.*, Temporal.*) that the parser-conformance run can't evaluate. The runners skip these by classification, not per-path allowlist.

What does NOT count and must be fixed in the parser:

"Parser rejects valid input." Fix the parser.
"Parser accepts invalid input." Fix the parser.
"Parser produces wrong AST shape." Fix the parser.
"Cross-impl divergence: Rust + TS pass, Go fails." Fix Go.

If you feel tempted to add a per-test-path allowlist entry, the answer is almost always "the parser needs fixing." The unsupported-features file is the only escape valve and it's feature-name-keyed by design. You can't sneak a parser bug past it.

Canonical runners per repo

Repo	Runner	Skip config
ultrathink/packages/acorn (multi-lane driver)	`test/test262-compare.mts`	per-lane runner config (inherits unsupported-features)
ultrathink/packages/acorn (per-lane)	`lang/<lane>/scripts/test262.mts`	`test262-config/test262.unsupported-features` (feature-name-keyed)
ultrathink/packages/test262-parser-runner	`bin/test262-parser-runner.mts`	passed via flags
socket-btm/packages/temporal-infra	`test/scripts/test262-temporal-runner.mts`	`test262-config/test262.allowlist` (Temporal-only path allowlist; reviewed manually for non-parser-fail justification)

Invocation patterns

Multi-lane (recommended for cross-lane parity checks)

cd packages/acorn

# All 4 lanes, full suite
node test/test262-compare.mts

# Subset of lanes
node test/test262-compare.mts --lane rust,go

# All lanes, filtered to a single category
node test/test262-compare.mts --include 'language/expressions/await'

# Single test path, all lanes
node test/test262-compare.mts test/language/statements/class/private-method.js

Lanes: rust, go, cpp, typescript. Flags forward to each per-lane runner.

Single-lane

# Per-lane direct invocation
cd packages/acorn/lang/rust && node scripts/test262.mts
cd packages/acorn/lang/go && node scripts/test262.mts
cd packages/acorn/lang/cpp && node scripts/test262.mts
cd packages/acorn/lang/typescript && node scripts/test262.mts

# socket-btm temporal-infra
cd socket-btm/packages/temporal-infra && node test/scripts/test262-temporal-runner.mts

Single-case debug

Pass the test path positionally:

# Single lane
node scripts/test262.mts test/language/expressions/await/await-in-nested-function.js

# All lanes
node test/test262-compare.mts test/language/expressions/await/await-in-nested-function.js

Targeted filtering

node scripts/test262.mts --include 'export'          # regex on path
node scripts/test262.mts --exclude 'surrogate'       # regex on path
node scripts/test262.mts --category module           # named feature group
node scripts/test262.mts --include 'class' --exclude 'async'

Vitest-integrated mode

Each repo also wires a vitest test that wraps the runner. Useful for CI integration and selective re-runs:

pnpm exec vitest run test/unit/test262.test.mts             # ultrathink acorn
pnpm exec vitest run test/unit/test262-temporal.test.mts    # socket-btm temporal

Common failure modes

Submodule missing. The test262 suite is a git submodule. If the runner errors with "test262 suite not found", run git submodule update --init --recursive.
Feature classification drift. The runner uses each test's metadata block (/*--- features: [...] ---*/) to decide whether to run or skip. If a new TC39 feature is added upstream, classify it in the unsupported-features config first; do not let the runner silently pass tests for features the parser doesn't implement.
"Allowlist drift": does NOT apply here. The acorn lanes don't carry a per-test-path allowlist. If a test starts passing or failing, that's the parser's behavior; either the parser is correct and the test is correct (good), or one of them is wrong and that's a bug.
Cross-fleet drift. ultrathink and socket-btm should pin the same tc39/test262 SHA. If you're investigating a flaky test, double-check both .gitmodules files first.

Never write a homebrew runner

The existing runners encode dozens of edge cases (strict-mode harness wrapping, async-throws semantics, error-name matching, the negative.phase distinction between parse vs early errors). Recreating that surface from scratch reliably misses cases. If you find yourself wanting to "just run a few test262 files by hand," reach for the runner with a filter arg instead.

Reference

TC39 test262 spec: https://github.com/tc39/test262
Each runner's source is the source of truth for invocation flags and exit-code conventions; cat the runner first if the invocation is unclear.
Strict allowlist policy + multi-lane behavior + tc39/test262 pin date all encoded in this skill. Read this skill before touching either system.

running-test262

More from this repository

More from this repository

running-test262

Test262 submodule pin

🚨 Strict allowlist policy

Canonical runners per repo

Invocation patterns

Multi-lane (recommended for cross-lane parity checks)

Single-lane

Single-case debug

Targeted filtering

Vitest-integrated mode

Common failure modes

Never write a homebrew runner

Reference

running-test262

Test262 submodule pin

🚨 Strict allowlist policy

Canonical runners per repo

Invocation patterns

Multi-lane (recommended for cross-lane parity checks)

Single-lane

Single-case debug

Targeted filtering

Vitest-integrated mode

Common failure modes

Never write a homebrew runner

Reference