Run any Skill in Manus with one click

agent-reproduce-align

Use after a Codex or Claude Code feature has been implemented in Qwen Code to run the selected reference agent and Qwen Code under the same scenario, capture HTTP and terminal traces, compare request bodies, tool/function schemas, outputs, and iterate until the reproduced behavior is close enough.

Run Skill in Manus

Overview

Install command

npx skills add https://github.com/QwenLM/qwen-code --skill agent-reproduce-align

Copy and paste this command into Claude Code to install the skill

Source

QwenLM/qwen-code

Stars24,887

Forks2,455

UpdatedJune 2, 2026 at 04:10

File Explorer

5 files

SKILL.md

readonly

name	agent-reproduce-align
description	Use after a Codex or Claude Code feature has been implemented in Qwen Code to run the selected reference agent and Qwen Code under the same scenario, capture HTTP and terminal traces, compare request bodies, tool/function schemas, outputs, and iterate until the reproduced behavior is close enough.

Agent Reproduce Align

Purpose

Use this skill when Qwen Code already has a candidate implementation and needs evidence-based parity with a selected reference agent: codex or claude-code. The goal is not byte-for-byte equality; it is matching the observable contract that matters for the feature.

Default target repo: the current working directory. Use a user-specified path only when the user explicitly provides one.

Reference Agent Selection

Use the same reference agent selected during $agent-reproduce-feature. If the earlier choice is unavailable, ask once and record the answer in the scenario or run notes.

Workflow

Re-state the parity target:
- feature name and trigger
- selected reference agent
- one baseline prompt or interaction script
- acceptable differences
- must-match fields
Run the reference agent and Qwen Code in separate capture directories with the same scenario.
Capture the selected reference agent's local state before and after the reference run when state may affect parity.
Normalize traces with scripts/normalize_trace.py.
Compare normalized traces with scripts/compare_traces.py.
Inspect differences in this order:
- reference-agent state changes that explain behavior
- missing tool/function names
- schema shape and required fields
- model settings and response mode
- prompt role/order differences that affect behavior
- terminal-visible output and exit status
Patch Qwen Code, rerun the smallest failing scenario, and repeat.
Preserve only redacted minimal fixtures in the repo.

Read references/alignment-workflow.md before the first comparison pass.

Common Commands

Normalize:

.qwen/skills/agent-reproduce-align/scripts/normalize_trace.py \
  .repro-runs/reference/http.jsonl \
  > .repro-runs/reference/normalized.json

Compare:

.qwen/skills/agent-reproduce-align/scripts/compare_traces.py \
  .repro-runs/reference/normalized.json \
  .repro-runs/qwen/normalized.json

Run a paired shell scenario:

REPRO_REFERENCE_AGENT=codex \
.qwen/skills/agent-reproduce-align/scripts/run_pair_capture.sh \
  .repro-runs/slash-help \
  "codex exec '/help'" \
  "npm test -- --runInBand"

For Claude Code, set REPRO_REFERENCE_AGENT=claude-code and replace the first command with the discovered Claude Code command. When REPRO_REFERENCE_AGENT is set, the paired runner writes reference/state-before, reference/state-after, and reference/state-diff. Use the paired runner only when shell quoting is simple. For interactive slash commands, run the two captures manually with tmux so each side can receive the same keystrokes. Use REPRO_REFERENCE_STATE_ROOT=/tmp/some-root only for tests or custom state directories.

Comparison Rules

Compare contracts before wording. Exact prompt text is usually implementation detail.
Treat absent schemas, wrong required fields, or wrong argument names as high-signal failures.
Treat output ordering as significant only when the user-visible workflow depends on it.
Do not chase provider-specific endpoints, model names, IDs, timestamps, token counts, or ephemeral headers unless the feature depends on them.
Do not chase every local state write. Treat state diffs as explanatory evidence unless the feature contract requires a particular config, memory, or permission side effect.
Stop when Qwen Code passes the user-visible scenario and the remaining trace differences are documented as intentional.

Done Criteria

Reference-agent and Qwen Code traces for the same scenario exist locally.
Reference-agent state diff exists or state capture is documented as irrelevant for the scenario.
The normalized comparison has no unexplained must-match differences.
Qwen Code tests or smoke commands cover the fixed behavior.
Any remaining mismatch is written down in the task notes or Qwen Code docs when it affects users.

More from this repository

same repository

triage

QwenLM/qwen-code

Gatekeep and review GitHub issues and pull requests for Qwen Code maintainers. Use for GitHub Action issue triage, PR admission checks, product-direction review, KISS-focused PR review, and staged bilingual GitHub comments.

2026-06-0324.9k

agent-reproduce-feature

QwenLM/qwen-code

Use when reproducing an existing Codex or Claude Code feature in Qwen Code or another agent CLI by choosing a reference agent, capturing HTTP request bodies, prompts, tool/function schemas, terminal output, and then implementing the matching behavior in the target repo.

2026-06-0224.9k

simplify

QwenLM/qwen-code

Review recent code changes for reuse, code quality, and efficiency, then directly apply straightforward cleanup improvements. Use when the user wants a post-implementation cleanup pass, pre-PR polish, or asks to simplify/refine recent changes. Invoke with `/simplify` or `/simplify <focus>`.

2026-06-0224.9k

new-app

QwenLM/qwen-code

Workflow for creating new applications from scratch. Covers requirements gathering, tech stack selection, scaffolding, implementation, and delivery of a functional prototype.

2026-05-2724.9k

memory-leak-debug

QwenLM/qwen-code

Diagnose memory leaks in the Qwen Code CLI using heap snapshots and the chrome-devtools CLI. Use when investigating high memory usage, unbounded growth, or suspected object retention issues.

2026-05-2324.9k

review

QwenLM/qwen-code

Review changed code for correctness, security, code quality, and performance. Use when the user asks to review code changes, a PR, or specific files. Invoke with `/review`, `/review <pr-number>`, `/review <file-path>`, or `/review <pr-number> --comment` to post inline comments on the PR.

2026-05-2024.9k

Source

QwenLM

QwenLM/qwen-code

View GitHub Repository View Creator Repositories

Install command

Download

Run Skill in Manus

Useful forSOC

Software Quality Assurance Analysts and TestersComputer and Mathematical Occupations15-1253L4

name	agent-reproduce-align
description	Use after a Codex or Claude Code feature has been implemented in Qwen Code to run the selected reference agent and Qwen Code under the same scenario, capture HTTP and terminal traces, compare request bodies, tool/function schemas, outputs, and iterate until the reproduced behavior is close enough.

Agent Reproduce Align

Purpose

Default target repo: the current working directory. Use a user-specified path only when the user explicitly provides one.

Reference Agent Selection

Use the same reference agent selected during $agent-reproduce-feature. If the earlier choice is unavailable, ask once and record the answer in the scenario or run notes.

Workflow

Re-state the parity target:
- feature name and trigger
- selected reference agent
- one baseline prompt or interaction script
- acceptable differences
- must-match fields
Run the reference agent and Qwen Code in separate capture directories with the same scenario.
Capture the selected reference agent's local state before and after the reference run when state may affect parity.
Normalize traces with scripts/normalize_trace.py.
Compare normalized traces with scripts/compare_traces.py.
Inspect differences in this order:
- reference-agent state changes that explain behavior
- missing tool/function names
- schema shape and required fields
- model settings and response mode
- prompt role/order differences that affect behavior
- terminal-visible output and exit status
Patch Qwen Code, rerun the smallest failing scenario, and repeat.
Preserve only redacted minimal fixtures in the repo.

Read references/alignment-workflow.md before the first comparison pass.

Common Commands

Normalize:

.qwen/skills/agent-reproduce-align/scripts/normalize_trace.py \
  .repro-runs/reference/http.jsonl \
  > .repro-runs/reference/normalized.json

Compare:

.qwen/skills/agent-reproduce-align/scripts/compare_traces.py \
  .repro-runs/reference/normalized.json \
  .repro-runs/qwen/normalized.json

Run a paired shell scenario:

REPRO_REFERENCE_AGENT=codex \
.qwen/skills/agent-reproduce-align/scripts/run_pair_capture.sh \
  .repro-runs/slash-help \
  "codex exec '/help'" \
  "npm test -- --runInBand"

Comparison Rules

Compare contracts before wording. Exact prompt text is usually implementation detail.
Treat absent schemas, wrong required fields, or wrong argument names as high-signal failures.
Treat output ordering as significant only when the user-visible workflow depends on it.
Do not chase provider-specific endpoints, model names, IDs, timestamps, token counts, or ephemeral headers unless the feature depends on them.
Do not chase every local state write. Treat state diffs as explanatory evidence unless the feature contract requires a particular config, memory, or permission side effect.
Stop when Qwen Code passes the user-visible scenario and the remaining trace differences are documented as intentional.

Done Criteria

Reference-agent and Qwen Code traces for the same scenario exist locally.
Reference-agent state diff exists or state capture is documented as irrelevant for the scenario.
The normalized comparison has no unexplained must-match differences.
Qwen Code tests or smoke commands cover the fixed behavior.
Any remaining mismatch is written down in the task notes or Qwen Code docs when it affects users.