Run any Skill in Manus with one click

auto-paper-improvement-loop

Autonomously improve a generated paper via GPT-5.4 xhigh review → implement fixes → recompile, for 2 rounds. Use when user says "改论文", "improve paper", "论文润色循环", "auto improve", or wants to iteratively polish a generated paper.

Run Skill in Manus

Overview

Install command

npx skills add https://github.com/tqLi99/claude-skills-for-writing --skill auto-paper-improvement-loop

Copy and paste this command into Claude Code to install the skill

Source

tqLi99/claude-skills-for-writing

Stars0

Forks0

UpdatedApril 2, 2026 at 02:21

SKILL.md

readonly

Auto Paper Improvement Loop: Review → Fix → Recompile

Autonomously improve the paper at: $ARGUMENTS

Context

This skill is designed to run after Workflow 3 (/paper-plan → /paper-figure → /paper-write → /paper-compile). It takes a compiled paper and iteratively improves it through external LLM review.

Unlike /auto-review-loop (which iterates on research — running experiments, collecting data, rewriting narrative), this skill iterates on paper writing quality — fixing theoretical inconsistencies, softening overclaims, adding missing content, and improving presentation.

For control, robotics, and systems papers targeting IEEE Transactions, treat this as a reviewer-style journal polishing loop, not a conference-style cosmetic pass. The default priority is:

claim-evidence alignment
theorem-assumption-proof consistency
comparison fairness and adequacy
IEEE Trans writing discipline
final formatting and compilation hygiene

Constants

MAX_ROUNDS = 2 — Two rounds of review→fix→recompile. Empirically, Round 1 catches structural issues (4→6/10), Round 2 catches remaining presentation issues (6→7/10). Diminishing returns beyond 2 rounds for writing-only improvements.
REVIEWER_MODEL = gpt-5.4 — Model used via Codex MCP for paper review.
REVIEW_LOG = PAPER_IMPROVEMENT_LOG.md — Cumulative log of all rounds, stored in paper directory.
TARGET_VENUE = IEEE_TRANS — Default venue family. Supported: IEEE_TAC, IEEE_TSMC, IEEE_TCYB, IEEE_TIE, IEEE_TNNLS, IEEE_TCSI, IEEE_ACCESS, AUTOMATICA, ICLR, NeurIPS, ICML.
HUMAN_CHECKPOINT = false — When true, pause after each round's review and present score + weaknesses to the user. The user can approve fixes, provide custom modification instructions, skip specific fixes, or stop early. When false (default), runs fully autonomously.
AUTO_PROCEED = true — When false, stop after each review round even if HUMAN_CHECKPOINT = false.

💡 Override: /auto-paper-improvement-loop "paper/" — venue: IEEE_TAC, human checkpoint: true

Inputs

Compiled paper — paper/main.pdf + LaTeX source files
All section .tex files — concatenated for review prompt
Optional claim artifacts — CLAIMS_FROM_RESULTS.md, PAPER_PLAN.md, AUTO_REVIEW.md, findings.md

Project Automation Policy

Before acting, resolve automation defaults in this precedence order:

Inline command arguments
PROJECT_AUTOMATION.md in the project root
CLAUDE.md in the project root
The constants in this skill

For control, robotics, and systems papers, a good default is:

AUTO_PROCEED = false
HUMAN_CHECKPOINT = true

That keeps the loop autonomous at the edit level while still requiring your approval at each reviewer gate.

Before sending the paper to REVIEWER_MODEL, read:

../shared-references/agent-role-charter.md
../shared-references/anti-ai-writing.md
../shared-references/writing-principles.md
../shared-references/model-routing-policy.md

Routing:

gpt-5.4 owns the reviewer gate
Sonnet owns local rewrite application, citation cleanup, notation cleanup, and other mechanical fixes
Opus is used only if the review forces a story-level rewrite, claim contraction, or final whole-paper integration pass

State Persistence (Compact Recovery)

If the context window fills up mid-loop, Claude Code auto-compacts. To recover, this skill writes PAPER_IMPROVEMENT_STATE.json after each round:

{
  "current_round": 1,
  "threadId": "019ce736-...",
  "last_score": 6,
  "status": "in_progress",
  "timestamp": "2026-03-13T21:00:00"
}

On startup: if PAPER_IMPROVEMENT_STATE.json exists with "status": "in_progress" AND timestamp is within 24 hours, read it + PAPER_IMPROVEMENT_LOG.md to recover context, then resume from the next round. Otherwise (file absent, "status": "completed", or older than 24 hours), start fresh.

After each round: overwrite the state file. On completion: set "status": "completed".

Workflow

Step 0: Preserve Original

cp paper/main.pdf paper/main_round0_original.pdf

Step 1: Collect Paper Text

Concatenate all section files into a single text block for the review prompt:

# Collect all sections in order
for f in paper/sections/*.tex; do
    echo "% === $(basename $f) ==="
    cat "$f"
done > /tmp/paper_full_text.txt

Also gather any structured context that constrains valid edits:

PAPER_PLAN.md for claims-evidence mapping
CLAIMS_FROM_RESULTS.md for supported claim boundaries
AUTO_REVIEW.md or findings.md for known weaknesses already diagnosed

If these files exist, include their essential conclusions in the review briefing so the reviewer judges the paper against the actual supported claims, not an inflated reading of the prose.

Step 2: Round 1 Review

Send the full paper text to GPT-5.4 xhigh:

mcp__codex__codex:
  model: gpt-5.4
  config: {"model_reasoning_effort": "xhigh"}
  prompt: |
    You are reviewing a [TARGET_VENUE] paper. Please provide a detailed, structured review.

    ## Full Paper Text:
    [paste concatenated sections]

    ## Optional Structured Context
    [paste claims-from-results summary, paper plan highlights, prior reviewer conclusions if available]

    ## Review Instructions
    Please act as a senior professor in multi-agent control, robotics, and dynamical
    systems, with extensive journal reviewer and associate-editor experience. Also act
    like a hard scientific writing editor who can detect AI-looking prose immediately.
    If TARGET_VENUE is an IEEE Transactions journal, use the standards of a strong
    control / robotics / systems journal reviewer rather than a conference reviewer. Provide:
    1. **Overall Score** (1-10, where 6 = weak accept, 7 = accept)
    2. **Summary** (2-3 sentences)
    3. **Strengths** (bullet list, ranked)
    4. **Weaknesses** (bullet list, ranked: CRITICAL > MAJOR > MINOR)
    5. **For each CRITICAL/MAJOR weakness**: A specific, actionable fix
    6. **Missing References** (if any)
    7. **Verdict**: Ready for submission? Yes / Almost / No

    Focus on:
    - theoretical rigor
    - claims vs evidence alignment
    - writing clarity and self-containedness
    - notation consistency
    - AI-like phrasing, generic transitions, and hype language
    - whether the prose sounds like a field-native journal manuscript rather than LLM output

    If TARGET_VENUE is IEEE_TRANS or one of IEEE_TAC / IEEE_TSMC / IEEE_TCYB / IEEE_TIE / IEEE_TNNLS / IEEE_TCSI / IEEE_ACCESS, explicitly evaluate:
    - whether every main contribution is backed by a theorem, proposition, simulation, or comparison
    - whether assumptions are stronger than necessary or insufficiently justified
    - whether proof flow has hidden gaps, missing definitions, or informal leaps
    - whether simulation scenarios actually validate the claimed theory
    - whether comparisons to classical or representative baselines are missing
    - whether Abstract / Introduction / Conclusion read like an IEEE Transactions paper rather than AI-generated prose
    - whether claims are overstated relative to what is proved and simulated

    Do not ask for impossible fixes. Prefer the minimum grounded fix that materially improves publishability.
    When flagging AI-like writing, quote the concrete sentence pattern and suggest a more field-native rewrite direction.

Save the threadId for Round 2.

Step 2b: Human Checkpoint (if enabled)

Skip only if HUMAN_CHECKPOINT = false AND AUTO_PROCEED = true.

Present the review results and wait for user input:

📋 Round 1 review complete.

Score: X/10 — [verdict]
Key weaknesses (by severity):
1. [CRITICAL] ...
2. [MAJOR] ...
3. [MINOR] ...

Reply "go" to implement all fixes, give custom instructions, "skip 2" to skip specific fixes, or "stop" to end.

Parse user response same as /auto-review-loop: approve / custom instructions / skip / stop.

If AUTO_PROCEED = false, this checkpoint is mandatory even when HUMAN_CHECKPOINT = false.

Step 3: Implement Round 1 Fixes

When fixes touch disjoint files or disjoint paper slices, they may be applied in parallel. Keep the review gate itself serial.

Parse the review and implement fixes by severity:

Priority order:

CRITICAL fixes (assumption mismatches, internal contradictions)
MAJOR fixes (overclaims, missing content, notation issues)
MINOR fixes (if time permits)

Trans-first priority override: if the target is an IEEE Transactions journal, apply this ordering inside CRITICAL/MAJOR:

claim-theorem-simulation mismatches
unjustified or missing assumptions
missing baseline/comparison framing
weak introduction story or contribution bullets
conclusion, abstract, and language polish

Common fix patterns:

Issue	Fix Pattern
Assumption-model mismatch	Rewrite assumption to match the model, add formal proposition bridging the gap
Overclaims	Soften language: "validate" → "demonstrate practical relevance", "comparable" → "qualitatively competitive"
Missing metrics	Add quantitative table with honest parameter counts and caveats
Theorem not self-contained	Add "Interpretation" paragraph listing all dependencies
Notation confusion	Rename conflicting symbols globally, add Notation paragraph
Missing references	Add to `references.bib`, cite in appropriate locations
Theory-practice gap	Explicitly frame theory as idealized; add synthetic validation subsection
Missing theorem-to-simulation mapping	Add a short bridge sentence or paragraph in Intro / Main Results / Simulation identifying which result validates which claim
IEEE-style abstract too generic	Rewrite into problem → method → theorem/result → validation structure; remove hype and background filler
Introduction reads like conference paper	Expand related-work synthesis, surface the technical gap earlier, and make contributions concrete and falsifiable
Missing limitations / scope boundary	Add an honest remark in Introduction, Discussion, or Conclusion about what is not claimed
AI-like prose	Remove filler transitions, repeated sentence templates, hype adjectives, and generic motivation; replace with paper-specific technical content

Step 4: Recompile Round 1

cd paper && latexmk -C && latexmk -pdf -interaction=nonstopmode -halt-on-error main.tex
cp main.pdf main_round1.pdf

Verify: 0 undefined references, 0 undefined citations.

Step 5: Round 2 Review

Use mcp__codex__codex-reply with the saved threadId:

mcp__codex__codex-reply:
  threadId: [saved from Round 1]
  model: gpt-5.4
  config: {"model_reasoning_effort": "xhigh"}
  prompt: |
    [Round 2 update]

    Since your last review, we have implemented:
    1. [Fix 1]: [description]
    2. [Fix 2]: [description]
    ...

    Please re-score and re-assess. Same format:
    Score, Summary, Strengths, Weaknesses, Actionable fixes, Verdict.

    Pay special attention to whether any remaining claims are still too strong for the available theory and simulations.

Step 5b: Human Checkpoint (if enabled)

Skip only if HUMAN_CHECKPOINT = false AND AUTO_PROCEED = true. Same as Step 2b — present Round 2 review, wait for user input.

Step 6: Implement Round 2 Fixes

Same process as Step 3. Typical Round 2 fixes:

Add controlled synthetic experiments validating theory
Further soften any remaining overclaims
Formalize informal arguments (e.g., truncation → formal proposition)
Strengthen limitations section

Step 7: Recompile Round 2

cd paper && latexmk -C && latexmk -pdf -interaction=nonstopmode -halt-on-error main.tex
cp main.pdf main_round2.pdf

Step 8: Format Check

After the final recompilation, run a format compliance check:

# 1. Page count vs venue limit
PAGES=$(pdfinfo paper/main.pdf | grep Pages | awk '{print $2}')
echo "Pages: $PAGES"

# 2. Overfull hbox warnings (content exceeding margins)
OVERFULL=$(grep -c "Overfull" paper/main.log 2>/dev/null || echo 0)
echo "Overfull hbox warnings: $OVERFULL"
grep "Overfull" paper/main.log 2>/dev/null | head -10

# 3. Underfull hbox warnings (loose spacing)
UNDERFULL=$(grep -c "Underfull" paper/main.log 2>/dev/null || echo 0)
echo "Underfull hbox warnings: $UNDERFULL"

# 4. Bad boxes summary
grep -c "badness" paper/main.log 2>/dev/null || echo "0 badness warnings"

Auto-fix patterns:

Issue	Fix
Overfull hbox in equation	Wrap in `\resizebox` or split with `\split`/`aligned`
Overfull hbox in table	Reduce font (`\small`/`\footnotesize`) or use `\resizebox{\linewidth}{!}{...}`
Overfull hbox in text	Rephrase sentence or add `\allowbreak` / `\-` hints
Over page limit	Move content to appendix, compress tables, reduce figure sizes
Underfull hbox (loose)	Rephrase for better line filling or add `\looseness=-1`

If any overfull hbox > 10pt is found, fix it and recompile before documenting.

For IEEE Transactions venues, also check:

abstract length and style
presence of Index Terms
appendix placement and proof references
readability of all figures and tables in two-column format
whether the manuscript stays within a typical journal page range instead of drifting into unnecessary length

Step 9: Document Results

Create PAPER_IMPROVEMENT_LOG.md in the paper directory:

# Paper Improvement Log

## Score Progression

| Round | Score | Verdict | Key Changes |
|-------|-------|---------|-------------|
| Round 0 (original) | X/10 | No/Almost/Yes | Baseline |
| Round 1 | Y/10 | No/Almost/Yes | [summary of fixes] |
| Round 2 | Z/10 | No/Almost/Yes | [summary of fixes] |

## Round 1 Review & Fixes

<details>
<summary>GPT-5.4 xhigh Review (Round 1)</summary>

[Full raw review text, verbatim]

</details>

### Fixes Implemented
1. [Fix description]
2. [Fix description]
...

## Round 2 Review & Fixes

<details>
<summary>GPT-5.4 xhigh Review (Round 2)</summary>

[Full raw review text, verbatim]

</details>

### Fixes Implemented
1. [Fix description]
2. [Fix description]
...

## PDFs
- `main_round0_original.pdf` — Original generated paper
- `main_round1.pdf` — After Round 1 fixes
- `main_round2.pdf` — Final version after Round 2 fixes

Step 9: Summary

Report to user:

Score progression table
Number of CRITICAL/MAJOR/MINOR issues fixed per round
Final page count
Remaining issues (if any)

Feishu Notification (if configured)

After each round's review AND at final completion, check ~/.claude/feishu.json:

After each round: Send review_scored — "Round N: X/10 — [key changes]"
After final round: Send pipeline_done — score progression table + final page count
If config absent or mode "off": skip entirely (no-op)

Output

paper/
├── main_round0_original.pdf    # Original
├── main_round1.pdf             # After Round 1
├── main_round2.pdf             # After Round 2 (final)
├── main.pdf                    # = main_round2.pdf
└── PAPER_IMPROVEMENT_LOG.md    # Full review log with scores

Key Rules

Large file handling: If the Write tool fails due to file size, immediately retry using Bash (cat << 'EOF' > file) to write in chunks. Do NOT ask the user for permission — just do it silently.
Preserve all PDF versions — user needs to compare progression
Save FULL raw review text — do not summarize or truncate GPT-5.4 responses
Use mcp__codex__codex-reply for Round 2 to maintain conversation context
Always recompile after fixes — verify 0 errors before proceeding
Do not fabricate experimental results — synthetic validation must describe methodology, not invent numbers
Respect the paper's claims — soften overclaims rather than adding unsupported new claims
Global consistency — when renaming notation or softening claims, check ALL files (abstract, intro, method, experiments, theory sections, conclusion, tables, figure captions)
For IEEE Trans targets, theory and validation outrank prose polish — do not spend the loop on cosmetics while claim support is still weak
Do not silently add new contributions — only sharpen, narrow, justify, or reorganize existing supported contributions
Prefer explicit scope boundaries to vague optimism — a precise limitation is better than an inflated claim

Typical Score Progression

Typical trajectories differ by venue family:

IEEE Trans-style control / robotics / systems paper

Round	Score	Key Improvements
Round 0	4-6/10	Baseline: weak claim-evidence mapping, assumptions unclear, simulations undersold or misaligned
Round 1	6-7/10	Fixed assumptions, narrowed claims, strengthened Intro / Main Results / Simulation connections
Round 2	7-8/10	Added scope boundaries, comparison framing, theorem interpretation, cleaner IEEE journal prose

Conference-style paper

Round	Score	Key Improvements
Round 0	4-6/10	Baseline: structure, overclaims, notation issues
Round 1	6-7/10	Main content fixes
Round 2	7-8/10	Presentation and compliance fixes

For your direction, the useful target is usually not "more polish" but "fewer unjustified claims and cleaner theorem-to-evidence closure."

name	auto-paper-improvement-loop
description	Autonomously improve a generated paper via GPT-5.4 xhigh review → implement fixes → recompile, for 2 rounds. Use when user says "改论文", "improve paper", "论文润色循环", "auto improve", or wants to iteratively polish a generated paper.
argument-hint	["paper-directory"]
allowed-tools	Bash(*), Read, Write, Edit, Grep, Glob, Agent, mcp__codex__codex, mcp__codex__codex-reply

auto-paper-improvement-loop

More from this repository

More from this repository

Auto Paper Improvement Loop: Review → Fix → Recompile

Context

Constants

Inputs

Project Automation Policy

State Persistence (Compact Recovery)

Workflow

Step 0: Preserve Original

Step 1: Collect Paper Text

Step 2: Round 1 Review

Step 2b: Human Checkpoint (if enabled)

Step 3: Implement Round 1 Fixes

Step 4: Recompile Round 1

Step 5: Round 2 Review

Step 5b: Human Checkpoint (if enabled)

Step 6: Implement Round 2 Fixes

Step 7: Recompile Round 2

Step 8: Format Check

Step 9: Document Results

Step 9: Summary

Feishu Notification (if configured)

Output

Key Rules

Typical Score Progression

Auto Paper Improvement Loop: Review → Fix → Recompile

Context

Constants

Inputs

Project Automation Policy

State Persistence (Compact Recovery)

Workflow

Step 0: Preserve Original

Step 1: Collect Paper Text

Step 2: Round 1 Review

Step 2b: Human Checkpoint (if enabled)

Step 3: Implement Round 1 Fixes

Step 4: Recompile Round 1

Step 5: Round 2 Review

Step 5b: Human Checkpoint (if enabled)

Step 6: Implement Round 2 Fixes

Step 7: Recompile Round 2

Step 8: Format Check

Step 9: Document Results

Step 9: Summary

Feishu Notification (if configured)

Output

Key Rules

Typical Score Progression