// Read a theory markdown file (model/algorithm explanation), then generate a companion Jupyter notebook (.ipynb) via a four-phase pipeline: Research → Plan → Implement → Review.
Every notebook contains TWO parallel paths in a SINGLE file: - Learning path: understand the model by building/running key components - Engineering path: use mature toolchains for production-grade implementation
Path depth and form adapt independently based on feasibility decisions. All implementations target free Colab (free T4 GPU available; must also work on CPU as fallback). Final notebook includes interview-oriented extension and project expression content.
Read a theory markdown file (model/algorithm explanation), then generate a companion Jupyter notebook (.ipynb) via a four-phase pipeline: Research → Plan → Implement → Review.
Every notebook contains TWO parallel paths in a SINGLE file: - Learning path: understand the model by building/running key components - Engineering path: use mature toolchains for production-grade implementation
Path depth and form adapt independently based on feasibility decisions. All implementations target free Colab (free T4 GPU available; must also work on CPU as fallback). Final notebook includes interview-oriented extension and project expression content.
Code Tutorial Notebook Generator
Read a theory markdown file → Research → Plan → Implement → Review → Output .ipynb.
Input
$ARGUMENTS — the absolute path to a theory markdown file.
Output
Same directory as input: <number>_<ModelName>_代码实战.ipynb
Read the target markdown file. Extract a structured summary:
model_name:"Transformer"paper:"Attention Is All You Need (Vaswani et al., 2017)"task_type:"seq2seq_translation"# classification / generation / contrastive / detection / ...core_components:-name:"Scaled Dot-Product Attention"formula:"Attention(Q,K,V) = softmax(QK^T / sqrt(d_k)) V"-name:"Multi-Head Attention"formula:"MultiHead(Q,K,V) = Concat(head_1,...,head_h) W^O"-name:"Position-wise FFN"formula:"FFN(x) = max(0, xW1+b1)W2+b2"-name:"Positional Encoding"formula:"PE(pos,2i) = sin(pos/10000^{2i/d})"# ... all componentstraining_vs_inference_diff:true# Does training differ from inference? (e.g. teacher forcing vs autoregressive)engineering_ecosystem:libraries: ["transformers", "torch.nn.Transformer"] # known industrial libs for this modelmaturity:"mature"# "mature" | "limited" | "none"typical_usage:"inference-only"# "inference-only" | "fine-tune" | "train"
Step 1.2: Web Search for API & Version Verification
Mandatory: Before writing ANY code using non-built-in APIs, search official documentation:
PyTorch API signatures (nn.Transformer, nn.MultiheadAttention, etc.)
batch_first conventions and default values
Dataset download URLs and class signatures
Version-specific breaking changes
Use WebSearch or mcp__context7__query-docs for this. Do NOT rely on memorized API knowledge.
Step 1.3: Choose Dataset
Select the simplest dataset that demonstrates the model's core behavior.
Hard constraints:
Free Colab compatible: Must work with Colab free tier (~12GB RAM, free T4 GPU available, no persistent storage)
GPU-aware: Training should leverage GPU when available (device = 'cuda' if torch.cuda.is_available() else 'cpu'); must also run on CPU (slower but functional). Target training time: < 5 min on free T4 GPU, < 15 min on CPU.
No manual download: Single URL or torchvision.datasets auto-download
Small scale: < 10k samples for training. Reduce if needed.
Task Type
Dataset
Source
Est. Train Time (CPU)
Seq2Seq (Translation)
Tatoeba eng-fra 8k pairs
URL download
~5 min
Image Classification
FashionMNIST (subset 5k)
torchvision.datasets
~3 min
Image Generation (GAN)
MNIST (subset 5k)
torchvision.datasets
~5 min
Contrastive Learning
CIFAR-10 (subset 5k, no labels)
torchvision.datasets
~5 min
Self-supervised (MAE)
CIFAR-10 (subset 5k)
torchvision.datasets
~5 min
Graph Tasks
Synthetic graph (inline code)
torch.randn + adjacency
~2 min
Text Classification
Synthetic or IMDB subset 2k
URL or inline
~3 min
Regression
Synthetic (torch.randn)
Inline generation
~1 min
Key principle: Don't let GPU/data requirements become a barrier to learning. Students should be able to run the notebook immediately after opening it.
Step 1.4: Two-Layer Feasibility Decision
Based on the model's complexity and the engineering ecosystem, make two independent decisions:
Decision 1 — Learning Path Depth
What is the deepest learning artifact that still runs stably?
Level
Criterion
Example Models
L1: Full mini training
Model trainable from scratch on free Colab (< 5 min GPU / < 15 min CPU) with toy dataset
Rule: Always choose the most runnable option. Prefer train > fine-tune > inference-only.
Decision 4 — Comparison Table Dimensions
Automatically filled during Phase 3 implementation. The notebook must produce a fixed comparison table with these 6 dimensions: educational value, code volume, flexibility, stability, industrial fitness, applicable scenarios.
Record all decisions:
learning_path_depth:"L1"# L1 / L2 / L3learning_rationale:"2-layer Transformer trainable from scratch on CPU in ~5 min"engineering_path_form:"E1"# E1 / E2 / E3engineering_rationale:"HuggingFace transformers provides full Seq2Seq pipeline"engineering_action:"inference-only"# train / fine-tune / inference-onlyaction_rationale:"Full model too large to fine-tune on CPU; demo via generate()"
Step 1.5: Component Mapping Table (Mandatory)
After decisions, produce a component mapping table that will be embedded in the notebook. Every paper component must appear:
The notebook must contain separate, clearly labeled code cells for training and inference logic.
Add a markdown cell before each to explain the difference.
Show how both paths handle the training/inference distinction.
Step 2.3: Plan the Interview & Project Expression Section
Search for interview-relevant content about this model/algorithm:
Common interview questions (e.g. "Why does Transformer use Layer Norm instead of Batch Norm?")
Typical follow-up questions and edge cases
Comparison with related models that interviewers often ask about
Key concepts that are frequently tested
Engineering-oriented questions (e.g. "What is use_cache?", "Why batch inference?")
Use WebSearch with queries like:
"<model_name> 面试题" or "<model_name> interview questions deep learning"
"<model_name> 常见问题" or "<model_name> FAQ"
Also plan the project expression content: how to frame this model in a project interview context.
Phase 3: Implement(实现)
Step 3.1: Write Notebook Cells
Use NotebookEdit with edit_mode=insert to build the notebook cell by cell, following the plan from Phase 2.
Notebook Structure Specification
Cell 0: Environment Setup (code) — MANDATORY
This cell is REQUIRED as the very first code cell in every notebook. It must:
List all third-party packages the notebook needs (torch, torchvision, transformers, torch_geometric, etc.)
Provide both Colab (commented !pip install) and local Jupyter (subprocess) install paths
Be placed before any import statement so the notebook runs top-to-bottom on a fresh environment
Adapt the package list below to match the notebook's actual dependencies:
[markdown] Paper citation, task definition, why this specific task demonstrates the model's core behavior
[markdown] (optional) Architecture family overview — where this model sits in its lineage (e.g. RNN → Attention → Transformer → BERT/GPT)
[markdown] Scope: what this notebook covers vs what it does not
Section ii: Minimal Necessary Theory (2-4 cells)
[markdown] Only the formulas and concepts needed to understand the code that follows. This is NOT a full theory recap — reference the companion .md for that.
[markdown]Component mapping table from Step 1.5 (mandatory). This table anchors both paths and lets readers see the full picture before diving into code.
[code] Inference demo (if not already shown in Sections iii/iv)
[markdown] Key insight: "The same fundamental algorithm, at different abstraction levels"
Section vii: Results & Boundaries (2-3 cells)
[code] Side-by-side loss curves / metrics / inference examples (where both paths produce comparable outputs)
[markdown] Boundaries of each path:
Learning path: "What you gain (deep understanding, full control) and what you lose (stability, scalability, engineering efficiency)"
Engineering path: "What you gain (speed, stability, production readiness) and what you lose (transparency, fine-grained control)"
[markdown] When to use which path in practice
Plot language rule: All plt.xlabel(), plt.ylabel(), plt.title(), plt.legend(), and plt.annotate() must use English text only (avoids matplotlib CJK font issues).
Appendix A: Visualizations (optional, 2-3 cells)
Attention weight heatmap
Feature map / embedding visualization
Positional encoding pattern
Dimension sanity check
All figure text (titles, axis labels, legends, annotations) must be English.
Engineering questions (e.g. "What is use_cache?", "Why batch inference?", "How to choose decoding strategy?")
Phase 4: Review(审查)
Step 4.1: Self-Review Checklist
Before finalizing, verify every item:
Runs top-to-bottom: No undefined variables, correct cell execution order, no circular dependencies
Colab free-tier: Training < 5 min on free T4 GPU / < 15 min on CPU, RAM < 12GB; notebook must run on both GPU and CPU (device-agnostic)
Environment setup cell: Cell 0 is a code cell that installs ALL required packages (both Colab !pip commented + local subprocess active). It must list every third-party dependency the notebook uses — missing packages will cause ImportError on fresh environments
Dual-path completeness: Both learning and engineering paths are present and clearly labeled
Component mapping table: Present in Section ii, covers all paper components with status (Runnable/Illustrative/Explain-only)
No fabrication: Only implements what the theory file describes
Loss curve: Plotted after training (where training occurs)
Plot language: All matplotlib text (titles, labels, legends) is English only
Results & Boundaries: Section vii summarizes findings, trade-offs, and path boundaries
Device-agnostic: All tensors and models use .to(device), notebook runs on both GPU and CPU without code changes
Step 4.2: Codex MCP Code Review (mandatory)
After writing the complete notebook, invoke mcp__codex__codex to review the generated code:
Prompt: "Review this Jupyter notebook for correctness. Check:
1. Will all cells run top-to-bottom without errors?
2. Are tensor shapes consistent across all operations?
3. Are there any undefined variables or import issues?
4. Is the training loop correct (loss computation, backward, optimizer step)?
5. Are masks constructed correctly (padding mask, causal mask)?
6. Does the inference logic match the training logic appropriately?
7. Do both learning and engineering paths use consistent data and produce comparable outputs?
Report any bugs or issues found."
cd: <notebook_directory>
sandbox: read-only
If Codex reports issues, fix them before finalizing.
Code Style Rules
Rule
Detail
Framework
PyTorch primary. Minimize external deps.
Language
Variable names: English. Comments: Chinese + English. Markdown cells: Chinese. Plots (title / axis / legend / annotation): English only.