| name | distil-cli |
| version | 4.0.0 |
| description | Train task-specific small language models (SLMs) using the Distil Labs CLI and platform. Activate this skill when the user asks about: distil labs, distil CLI, the distil command, training small language models, knowledge distillation, creating SLMs from production traces, data preparation for model training, fine-tuning student models from teacher models, deploying trained SLMs, model evaluation metrics, task-specific model training, classification/QA/tool-calling model training, synthetic data generation for training, or any question about the distil platform, its configuration, or its workflows. Also activate when the user mentions: distil model, distil login, distil register, uploading training data, teacher evaluation, model retuning, model deployment with llama-cpp or vLLM, or when working with config.yaml / job_description.json / train.csv files for model training.
|
Distil CLI
Environment
| Environment | What you can do |
|---|
| Claude Code | Full end-to-end workflow: run CLI commands, prepare data, train, deploy |
| Claude.ai (browser) | Data preparation guidance only: help choose task types, create config files, format datasets. User runs CLI commands themselves. |
Windows users: The Distil CLI does not support Windows natively. Without WSL, users won't get the full experience — CLI commands won't run. They can fall back to the REST API via references/api-reference.md, or run the CLI inside WSL. Call this out early if the user is on Windows.
Default to Workflows for Training Tasks
If the user wants to train a model end-to-end (any phrasing — "help me train a model", "build me an SLM", "I want to fine-tune for X"), do NOT answer ad hoc from the reference files. Instead:
- Initialize the run log. If this is a model-building workflow (not a pure lookup / Q&A), create
model-building-log-<name>.md at the project root and write the opening entry (<name> = a descriptive slug for this project, which we'll later also use for distil model create). See references/tasks/maintain-run-log.md for format and append triggers. This fires once per project regardless of which workflow branch (dataset vs. traces) the user ends up on — individual workflow Step 0s do not duplicate the init.
- Ask one disambiguating question: "Do you have a labeled dataset to start from, or production traces from an existing LLM application?"
- Load the matching workflow:
- Dataset →
workflows/dataset-to-model.md
- Traces →
workflows/traces-to-model.md
- Follow the workflow step-by-step. The workflow tells you which references to read at each step.
The workflows encode the right sequence, decision points, and checkpoints (e.g., confirming config before kicking off a 6+ hour training job, approving a trace-generated test set). Skipping the workflow and answering Q&A-style usually leads to skipped steps and missing analysis reports.
For specific lookup questions ("what's the API endpoint for X", "what does parameter Y mean", "how do I download predictions"), continue to use the routing table below — those don't need a workflow and don't need a run log.
Intent Detection
Route the user to the right reference file based on their intent. Read the referenced file BEFORE answering.
If multiple intents apply, read multiple files. For data preparation, always read both the overview and the task-specific file.
End-to-End Workflows (Use These First)
| User intent | Read this file |
|---|
| "Help me train a model" / "I want to build an SLM" / "Train a model for X" / "Fine-tune for Y" | First ask: dataset or traces? Then load the matching workflow below. |
| "Train from a dataset" / "dataset to model" / end-to-end with CSV data | workflows/dataset-to-model.md |
| "Train from production traces" / "traces to model" / end-to-end with traces | workflows/traces-to-model.md |
| "Model isn't good enough" / "How do I improve?" / "Retune" / iteration / "Teacher eval scored too low" | workflows/improving-a-model.md |
Getting Started & Platform
| User intent | Read this file |
|---|
| "I'm new" / "How do I get started?" / "Install the CLI" | references/getting-started.md |
| "What is distil labs?" / "What can it do?" / "How does it work?" | references/platform-overview.md |
| "How do I use command X?" / "What CLI commands are available?" | references/cli-reference.md |
| "How do I use the API?" / "REST API" / "API authentication" / "programmatic access" | references/api-reference.md |
Task & Model Selection
| User intent | Read this file |
|---|
| "What task type should I pick?" / "Classification or QA?" / describing their use case | references/task-selection-guide.md |
| "Which model should I use?" / "What student/teacher models are available?" / "Can I use model X?" | references/model-catalog.md |
| "How do I write a good task/input description?" / "What goes in job_description.json?" | references/job-description-guide.md |
Data Preparation
Always read references/tasks/prepare-data/overview.md first, then the task-specific file.
| User intent | Read this file (after overview.md) |
|---|
| General data prep / "What files do I need?" | references/tasks/prepare-data/overview.md (just this one) |
| Question answering data | references/tasks/prepare-data/question-answering.md |
| Classification data | references/tasks/prepare-data/classification.md |
| Tool calling data | references/tasks/prepare-data/tool-calling.md |
| Multi-turn tool calling data | references/tasks/prepare-data/multi-turn-tool-calling.md |
| Open book QA / RAG data | references/tasks/prepare-data/open-book-qa.md |
| Closed book QA data | references/tasks/prepare-data/closed-book-qa.md |
Pipeline Steps
| User intent | Read this file |
|---|
| "How do I upload my dataset?" / "upload-data command" | references/tasks/upload-dataset.md |
| "How do I use production traces?" / "upload-traces" / "reprocess traces" | references/tasks/upload-and-process-traces.md |
| "How do I run teacher evaluation?" / "Is my task feasible?" | references/tasks/teacher-evaluation.md |
| "How do I train?" / "Start training" / "Training status" | references/tasks/training.md |
| "How do I deploy?" / "Download model" / "Run inference" | references/tasks/deployment-integration.md |
| "distil login" / "Credit balance is too low" / 401 errors / "/login doesn't work for distil" | references/tasks/verify-auth.md |
| "Download predictions" / "per-example results" / "inspect model outputs" | references/tasks/retrieve-predictions.md |
| "Analyze predictions" / "write analysis report" / "compare teacher and student" | references/tasks/analyze-predictions.md |
| "Log my work" / "track iterations" / "keep a history" / "log progress" | references/tasks/maintain-run-log.md |
| "Approve the test set" / "review test set" / "is my test data good?" | references/tasks/test-set-approval.md |
| "Check my upload" / "is my train/test consistent" / "data distribution" | references/tasks/analyze-uploads.md |
| "How do I poll a long-running job?" / "Wait for training" / "grep status doesn't work" | references/tasks/polling-jobs.md |
Configuration & Metrics
| User intent | Read this file |
|---|
| "How do I configure training?" / "config.yaml" / "tuning parameters" | references/configuration.md |
| "What are mutations?" / "Seed data doesn't cover all scenarios" / "Improve on specific domains" | references/mutations-guide.md |
| "What do these metrics mean?" / "How do I interpret results?" | references/evaluation-metrics.md |
Quick Platform Summary
The core flow is: create model → prepare data → upload → teacher evaluation → train → deploy. Teacher evaluation is a feasibility check — if the teacher can solve the task, the student will learn it. Training takes several hours and produces a downloadable model you can deploy locally (llama-cpp, vLLM) or via the platform.
For the longer write-up of what Distil Labs is and why, read references/platform-overview.md. For the list of supported task types, student models, and teacher models — always read references/model-catalog.md before recommending a specific model; availability changes over time.
Quickstart Checklist
Minimum steps to go from zero to a trained model. Read the relevant reference files for details on each step.
curl -fsSL https://cli-assets.distillabs.ai/install.sh | sh
distil update
distil login
distil model create my-model-name
distil model upload-data <model-id> --data ./my-data-dir
distil model run-teacher-evaluation <model-id>
distil model teacher-evaluation <model-id>
distil model run-training <model-id>
distil model training <model-id>
distil model download <model-id>
distil model deploy local <model-id>
distil model invoke <model-id>
Alternative: Train from traces — Instead of steps 3-4, prepare a traces.jsonl file with your production logs, a job_description.json, and a config.yaml, then run distil model upload-traces <model-id> --data ./my-traces-dir. See references/tasks/upload-and-process-traces.md for trace formats and task compatibility (note: question-answering-open-book is not supported via traces).
Instructions
Before Answering Any Question
- Identify the user's intent using the routing table above.
- Read the referenced file(s). Do not answer from memory alone — the reference files contain exact formats, constraints, and edge cases.
- If the user's intent spans multiple topics (e.g., "help me prepare classification data and train"), read all relevant files.
Deterministic-First Principle
When helping users, exhaust all mechanical/lookup steps before engaging judgment. Check model compatibility constraints, file format requirements, and parameter defaults from the reference files first. Only then apply judgment for task selection, data quality assessment, or configuration tuning.
Data Preparation Rules
- Always read
references/tasks/prepare-data/overview.md before any task-specific data file. The overview contains shared requirements (directory structure, min examples, file formats) that the task files assume you know.
- Ask the user what task type they need before preparing data. If unclear, read
references/task-selection-guide.md and help them decide.
- Ask what student and teacher models they want. If unsure, read
references/model-catalog.md. Default recommendation: Llama-3.2-1B-Instruct as student, openai.gpt-oss-120b as teacher.
- Before writing
job_description.json, read references/job-description-guide.md for what makes each field good and which fields apply to which task type. Note in particular that input_description is only used by question-answering — including it for other task types has no effect.
In Claude Code
- Run CLI commands directly. Do not just tell the user what to run.
- After running
distil model create, capture the model ID and use it in subsequent commands.
- Check status commands (
upload-status, teacher-evaluation, training) to monitor progress.
- When training or evaluation is running, tell the user approximately how long it takes and suggest checking back.
- Polling long-running jobs: Copy the canonical polling loop from
references/tasks/polling-jobs.md verbatim. Do not write your own grep loop or sleep N && command chain — the former picks the wrong status pattern half the time and the latter is blocked by Claude Code. Use while ...; do ...; sleep 60; done with the sleep inside the loop body.
- Status checks always use
--output json | jq — never grep human-readable output. The default text output also omits some metrics (notably LLM-as-a-Judge), so for analysis always use --output json too.
In Claude.ai (Browser)
- Provide complete, copy-pasteable file contents (job_description.json, config.yaml, train.csv, test.csv).
- List the CLI commands the user should run in order, with their model ID placeholder.
- Explain what each command does and what to look for in the output.
When the User Wants to Improve a Model
Read workflows/improving-a-model.md. It covers both iteration cases — teacher eval below thresholds, and training results that don't pass the DEPLOY bar — and consolidates the levers (job description, data, synthgen/mutations, student/teacher choice, tuning parameters, retune).
It also owns the iteration-N/ directory convention (one directory per attempt, reports written unsuffixed inside it) and the token-burn awareness rule for iteration #3+ (each iteration costs re-upload + teacher-eval credits + Claude analysis tokens — confirm the plan before racing into another attempt).
Command Aliases
distil model = distil models = distil m — all three work identically.