mit einem Klick
sweep
// Prepare and run a KV-cache compression sweep. Loads sweep configuration, validates prerequisites, and provides the exact commands needed. Use before starting any GPU experiment.
// Prepare and run a KV-cache compression sweep. Loads sweep configuration, validates prerequisites, and provides the exact commands needed. Use before starting any GPU experiment.
Search past decisions, failures, and experiment logs for relevant context before starting a task. Use this before any significant implementation or experiment to avoid repeating mistakes.
Analyze sweep results after completion. Computes accuracy, CFR, signal statistics, and generates comparison tables. Use after a sweep completes to understand the data.
Compare results across models (Qwen vs Llama) at matching compression configurations. Generates side-by-side tables and identifies cross-model patterns.
After completing a significant task or experiment, extract lessons learned and update the project knowledge base. Captures what worked, what failed, and what to remember for next time.
| name | sweep |
| description | Prepare and run a KV-cache compression sweep. Loads sweep configuration, validates prerequisites, and provides the exact commands needed. Use before starting any GPU experiment. |
Before running a sweep, verify prerequisites and provide the correct commands.
Check environment:
uv run python -c "import torch; print(f'CUDA: {torch.cuda.is_available()}, MPS: {torch.backends.mps.is_available()}')" to verify GPU accessmake check to ensure code is cleanCheck existing results:
find results -name "*_$1p.json" 2>/dev/null | wc -lfind results -name "*_$1p.json" -exec basename {} \;find results -name "*.ckpt.jsonl" -exec wc -l {} \;Validate model access:
ls ~/.cache/huggingface/hub/models--$(echo "$0" | tr '/' '--')/ 2>/dev/nullbash scripts/download_models.shProvide the sweep command:
uv run kvguard sweep \
--num-prompts ${1:-500} \
--model "${0:-Qwen/Qwen2.5-7B-Instruct}" \
--output-dir results \
--max-new-tokens 512
For the full Phase 2 pipeline (both models + train + eval):
nohup bash scripts/run_phase2.sh &
# Monitor with: bash scripts/check_status.sh
16 configs per model:
Expected output: results/{press}/{ModelShort}_{ratio}_{num_prompts}p.json
Checkpoint files: results/{press}/{ModelShort}_{ratio}.ckpt.jsonl (auto-resume on restart)