name	sweep
description	Prepare and run a KV-cache compression sweep. Loads sweep configuration, validates prerequisites, and provides the exact commands needed. Use before starting any GPU experiment.

Sweep Preparation and Execution

Before running a sweep, verify prerequisites and provide the correct commands.

Check environment:
- Run uv run python -c "import torch; print(f'CUDA: {torch.cuda.is_available()}, MPS: {torch.backends.mps.is_available()}')" to verify GPU access
- Run make check to ensure code is clean
Check existing results:
- Count files: find results -name "*_$1p.json" 2>/dev/null | wc -l
- List what's done: find results -name "*_$1p.json" -exec basename {} \;
- Check checkpoints: find results -name "*.ckpt.jsonl" -exec wc -l {} \;
Validate model access:
- Check if model is cached: ls ~/.cache/huggingface/hub/models--$(echo "$0" | tr '/' '--')/ 2>/dev/null
- If not cached, suggest: bash scripts/download_models.sh

Provide the sweep command:

uv run kvguard sweep \
  --num-prompts ${1:-500} \
  --model "${0:-Qwen/Qwen2.5-7B-Instruct}" \
  --output-dir results \
  --max-new-tokens 512

For the full Phase 2 pipeline (both models + train + eval):

nohup bash scripts/run_phase2.sh &
# Monitor with: bash scripts/check_status.sh

16 configs per model:

Expected output: results/{press}/{ModelShort}_{ratio}_{num_prompts}p.json

Checkpoint files: results/{press}/{ModelShort}_{ratio}.ckpt.jsonl (auto-resume on restart)

name	sweep
description	Prepare and run a KV-cache compression sweep. Loads sweep configuration, validates prerequisites, and provides the exact commands needed. Use before starting any GPU experiment.