Skip to main content
Run any Skill in Manus
with one click
$pwd:

gsm8k-eval

// GSM8K evaluation protocol: answer extraction (####, \boxed, CoT), accuracy scoring, prompt formatting, few-shot exemplars, dataset loading, pitfalls. Use when: GSM8K, grade school math, openai/gsm8k, #### delimiter, parse_gsm8k_answer, detect_answer_failure, load_gsm8k, format_chat, math benchmark scoring, gsm8k few-shot, chain-of-thought eval.

$ git log --oneline --stat
stars:2
forks:0
updated:March 23, 2026 at 21:16
File Explorer
3 files
SKILL.md
readonly