Skip to main content
Ejecuta cualquier Skill en Manus
con un clic
$pwd:

agents-meet-rl

// Troubleshooter for agentic-RL training, evaluation, and experiment design on LLM agents (single or multi-agent, multi-turn, tool-augmented). Routes a user's symptom to fixes anchored in the corpus. TRIGGER when: user is training, evaluating, or designing experiments for an RL-trained LLM agent; symptoms like reward not moving, eval flat, KL/entropy/length blow-ups, retokenization drift, tool-call parse failures, credit assignment, async-rollout staleness, judge inconsistency, benchmark contamination, pass@k vs pass@1; choices about ablation, baseline, framework, algorithm, reward, or data curation; user names GRPO, PPO, DAPO, veRL, OpenRLHF, slime, AReaL, RAGEN, or similar. SKIP: generic supervised LLM fine-tuning with no RL component; classical RL theory or tabular RL; non-LLM agents. Distilled from the AgentsMeetRL awesome list, snapshot 2026-05-23.

$ git log --oneline --stat
stars:1467
forks:56
updated:26 de mayo de 2026, 15:49
Explorador de archivos
92 archivos
SKILL.md
readonly