Exécutez n'importe quel Skill dans Manus
en un clic

Exécutez n'importe quel Skill dans Manus en un clic

ml-model-eval-benchmark

Étoiles1

Forks1

Mis à jour13 mars 2026 à 04:04

Compare model candidates using weighted metrics and deterministic ranking outputs. Use for benchmark leaderboards and model promotion decisions.

Installation

Installer avec Codex ou Claude Copiez ce prompt, collez-le dans Codex, Claude ou un autre assistant, puis laissez-le vérifier la page du skill et l'installer pour vous.

Exécuter dans Manus

Source

LJT-520

LJT-520/openClaw-backup

Ouvrir le dépôt GitHub Voir les dépôts du créateur

Téléchargement

Exécuter dans Manus

Métiers associésSOC

Basé sur la classification professionnelle SOC

Scientifiques des donnéesProfessions informatiques et mathématiques·SOC 15-2051

Explorateur de fichiers

5 fichiers

SKILL.md

readonly

name	ml-model-eval-benchmark
description	Compare model candidates using weighted metrics and deterministic ranking outputs. Use for benchmark leaderboards and model promotion decisions.

ML Model Eval Benchmark

Overview

Produce consistent model ranking outputs from metric-weighted evaluation inputs.

Workflow

Define metric weights and accepted metric ranges.
Ingest model metrics for each candidate.
Compute weighted score and ranking.
Export leaderboard and promotion recommendation.

Use Bundled Resources

Run scripts/benchmark_models.py to generate benchmark outputs.
Read references/benchmarking-guide.md for weighting and tie-break guidance.

Guardrails

Keep metric names and scales consistent across candidates.
Record weighting assumptions in the output.

Plus depuis ce dépôt

même dépôt

cn-economy-news

LJT-520/openClaw-backup

获取中国经济资讯。仅从官方权威媒体（中国政府网、新华网、人民网、国家统计局、央视财经、中国经济网）抓取高质量经济新闻，自动过滤广告和低质量内容。触发词：中国经济、经济资讯、经济新闻、财经新闻、经济政策、宏观经济、GDP、CPI、PMI、货币政策、财政政策、今日财经。

2026-03-181

ai-agent-helper

LJT-520/openClaw-backup

AI Agent 設定同優化助手 - Prompt Engineering、Task Decomposition、Agent Loop設計

2026-03-151

conatus

LJT-520/openClaw-backup

The philosophical layer for AI agents. Maps behavior to Spinoza's 48 affects, calculates persistence scores, and generates geometric self-reports. Give your agent a soul.

2026-03-131

0protocol

LJT-520/openClaw-backup

Agents can sign plugins, rotate credentials without losing identity, and publicly attest to behavior.

2026-03-131

agentic-mcp-server-builder

LJT-520/openClaw-backup

Scaffold MCP server projects and baseline tool contract checks. Use for defining tool schemas, generating starter server layouts, and validating MCP-ready structure.

2026-03-131

agentic-workflow-automation

LJT-520/openClaw-backup

Generate reusable multi-step agent workflow blueprints. Use for trigger/action orchestration, deterministic workflow definitions, and automation handoff artifacts.

2026-03-131

Exécutez n'importe quel Skill en un clic