تشغيل أي مهارة في Manus بنقرة واحدة

$pwd:

create-metric-plugin

Name: Create Metric Plugin
Author: NomaDamas

// Guide developers through creating a custom evaluation metric plugin for AutoRAG-Research. Covers both retrieval metrics (recall, precision, etc.) and generation metrics (BLEU, ROUGE, etc.). Walks through scaffolding, implementing metric functions with @metric decorators, writing configs, testing, and installing. Use when building a new evaluation metric.

تشغيل في Manus

$ git log --oneline --stat

stars:١٣٩

forks:٢٢

updated:٢١ فبراير ٢٠٢٦ في ٠٤:٢٩

SKILL.md

readonly

name	create-metric-plugin
description	Guide developers through creating a custom evaluation metric plugin for AutoRAG-Research. Covers both retrieval metrics (recall, precision, etc.) and generation metrics (BLEU, ROUGE, etc.). Walks through scaffolding, implementing metric functions with @metric decorators, writing configs, testing, and installing. Use when building a new evaluation metric.
allowed-tools	["Bash","Read","Write","Edit"]

Create Metric Plugin

Workflow

1. Scaffold

# For retrieval metric:
autorag-research plugin create my_metric --type=metric_retrieval

# For generation metric:
autorag-research plugin create my_metric --type=metric_generation

Read the generated metric.py, pyproject.toml, YAML config, and test file to understand the structure.

2. Implement the metric function

Use the @metric decorator (per-input) or @metric_loop decorator (batch) from autorag_research.evaluation.metrics.util. Both validate that required fields are non-None before calling.

@metric(fields_to_check=[...]) — function receives a single MetricInput, returns float
@metric_loop(fields_to_check=[...]) — function receives list[MetricInput], returns list[float]

See autorag_research/schema.py for the full MetricInput dataclass definition.

3. Understanding `retrieval_gt` (AND/OR group structure)

For retrieval metrics, metric_input.retrieval_gt uses a nested list structure with AND/OR semantics:

retrieval_gt: list[list[str]]

Example: [["A", "B"], ["C"]]
  → Means: (A OR B) AND C
  → Each inner list is an OR group (any item satisfies the group)
  → Outer list is AND (ALL groups must be satisfied for complete retrieval)

This is critical for multi-hop queries where multiple evidence pieces are needed. Your metric must handle this structure correctly — don't just flatten it into a single set unless your metric semantics allow it.

Examples:

[["doc1"]] — single required document
[["doc1", "doc2"], ["doc3"]] — need (doc1 OR doc2) AND doc3
[["doc1"], ["doc2"], ["doc3"]] — need doc1 AND doc2 AND doc3

See retrieval_ndcg in autorag_research/evaluation/metrics/retrieval.py for a real implementation that handles AND/OR groups with graded relevance.

4. Wire up config and install

The generated config class just needs get_metric_func() to return your metric function. If your metric takes extra kwargs, override get_metric_kwargs().

cd my_metric_plugin
pip install -e .   # or: uv pip install -e .
cd .. && autorag-research plugin sync

Verify: ls configs/metrics/retrieval/my_metric.yaml (or metrics/generation/)

Key Files

Purpose	Path
Base config classes	`autorag_research/config.py` → `BaseRetrievalMetricConfig`, `BaseGenerationMetricConfig`
MetricInput schema	`autorag_research/schema.py`
Metric decorators	`autorag_research/evaluation/metrics/util.py` → `@metric`, `@metric_loop`
Plugin entry point discovery	`autorag_research/plugin_registry.py`

Examples

Study these existing implementations for patterns:

autorag_research/evaluation/metrics/retrieval.py — Recall, Precision, F1, NDCG, MRR, MAP (all handle AND/OR groups)
autorag_research/evaluation/metrics/generation.py — BLEU, ROUGE, BERTScore, SemScore
YAML configs: configs/metrics/retrieval/f1.yaml, configs/metrics/generation/rouge.yaml

related-skills.json

نفس المستودع

create-generation-plugin.md

from "NomaDamas/AutoRAG-Research"

Guide developers through creating a custom generation pipeline plugin for AutoRAG-Research. Walks through scaffolding, implementing BaseGenerationPipeline methods, composing with retrieval pipelines, writing YAML configs, testing, and installing. Use when building a new RAG generation strategy (e.g., chain-of-thought RAG, multi-hop RAG).

2026-03-28139

create-ingestor-plugin.md

from "NomaDamas/AutoRAG-Research"

Guide developers through creating a custom data ingestor plugin for AutoRAG-Research. Ingestors load external datasets (HuggingFace, local files, APIs) into the database. Uses @register_ingestor decorator for automatic CLI parameter extraction. Use when ingesting a new dataset format into AutoRAG-Research.

2026-03-28139

create-retrieval-plugin.md

from "NomaDamas/AutoRAG-Research"

Guide developers through creating a custom retrieval pipeline plugin for AutoRAG-Research. Walks through scaffolding, implementing BaseRetrievalPipeline methods, writing YAML configs, testing, and installing. Use when building a new search/retrieval strategy (e.g., Elasticsearch, ColBERT, custom vector search).

2026-03-28139

autorag-query.md

from "NomaDamas/AutoRAG-Research"

Query AutoRAG-Research pipeline results using natural language. Converts questions to SQL, executes safely (SELECT-only), returns formatted results. Auto-detects DB connection from configs/db.yaml or env vars. Use for pipeline comparison, metrics analysis, token usage.

2026-02-20139

resolve-conversation.md

from "NomaDamas/AutoRAG-Research"

Process [APPROVE] and [IGNORE] replies on /refactor review threads. Applies approved fixes to the codebase, resolves all responded threads on GitHub, commits and pushes changes. Sequential single-agent workflow. All output is in English.

2026-02-10139

refactor.md

from "NomaDamas/AutoRAG-Research"

Orchestrate a 3-agent PR code review debate using Claude Code Teams. Spawns Devil's Advocate, Neutral Judge, and Approval Advocate reviewers who analyze the current PR diff in parallel. Synthesizes findings, auto-fixes unanimous issues, and posts inline PR comments for disagreements. All output is in English.

2026-02-10139

package.json

"author": "NomaDamas"

"repository": "NomaDamas/AutoRAG-Research"

فتح مستودع GitHub عرض مستودعات المنشئ

$ install --global

$ download --local

تشغيل في Manus

$ useful --forSOC

علماء البياناتمهن الحاسوب والرياضيات15-2051L4

name	create-metric-plugin
description	Guide developers through creating a custom evaluation metric plugin for AutoRAG-Research. Covers both retrieval metrics (recall, precision, etc.) and generation metrics (BLEU, ROUGE, etc.). Walks through scaffolding, implementing metric functions with @metric decorators, writing configs, testing, and installing. Use when building a new evaluation metric.
allowed-tools	["Bash","Read","Write","Edit"]

Create Metric Plugin

Workflow

1. Scaffold

# For retrieval metric:
autorag-research plugin create my_metric --type=metric_retrieval

# For generation metric:
autorag-research plugin create my_metric --type=metric_generation

Read the generated metric.py, pyproject.toml, YAML config, and test file to understand the structure.

2. Implement the metric function

Use the @metric decorator (per-input) or @metric_loop decorator (batch) from autorag_research.evaluation.metrics.util. Both validate that required fields are non-None before calling.

@metric(fields_to_check=[...]) — function receives a single MetricInput, returns float
@metric_loop(fields_to_check=[...]) — function receives list[MetricInput], returns list[float]

See autorag_research/schema.py for the full MetricInput dataclass definition.

3. Understanding `retrieval_gt` (AND/OR group structure)

For retrieval metrics, metric_input.retrieval_gt uses a nested list structure with AND/OR semantics:

retrieval_gt: list[list[str]]

Example: [["A", "B"], ["C"]]
  → Means: (A OR B) AND C
  → Each inner list is an OR group (any item satisfies the group)
  → Outer list is AND (ALL groups must be satisfied for complete retrieval)

Examples:

[["doc1"]] — single required document
[["doc1", "doc2"], ["doc3"]] — need (doc1 OR doc2) AND doc3
[["doc1"], ["doc2"], ["doc3"]] — need doc1 AND doc2 AND doc3

See retrieval_ndcg in autorag_research/evaluation/metrics/retrieval.py for a real implementation that handles AND/OR groups with graded relevance.

4. Wire up config and install

The generated config class just needs get_metric_func() to return your metric function. If your metric takes extra kwargs, override get_metric_kwargs().

cd my_metric_plugin
pip install -e .   # or: uv pip install -e .
cd .. && autorag-research plugin sync

Verify: ls configs/metrics/retrieval/my_metric.yaml (or metrics/generation/)

Key Files

Purpose	Path
Base config classes	`autorag_research/config.py` → `BaseRetrievalMetricConfig`, `BaseGenerationMetricConfig`
MetricInput schema	`autorag_research/schema.py`
Metric decorators	`autorag_research/evaluation/metrics/util.py` → `@metric`, `@metric_loop`
Plugin entry point discovery	`autorag_research/plugin_registry.py`

Examples

Study these existing implementations for patterns:

autorag_research/evaluation/metrics/retrieval.py — Recall, Precision, F1, NDCG, MRR, MAP (all handle AND/OR groups)
autorag_research/evaluation/metrics/generation.py — BLEU, ROUGE, BERTScore, SemScore
YAML configs: configs/metrics/retrieval/f1.yaml, configs/metrics/generation/rouge.yaml

create-metric-plugin

Create Metric Plugin

Workflow

1. Scaffold

2. Implement the metric function

3. Understanding retrieval_gt (AND/OR group structure)

4. Wire up config and install

Key Files

Examples

المزيد من هذا المستودع

المزيد من هذا المستودع

Create Metric Plugin

Workflow

1. Scaffold

2. Implement the metric function

3. Understanding retrieval_gt (AND/OR group structure)

4. Wire up config and install

Key Files

Examples

3. Understanding `retrieval_gt` (AND/OR group structure)

3. Understanding `retrieval_gt` (AND/OR group structure)