Ejecuta cualquier Skill en Manus
con un clic

Ejecuta cualquier Skill en Manus con un clic

$pwd:

generate-report

Name: Generate Report
Author: xvirobotics

// Generate a comprehensive summary report of the latest experiment including metrics, plots, and comparison with baseline. Use this after training and evaluation to create a shareable experiment summary.

Ejecutar en Manus

$ git log --oneline --stat

stars:41

forks:6

updated:23 de febrero de 2026, 03:44

SKILL.md

readonly

name	generate-report
description	Generate a comprehensive summary report of the latest experiment including metrics, plots, and comparison with baseline. Use this after training and evaluation to create a shareable experiment summary.
user-invocable	true
context	fork
allowed-tools	Bash, Read, Grep, Write
argument-hint	[experiment-name] e.g. 'transformer-v2-lr-sweep'

You are generating a comprehensive experiment report for this data science project. Your goal is to gather all available metrics, plots, and configuration details from the latest experiment and produce a clear, well-structured report that can be shared with the team.

Dynamic Context

Current branch: !git branch --show-current Git commit: !git rev-parse --short HEAD 2>/dev/null || echo "unknown" Recent experiment logs: !ls -lt reports/*.json experiments/*.json 2>/dev/null | head -5 || echo "No experiment logs found" Available plots: !ls reports/figures/*.png reports/figures/*.svg 2>/dev/null | head -10 || echo "No plots found" Checkpoints: !ls -lt checkpoints/*.pt checkpoints/*.pth 2>/dev/null | head -3 || echo "No checkpoints" Config used: !ls configs/*.yaml configs/*.toml 2>/dev/null | head -3 || echo "No configs"

Experiment Name

If the user provided an experiment name: $ARGUMENTS Otherwise, derive one from the branch name, latest config file, or use the current date.

Report Generation Process

Step 1: Gather Experiment Data

Collect all available information about the latest experiment:

Metrics: Read the latest metrics JSON from reports/ or experiments/
Training logs: Look for training output logs, MLflow run data, or W&B run summaries
Configuration: Read the experiment config file (YAML/TOML)
Checkpoint metadata: Load the best checkpoint and extract epoch, metric, config
Dataset statistics: Look for data profiling outputs or read from data validation logs

# Find and read latest metrics
METRICS_FILE=$(ls -t reports/*.json experiments/*.json 2>/dev/null | head -1)
if [ -n "$METRICS_FILE" ]; then
    echo "=== Latest Metrics ==="
    cat "$METRICS_FILE"
fi

# Find config used
CONFIG_FILE=$(ls -t configs/*.yaml configs/*.toml 2>/dev/null | head -1)
if [ -n "$CONFIG_FILE" ]; then
    echo "=== Configuration ==="
    cat "$CONFIG_FILE"
fi

Step 2: Gather Baseline Data

Look for baseline metrics to compare against:

Check for a reports/baseline_metrics.json or experiments/baseline.json
Check git history for previous metrics files: git log --oneline --all -- reports/*.json
If MLflow is configured, query for the baseline run
If no baseline exists, note this in the report

Step 3: Generate Visualizations

If plots do not already exist, generate them:

python3 -c "
import json
from pathlib import Path

# Check if visualization script exists
viz_script = Path('src/evaluation/visualize.py')
if viz_script.exists():
    print('Visualization script found')
else:
    print('No visualization script found -- will generate basic plots')
"

Key visualizations to include:

Training curves: loss and metric over epochs (train vs. validation)
Confusion matrix: if classification task
Metric comparison bar chart: current vs. baseline
Feature importance: if available from the model or analysis

Step 4: Write the Report

Generate the report as a Markdown file at reports/experiment_report.md:

# Experiment Report: [Experiment Name]

**Date:** [current date]
**Branch:** [git branch]
**Commit:** [git commit hash]
**Author:** [generated by /generate-report skill]

---

## Executive Summary

[2-3 sentences: what was the experiment, what was the key result, and is it better than baseline?]

## Experiment Configuration

| Parameter | Value |
|-----------|-------|
| Model architecture | [from config] |
| Learning rate | [from config] |
| Batch size | [from config] |
| Epochs | [from config] |
| Optimizer | [from config] |
| Scheduler | [from config] |
| Random seed | [from config] |
| Dataset version | [from config or DVC] |

## Dataset Summary

| Split | Samples | Features | Classes |
|-------|---------|----------|---------|
| Train | [count] | [count] | [count or N/A] |
| Validation | [count] | [count] | [count or N/A] |
| Test | [count] | [count] | [count or N/A] |

## Results

### Final Metrics

| Metric | Value |
|--------|-------|
| [metric 1] | [value] |
| [metric 2] | [value] |
| ... | ... |

### Comparison with Baseline

| Metric | Baseline | Current | Delta | Improvement? |
|--------|----------|---------|-------|-------------|
| [metric 1] | [value] | [value] | [+/- value] | [Yes/No] |
| ... | ... | ... | ... | ... |

### Training Curves

![Training Loss](figures/training_loss.png)
![Validation Metric](figures/validation_metric.png)

### Confusion Matrix

![Confusion Matrix](figures/confusion_matrix.png)

## Analysis

### Key Findings
- [Finding 1: most important result]
- [Finding 2: notable pattern or observation]
- [Finding 3: any concerning behavior]

### Error Analysis
- [What types of errors does the model make?]
- [Are errors concentrated in specific classes or data subsets?]

### Comparison with Previous Experiments
- [How does this compare to previous runs?]
- [What changed and what impact did it have?]

## Recommendations

### Next Steps
1. [Actionable recommendation 1]
2. [Actionable recommendation 2]
3. [Actionable recommendation 3]

### Potential Improvements
- [Idea for model improvement]
- [Idea for data improvement]
- [Idea for training procedure improvement]

## Artifacts

| Artifact | Path |
|----------|------|
| Best checkpoint | checkpoints/best_model.pt |
| Metrics JSON | reports/metrics.json |
| Config file | configs/experiment.yaml |
| Training logs | experiments/[run-id]/ |
| Figures | reports/figures/ |

---

*Report generated automatically by the /generate-report skill.*

Step 5: Verify Report Quality

After writing the report:

Read it back and verify all placeholders are filled with actual data
Verify all referenced figure paths exist
Verify metrics values are reasonable (not NaN, not obviously wrong)
Ensure the executive summary accurately reflects the detailed results
Check that recommendations are specific and actionable, not generic

Report the path to the generated report file when complete.

related-skills.json

mismo repositorio

evaluate-model.md

from "xvirobotics/metaskill"

Load the latest model checkpoint, run evaluation on the test set, and generate a metrics report with confusion matrix. Use this after training to assess model performance or to re-evaluate a specific checkpoint.

2026-02-2341

run-pipeline.md

from "xvirobotics/metaskill"

Run the full data science pipeline: validate raw data, preprocess, engineer features, train model, and evaluate. Use this when you want to execute the end-to-end ML pipeline or re-run it after data or code changes.

2026-02-2341

api-test.md

from "xvirobotics/metaskill"

Run API integration tests against the running backend, verify endpoints return expected responses and status codes. Use after deploying a preview or starting the dev server.

2026-02-2341

build-and-test.md

from "xvirobotics/metaskill"

Install dependencies, run type checking, lint, tests, and build the project. Use after making code changes to verify nothing is broken.

2026-02-2341

deploy-preview.md

from "xvirobotics/metaskill"

Build Docker images and launch a local preview environment with docker-compose. Use to test the full stack locally before merging.

2026-02-2341

build-and-test.md

from "xvirobotics/metaskill"

Build the Xcode project and run the full test suite. Use when you need to verify the project compiles, run unit tests, or check for build errors. Reports pass/fail results with detailed error output.

2026-02-2341

package.json

"author": "xvirobotics"

"repository": "xvirobotics/metaskill"

Abrir repositorio de GitHub Ver repositorios del creador

$ install --global

$ download --local

Ejecutar en Manus

$ useful --forSOC

Científicos de datosOcupaciones informáticas y matemáticas15-2051L4

# Find and read latest metrics METRICS_FILE=$(ls -t reports/*.json experiments/*.json 2>/dev/null | head -1) if [ -n "$METRICS_FILE" ]; then echo "=== Latest Metrics ===" cat "$METRICS_FILE" fi # Find config used CONFIG_FILE=$(ls -t configs/*.yaml configs/*.toml 2>/dev/null | head -1) if [ -n "$CONFIG_FILE" ]; then echo "=== Configuration ===" cat "$CONFIG_FILE" fi

python3 -c " import json from pathlib import Path # Check if visualization script exists viz_script = Path('src/evaluation/visualize.py') if viz_script.exists(): print('Visualization script found') else: print('No visualization script found -- will generate basic plots') "

# Experiment Report: [Experiment Name] **Date:** [current date] **Branch:** [git branch] **Commit:** [git commit hash] **Author:** [generated by /generate-report skill] --- ## Executive Summary [2-3 sentences: what was the experiment, what was the key result, and is it better than baseline?] ## Experiment Configuration | Parameter | Value | |-----------|-------| | Model architecture | [from config] | | Learning rate | [from config] | | Batch size | [from config] | | Epochs | [from config] | | Optimizer | [from config] | | Scheduler | [from config] | | Random seed | [from config] | | Dataset version | [from config or DVC] | ## Dataset Summary | Split | Samples | Features | Classes | |-------|---------|----------|---------| | Train | [count] | [count] | [count or N/A] | | Validation | [count] | [count] | [count or N/A] | | Test | [count] | [count] | [count or N/A] | ## Results ### Final Metrics | Metric | Value | |--------|-------| | [metric 1] | [value] | | [metric 2] | [value] | | ... | ... | ### Comparison with Baseline | Metric | Baseline | Current | Delta | Improvement? | |--------|----------|---------|-------|-------------| | [metric 1] | [value] | [value] | [+/- value] | [Yes/No] | | ... | ... | ... | ... | ... | ### Training Curves ![Training Loss](figures/training_loss.png) ![Validation Metric](figures/validation_metric.png) ### Confusion Matrix ![Confusion Matrix](figures/confusion_matrix.png) ## Analysis ### Key Findings - [Finding 1: most important result] - [Finding 2: notable pattern or observation] - [Finding 3: any concerning behavior] ### Error Analysis - [What types of errors does the model make?] - [Are errors concentrated in specific classes or data subsets?] ### Comparison with Previous Experiments - [How does this compare to previous runs?] - [What changed and what impact did it have?] ## Recommendations ### Next Steps 1. [Actionable recommendation 1] 2. [Actionable recommendation 2] 3. [Actionable recommendation 3] ### Potential Improvements - [Idea for model improvement] - [Idea for data improvement] - [Idea for training procedure improvement] ## Artifacts | Artifact | Path | |----------|------| | Best checkpoint | checkpoints/best_model.pt | | Metrics JSON | reports/metrics.json | | Config file | configs/experiment.yaml | | Training logs | experiments/[run-id]/ | | Figures | reports/figures/ | --- *Report generated automatically by the /generate-report skill.*

generate-report

Dynamic Context

Experiment Name

Report Generation Process

Step 1: Gather Experiment Data

Step 2: Gather Baseline Data

Step 3: Generate Visualizations

Step 4: Write the Report

Step 5: Verify Report Quality

Más de este repositorio

Más de este repositorio

Dynamic Context

Experiment Name

Report Generation Process

Step 1: Gather Experiment Data

Step 2: Gather Baseline Data

Step 3: Generate Visualizations

Step 4: Write the Report

Step 5: Verify Report Quality