Run any Skill in Manus with one click

$pwd:

run-pipeline

Name: Run Pipeline
Author: xvirobotics

// Run the full data science pipeline: validate raw data, preprocess, engineer features, train model, and evaluate. Use this when you want to execute the end-to-end ML pipeline or re-run it after data or code changes.

Run Skill in Manus

$ git log --oneline --stat

stars:41

forks:6

updated:February 23, 2026 at 03:44

SKILL.md

readonly

name	run-pipeline
description	Run the full data science pipeline: validate raw data, preprocess, engineer features, train model, and evaluate. Use this when you want to execute the end-to-end ML pipeline or re-run it after data or code changes.
user-invocable	true
context	fork
allowed-tools	Bash, Read, Grep
argument-hint	[config-file] e.g. configs/experiment.yaml

You are executing the full data science pipeline for this project. Run each stage sequentially, verifying success before proceeding to the next stage. Stop immediately if any stage fails and report the error clearly.

Dynamic Context

Current branch: !git branch --show-current Data directory contents: !ls data/ 2>/dev/null || echo "No data/ directory found" Available configs: !ls configs/*.yaml 2>/dev/null || ls configs/*.toml 2>/dev/null || echo "No config files found" Python environment: !which python3 && python3 --version 2>/dev/null || echo "Python not found" Recent changes: !git diff --stat HEAD~3 2>/dev/null || echo "No recent commits"

Configuration

If the user provided a config file as an argument, use it: $ARGUMENTS Otherwise, look for the default config at configs/experiment.yaml or configs/experiment.toml.

Pipeline Stages

Execute each stage in order. After each stage, check for errors and verify outputs exist before proceeding.

Stage 1: Environment Check

Verify the Python environment is ready:

python3 -c "import torch; import pandas; import numpy; print(f'PyTorch {torch.__version__}, pandas {pandas.__version__}, NumPy {numpy.__version__}')"

If imports fail, report which packages are missing and suggest pip install -r requirements.txt.

Stage 2: Data Validation

Run data validation on the raw data:

python3 -m src.data.validate --data-dir data/raw/

If the validation script does not exist, look for alternative patterns:

python3 src/data/validate.py
python3 -m pytest tests/test_data/ -v --tb=short
Check for pandera schemas in src/data/ and report their status

Verify: validation passes with no critical errors. Log any warnings.

Stage 3: Preprocessing

Run the preprocessing pipeline:

python3 -m src.data.preprocess --config $CONFIG_FILE

Alternative patterns:

python3 src/data/preprocess.py --config $CONFIG_FILE
dvc repro preprocess (if DVC pipeline is configured)

Verify: processed data files exist in data/processed/ (check for .parquet or .csv files).

Stage 4: Feature Engineering

Run feature engineering:

python3 -m src.features.build_features --config $CONFIG_FILE

Alternative patterns:

python3 src/features/build_features.py
dvc repro features

Verify: feature files exist in data/features/ with expected columns.

Stage 5: Model Training

Run model training:

python3 -m src.models.training.trainer --config $CONFIG_FILE

Alternative patterns:

python3 src/models/train.py --config $CONFIG_FILE
python3 train.py --config $CONFIG_FILE

Monitor output for:

Loss values (should decrease over epochs)
Validation metrics at each epoch
Any NaN or Inf values (indicates numerical instability)
Out-of-memory errors

Verify: model checkpoint exists in checkpoints/ directory.

Stage 6: Evaluation

Run model evaluation on the test set:

python3 -m src.models.evaluation.evaluate --checkpoint checkpoints/best_model.pt --config $CONFIG_FILE

Alternative patterns:

python3 src/evaluation/evaluate.py
python3 evaluate.py --checkpoint checkpoints/best_model.pt

Verify: metrics JSON file exists in reports/ or experiments/.

Stage 7: Summary

After all stages complete, produce a summary:

Report which stages succeeded and which failed
Print the final evaluation metrics (read from the metrics JSON)
List all generated artifacts (checkpoints, processed data, feature files, metrics)
If any stage failed, provide the error message and suggest a fix
Report total pipeline execution time

Error Handling

If a stage fails, do NOT proceed to the next stage (except validation warnings which are non-blocking)
Capture stderr and stdout from each command
For Python errors, read the traceback and identify the root cause
For file-not-found errors, check if the expected directory structure exists
For import errors, report the missing package
For CUDA out-of-memory, suggest reducing batch size in the config

related-skills.json

same repository

evaluate-model.md

from "xvirobotics/metaskill"

Load the latest model checkpoint, run evaluation on the test set, and generate a metrics report with confusion matrix. Use this after training to assess model performance or to re-evaluate a specific checkpoint.

2026-02-2341

generate-report.md

from "xvirobotics/metaskill"

Generate a comprehensive summary report of the latest experiment including metrics, plots, and comparison with baseline. Use this after training and evaluation to create a shareable experiment summary.

2026-02-2341

api-test.md

from "xvirobotics/metaskill"

Run API integration tests against the running backend, verify endpoints return expected responses and status codes. Use after deploying a preview or starting the dev server.

2026-02-2341

build-and-test.md

from "xvirobotics/metaskill"

Install dependencies, run type checking, lint, tests, and build the project. Use after making code changes to verify nothing is broken.

2026-02-2341

deploy-preview.md

from "xvirobotics/metaskill"

Build Docker images and launch a local preview environment with docker-compose. Use to test the full stack locally before merging.

2026-02-2341

build-and-test.md

from "xvirobotics/metaskill"

Build the Xcode project and run the full test suite. Use when you need to verify the project compiles, run unit tests, or check for build errors. Reports pass/fail results with detailed error output.

2026-02-2341

package.json

"author": "xvirobotics"

"repository": "xvirobotics/metaskill"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

$ useful --forSOC

Data ScientistsComputer and Mathematical Occupations15-2051L4

name	run-pipeline
description	Run the full data science pipeline: validate raw data, preprocess, engineer features, train model, and evaluate. Use this when you want to execute the end-to-end ML pipeline or re-run it after data or code changes.
user-invocable	true
context	fork
allowed-tools	Bash, Read, Grep
argument-hint	[config-file] e.g. configs/experiment.yaml

Dynamic Context

Configuration

If the user provided a config file as an argument, use it: $ARGUMENTS Otherwise, look for the default config at configs/experiment.yaml or configs/experiment.toml.

Pipeline Stages

Execute each stage in order. After each stage, check for errors and verify outputs exist before proceeding.

Stage 1: Environment Check

Verify the Python environment is ready:

python3 -c "import torch; import pandas; import numpy; print(f'PyTorch {torch.__version__}, pandas {pandas.__version__}, NumPy {numpy.__version__}')"

If imports fail, report which packages are missing and suggest pip install -r requirements.txt.

Stage 2: Data Validation

Run data validation on the raw data:

python3 -m src.data.validate --data-dir data/raw/

If the validation script does not exist, look for alternative patterns:

python3 src/data/validate.py
python3 -m pytest tests/test_data/ -v --tb=short
Check for pandera schemas in src/data/ and report their status

Verify: validation passes with no critical errors. Log any warnings.

Stage 3: Preprocessing

Run the preprocessing pipeline:

python3 -m src.data.preprocess --config $CONFIG_FILE

Alternative patterns:

python3 src/data/preprocess.py --config $CONFIG_FILE
dvc repro preprocess (if DVC pipeline is configured)

Verify: processed data files exist in data/processed/ (check for .parquet or .csv files).

Stage 4: Feature Engineering

Run feature engineering:

python3 -m src.features.build_features --config $CONFIG_FILE

Alternative patterns:

python3 src/features/build_features.py
dvc repro features

Verify: feature files exist in data/features/ with expected columns.

Stage 5: Model Training

Run model training:

python3 -m src.models.training.trainer --config $CONFIG_FILE

Alternative patterns:

python3 src/models/train.py --config $CONFIG_FILE
python3 train.py --config $CONFIG_FILE

Monitor output for:

Loss values (should decrease over epochs)
Validation metrics at each epoch
Any NaN or Inf values (indicates numerical instability)
Out-of-memory errors

Verify: model checkpoint exists in checkpoints/ directory.

Stage 6: Evaluation

Run model evaluation on the test set:

python3 -m src.models.evaluation.evaluate --checkpoint checkpoints/best_model.pt --config $CONFIG_FILE

Alternative patterns:

python3 src/evaluation/evaluate.py
python3 evaluate.py --checkpoint checkpoints/best_model.pt

Verify: metrics JSON file exists in reports/ or experiments/.

Stage 7: Summary

After all stages complete, produce a summary:

Report which stages succeeded and which failed
Print the final evaluation metrics (read from the metrics JSON)
List all generated artifacts (checkpoints, processed data, feature files, metrics)
If any stage failed, provide the error message and suggest a fix
Report total pipeline execution time

Error Handling

If a stage fails, do NOT proceed to the next stage (except validation warnings which are non-blocking)
Capture stderr and stdout from each command
For Python errors, read the traceback and identify the root cause
For file-not-found errors, check if the expected directory structure exists
For import errors, report the missing package
For CUDA out-of-memory, suggest reducing batch size in the config

run-pipeline

Dynamic Context

Configuration

Pipeline Stages

Stage 1: Environment Check

Stage 2: Data Validation

Stage 3: Preprocessing

Stage 4: Feature Engineering

Stage 5: Model Training

Stage 6: Evaluation

Stage 7: Summary

Error Handling

More from this repository

More from this repository

Dynamic Context

Configuration

Pipeline Stages

Stage 1: Environment Check

Stage 2: Data Validation

Stage 3: Preprocessing

Stage 4: Feature Engineering

Stage 5: Model Training

Stage 6: Evaluation

Stage 7: Summary

Error Handling