Run any Skill in Manus with one click

Get Started

$pwd:

new-service

Name: New Service
Author: DuqueOM

// Create a complete new ML service from template — end-to-end scaffolding

Run Skill in Manus

$ git log --oneline --stat

stars:2

forks:0

updated:May 6, 2026 at 16:23

SKILL.md

readonly

package.json

"author": "DuqueOM"

"repository": "DuqueOM/ML-MLOps-Production-Template"

View GitHub Repository

$ install --globalskills.sh

$ download --local

Run Skill in Manus

[HINT] Download the complete skill directory including SKILL.md and all related files

name	new-service
description	Create a complete new ML service from template — end-to-end scaffolding
allowed-tools	["Read","Write","Edit","Grep","Glob","Bash(cp:)","Bash(mkdir:)","Bash(sed:)","Bash(docker:)","Bash(kubectl:)","Bash(dvc:)","Bash(terraform:*)"]
when_to_use	Use when creating a new ML microservice from scratch for a business problem. Examples: 'create a new churn prediction service', 'scaffold a fraud detection API', 'new service for loan default prediction'
argument-hint	<service-name> <business-problem>
arguments	["service-name","business-problem"]
authorization_mode	{"scaffold_files":"AUTO","init_dvc":"AUTO","create_mlflow_experiment":"AUTO","wire_cicd":"AUTO","push_initial_commit":"CONSULT","escalation_triggers":[{"target_dir_exists":"STOP"},{"eda_artifacts_missing":"STOP"},{"service_name_collides":"STOP"}]}

Create New ML Service

Guides creation of a complete, production-ready ML service using the template system.

Inputs

$service-name: Service slug (e.g., bankchurn, frauddetect)
$business-problem: What the service predicts/classifies

Goal

A fully deployed, tested, monitored ML service with all quality gates passing, drift detection running, and documentation complete.

Pre-conditions

templates/scripts/new-service.sh exists and is executable
The caller has specified ServiceName (PascalCase) and service_slug (snake_case)
Cloud target is known (gcp, aws, or both)

Steps

1. Gather Requirements

Human checkpoint: Confirm requirements before scaffolding.

Answer these questions:

Business problem: What does this service predict/classify/estimate?
Dataset: Source, size, features, target distribution
Model type: Classification, regression, NLP, time series?
Scale: Expected request volume, latency requirements
Explainability: Is SHAP required? (High-stakes decisions = yes)

2. Run Scaffolding Script

bash templates/scripts/new-service.sh "$service-name" "$service-slug"

Verify no remaining placeholders:

grep -r "{ServiceName}\|{service}\|{SERVICE}" $service-name/ --include="*.py" --include="*.yaml" | head -20

Success criteria: Directory created with zero remaining {ServiceName}, {service}, or {SERVICE} placeholders. Run examples/minimal/ if this is the first time to validate template works.

3. Data Validation (Agent-DataValidator)

Define Pandera schema in src/$service-name/schemas.py
Check for temporal data → review for leakage risk
Create background data for SHAP (50 representative samples)
Version data with DVC: dvc add data/raw/dataset.csv

Success criteria: Pandera schema validates sample data without errors. DVC tracking configured.

4. Training Pipeline (Agent-MLTrainer)

Implement FeatureEngineer class in src/$service-name/training/features.py
Define model pipeline in src/$service-name/training/model.py
Implement Trainer.run() in src/$service-name/training/train.py:
- load_data() + Pandera validation
- engineer_features()
- split_train_val_test() (temporal if dates exist)
- cross_validate() with StratifiedKFold
- evaluate() with optimal threshold
- fairness_check() (DIR >= 0.80)
- save_artifacts() with SHA256
- log_to_mlflow()
- quality_gates()
Configure Optuna (minimum 50 trials)
Create MLflow experiment

Success criteria: python -m src.$service-name.cli train --data data/raw/dataset.csv completes with all quality gates passing.

5. Serving API (Agent-APIBuilder)

Define Pydantic schemas in app/schemas.py
Keep the generated FastAPI split intact:
- app/main.py owns lifespan, /health, /ready, CORS, tracing, error envelope, /model/info, and /model/reload
- app/fastapi_app.py owns /predict, /predict_batch, /metrics, model loading, feature parity, SHAP, and prediction logging
Customize the existing endpoints without changing the contract:
- /predict with ThreadPoolExecutor (NEVER sync predict in async)
- /predict?explain=true with SHAP KernelExplainer
- /predict_batch for batch predictions (note: underscore, not slash)
- /health for liveness probe (200 while process alive)
- /ready for readiness probe (503 until warm-up complete — D-23)
- /metrics for Prometheus
Keep FeatureEngineer.transform_inference() aligned with training
Define predict_proba_wrapper for SHAP in original feature space
Write API tests with TestClient and keep tests/test_fastapi_template_contract.py passing

Success criteria: pytest tests/test_fastapi_template_contract.py tests/test_api.py -v passes. curl localhost:8000/health returns healthy and /ready returns 200 only after the model is loaded and warmed.

6. Containerization (Agent-DockerBuilder)

Customize Dockerfile (multi-stage, non-root, HEALTHCHECK)
Verify .dockerignore excludes models/, data/raw/, tests/

Build and test locally:

docker build -t $service-name:dev .
docker run -p 8000:8000 $service-name:dev
curl localhost:8000/health

Success criteria: Docker build succeeds. Container starts and /health returns 200.

7. Kubernetes (Agent-K8sBuilder)

Create deployment from templates/k8s/deployment.yaml
Create HPA (CPU-only, 50-70% target) from templates/k8s/hpa.yaml
Create Service from templates/k8s/service.yaml
Create Kustomize overlays for GCP and AWS
Init container configured for model download

Success criteria: for o in gcp-dev gcp-staging gcp-prod aws-dev aws-staging aws-prod; do kustomize build k8s/overlays/$o; done renders valid YAML for all 6 overlays.

8. Infrastructure (Agent-TerraformBuilder)

Add container repository in infra/terraform/{cloud}/
Add IAM permissions (Workload Identity for GCP, IRSA for AWS)
terraform plan → verify → terraform apply

Success criteria: terraform plan shows expected resources with no errors.

9. CI/CD (Agent-CICDBuilder)

Add service to build matrix in .github/workflows/ci.yml
Add drift detection to scheduled workflow
Create retrain-$service-name.yml with quality gates
Configure GitHub Secrets

Success criteria: CI workflow triggers on PR and runs tests + lint + type check.

10. Monitoring (Agent-MonitoringSetup)

Verify /metrics exports {service}_requests_total, {service}_request_duration_seconds
Create Grafana dashboard from templates/monitoring/grafana-dashboard.json
Configure P1-P4 alerts in AlertManager
Verify Pushgateway connectivity for drift metrics

Success criteria: Grafana dashboard shows live metrics. Alert rules configured.

11. Drift Detection (Agent-DriftSetup)

Define PSI thresholds per feature with domain reasoning
Implement drift_detection.py with quantile-based bins
Create K8s CronJob for scheduled drift checks
Configure heartbeat alert (48h timeout)

Success criteria: CronJob runs successfully. PSI metrics appear in Pushgateway.

12. Documentation (Agent-DocumentationAI)

Create ADR for model selection decision
Write service README.md with real metrics
Create runbook with P1-P4 commands
Update root project documentation

Success criteria: README includes measured metrics, not estimates.

13. Testing (Agent-TestGenerator)

Data leakage regression test
SHAP consistency + non-zero + original feature space tests
Quality gate threshold tests
Inference latency SLA test (P95 < SLA)
Fairness DIR >= 0.80 test
Load test with Locust (100 concurrent, < 1% error)

Success criteria: pytest tests/ -v --cov=src --cov-report=term-missing shows >= 90% coverage.

Rules

Never skip quality gates — all must pass before deployment
Never use == for ML package pinning — use ~= (compatible release)
Never bake models into Docker images — use init container pattern
Always create an ADR for every non-trivial decision
Always measure and document real metrics, not estimates

Acceptance Criteria

A service is production-ready when ALL of these pass: