一键导入
ml-workflow
Design ML workflows — experiment tracking, feature stores, model training, serving, monitoring for drift
用 Codex 或 Claude 帮你安装 复制这段 Prompt,粘贴到 Codex、Claude 或其他助手里,让它检查 Skill 页面并帮你完成安装。
菜单
Design ML workflows — experiment tracking, feature stores, model training, serving, monitoring for drift
用 Codex 或 Claude 帮你安装 复制这段 Prompt,粘贴到 Codex、Claude 或其他助手里,让它检查 Skill 页面并帮你完成安装。
基于 SOC 职业分类
Local git operations for syncing, branching, merging, and conflict resolution
GitHub interactions for issues, PRs, releases, and repository management
Use this skill when performing hardware security analysis for System-on-Chip components — threat modeling, verification scaffolding, compliance mapping, executive briefing, microarchitectural attack analysis, physical side-channel assessment, kernel security analysis, emerging hardware security, or TLA+ formal specification. Routes to the appropriate specialist. Trigger phrases include "threat model my SoC", "run STRIDE analysis", "generate SVA assertions", "compliance check against FIPS", "executive summary of findings", "Spectre analysis for cache", "DPA attack assessment", "kernel hardening review", "PQC hardware review", "TLA+ spec for access control". Do NOT use for software-only security, network security, or web application security.
Use when working with Terraform or OpenTofu - creating modules, writing tests (native test framework, Terratest), setting up CI/CD pipelines, reviewing configurations, choosing between testing approaches, debugging state issues, implementing security scanning (trivy, checkov), or making infrastructure-as-code architecture decisions
Security audit checklist for web applications. Use when reviewing, auditing, or hardening a web app's security posture. Covers rate limiting, auth headers, IP blocking, CORS, security middleware, input validation, file upload limits, ORM usage, and password hashing. Triggers on requests like "review security", "harden this app", "security audit", "check for vulnerabilities", or when building/reviewing API endpoints.
Use this skill when connecting AI or LLMs to data platforms. Covers MCP servers for warehouses, natural-language-to-SQL, embeddings for data discovery, LLM-powered enrichment, and AI agent data access patterns. Common phrases: "text-to-SQL", "MCP server for Snowflake", "LLM data enrichment", "AI agent access". Do NOT use for general data integration (use data-integration) or dbt modeling (use dbt-transforms).
| name | ML Workflow |
| department | alchemist |
| description | Design ML workflows — experiment tracking, feature stores, model training, serving, monitoring for drift |
| version | 1 |
| triggers | ["ML","machine learning","model","training","feature store","MLflow","experiment","drift","serving","inference","W&B","Weights & Biases","model deployment"] |
Design end-to-end ML workflows covering experiment tracking, feature engineering and storage, model training pipelines, model serving and deployment, A/B testing for models, and monitoring for data and model drift. Produces a workflow architecture, tool selection rationale, and operational runbook.
Before any tooling decisions, formalize:
Document the problem statement, target variable, evaluation metric, and success threshold.
Map raw data to model-ready features:
Set up reproducible experiment management:
Build a reproducible, automated training workflow:
Plan how predictions reach users:
For real-time serving, specify: latency SLA (p50/p99), throughput (requests/second), scaling strategy (auto-scale triggers), and fallback behavior (what happens if the model is unavailable?).
Plan controlled rollout of model changes:
Plan ongoing model health monitoring:
Define retraining policy: scheduled (weekly/monthly), triggered (drift detected), or continuous (online learning).
# ML Workflow: [Project/Model Name]
## Problem Definition
| Aspect | Detail |
|--------|--------|
| Problem type | ... |
| Target variable | ... |
| Business metric | ... |
| Evaluation metric | ... |
| Baseline performance | ... |
| Success threshold | ... |
## Feature Engineering
| Feature | Source | Transformation | Type | Leakage Risk |
|---------|--------|---------------|------|-------------|
| ... | ... | ... | ... | Low/Med/High |
**Feature store:** [Yes/No — tool choice and rationale]
## Experiment Tracking
| Aspect | Choice | Rationale |
|--------|--------|-----------|
| Tool | ... | ... |
| What's tracked | ... | ... |
| Organization | ... | ... |
## Training Pipeline
[ASCII diagram showing data → features → train → evaluate → register]
| Stage | Tool/Method | Notes |
|-------|------------|-------|
| Data split | ... | ... |
| Training | ... | ... |
| Tuning | ... | ... |
| Validation | ... | ... |
| Registry | ... | ... |
## Model Serving
| Aspect | Detail |
|--------|--------|
| Serving mode | Batch / Real-time / Streaming / Edge |
| Latency SLA | ... |
| Throughput | ... |
| Scaling | ... |
| Fallback | ... |
## A/B Testing
| Aspect | Detail |
|--------|--------|
| Traffic split | ... |
| Primary metric | ... |
| Guardrail metrics | ... |
| Min duration | ... |
| Rollback criteria | ... |
## Monitoring and Drift
| Monitor | Tool | Threshold | Action |
|---------|------|-----------|--------|
| Data drift | ... | ... | ... |
| Model drift | ... | ... | ... |
| Concept drift | ... | ... | ... |
| Operational | ... | ... | ... |
**Retraining policy:** [Scheduled / Triggered / Continuous — details]