一键导入
machine-learning
Machine learning development patterns, model training, evaluation, and deployment. Use when building ML pipelines, training models, feature engineering, model evaluation, or deploying ML systems to production.
菜单
Machine learning development patterns, model training, evaluation, and deployment. Use when building ML pipelines, training models, feature engineering, model evaluation, or deploying ML systems to production.
当用户需要将图片、截图或扫描件转换为 Office 文档(Word/Excel)或 PDF 时,使用此技能。适用于包含复杂表格、合同或图文混排内容的图片或扫描件,可尽量还原原始版式并生成可编辑文档。本技能由夸克扫描王提供转换支持。即使用户未明确提到格式转换,只要用户的需求涉及将图片内容转换为可编辑文档(如 .docx、.xlsx 或 .pdf),也应触发此技能。请勿用于提取纯文本或识别文字内容、图像增强处理或从零创建文档
当用户需要从图片、截图、照片或扫描文档中提取、识别或结构化文本,就使用此技能——包括手写体、表格、数学公式、商品图、各类证件(身份证、社保卡、驾照、行驶证、港澳台通行证、学位证等)、票据(增值税发票、火车票、英文发票等)、医疗报告、营业执照以及习题。本技能由夸克扫描王提供支持。即使用户没有明确提到“OCR”或“文字识别”,只要用户的需求涉及从图片中获取文字或关键信息,也应触发此技能。不适用于图像生成、图像编辑或无需从图片中提取文本的任务
对话式产品孵化技能。帮助零基础用户将模糊想法转化为结构化产品需求文档(PRD),包含门槛评估、风险提示、竞品对照、边界定义和工具链推荐。
棱镜 - 多学科动态适配分析引擎。支持真实性核查、片段补全、结构化输出(表格/流程图/列表)、HTML/Markdown导出,柔和化交互,多语言自适应。
功能需求真伪验证器。用三维数据(Review/关键词/社区)验证微创新是否真实需求。 使用时机:品类选定后评估微创新、竞品分析发现差异点后判断要不要跟进。 触发词:/zach-feature-demand-validator
Use when the user wants to design, preview, or customize an Obsidian vault theme — including choosing styles, comparing color schemes, adjusting typography, or generating CSS snippets. Triggers on keywords like "Obsidian theme", "color scheme", "CSS snippet", "appearance".
| name | machine-learning |
| description | Machine learning development patterns, model training, evaluation, and deployment. Use when building ML pipelines, training models, feature engineering, model evaluation, or deploying ML systems to production. |
| author | Joseph OBrien |
| status | unpublished |
| updated | 2025-12-23 |
| version | 1.0.1 |
| tag | skill |
| type | skill |
Comprehensive machine learning skill covering the full ML lifecycle from experimentation to production deployment.
Classification Types:
Success Metrics by Problem Type:
| Problem Type | Primary Metrics | Secondary Metrics |
|---|---|---|
| Binary Classification | AUC-ROC, F1 | Precision, Recall, PR-AUC |
| Multi-class | Macro F1, Accuracy | Per-class metrics |
| Regression | RMSE, MAE | R², MAPE |
| Ranking | NDCG, MAP | MRR |
| Clustering | Silhouette, Calinski-Harabasz | Davies-Bouldin |
Data Quality Checks:
Feature Engineering Patterns:
Train/Test Split Strategies:
Algorithm Selection Guide:
| Data Size | Problem | Recommended Models |
|---|---|---|
| Small (<10K) | Classification | Logistic Regression, SVM, Random Forest |
| Small (<10K) | Regression | Linear Regression, Ridge, SVR |
| Medium (10K-1M) | Classification | XGBoost, LightGBM, Neural Networks |
| Medium (10K-1M) | Regression | XGBoost, LightGBM, Neural Networks |
| Large (>1M) | Any | Deep Learning, Distributed training |
| Tabular | Any | Gradient Boosting (XGBoost, LightGBM, CatBoost) |
| Images | Classification | CNN, ResNet, EfficientNet, Vision Transformers |
| Text | NLP | Transformers (BERT, RoBERTa, GPT) |
| Sequential | Time Series | LSTM, Transformer, Prophet |
Hyperparameter Tuning:
Common Hyperparameters:
| Model | Key Parameters |
|---|---|
| XGBoost | learning_rate, max_depth, n_estimators, subsample |
| LightGBM | num_leaves, learning_rate, n_estimators, feature_fraction |
| Random Forest | n_estimators, max_depth, min_samples_split |
| Neural Networks | learning_rate, batch_size, layers, dropout |
Evaluation Best Practices:
Handling Imbalanced Data:
Model Serving Patterns:
Production Considerations:
What to Monitor:
Retraining Triggers:
Track for every experiment:
models/
├── model_v1.0.0/
│ ├── model.pkl
│ ├── metadata.json
│ ├── requirements.txt
│ └── metrics.json
├── model_v1.1.0/
└── model_v2.0.0/
Continuous Integration:
Continuous Deployment:
For detailed patterns and code examples, load reference files as needed:
references/preprocessing.md - Data preprocessing patterns and feature engineering techniquesreferences/model_patterns.md - Model architecture patterns and implementation examplesreferences/evaluation.md - Comprehensive evaluation strategies and metrics