mit einem Klick
r-ml
// R machine learning packages. Use for classification, regression, clustering, deep learning, gradient boosting (xgboost, lightgbm), random forests, neural networks, and time series forecasting.
// R machine learning packages. Use for classification, regression, clustering, deep learning, gradient boosting (xgboost, lightgbm), random forests, neural networks, and time series forecasting.
R language data analysis and visualization skill. Use when user asks to (1) run R scripts or code, (2) install/update R packages, (3) perform data analysis with R, (4) create visualizations with ggplot2/plotly, (5) statistical analysis, (6) data manipulation with tidyverse/dplyr/data.table. Triggers on keywords like "R语言", "R脚本", "ggplot", "tidyverse", "数据分析", "可视化".
R DALEX package for model explanations. Use for explaining complex machine learning models.
R iml package for interpretable ML. Use for model-agnostic interpretability methods.
R lime package for local explanations. Use for explaining individual predictions with local interpretable models.
R packages for ML interpretability. Use for explaining and interpreting machine learning models.
R vip package for variable importance. Use for computing and visualizing variable importance scores.
| name | r-ml |
| description | R machine learning packages. Use for classification, regression, clustering, deep learning, gradient boosting (xgboost, lightgbm), random forests, neural networks, and time series forecasting. |
| Sub-skill | Description |
|---|---|
| r-ml-frameworks | tidymodels, caret, mlr3, h2o |
| r-ml-boosting | xgboost, lightgbm, gbm |
| r-ml-trees | randomForest, ranger, rpart |
| r-ml-regularization | glmnet, lasso, elastic-net |
| r-ml-deeplearning | torch, keras, neural networks |
| r-ml-timeseries | prophet, fable, forecast |
| r-ml-survival | survival, survminer |
| r-ml-anomaly | AnomalyDetection, anomalize |
Machine learning and predictive modeling in R.
| Package | Description |
|---|---|
| caret ★ | Classification and Regression Training |
| mlr3 ★ | Next-gen extensible ML framework |
| tidymodels ★ | Tidyverse-friendly modeling |
| h2o | Deep learning, RF, GBM, GLM |
| Package | Description |
|---|---|
| xgboost ★ | eXtreme Gradient Boosting |
| lightgbm ★ | Light Gradient Boosting Machine |
| gbm | Generalized Boosted Regression |
| bst | Gradient Boosting |
| mboost | Model-Based Boosting |
| CoxBoost | Cox models boosting |
| GAMBoost | GAM boosting |
| gamboostLSS | GAMLSS boosting |
| GMMBoost | Mixed models boosting |
| Package | Description |
|---|---|
| randomForest | Breiman's random forests |
| ranger ★ | Fast random forests |
| randomForestSRC | RF for survival/regression/classification |
| rpart | Recursive partitioning trees |
| party | Recursive partitioning lab |
| partykit | Partitioning toolkit |
| C50 | C5.0 Decision Trees |
| Cubist | Rule-based regression |
| evtree | Evolutionary trees |
| tree | Classification/regression trees |
| bigrf | Big Random Forests |
| Package | Description |
|---|---|
| glmnet ★ | Lasso and elastic-net GLMs |
| lars | Least Angle Regression, Lasso |
| elasticnet | Elastic-Net, Sparse PCA |
| penalized | L1/L2 penalized estimation |
| ncvreg | SCAD/MCP regularization |
| grplasso | Group Lasso |
| grpreg | Grouped covariates regularization |
| L0Learn | Best subset selection |
| Package | Description |
|---|---|
| torch ★ | PyTorch-like tensors/NNs |
| MXNet | Flexible GPU deep learning |
| nnet | Feed-forward NNs |
| RSNNS | Stuttgart NN Simulator |
| keras | Keras interface |
| Package | Description |
|---|---|
| lme4 ★ | Mixed-effects models |
| nlme | Mixed-effects with custom covariance |
| glmmTMB | Generalized mixed-effects |
| Package | Description |
|---|---|
| prophet ★ | Facebook's forecasting tool |
| fable | Tidy forecasting |
| forecast | Time series forecasting |
| Package | Description |
|---|---|
| AnomalyDetection | Twitter's anomaly detection |
| anomalize | Tidy anomaly detection |
| BreakoutDetection | Twitter's breakout detection |
| CausalImpact | Google's causal inference |
| Package | Description |
|---|---|
| kernlab | Kernel-based ML lab |
| e1071 | SVM, Naive Bayes, etc. |
| LiblineaR | Linear predictive models |
| svmpath | SVM path algorithm |
| penalizedSVM | Feature selection SVM |
| Package | Description |
|---|---|
| kohonen | Self-Organizing Maps |
| Rsomoclu | Parallel SOM |
| hda | Heteroscedastic Discriminant Analysis |
| klaR | Classification and visualization |
| Package | Description |
|---|---|
| Boruta | All-relevant feature selection |
| FSelector | Subset-search/ranking selection |
| varSelRF | Variable selection with RF |
| Package | Description |
|---|---|
| survival ★ | Survival analysis |
| survminer | Survival visualization |
| ahaz | Additive hazards regression |
| Package | Description |
|---|---|
| arules | Association rules mining |
| rattle | GUI for data mining |
| rminer | Simplified NN/SVM usage |
| ROCR | Classifier performance visualization |
| SuperLearner | Ensemble learning |
| RWeka | Weka interface |
# tidymodels workflow
library(tidymodels)
split <- initial_split(df, prop = 0.8)
train <- training(split)
test <- testing(split)
model <- rand_forest(trees = 100) %>%
set_engine("ranger") %>%
set_mode("classification")
recipe <- recipe(target ~ ., data = train) %>%
step_normalize(all_numeric())
workflow <- workflow() %>%
add_model(model) %>%
add_recipe(recipe)
fit <- workflow %>% fit(data = train)
predict(fit, test)
# xgboost
library(xgboost)
dtrain <- xgb.DMatrix(data = as.matrix(train_x), label = train_y)
model <- xgb.train(
params = list(objective = "binary:logistic", max_depth = 6),
data = dtrain, nrounds = 100
)
# caret
library(caret)
ctrl <- trainControl(method = "cv", number = 5)
model <- train(target ~ ., data = train, method = "rf", trControl = ctrl)