| name | r-targets |
| description | Build and maintain reproducible data analysis pipelines in R using the targets package. Use when working with _targets.R files, creating computational workflows, managing dependencies between R analysis steps, debugging pipeline errors, optimizing performance, or implementing dynamic/static branching for large-scale analyses. |
R Targets Package Skill
This skill helps you effectively use the targets R package for building reproducible, scalable data analysis pipelines.
Core Concepts
What is targets?
targets is a Make-like pipeline tool for R that:
- Skips costly runtime for tasks already up to date
- Orchestrates computation with implicit parallel computing
- Abstracts files as R objects
- Tracks dependencies automatically through static code analysis
Key Files
_targets.R: The target script file that defines your pipeline. Must return a list of target objects.
_targets/: Data store containing:
_targets/meta/meta: Target metadata (text file)
_targets/objects/: Target output data
_targets/workspaces/: Debug workspaces for errored targets
Quick Start
Basic Pipeline Structure
library(targets)
library(tarchetypes)
tar_source()
tar_option_set(packages = c("dplyr", "ggplot2"))
list(
tar_target(file, "data.csv", format = "file"),
tar_target(data, read_csv(file)),
tar_target(model, fit_model(data)),
tar_target(plot, create_plot(model, data))
)
Essential Commands
tar_make()
tar_outdated()
tar_visnetwork()
tar_manifest()
tar_read(target_name)
tar_load(target_name)
tar_destroy()
tar_delete(target_name)
tar_invalidate(target_name)
Best Practices
Target Design
A good target should:
- Create a dataset, analyze a dataset, or summarize an analysis
- Be large enough to save meaningful time when skipped
- Be small enough that some targets can skip while others run
- Have no side effects (except file targets with
format = "file")
- Return a single, meaningful, saveable value
Function-Oriented Workflows
Define functions in R/ directory, not inline in _targets.R:
get_data <- function(file) {
read_csv(file) %>%
filter(!is.na(value))
}
fit_model <- function(data) {
lm(outcome ~ predictor, data)
}
library(targets)
tar_source()
list(
tar_target(data, get_data("data.csv")),
tar_target(model, fit_model(data))
)
Storage Formats
Choose appropriate formats for your data:
| Format | Best For | Requirements |
|---|
"rds" (default) | General R objects | base R |
"qs" | Large/general objects | qs2 package |
"feather" | Data frames | arrow package |
"parquet" | Large data frames | arrow package |
"file" | External files | Returns file path |
tar_option_set(format = "qs")
tar_target(data, get_data(), format = "qs")
Dynamic Branching
Dynamic branching creates targets at runtime based on data:
list(
tar_target(samples, c("A", "B", "C")),
tar_target(
analysis,
analyze_sample(samples),
pattern = map(samples)
),
tar_target(
combined,
combine_results(analysis)
)
)
Pattern Types
map(x, y): One branch per tuple of elements
cross(x, y): One branch per combination
slice(x, index = c(1, 3)): Branch over specific indices
head(x, n = 5): First n elements
tail(x, n = 5): Last n elements
sample(x, n = 5): Random sample
Iteration Modes
"vector" (default): Uses vctrs::vec_slice() and vctrs::vec_c()
"list": Uses [[ for slicing and list() for aggregation
"group": Branch over dplyr::group_by() row groups (use with tar_group())
Static Branching with tarchetypes
Static branching creates targets before the pipeline runs using metaprogramming:
library(tarchetypes)
values <- tibble(
method = rlang::syms(c("method1", "method2")),
dataset = c("data1", "data2")
)
tar_map(
values = values,
tar_target(analysis, method(dataset)),
tar_target(summary, summarize(analysis))
)
Debugging Workflow
Step 1: Check Error Details
tar_meta(fields = error, complete_only = TRUE)
Step 2: Reproduce Error Locally
tar_load_globals()
tar_load(target_name)
Step 3: Interactive Debugging (if needed)
tar_make(callr_function = NULL, use_crew = FALSE)
See references/TROUBLESHOOTING.md for detailed error solutions.
Advanced Topics (See References)
Useful Utilities
tar_deps(your_function)
tar_pattern(map(x, y), x = 3, y = 2)
tar_meta(targets_only = TRUE)
tar_meta(fields = c("name", "status", "time", "error"))
tar_poll()
tar_progress()
tar_watch()
tar_validate()
tar_glimpse()