| name | add-subworkflow |
| description | Scaffold a new Bactopia subworkflow that orchestrates existing modules. Creates main.nf with GroovyDoc and test files. Use when asked to add a new subworkflow, create a subworkflow, or wire up modules into a subworkflow. |
Add Subworkflow
Scaffold a Bactopia subworkflow that orchestrates one or more existing modules. Subworkflows are glue -- they wire modules together, aggregate results, and provide a clean interface for workflows.
Prerequisites
- The module(s) this subworkflow will use must already exist under
modules/
- Read
.claude/docs/standards/04-subworkflow-documentation.md for documentation standards
Interactive Questioning
This skill is interactive -- ask the user early and often, especially before creating files.
- Multiple questions at once: Use
AskUserQuestion popups (up to 4 questions per batch).
Mark the recommended option with "(Recommended)" at the end of its label and place it first.
- Single simple question: Just ask in chat, no popup needed.
- When in doubt: Ask. It's cheaper to clarify upfront than to regenerate files.
What a Subworkflow Contains
A subworkflow is a single main.nf file plus tests:
subworkflows/{tool}/
main.nf # Workflow definition with GroovyDoc
tests/
main.nf.test # nf-test specification
main.nf.test.snap # Snapshot (generated by nf-test)
nextflow.config # Includes ALL module.configs used by this subworkflow
nf-test.config # Standard nf-test config
.nftignore # Exclude unstable files from snapshots
No module.config or schema.json -- those belong to modules only.
Phased Workflow
Phase 1: Gather Information
Goal: Determine which modules to orchestrate and how, using interactive prompts.
Important: Use the AskUserQuestion tool for structured choices throughout this phase.
Present up to 4 questions per batch. Mark the recommended option with "(Recommended)"
at the end of its label and place it first in the options list.
-
Ask the user which modules this subworkflow uses. Get the module paths (e.g., modules/nohuman/run, modules/mlst).
- Read each module's
main.nf to understand its inputs and outputs.
- Does it call other subworkflows? If so, use
@subworkflows tag (not @modules) for those includes.
-
Run the lookup command if package info is needed:
bash .claude/skills/add-bactopia-tool/scripts/run-bactopia-scaffold.sh lookup {package_name} --bactopia-path . --pretty
-
Determine the input type from the primary module's inputs, then run test-data discovery:
bash .claude/skills/add-bactopia-tool/scripts/run-bactopia-scaffold.sh test-data --input-type {input_type} --bactopia-path . --pretty
This returns species/accession combinations already used by similar modules, with
pre-computed test_data_path, test_uncompressed_path, test_species, and
test_sample_id values. Use the returned paths directly in the scaffold config
(Phase 2) -- do NOT construct paths manually. Subworkflow tests use the
compressed path (test_data_path).
-
Batch 1: Design choices (AskUserQuestion, up to 3 questions)
Question 1 -- Aggregation strategy:
- CSVTK_CONCAT -- concatenate per-sample tabular output (most common) (Recommended)
- Dedicated summary module -- tool has its own aggregation command (rare)
- No aggregation -- tool doesn't produce per-sample tabular output
Question 2 -- Test data species:
Present top 3-4 species from the test-data discovery results. Recommend species that
exercise the tool's functionality. Include the accession in each option's description.
Question 3 -- Aggregation field (if CSVTK_CONCAT or dedicated_summary selected):
Which output field should be aggregated? (e.g., tsv, report, csv)
What format? (tsv or csv)
-
Present the following for confirmation (derive from module main.nf and lookup):
- Tool identity: name (snake_case), display name, one-sentence description
- Outputs: from the primary module's output block (for
@output GroovyDoc tags)
- Citation key and keywords for GroovyDoc
-
Final confirmation (AskUserQuestion, 1 question)
After presenting the summary, ask:
- Looks good, proceed to file generation
- I need to make changes (user provides details via "Other" or notes)
Phase 2: File Generation
Goal: Generate the 5 subworkflow files using bactopia-scaffold.
-
Construct the JSON config from the design decisions. Write it to /tmp/scaffold-config.json:
{
"tool": "{tool_name}",
"display_name": "{DisplayName}",
"description": "{One-sentence description}",
"process_name": "{TOOL_NAME}",
"package": "{package_name}",
"version": "{version}",
"build": "{build}",
"home_url": "{github_url}",
"input_type": "assembly",
"has_database": false,
"handles_gz": false,
"layout": "flat",
"resource_label": "process_low",
"version_command": "{version_command}",
"citation_key": "{citation_key}",
"keywords": ["{keyword1}", "{keyword2}"],
"aggregation": {
"strategy": "csvtk_concat",
"field": "{output_field}",
"format": "{tsv|csv}"
},
"outputs": [
{"name": "{field}", "extension": "{ext}", "description": "{desc}"}
],
"parameters": [],
"container_refs": {
"toolName": "{from lookup}",
"docker": "{from lookup}",
"image": "{from lookup}"
},
"test_species": "{species}",
"test_sample_id": "{sample_id}",
"test_data_path": "{compressed_path}",
"test_uncompressed_path": "{uncompressed_path}"
}
For database-dependent subworkflows, also include:
{
"database": {
"param_name": "{tool}_db",
"test_path": "datasets/{tool}/{db_file}"
}
}
-
Run the scaffold command:
bash .claude/skills/add-bactopia-tool/scripts/run-bactopia-scaffold.sh subworkflow --config /tmp/scaffold-config.json --bactopia-path . --pretty
-
The command creates 5 files:
subworkflows/{tool}/main.nf
subworkflows/{tool}/tests/main.nf.test
subworkflows/{tool}/tests/nextflow.config
subworkflows/{tool}/tests/nf-test.config
subworkflows/{tool}/tests/.nftignore
Phase 3: Review & Customize
Goal: Review generated files and make tool-specific adjustments.
-
Subworkflow main.nf -- review and customize:
- The
@input GroovyDoc should match the subworkflow's take channel name
- For the CSVTK_CONCAT pattern, verify the gather field and format are correct
- If the subworkflow uses modules not in the standard pattern (e.g., calls other subworkflows), add the appropriate
@subworkflows tag and adjust includes
- The
@modules tag should use underscore-delimited directory keys: csvtk_concat, {tool}
-
Test nextflow.config -- verify it includes ALL module.configs for processes in the subworkflow:
- The primary module's config
csvtk/concat/module.config if using CSVTK_CONCAT
- Any other module configs
-
Test main.nf.test -- verify:
- Test data paths match the species/sample chosen
- Database input lines are present if needed
- Snapshot fields include the right output field names
-
Run the linter to catch structural issues before proceeding:
bash .claude/skills/add-bactopia-tool/scripts/run-bactopia-lint.sh {tool} --bactopia-path .
This runs bactopia-lint scoped to the new subworkflow (and module if it exists).
Fix any FAILs before moving on. Common issues:
- S011: misaligned include braces in subworkflow
- S019: citation key not found in
data/citations.yml
-
Update data/citations.yml -- add the tool citation entry in alphabetical order
(if not already present from a prior /add-module run):
{tool}:
name: "{ToolName}"
link: "{github_url}"
description: "{One-sentence description}"
cite: "{Full citation text}"
-
List all created files with full paths.
-
Remind the user to run these follow-up steps in order:
/run-tests {tool} subworkflow --generate -- generate snapshots and verify the subworkflow test passes (new subworkflows have no existing snapshots)
- The subworkflow needs a workflow entry point to be usable -- use
/add-bactopia-tool if this is a standalone bactopia-tool
The --generate flag is required because newly scaffolded subworkflows have no
snapshot files yet. Without it, nf-test will fail immediately on missing
snapshots.
Subworkflow Patterns
The scaffold generates one of three patterns based on the aggregation.strategy:
| Strategy | Pattern | Include | Emit |
|---|
csvtk_concat | CSVTK_CONCAT aggregation | gatherCsvtk from plugin | sample_outputs + run_outputs |
dedicated_summary | Tool's own summary command | gatherFields from plugin | sample_outputs + run_outputs |
none | No aggregation | No plugin | sample_outputs + Channel.empty() |
CSVTK_CONCAT is the default and most common (~80% of subworkflows).
Edge Cases
-
Multi-module subworkflows (e.g., snippy + snpdists + gubbins): The scaffold generates a single-module pattern. For complex orchestration, generate the scaffold then manually adjust the includes and channel wiring.
-
Composite subworkflows that call other subworkflows: Add @subworkflows tag manually and adjust includes to point to ../../subworkflows/{name}/main instead of ../../modules/{name}/main.
Test Data Discovery
Test data paths are discovered dynamically from existing module tests using:
bash .claude/skills/add-bactopia-tool/scripts/run-bactopia-scaffold.sh test-data --input-type {type} --bactopia-path . --pretty
This scans modules/*/tests/main.nf.test for paths matching the input type and returns
pre-computed template variables. Always use the discovered paths -- never construct test
data paths manually. The output includes test_data_path (compressed, for subworkflow
tests), test_uncompressed_path (for module tests), test_species, and test_sample_id.
Supported input types: assembly, reads, assembly_reads, proteins, gff, genbank.