Run any Skill in Manus with one click

bio-workflows-riboseq-pipeline

Stars943

Forks165

UpdatedJune 20, 2026 at 14:24

End-to-end Ribo-seq analysis from FASTQ through periodicity QC, P-site calibration, ORF detection, translation efficiency, and stalling. Use when orchestrating a full ribosome profiling pipeline and deciding harvest/dedup/alignment options and which downstream analyses the library can support.

Installation

Install with Codex or Claude Copy this prompt, paste it into Codex, Claude, or another assistant, and let it review the skill page and install it for you.

Run Skill in Manus

Source

GPTomics

GPTomics/bioSkills

View GitHub Repository View Creator Repositories

Download

Run Skill in Manus

Related occupationsSOC

Based on SOC occupation classification

Software DevelopersComputer and Mathematical Occupations·SOC 15-1252

File Explorer

3 files

SKILL.md

readonly

name	bio-workflows-riboseq-pipeline
description	End-to-end Ribo-seq analysis from FASTQ through periodicity QC, P-site calibration, ORF detection, translation efficiency, and stalling. Use when orchestrating a full ribosome profiling pipeline and deciding harvest/dedup/alignment options and which downstream analyses the library can support.
tool_type	mixed
primary_tool	STAR

Version Compatibility

Reference examples tested with: cutadapt 4.4+, umi_tools 1.1+, STAR 2.7.11+, SortMeRNA 4.3+, riboWaltz 2.0+, RiboCode 1.2+, riborex 2.4+, samtools 1.19+

Before using code patterns, verify installed versions match. If versions differ:

CLI: <tool> --version then <tool> --help to confirm flags
R: packageVersion('<pkg>') then ?function_name to verify parameters
Python: pip show <package> then help(module.function) to check signatures

If code throws ImportError, AttributeError, or TypeError, introspect the installed package and adapt the example to match the actual API rather than retrying.

Ribo-seq Pipeline

"Analyze my ribosome profiling data from FASTQ to translation efficiency" -> Orchestrate UMI handling, trimming, rRNA depletion, footprint-aware alignment, periodicity QC, P-site calibration, ORF detection, and differential translation, gating each downstream analysis on library quality.

Pipeline overview

FASTQ -> UMI extract -> trim -> rRNA remove -> STAR (EndToEnd) -> dedup
      -> periodicity QC + P-site offsets -> [ORF detection | translation efficiency | stalling]

Two upstream facts gate the whole pipeline: how cells were harvested (CHX pre-treatment distorts dwell-time analysis) and whether the library has UMIs (decides deduplication). Periodicity QC is a hard gate: a library without 3-nt periodicity supports only gene-level counts, not ORF/stalling analysis.

Step 1: Preprocess

Goal: Produce a clean, footprint-aware alignment.

Approach: Extract UMIs first (if present), trim with a permissive floor, deplete rRNA before alignment, align end-to-end, and deduplicate only with UMIs. See riboseq-preprocessing for the decision tables.

# UMI-extract (if present) -> trim -> rRNA remove -> STAR EndToEnd -> dedup (UMI only)
cutadapt -a CTGTAGGCACCATCAAT --discard-untrimmed -m 15 -M 40 -o trimmed.fq.gz reads.fq.gz
bowtie2 -x contaminant_index -U trimmed.fq.gz --un-gz noncontam.fq.gz -S /dev/null -p 8
STAR --genomeDir STAR_index --readFilesIn noncontam.fq.gz --readFilesCommand zcat \
    --alignEndsType EndToEnd --seedSearchStartLmax 15 --outFilterMismatchNmax 2 \
    --quantMode TranscriptomeSAM --outSAMtype BAM SortedByCoordinate --outFileNamePrefix ribo_
samtools index ribo_Aligned.sortedByCoord.out.bam

--quantMode TranscriptomeSAM writes a SEPARATE ribo_Aligned.toTranscriptome.out.bam alongside the sorted genome BAM. RiboCode and riboWaltz transcriptome paths consume the TRANSCRIPTOME BAM; the sorted genome BAM is for plastid/genome-coordinate steps. With UMIs, deduplicate the transcriptome BAM too (umi_tools dedup --per-contig; see riboseq-preprocessing), or its ORF/periodicity inputs stay PCR-inflated.

Step 2: Periodicity QC and P-site offsets

Goal: Certify the library and obtain per-length P-site offsets.

Approach: Run riboWaltz to filter periodic read lengths and calibrate offsets; the frame-0 fraction is the pass/fail metric. See ribosome-periodicity.

library(riboWaltz)
annotation <- create_annotation("annotation.gtf")
reads <- bamtolist("bams", annotation = annotation)
reads <- length_filter(reads, length_filter_mode = "periodicity", periodicity_threshold = 50)
offsets <- psite(reads, extremity = "auto")   # per-length P-site offsets

Either riboWaltz (above) or the plastid metagene generate + psite CLI (used in the example script) is acceptable for offsets; pick one per project.

Step 3: Detect ORFs

Goal: Call translated ORFs once offsets are known.

Approach: Run RiboCode; read lengths come from the metaplots config, and -l is the longest-ORF toggle. See orf-detection.

prepare_transcripts -g annotation.gtf -f genome.fa -o annot
metaplots -a annot -r ribo_Aligned.toTranscriptome.out.bam -o metaplots
RiboCode -a annot -c metaplots_pre_config.txt -A CTG,GTG -l no -p 0.05 -o ribocode_result

Step 4: Translation efficiency

Goal: Test differential translation with matched RNA-seq.

Approach: Count both assays over the CDS and use a count-based GLM; use anota2seq when buffering vs control matters. See translation-efficiency.

library(riborex)
res <- riborex(rnaCntTable = rna_cds_counts, riboCntTable = ribo_cds_counts,
               rnaCond = cond, riboCond = cond, engine = "DESeq2")
sig <- res[which(res$padj < 0.05), ]

Step 5: Optional analyses

Stalling/pausing (only on flash-frozen no-drug data; see ribosome-stalling) and initiation-site mapping (needs a harringtonine/LTM library; see initiation-site-mapping) run off the same aligned BAM and calibrated offsets.

Common Errors

Symptom	Cause	Fix
Downstream analyses all noisy	Periodicity QC skipped	Gate ORF/stalling on the frame-0 fraction first
P-site offsets look wrong	Single hardcoded offset across lengths	Calibrate per length with riboWaltz
RiboCode uses wrong read lengths	`-l` passed read lengths	Read lengths come from metaplots; `-l` is a toggle
TE hits dominated by low-count genes	Ratio testing	Use riborex/Xtail/anota2seq count GLMs

Related Skills

ribo-seq/riboseq-preprocessing - UMI handling, trimming, rRNA removal, alignment
ribo-seq/ribosome-periodicity - Periodicity QC and P-site calibration
ribo-seq/orf-detection - Translated ORF calling
ribo-seq/translation-efficiency - Differential TE and buffering
ribo-seq/initiation-site-mapping - Start-codon mapping from TI-seq
differential-expression/deseq2-basics - Count-based differential testing

References

McGlincy NJ, Ingolia NT. 2017. Transcriptome-wide measurement of translation by ribosome profiling. Methods 126:112-129. doi:10.1016/j.ymeth.2017.05.028
Lauria F, Tebaldi T, Bernabò P, Groen EJN, Gillingwater TH, Viero G. 2018. riboWaltz: Optimization of ribosome P-site positioning in ribosome profiling data. PLoS Comput Biol 14(8):e1006169. doi:10.1371/journal.pcbi.1006169
Xiao Z, Huang R, Xing X, Chen Y, Deng H, Yang X. 2018. De novo annotation and characterization of the translatome with ribosome profiling data. Nucleic Acids Res 46(10):e61. doi:10.1093/nar/gky179

Version Compatibility

Reference examples tested with: cutadapt 4.4+, umi_tools 1.1+, STAR 2.7.11+, SortMeRNA 4.3+, riboWaltz 2.0+, RiboCode 1.2+, riborex 2.4+, samtools 1.19+

Before using code patterns, verify installed versions match. If versions differ:

CLI: <tool> --version then <tool> --help to confirm flags
R: packageVersion('<pkg>') then ?function_name to verify parameters
Python: pip show <package> then help(module.function) to check signatures

If code throws ImportError, AttributeError, or TypeError, introspect the installed package and adapt the example to match the actual API rather than retrying.

Ribo-seq Pipeline

Pipeline overview

FASTQ -> UMI extract -> trim -> rRNA remove -> STAR (EndToEnd) -> dedup
      -> periodicity QC + P-site offsets -> [ORF detection | translation efficiency | stalling]

Step 1: Preprocess

Goal: Produce a clean, footprint-aware alignment.

# UMI-extract (if present) -> trim -> rRNA remove -> STAR EndToEnd -> dedup (UMI only)
cutadapt -a CTGTAGGCACCATCAAT --discard-untrimmed -m 15 -M 40 -o trimmed.fq.gz reads.fq.gz
bowtie2 -x contaminant_index -U trimmed.fq.gz --un-gz noncontam.fq.gz -S /dev/null -p 8
STAR --genomeDir STAR_index --readFilesIn noncontam.fq.gz --readFilesCommand zcat \
    --alignEndsType EndToEnd --seedSearchStartLmax 15 --outFilterMismatchNmax 2 \
    --quantMode TranscriptomeSAM --outSAMtype BAM SortedByCoordinate --outFileNamePrefix ribo_
samtools index ribo_Aligned.sortedByCoord.out.bam

Step 2: Periodicity QC and P-site offsets

Goal: Certify the library and obtain per-length P-site offsets.

Approach: Run riboWaltz to filter periodic read lengths and calibrate offsets; the frame-0 fraction is the pass/fail metric. See ribosome-periodicity.

library(riboWaltz)
annotation <- create_annotation("annotation.gtf")
reads <- bamtolist("bams", annotation = annotation)
reads <- length_filter(reads, length_filter_mode = "periodicity", periodicity_threshold = 50)
offsets <- psite(reads, extremity = "auto")   # per-length P-site offsets

Either riboWaltz (above) or the plastid metagene generate + psite CLI (used in the example script) is acceptable for offsets; pick one per project.

Step 3: Detect ORFs

Goal: Call translated ORFs once offsets are known.

Approach: Run RiboCode; read lengths come from the metaplots config, and -l is the longest-ORF toggle. See orf-detection.

prepare_transcripts -g annotation.gtf -f genome.fa -o annot
metaplots -a annot -r ribo_Aligned.toTranscriptome.out.bam -o metaplots
RiboCode -a annot -c metaplots_pre_config.txt -A CTG,GTG -l no -p 0.05 -o ribocode_result

Step 4: Translation efficiency

Goal: Test differential translation with matched RNA-seq.

Approach: Count both assays over the CDS and use a count-based GLM; use anota2seq when buffering vs control matters. See translation-efficiency.

library(riborex)
res <- riborex(rnaCntTable = rna_cds_counts, riboCntTable = ribo_cds_counts,
               rnaCond = cond, riboCond = cond, engine = "DESeq2")
sig <- res[which(res$padj < 0.05), ]

Step 5: Optional analyses

Common Errors

Symptom	Cause	Fix
Downstream analyses all noisy	Periodicity QC skipped	Gate ORF/stalling on the frame-0 fraction first
P-site offsets look wrong	Single hardcoded offset across lengths	Calibrate per length with riboWaltz
RiboCode uses wrong read lengths	`-l` passed read lengths	Read lengths come from metaplots; `-l` is a toggle
TE hits dominated by low-count genes	Ratio testing	Use riborex/Xtail/anota2seq count GLMs

Related Skills

ribo-seq/riboseq-preprocessing - UMI handling, trimming, rRNA removal, alignment
ribo-seq/ribosome-periodicity - Periodicity QC and P-site calibration
ribo-seq/orf-detection - Translated ORF calling
ribo-seq/translation-efficiency - Differential TE and buffering
ribo-seq/initiation-site-mapping - Start-codon mapping from TI-seq
differential-expression/deseq2-basics - Count-based differential testing

References

McGlincy NJ, Ingolia NT. 2017. Transcriptome-wide measurement of translation by ribosome profiling. Methods 126:112-129. doi:10.1016/j.ymeth.2017.05.028
Lauria F, Tebaldi T, Bernabò P, Groen EJN, Gillingwater TH, Viero G. 2018. riboWaltz: Optimization of ribosome P-site positioning in ribosome profiling data. PLoS Comput Biol 14(8):e1006169. doi:10.1371/journal.pcbi.1006169
Xiao Z, Huang R, Xing X, Chen Y, Deng H, Yang X. 2018. De novo annotation and characterization of the translatome with ribosome profiling data. Nucleic Acids Res 46(10):e61. doi:10.1093/nar/gky179

bio-workflows-riboseq-pipeline

Version Compatibility

Ribo-seq Pipeline

Pipeline overview

Step 1: Preprocess

Step 2: Periodicity QC and P-site offsets

Step 3: Detect ORFs

Step 4: Translation efficiency

Step 5: Optional analyses

Common Errors

Related Skills

References

More from this repository

Version Compatibility

Ribo-seq Pipeline

Pipeline overview

Step 1: Preprocess

Step 2: Periodicity QC and P-site offsets

Step 3: Detect ORFs

Step 4: Translation efficiency

Step 5: Optional analyses

Common Errors

Related Skills

References

More from this repository