| name | bio-tcr-bcr-analysis-immcantation-analysis |
| description | Analyze BCR repertoires for somatic hypermutation, clonal lineages, and B cell phylogenetics using the Immcantation framework. Use when studying B cell affinity maturation, germinal center dynamics, or antibody evolution. |
| tool_type | r |
| primary_tool | alakazam |
Version Compatibility
Reference examples tested with: MiXCR 4.6+, ggplot2 3.5+
Before using code patterns, verify installed versions match. If versions differ:
- R:
packageVersion('<pkg>') then ?function_name to verify parameters
If code throws ImportError, AttributeError, or TypeError, introspect the installed
package and adapt the example to match the actual API rather than retrying.
Immcantation Analysis
"Analyze B cell repertoire evolution and clonal lineages" → Study somatic hypermutation, build B cell phylogenies, and track affinity maturation using the Immcantation framework for BCR repertoire analysis.
- R:
alakazam::plotMutability(), dowser::buildPhylipLineage(), scoper::spectralClones()
Requires Immcantation suite: alakazam 1.3+, shazam 1.2+, scoper 1.3+, dowser 2.0+, tigger 1.1+.
Load and Format Data
Goal: Import AIRR-formatted repertoire data into the Immcantation framework for downstream analysis.
Approach: Read Change-O/AIRR tab-delimited files into R data frames with required V(D)J annotation columns.
library(alakazam)
library(shazam)
library(dplyr)
db <- readChangeoDb('clones_airr.tsv')
Clonal Clustering
Goal: Group B cell sequences into clonal lineages based on junction sequence similarity.
Approach: Apply hierarchical clustering on nucleotide distance of junction regions with a threshold-based cutoff.
library(scoper)
db <- hierarchicalClones(
db,
threshold = 0.15,
method = 'nt',
linkage = 'single'
)
clone_sizes <- countClones(db, groups = 'sample_id')
Somatic Hypermutation Analysis
Goal: Quantify somatic hypermutation rates across replacement and silent categories for each clone.
Approach: Compare observed sequences to germline alignments using the S5F targeting model to count and classify mutations.
db <- observedMutations(
db,
sequenceColumn = 'sequence_alignment',
germlineColumn = 'germline_alignment_d_mask',
regionDefinition = IMGT_V,
mutationDefinition = MUTATION_SCHEMES$S5F
)
mutation_summary <- db %>%
group_by(clone_id) %>%
summarize(
mean_mu = mean(mu_freq_seq_r, na.rm = TRUE),
n_sequences = n()
)
Selection Analysis
Goal: Test whether observed replacement/silent mutation ratios deviate from neutral expectation, indicating positive or negative selection.
Approach: Estimate BASELINe selection strength (sigma) by comparing observed R/S ratios to a null model of somatic hypermutation targeting.
library(shazam)
baseline <- estimateBaseline(
db,
sequenceColumn = 'sequence_alignment',
germlineColumn = 'germline_alignment_d_mask',
testStatistic = 'focused',
regionDefinition = IMGT_V,
nproc = 4
)
selection <- summarizeBaseline(baseline, returnType = 'df')
Build Clonal Lineage Trees
Goal: Reconstruct phylogenetic lineage trees for each B cell clone to visualize affinity maturation pathways.
Approach: Build maximum parsimony trees from clonal sequence alignments using PHYLIP's dnapars algorithm via dowser.
library(dowser)
clones_multi <- db %>%
group_by(clone_id) %>%
filter(n() >= 3) %>%
ungroup()
trees <- buildPhylipLineage(
clones_multi,
phylip_exec = 'dnapars',
rm_temp = TRUE
)
plotTrees(trees[[1]])
Germline Inference
Goal: Discover novel V gene alleles and correct V gene assignments using individual-level genotyping.
Approach: Infer novel alleles from mutation patterns with TIgGER, build a personalized genotype, and reassign allele calls.
library(tigger)
novel <- findNovelAlleles(
db,
germline_db = 'IMGT_Human_IGHV.fasta',
nproc = 4
)
genotype <- inferGenotype(db, germline_db = 'IMGT_Human_IGHV.fasta')
db <- reassignAlleles(db, genotype)
Visualization
Goal: Generate summary plots of mutation frequencies and V gene usage across samples.
Approach: Plot mutation frequency distributions with ggplot2 histograms and V gene usage bar charts via alakazam helpers.
library(ggplot2)
ggplot(db, aes(x = mu_freq_seq_r)) +
geom_histogram(bins = 50) +
facet_wrap(~ sample_id) +
labs(x = 'Replacement Mutation Frequency', y = 'Count')
v_usage <- countGenes(db, gene = 'v_call', groups = 'sample_id')
plotGeneUsage(v_usage, gene = 'v_call')
Related Skills
- mixcr-analysis - Generate input clonotype data
- vdjtools-analysis - Diversity metrics (TCR-focused)
- phylogenetics/tree-io - General tree concepts