| name | bio-workflows-spatial-pipeline |
| description | End-to-end spatial transcriptomics workflow for Visium/Xenium data. Covers data loading, preprocessing, spatial analysis, domain detection, and visualization with Squidpy. Use when analyzing spatial transcriptomics data. |
| tool_type | python |
| primary_tool | Squidpy |
| workflow | true |
| depends_on | ["spatial-transcriptomics/spatial-data-io","spatial-transcriptomics/spatial-preprocessing","spatial-transcriptomics/spatial-neighbors","spatial-transcriptomics/spatial-statistics","spatial-transcriptomics/spatial-domains","spatial-transcriptomics/spatial-visualization"] |
| qc_checkpoints | [{"after_loading":"Spots/cells detected, image aligned"},{"after_qc":"Low-quality spots filtered, genes detected"},{"after_clustering":"Spatial domains correspond to tissue regions"}] |
Version Compatibility
Reference examples tested with: matplotlib 3.8+, numpy 1.26+, scanpy 1.10+, squidpy 1.3+
Before using code patterns, verify installed versions match. If versions differ:
- Python:
pip show <package> then help(module.function) to check signatures
If code throws ImportError, AttributeError, or TypeError, introspect the installed
package and adapt the example to match the actual API rather than retrying.
Spatial Transcriptomics Pipeline
"Analyze my spatial transcriptomics data end-to-end" → Orchestrate data loading (squidpy/scanpy), QC, normalization, spatial domain detection, deconvolution (cell2location), spatial neighbor analysis, cell-cell communication, and tissue visualization.
Complete workflow for analyzing Visium, Xenium, or other spatial transcriptomics data.
Workflow Overview
Spatial data (Space Ranger output)
|
v
[1. Load Data] ---------> Read Visium/Xenium
|
v
[2. QC & Preprocessing] -> Filter, normalize
|
v
[3. Clustering] --------> Standard scRNA-seq clustering
|
v
[4. Spatial Analysis] --> Neighbors, statistics
|
v
[5. Domain Detection] --> Spatial domains
|
v
[6. Visualization] -----> Spatial plots
|
v
Annotated spatial data
Primary Path: Squidpy + Scanpy
Step 1: Load Data
import scanpy as sc
import squidpy as sq
import numpy as np
import matplotlib.pyplot as plt
adata = sq.read.visium('spaceranger_output/')
adata = sc.read_10x_h5('filtered_feature_bc_matrix.h5')
adata.uns['spatial'] = ...
adata = sq.read.xenium('xenium_output/')
print(f'Loaded: {adata.n_obs} spots/cells, {adata.n_vars} genes')
Step 2: Quality Control
adata.var['mt'] = adata.var_names.str.startswith('MT-')
sc.pp.calculate_qc_metrics(adata, qc_vars=['mt'], inplace=True)
fig, axes = plt.subplots(1, 3, figsize=(15, 4))
sc.pl.spatial(adata, color='total_counts', ax=axes[0], show=False)
sc.pl.spatial(adata, color='n_genes_by_counts', ax=axes[1], show=False)
sc.pl.spatial(adata, color='pct_counts_mt', ax=axes[2], show=False)
plt.savefig('qc_spatial.pdf')
sc.pp.filter_cells(adata, min_counts=500)
sc.pp.filter_cells(adata, min_genes=200)
sc.pp.filter_genes(adata, min_cells=10)
adata = adata[adata.obs.pct_counts_mt < 25, :]
print(f'After QC: {adata.n_obs} spots/cells')
Step 3: Normalization and Clustering
adata.layers['counts'] = adata.X.copy()
sc.pp.normalize_total(adata, target_sum=1e4)
sc.pp.log1p(adata)
sc.pp.highly_variable_genes(adata, n_top_genes=2000)
adata.raw = adata
adata = adata[:, adata.var.highly_variable]
sc.pp.scale(adata, max_value=10)
sc.tl.pca(adata, n_comps=50)
sc.pp.neighbors(adata, n_neighbors=15, n_pcs=30)
sc.tl.umap(adata)
sc.tl.leiden(adata, resolution=0.5)
sc.pl.spatial(adata, color='leiden', spot_size=1.5)
plt.savefig('clusters_spatial.pdf')
Step 4: Spatial Analysis
sq.gr.spatial_neighbors(adata, coord_type='generic', n_neighs=6)
sq.gr.nhood_enrichment(adata, cluster_key='leiden')
sq.pl.nhood_enrichment(adata, cluster_key='leiden')
plt.savefig('nhood_enrichment.pdf')
sq.gr.co_occurrence(adata, cluster_key='leiden')
sq.pl.co_occurrence(adata, cluster_key='leiden')
plt.savefig('co_occurrence.pdf')
sq.gr.spatial_autocorr(adata, mode='moran', n_perms=100, n_jobs=4)
svg = adata.uns['moranI'].sort_values('I', ascending=False)
top_svg = svg.head(20).index.tolist()
print('Top spatially variable genes:', top_svg[:10])
Step 5: Domain Detection
sq.gr.spatial_neighbors(adata, coord_type='generic', n_neighs=15)
sc.tl.leiden(adata, resolution=0.3, key_added='spatial_domains',
adjacency=adata.obsp['spatial_connectivities'])
sc.pl.spatial(adata, color='spatial_domains', spot_size=1.5)
plt.savefig('spatial_domains.pdf')
sc.pl.spatial(adata, color=['leiden', 'spatial_domains'], ncols=2)
plt.savefig('clusters_comparison.pdf')
Step 6: Visualization
genes = ['EPCAM', 'VIM', 'PTPRC', 'COL1A1']
sc.pl.spatial(adata, color=genes, ncols=2, spot_size=1.5, cmap='viridis')
plt.savefig('marker_genes_spatial.pdf')
sc.tl.rank_genes_groups(adata, 'leiden', method='wilcoxon')
sc.pl.rank_genes_groups_dotplot(adata, n_genes=5)
plt.savefig('cluster_markers.pdf')
adata.write('spatial_analyzed.h5ad')
Complete Workflow Script
import scanpy as sc
import squidpy as sq
import matplotlib.pyplot as plt
import os
data_dir = 'spaceranger_output'
output_dir = 'spatial_results'
os.makedirs(output_dir, exist_ok=True)
os.makedirs(f'{output_dir}/plots', exist_ok=True)
print('Loading data...')
adata = sq.read.visium(data_dir)
print(f'Loaded: {adata.n_obs} spots, {adata.n_vars} genes')
print('QC filtering...')
adata.var['mt'] = adata.var_names.str.startswith('MT-')
sc.pp.calculate_qc_metrics(adata, qc_vars=['mt'], inplace=True)
sc.pp.filter_cells(adata, min_counts=500)
sc.pp.filter_genes(adata, min_cells=10)
adata = adata[adata.obs.pct_counts_mt < 25, :]
print(f'After QC: {adata.n_obs} spots')
print('Processing...')
adata.layers['counts'] = adata.X.copy()
sc.pp.normalize_total(adata, target_sum=1e4)
sc.pp.log1p(adata)
sc.pp.highly_variable_genes(adata, n_top_genes=2000)
adata.raw = adata
adata = adata[:, adata.var.highly_variable]
sc.pp.scale(adata, max_value=10)
sc.tl.pca(adata, n_comps=50)
sc.pp.neighbors(adata, n_neighbors=15, n_pcs=30)
sc.tl.leiden(adata, resolution=0.5)
print('Spatial analysis...')
sq.gr.spatial_neighbors(adata, coord_type='generic', n_neighs=6)
sq.gr.nhood_enrichment(adata, cluster_key='leiden')
sq.gr.spatial_autocorr(adata, mode='moran', n_perms=100)
print('Creating plots...')
sc.pl.spatial(adata, color='leiden', spot_size=1.5, save='_clusters.pdf')
sq.pl.nhood_enrichment(adata, cluster_key='leiden', save='_nhood.pdf')
adata.write(f'{output_dir}/spatial_analyzed.h5ad')
print(f'Results saved to {output_dir}/')
Related Skills
- spatial-transcriptomics/spatial-data-io - Loading formats
- spatial-transcriptomics/spatial-preprocessing - QC details
- spatial-transcriptomics/spatial-statistics - Moran's I, co-occurrence
- spatial-transcriptomics/spatial-domains - Domain detection methods
- spatial-transcriptomics/spatial-deconvolution - Cell type estimation