| name | bio-genome-intervals-proximity-operations |
| description | Find nearest features, search within windows, and extend intervals using closest, window, flank, and slop operations. Use when performing TSS proximity analysis, assigning enhancers to genes, defining promoter regions, or finding nearby genomic features. |
| tool_type | mixed |
| primary_tool | bedtools |
Version Compatibility
Reference examples tested with: bedtools 2.31+
Before using code patterns, verify installed versions match. If versions differ:
- Python:
pip show <package> then help(module.function) to check signatures
- CLI:
<tool> --version then <tool> --help to confirm flags
If code throws ImportError, AttributeError, or TypeError, introspect the installed
package and adapt the example to match the actual API rather than retrying.
Proximity Operations
"Find nearest features or extend intervals" → Identify the closest genomic feature to each interval, or expand intervals by a fixed flank size.
- CLI:
bedtools closest -a peaks.bed -b genes.bed, bedtools slop -b 1000
- Python:
a.closest(b), a.slop(b=1000, g=genome) (pybedtools)
Operations for finding nearby features and extending intervals using bedtools and pybedtools.
Closest - Find Nearest Feature
CLI
bedtools closest -a peaks.bed -b genes.bed > peaks_with_nearest.bed
bedtools closest -a peaks.bed -b genes.bed -d > with_distance.bed
bedtools closest -a peaks.bed -b genes.bed -io > nearest_non_overlap.bed
bedtools closest -a peaks.bed -b genes.bed -s > same_strand.bed
bedtools closest -a peaks.bed -b genes.bed -S > opposite_strand.bed
bedtools closest -a peaks.bed -b genes.bed -D a -iu > upstream_only.bed
bedtools closest -a peaks.bed -b genes.bed -D a -id > downstream_only.bed
bedtools closest -a peaks.bed -b genes.bed -t all > all_ties.bed
bedtools closest -a peaks.bed -b genes.bed -t first > first_tie.bed
Python
import pybedtools
a = pybedtools.BedTool('peaks.bed')
b = pybedtools.BedTool('genes.bed')
result = a.closest(b)
result = a.closest(b, d=True)
result = a.closest(b, io=True)
result = a.closest(b, s=True)
result = a.closest(b, t='all')
result.saveas('closest.bed')
Window - Find Features Within Distance
CLI
bedtools window -a peaks.bed -b genes.bed -w 10000 > genes_within_10kb.bed
bedtools window -a peaks.bed -b genes.bed -l 5000 -r 2000 > asymmetric.bed
bedtools window -a peaks.bed -b genes.bed -w 10000 -sm > same_strand.bed
bedtools window -a peaks.bed -b genes.bed -l 5000 -r 2000 -sw > strand_aware.bed
Python
import pybedtools
a = pybedtools.BedTool('peaks.bed')
b = pybedtools.BedTool('genes.bed')
result = a.window(b, w=10000)
result = a.window(b, l=5000, r=2000)
result = a.window(b, w=10000, sm=True)
result.saveas('window.bed')
Slop - Extend Interval Boundaries
CLI
bedtools slop -i peaks.bed -g genome.txt -b 100 > extended.bed
bedtools slop -i peaks.bed -g genome.txt -l 500 -r 100 > asymmetric.bed
bedtools slop -i peaks.bed -g genome.txt -l 500 -r 100 -s > strand_aware.bed
bedtools slop -i peaks.bed -g genome.txt -b 0.5 -pct > extend_50pct.bed
bedtools slop -i peaks.bed -g genome.txt -b 100 -header > with_header.bed
Python
import pybedtools
bed = pybedtools.BedTool('peaks.bed')
result = bed.slop(g='genome.txt', b=100)
result = bed.slop(g='genome.txt', l=500, r=100)
result = bed.slop(g='genome.txt', l=500, r=100, s=True)
result = bed.slop(g='genome.txt', b=0.5, pct=True)
result.saveas('extended.bed')
Flank - Get Flanking Regions
CLI
bedtools flank -i peaks.bed -g genome.txt -b 100 > flanks.bed
bedtools flank -i peaks.bed -g genome.txt -l 100 -r 0 > upstream.bed
bedtools flank -i peaks.bed -g genome.txt -l 0 -r 100 > downstream.bed
bedtools flank -i peaks.bed -g genome.txt -l 500 -r 0 -s > upstream_strand.bed
bedtools flank -i peaks.bed -g genome.txt -b 0.5 -pct > flank_50pct.bed
Python
import pybedtools
bed = pybedtools.BedTool('peaks.bed')
result = bed.flank(g='genome.txt', b=100)
result = bed.flank(g='genome.txt', l=100, r=0)
result = bed.flank(g='genome.txt', l=500, r=0, s=True)
result.saveas('flanks.bed')
Shift - Move Intervals
CLI
bedtools shift -i peaks.bed -g genome.txt -s 100 > shifted.bed
bedtools shift -i peaks.bed -g genome.txt -s -100 > shifted_up.bed
bedtools shift -i peaks.bed -g genome.txt -s 0.5 -pct > shift_50pct.bed
bedtools shift -i peaks.bed -g genome.txt -s 100 -p 200 > shifted.bed
Python
import pybedtools
bed = pybedtools.BedTool('peaks.bed')
result = bed.shift(g='genome.txt', s=100)
result = bed.shift(g='genome.txt', s=-100)
result.saveas('shifted.bed')
Common Patterns
Find Peaks Within 10kb of TSS
awk -v OFS='\t' '{
if ($6 == "+") print $1, $2, $2+1, $4, $5, $6;
else print $1, $3-1, $3, $4, $5, $6;
}' genes.bed > tss.bed
bedtools window -a peaks.bed -b tss.bed -w 10000 > peaks_near_tss.bed
Create Promoter Regions
bedtools flank -i tss.bed -g genome.txt -l 2000 -r 0 -s | \
bedtools slop -i stdin -g genome.txt -l 0 -r 500 -s > promoters.bed
bedtools slop -i tss.bed -g genome.txt -l 2000 -r 500 -s > promoters.bed
Find Nearest Gene Within 100kb
import pybedtools
peaks = pybedtools.BedTool('peaks.bed')
genes = pybedtools.BedTool('genes.bed')
closest = peaks.closest(genes, d=True)
within_100kb = closest.filter(lambda x: abs(int(x.fields[-1])) <= 100000)
within_100kb.saveas('peaks_with_nearby_genes.bed')
Enhancer-Gene Assignment
Goal: Link putative enhancers to potential target genes based on genomic proximity within a defined distance window.
Approach: Use bedtools window with a large symmetric window (e.g., 1Mb) to find all TSS sites near each enhancer, producing an enhancer-gene pair list for downstream regulatory analysis.
import pybedtools
enhancers = pybedtools.BedTool('enhancers.bed')
tss = pybedtools.BedTool('tss.bed')
assignments = enhancers.window(tss, w=1000000)
df = assignments.to_dataframe()
Genome File Format
# genome.txt format: chromosome<TAB>size
chr1 248956422
chr2 242193529
chr3 198295559
...
# Create from FASTA index
cut -f1,2 reference.fa.fai > genome.txt
# Download UCSC chromosome sizes
wget https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.chrom.sizes
Key Parameters
| Operation | Parameter | Description |
|---|
| closest -d | Distance | Report distance in last column |
| closest -io | Ignore overlap | Skip overlapping features |
| closest -D | Direction | Report signed distance (a/b/ref) |
| window -w | Window | Symmetric window size |
| window -l/-r | Left/Right | Asymmetric window |
| slop -b | Both | Extend both ends |
| slop -s | Strand | Strand-aware extension |
| flank -l/-r | Left/Right | Flank size by side |
Related Skills
- bed-file-basics - BED format fundamentals
- interval-arithmetic - intersect, subtract, merge
- gtf-gff-handling - Extract TSS from annotations
- chip-seq/peak-annotation - Peak annotation