| name | bio-workflows-crispr-editing-pipeline |
| description | End-to-end CRISPR experiment design from target selection to delivery-ready constructs. Covers guide RNA design, off-target assessment, and specialized editing strategies including knockouts, base editing, and HDR knockins. Use when designing complete CRISPR editing experiments for gene knockout, correction, or tagging. |
| tool_type | mixed |
| primary_tool | crisprscan |
| workflow | true |
| depends_on | ["genome-engineering/grna-design","genome-engineering/off-target-prediction","genome-engineering/base-editing-design","genome-engineering/prime-editing-design","genome-engineering/hdr-template-design"] |
| qc_checkpoints | [{"after_grna_design":"Activity score >0.6, no poly-T runs, GC 40-70%"},{"after_offtarget":"Specificity score >0.7, no coding off-targets with <3 mismatches"},{"after_template":"Homology arms verified, PAM disrupted in donor"}] |
Version Compatibility
Reference examples tested with: BioPython 1.83+, matplotlib 3.8+, numpy 1.26+, pandas 2.2+, primer3-py 2.0+
Before using code patterns, verify installed versions match. If versions differ:
- Python:
pip show <package> then help(module.function) to check signatures
- CLI:
<tool> --version then <tool> --help to confirm flags
If code throws ImportError, AttributeError, or TypeError, introspect the installed
package and adapt the example to match the actual API rather than retrying.
CRISPR Editing Pipeline
"Design a complete CRISPR editing experiment for my target gene" → Orchestrate guide RNA design, off-target assessment, and strategy-specific template design (knockout, base editing, HDR knock-in, or prime editing) to produce delivery-ready constructs.
Complete workflow for CRISPR experiment design: from target gene to delivery-ready constructs with branching paths for different editing strategies.
Workflow Overview
Target Gene/Position
|
v
[1. Guide RNA Design] --> CRISPRscan / Rule Set 2 / DeepCRISPR
|
v
[2. Off-Target Assessment] --> Cas-OFFinder + CFD scoring
|
v
Decision Point: What type of edit?
|
+---+-------------------+--------------------+
| | |
v v v
[3a. Knockout] [3b. Base Editing] [3c. Knockin]
Standard Cas9 CBE/ABE design HDR template
Frameshift C>T or A>G with homology arms
| | |
v v v
Final Constructs with Validation Primers
Prerequisites
pip install crisprscan biopython pandas numpy matplotlib
conda install -c bioconda primer3-py cas-offinder
pip install crisprtools
Primary Path: Gene Knockout
Step 1: Guide RNA Design
from Bio import SeqIO
from Bio.Seq import Seq
import pandas as pd
import re
def find_guides(sequence, pam='NGG'):
'''Find all potential gRNA target sites with NGG PAM.'''
guides = []
seq_str = str(sequence).upper()
for match in re.finditer(r'(?=([ATCG]{20}[ATCG]GG))', seq_str):
pos = match.start()
target = match.group(1)[:20]
pam_seq = match.group(1)[20:23]
guides.append({
'sequence': target,
'pam': pam_seq,
'position': pos,
'strand': '+',
'full_target': match.group(1)
})
for match in re.finditer(r'(?=(CC[ATCG][ATCG]{20}))', seq_str):
pos = match.start()
full = match.group(1)
target = str(Seq(full[3:23]).reverse_complement())
pam_seq = str(Seq(full[0:3]).reverse_complement())
guides.append({
'sequence': target,
'pam': pam_seq,
'position': pos,
'strand': '-',
'full_target': full
})
return pd.DataFrame(guides)
def score_guide(guide_seq):
'''Score guide using Rule Set 2-like heuristics.'''
score = 0.5
gc = (guide_seq.count('G') + guide_seq.count('C')) / len(guide_seq)
if 0.4 <= gc <= 0.7:
score += 0.2
elif gc < 0.3 or gc > 0.8:
score -= 0.2
if 'TTTT' in guide_seq:
score -= 0.3
if guide_seq[-1] == 'G':
score += 0.1
if guide_seq[-2:] == 'GG':
score -= 0.1
seed = guide_seq[11:20]
seed_gc = (seed.count('G') + seed.count('C')) / len(seed)
if 0.4 <= seed_gc <= 0.7:
score += 0.1
return min(1.0, max(0.0, score))
gene_seq = '''ATGGATTTATCTGCTCTTCGCGTTGAAGAAGTACAAAATGTCATTAATGCTATGCAGAAAATCTTAGAGT
GTCCCATCTGTCTGGAGTTGATCAAGGAACCTGTCTCCACAAAGTGTGACCACATATTTTGCAAATTTTG'''
guides = find_guides(gene_seq.replace('\n', ''))
guides['activity_score'] = guides['sequence'].apply(score_guide)
good_guides = guides[guides['activity_score'] > 0.6].sort_values('activity_score', ascending=False)
print(f'Found {len(good_guides)} high-scoring guides')
print(good_guides[['sequence', 'position', 'strand', 'activity_score']].head(10))
Step 2: Off-Target Assessment
import subprocess
from pathlib import Path
def run_cas_offinder(guides_df, genome_fasta, output_dir, max_mismatches=4):
'''Run Cas-OFFinder for off-target detection.'''
output_dir = Path(output_dir)
output_dir.mkdir(parents=True, exist_ok=True)
input_file = output_dir / 'cas_offinder_input.txt'
with open(input_file, 'w') as f:
f.write(f'{genome_fasta}\n')
f.write('NNNNNNNNNNNNNNNNNNNNNGG\n')
for _, row in guides_df.iterrows():
f.write(f"{row['sequence']}NNN {max_mismatches}\n")
output_file = output_dir / 'offtargets.txt'
subprocess.run([
'cas-offinder', str(input_file), 'C', str(output_file)
], check=True)
offtargets = pd.read_csv(output_file, sep='\t', header=None,
names=['pattern', 'chromosome', 'position', 'target',
'strand', 'mismatches'])
return offtargets
def calculate_specificity_score(guide_seq, offtargets_df):
'''Calculate CFD-based specificity score.'''
guide_offtargets = offtargets_df[offtargets_df['pattern'].str.contains(guide_seq[:10])]
if len(guide_offtargets) == 0:
return 1.0
penalty = 0
for _, ot in guide_offtargets.iterrows():
mm = ot['mismatches']
if mm == 0:
penalty += 1.0
elif mm == 1:
penalty += 0.5
elif mm == 2:
penalty += 0.2
elif mm == 3:
penalty += 0.1
else:
penalty += 0.05
return max(0, 1 - penalty / 10)
good_guides['specificity_score'] = good_guides['sequence'].apply(
lambda x: calculate_specificity_score(x, pd.DataFrame())
)
good_guides['combined_score'] = (good_guides['activity_score'] * 0.5 +
good_guides['specificity_score'] * 0.5)
final_guides = good_guides.sort_values('combined_score', ascending=False).head(5)
Step 3a: Knockout Design (Frameshift)
def design_knockout(guide_row, target_sequence):
'''Design knockout experiment with validation primers.'''
guide_seq = guide_row['sequence']
position = guide_row['position']
cut_site = position + 17 if guide_row['strand'] == '+' else position + 6
left_start = max(0, cut_site - 100)
right_end = min(len(target_sequence), cut_site + 100)
return {
'guide_sequence': guide_seq,
'pam': guide_row['pam'],
'cut_site': cut_site,
'expected_outcome': 'Frameshift indel',
'validation_amplicon_start': left_start,
'validation_amplicon_end': right_end
}
ko_design = design_knockout(final_guides.iloc[0], gene_seq.replace('\n', ''))
print('Knockout Design:')
for k, v in ko_design.items():
print(f' {k}: {v}')
Step 3b: Base Editing Design (CBE/ABE)
def design_base_edit(target_position, target_sequence, edit_type='CBE'):
'''Design base editing experiment.
CBE: C>T conversion (or G>A on opposite strand)
ABE: A>G conversion (or T>C on opposite strand)
Editing window: positions 4-8 in the protospacer (counting from PAM-distal)
'''
guides = find_guides(target_sequence)
suitable_guides = []
for _, guide in guides.iterrows():
guide_start = guide['position']
guide_end = guide_start + 20
if guide['strand'] == '+':
window_start = guide_start + 3
window_end = guide_start + 8
else:
window_start = guide_end - 8
window_end = guide_end - 3
if window_start <= target_position <= window_end:
target_base = target_sequence[target_position].upper()
if edit_type == 'CBE' and target_base in ['C', 'G']:
suitable_guides.append(guide)
elif edit_type == 'ABE' and target_base in ['A', 'T']:
suitable_guides.append(guide)
return pd.DataFrame(suitable_guides)
target_pos = 45
cbe_guides = design_base_edit(target_pos, gene_seq.replace('\n', ''), 'CBE')
print(f'Found {len(cbe_guides)} CBE-compatible guides')
Step 3c: Knockin Design (HDR Template)
def design_hdr_template(guide_row, target_sequence, insert_sequence,
homology_arm_length=800):
'''Design HDR donor template with homology arms.
Homology arm length: 800bp is standard for plasmid donors.
For ssODN, use 30-60bp arms.
'''
cut_site = guide_row['position'] + 17 if guide_row['strand'] == '+' else guide_row['position'] + 6
left_arm_start = max(0, cut_site - homology_arm_length)
left_arm = target_sequence[left_arm_start:cut_site]
right_arm_end = min(len(target_sequence), cut_site + homology_arm_length)
right_arm = target_sequence[cut_site:right_arm_end]
guide_seq = guide_row['sequence']
pam_position_in_arms = cut_site - left_arm_start + 3
donor = left_arm + insert_sequence + right_arm
return {
'guide_sequence': guide_seq,
'cut_site': cut_site,
'left_arm': left_arm,
'right_arm': right_arm,
'insert': insert_sequence,
'donor_template': donor,
'donor_length': len(donor),
'note': 'Remember to mutate PAM in donor to prevent re-cutting'
}
gfp_sequence = 'ATGGTGAGCAAGGGCGAGGAG...'
hdr_design = design_hdr_template(final_guides.iloc[0], gene_seq.replace('\n', ''), 'FLAG_TAG', 50)
print('HDR Design:')
print(f" Left arm length: {len(hdr_design['left_arm'])}")
print(f" Right arm length: {len(hdr_design['right_arm'])}")
print(f" Total donor length: {hdr_design['donor_length']}")
Visualization
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
import numpy as np
def plot_guide_landscape(guides_df, gene_length, exon_coords=None):
'''Visualize guide positions and scores along gene.'''
fig, axes = plt.subplots(2, 1, figsize=(14, 6), gridspec_kw={'height_ratios': [1, 2]})
ax1 = axes[0]
ax1.axhline(y=0.5, color='gray', linewidth=10, solid_capstyle='butt')
if exon_coords:
for start, end in exon_coords:
ax1.axhline(y=0.5, xmin=start/gene_length, xmax=end/gene_length,
color='steelblue', linewidth=20, solid_capstyle='butt')
ax1.set_xlim(0, gene_length)
ax1.set_ylim(0, 1)
ax1.set_ylabel('Gene')
ax1.set_xticks([])
ax1.set_yticks([])
ax2 = axes[1]
colors = ['green' if s > 0.6 else 'orange' if s > 0.4 else 'red'
for s in guides_df['activity_score']]
ax2.scatter(guides_df['position'], guides_df['activity_score'],
c=colors, s=50, alpha=0.7)
ax2.axhline(y=0.6, color='green', linestyle='--', alpha=0.5, label='Threshold')
ax2.set_xlim(0, gene_length)
ax2.set_ylim(0, 1)
ax2.set_xlabel('Position (bp)')
ax2.set_ylabel('Activity Score')
ax2.legend()
plt.tight_layout()
plt.savefig('guide_landscape.pdf')
return fig
plot_guide_landscape(guides, len(gene_seq.replace('\n', '')),
exon_coords=[(0, 50), (70, 130)])
Parameter Recommendations
| Step | Parameter | Value | Rationale |
|---|
| Guide design | Activity score | >0.6 | Standard threshold for reliable editing |
| Guide design | GC content | 40-70% | Optimal for binding and Cas9 activity |
| Off-target | Max mismatches | 4 | Catches most relevant off-targets |
| Off-target | Specificity score | >0.7 | Acceptable off-target profile |
| Base editing | Window | positions 4-8 | Optimal for BE3/BE4, ABE7.10 |
| HDR | Homology arms | 800bp | Standard for plasmid donors |
| HDR (ssODN) | Homology arms | 30-60bp | For single-strand oligo donors |
Troubleshooting
| Issue | Likely Cause | Solution |
|---|
| No high-scoring guides | GC-poor region | Expand search region, consider Cas12a |
| Many off-targets | Repetitive sequence | Use high-fidelity Cas9 (eSpCas9, HiFi) |
| Low HDR efficiency | NHEJ dominant | Add NHEJ inhibitors, use ssODN |
| Base editing outside window | Guide position | Redesign with target in positions 4-8 |
| Bystander edits | Multiple C/A in window | Design guides with single target base |
Output Files
| File | Description |
|---|
guides_ranked.tsv | All guides with activity and specificity scores |
offtargets.txt | Cas-OFFinder results |
knockout_design.json | KO guide and validation primers |
base_edit_design.json | CBE/ABE design with editing window |
hdr_template.fasta | Donor template sequence |
guide_landscape.pdf | Visualization of guide positions |
Related Skills
- genome-engineering/grna-design - Detailed scoring algorithms
- genome-engineering/off-target-prediction - Cas-OFFinder and CFD
- genome-engineering/base-editing-design - CBE/ABE specifics
- genome-engineering/prime-editing-design - pegRNA design
- genome-engineering/hdr-template-design - Donor optimization
- primer-design/primer-basics - Validation primer design