| name | bio-gene-regulatory-networks-differential-networks |
| description | Compare gene regulatory and co-expression networks between biological conditions to identify rewired regulatory relationships using DiffCorr. Detects gained, lost, and reversed gene-gene correlations between conditions. Use when comparing co-expression networks between disease vs control, treatment conditions, or developmental stages. |
| tool_type | r |
| primary_tool | DiffCorr |
Version Compatibility
Reference examples tested with: matplotlib 3.8+, numpy 1.26+, pandas 2.2+, scipy 1.12+, statsmodels 0.14+
Before using code patterns, verify installed versions match. If versions differ:
- Python:
pip show <package> then help(module.function) to check signatures
- R:
packageVersion('<pkg>') then ?function_name to verify parameters
If code throws ImportError, AttributeError, or TypeError, introspect the installed
package and adapt the example to match the actual API rather than retrying.
Differential Networks
"Compare gene co-expression networks between my disease and control groups" → Test whether gene-gene correlations differ significantly between two conditions using Fisher's z-transform, identifying gained, lost, and reversed regulatory relationships.
- R:
DiffCorr::comp.2.cc.fdr() for differential correlation analysis
- Python: custom Fisher z-test with
scipy.stats and statsmodels for FDR correction
Compare co-expression and regulatory networks between biological conditions to identify rewired gene-gene relationships.
DiffCorr Workflow
DiffCorr uses Fisher's z-transform to test whether the correlation between two genes differs significantly between two conditions.
Required Libraries
library(DiffCorr)
library(igraph)
Input Preparation
expr_all <- read.csv('normalized_counts.csv', row.names = 1)
sample_info <- read.csv('sample_info.csv', row.names = 1)
expr_condition1 <- t(expr_all[, sample_info$condition == 'control'])
expr_condition2 <- t(expr_all[, sample_info$condition == 'disease'])
gene_vars <- apply(expr_all, 1, var)
top_genes <- names(sort(gene_vars, decreasing = TRUE))[1:3000]
expr_condition1 <- expr_condition1[, top_genes]
expr_condition2 <- expr_condition2[, top_genes]
Run DiffCorr Analysis
cor1 <- cor(expr_condition1, method = 'pearson')
cor2 <- cor(expr_condition2, method = 'pearson')
n1 <- nrow(expr_condition1)
n2 <- nrow(expr_condition2)
result <- comp.2.cc.fdr(
output.file = 'diffcorr_results.txt',
data1 = expr_condition1,
data2 = expr_condition2,
threshold = 0.05
)
Parse and Classify Results
diffcorr <- read.delim('diffcorr_results.txt')
classify_edge <- function(cor1, cor2, threshold = 0.3) {
if (abs(cor1) < threshold & abs(cor2) >= threshold) return('gained')
if (abs(cor1) >= threshold & abs(cor2) < threshold) return('lost')
if (cor1 > threshold & cor2 < -threshold) return('reversed')
if (cor1 < -threshold & cor2 > threshold) return('reversed')
return('unchanged')
}
diffcorr$edge_type <- mapply(classify_edge, diffcorr$cor1, diffcorr$cor2)
table(diffcorr$edge_type)
significant <- diffcorr[diffcorr$p.adj < 0.05, ]
rewired <- significant[significant$edge_type != 'unchanged', ]
print(paste('Significant rewired edges:', nrow(rewired)))
Identify Rewired Hub Genes
gene_rewiring <- c(rewired$gene1, rewired$gene2)
rewiring_counts <- sort(table(gene_rewiring), decreasing = TRUE)
top_rewired_genes <- head(rewiring_counts, 20)
print(top_rewired_genes)
DGCA (Alternative)
DGCA (Differential Gene Correlation Analysis) was archived from CRAN in May 2024. It is still available via GitHub.
devtools::install_github('andymckenzie/DGCA')
library(DGCA)
dgca_results <- ddcorAll(
inputMat = expr_all,
design = sample_info$condition,
compare = c('control', 'disease'),
adjust = 'BH',
nPerm = 0,
corrType = 'pearson'
)
sig_dgca <- dgca_results[dgca_results$pValDiff_adj < 0.05, ]
table(sig_dgca$Classes)
Python NetworkX Approach
Goal: Identify gene pairs whose co-expression relationship changes significantly between two conditions using a Python-native workflow.
Approach: Compute per-condition Pearson correlation matrices, test each gene pair for differential correlation using Fisher's z-transform, apply BH FDR correction, then classify edges as gained, lost, or reversed based on correlation magnitude thresholds.
import pandas as pd
import numpy as np
from scipy import stats
import networkx as nx
def build_correlation_network(expr_matrix, threshold=0.6):
'''Build network from correlation matrix, keeping edges above threshold.'''
cor_matrix = expr_matrix.corr(method='pearson')
genes = cor_matrix.columns.tolist()
G = nx.Graph()
G.add_nodes_from(genes)
for i in range(len(genes)):
for j in range(i + 1, len(genes)):
r = cor_matrix.iloc[i, j]
if abs(r) >= threshold:
G.add_edge(genes[i], genes[j], weight=r)
return G
def fisher_z_test(r1, n1, r2, n2):
'''Fisher z-transform test for difference between two correlations.'''
z1 = np.arctanh(np.clip(r1, -0.9999, 0.9999))
z2 = np.arctanh(np.clip(r2, -0.9999, 0.9999))
se = np.sqrt(1 / (n1 - 3) + 1 / (n2 - 3))
z_stat = (z1 - z2) / se
p_value = 2 * stats.norm.sf(abs(z_stat))
return z_stat, p_value
def differential_network_analysis(expr_cond1, expr_cond2, fdr_threshold=0.05):
'''Compare correlation networks between two conditions.'''
genes = expr_cond1.columns.tolist()
n1, n2 = len(expr_cond1), len(expr_cond2)
cor1 = expr_cond1.corr()
cor2 = expr_cond2.corr()
results = []
for i in range(len(genes)):
for j in range(i + 1, len(genes)):
r1, r2 = cor1.iloc[i, j], cor2.iloc[i, j]
z_stat, p_val = fisher_z_test(r1, n1, r2, n2)
results.append({
'gene1': genes[i], 'gene2': genes[j],
'cor_cond1': r1, 'cor_cond2': r2,
'z_stat': z_stat, 'p_value': p_val
})
df = pd.DataFrame(results)
from statsmodels.stats.multitest import multipletests
_, df['p_adj'], _, _ = multipletests(df['p_value'], method='fdr_bh')
def classify(row, threshold=0.3):
r1, r2 = row['cor_cond1'], row['cor_cond2']
if abs(r1) < threshold and abs(r2) >= threshold:
return 'gained'
if abs(r1) >= threshold and abs(r2) < threshold:
return 'lost'
if r1 > threshold and r2 < -threshold:
return 'reversed'
if r1 < -threshold and r2 > threshold:
return 'reversed'
return 'unchanged'
df['edge_type'] = df.apply(classify, axis=1)
return df
expr_all = pd.read_csv('normalized_counts.csv', index_col=0).T
sample_info = pd.read_csv('sample_info.csv', index_col=0)
expr_ctrl = expr_all.loc[sample_info[sample_info['condition'] == 'control'].index]
expr_disease = expr_all.loc[sample_info[sample_info['condition'] == 'disease'].index]
gene_vars = expr_all.var()
top_genes = gene_vars.nlargest(3000).index.tolist()
expr_ctrl, expr_disease = expr_ctrl[top_genes], expr_disease[top_genes]
diff_results = differential_network_analysis(expr_ctrl, expr_disease)
significant = diff_results[diff_results['p_adj'] < 0.05]
print(significant['edge_type'].value_counts())
Visualize Differential Network
Goal: Render the rewired gene-gene connections as a color-coded network graph distinguishing gained, lost, and reversed edges.
Approach: Build a NetworkX graph from significant rewired edges, assign edge colors by change type, and draw with a spring layout.
import matplotlib.pyplot as plt
rewired = significant[significant['edge_type'] != 'unchanged']
G_diff = nx.Graph()
color_map = {'gained': '#2ca02c', 'lost': '#d62728', 'reversed': '#9467bd'}
for _, row in rewired.iterrows():
G_diff.add_edge(row['gene1'], row['gene2'],
color=color_map[row['edge_type']], weight=abs(row['z_stat']))
edge_colors = [G_diff[u][v]['color'] for u, v in G_diff.edges()]
fig, ax = plt.subplots(figsize=(12, 12))
pos = nx.spring_layout(G_diff, k=2, seed=42)
nx.draw_networkx(G_diff, pos, edge_color=edge_colors, node_size=50,
font_size=6, width=0.5, alpha=0.7, ax=ax)
import matplotlib.patches as mpatches
legend_handles = [mpatches.Patch(color=c, label=l) for l, c in color_map.items()]
ax.legend(handles=legend_handles, loc='upper left')
ax.set_title('Differential co-expression network')
plt.savefig('differential_network.pdf', bbox_inches='tight')
Statistical Considerations
| Consideration | Guideline | Rationale |
|---|
| Minimum samples per group | 20 recommended, 15 floor | Correlation estimates unstable below 15 |
| Multiple testing | BH FDR correction | Thousands of gene pairs tested |
| Correlation method | Pearson (default) | Spearman for non-linear; Pearson is standard |
| Gene filtering | Top 2000-5000 variable | Reduces test burden, focuses on informative genes |
| Effect size filter | abs(delta_r) > 0.3 | Avoid reporting trivially different correlations |
Related Skills
- coexpression-networks - Build WGCNA networks for individual conditions
- scenic-regulons - TF regulon inference from scRNA-seq
- differential-expression/deseq2-basics - DE analysis to complement network rewiring
- temporal-genomics/temporal-grn - Time-delayed regulatory inference from temporal data