Run any Skill in Manus with one click

dina-v1-population-activity-interpretation

DINA (Dual-Tower Image-Neural Alignment) framework for interpretable contrastive analysis of V1 population activity. Aligns visual stimuli and V1 responses in shared latent space at intermediate feature map level. Activation: DINA, V1 population activity, image-neural alignment, contrastive framework, calcium imaging decoding, visual computation.

Run Skill in Manus

Overview

Install command

npx skills add https://github.com/hiyenwong/ai_collection --skill dina-v1-population-activity-interpretation

Copy and paste this command into Claude Code to install the skill

Source

hiyenwong/ai_collection

Stars1

Forks0

UpdatedJune 4, 2026 at 02:00

SKILL.md

readonly

name	dina-v1-population-activity-interpretation
description	DINA (Dual-Tower Image-Neural Alignment) framework for interpretable contrastive analysis of V1 population activity. Aligns visual stimuli and V1 responses in shared latent space at intermediate feature map level. Activation: DINA, V1 population activity, image-neural alignment, contrastive framework, calcium imaging decoding, visual computation.

DINA: Dual-Tower Image-Neural Alignment for V1 Population Activity Interpretation

An interpretable contrastive framework that jointly trains dual-tower architecture aligning visual stimuli and V1 population responses in shared latent space, enabling both accurate decoding and direct access to interpretable feature maps.

Metadata

Source: arXiv:2605.04309
Authors: Xin Wang, Zhuangzhi Gao, Hongyi Qin, Zhongli Wu, Feixiang Zhou, He Zhao
Published: 2026-05-05
Domain: Computational Neuroscience + Visual Processing

Core Methodology

Key Innovation

Traditional alignment-based approaches improve decoding accuracy from brain activity but provide limited insight into the neural computations underlying these improvements. DINA addresses this gap by training a dual-tower architecture that aligns visual stimuli and V1 population responses at the level of intermediate feature maps (not just final representations), enabling both accurate decoding AND direct access to interpretable computational mechanisms.

Technical Framework

Dual-Tower Architecture:
- Image Tower: Processes visual stimuli through hierarchical feature extraction
- Neural Tower: Processes V1 population activity through parallel architecture
- Both towers project into a shared latent space at intermediate feature map level
Contrastive Alignment:
- Positive pairs: (image, corresponding V1 response) are pulled together
- Negative pairs: mismatched image-response pairs are pushed apart
- Alignment occurs at multiple levels of the feature hierarchy
Interpretability Mechanism:
- Access to intermediate feature maps reveals which visual features drive neural responses
- Enables analysis of spatial regions contributing to alignment
- Identifies sparse subsets of strongly responsive neurons

Key Findings (Mouse V1 Two-Photon Calcium Imaging)

Decoding performance primarily supported by coarse, low-level visual structure (not semantic category or fine-grained details)
Alignable feature maps emerge from multiple spatially distributed image regions
Both shape and texture cues captured by alignable features
Features predominantly reconstructed by sparse subsets of strongly responsive neurons and their functional interactions

Implementation Guide

Prerequisites

Large-scale two-photon calcium imaging data from V1
Corresponding visual stimuli (natural images)
PyTorch or similar deep learning framework
GPU for contrastive training

Step-by-Step

Preprocess Neural Data: Denoise and normalize calcium traces, extract population activity vectors
Preprocess Visual Data: Extract multi-scale visual features (edges, textures, shapes)
Build Dual Towers: Design parallel architectures for image and neural processing
Define Contrastive Loss: InfoNCE or similar contrastive objective for paired alignment
Train Jointly: Optimize both towers simultaneously with shared latent space projection
Extract Feature Maps: Access intermediate representations for interpretability analysis
Analyze Spatial Contributions: Map which image regions drive neural alignment
Identify Key Neurons: Find sparse subsets of neurons most responsible for alignment

Code Sketch

import torch
import torch.nn as nn
import torch.nn.functional as F

class DINATower(nn.Module):
    """Dual-tower architecture for image-neural alignment."""
    
    def __init__(self, image_dim, neural_dim, latent_dim):
        super().__init__()
        self.image_tower = nn.Sequential(
            nn.Linear(image_dim, 512),
            nn.ReLU(),
            nn.Linear(512, 256),
            nn.ReLU(),
            nn.Linear(256, latent_dim)
        )
        self.neural_tower = nn.Sequential(
            nn.Linear(neural_dim, 512),
            nn.ReLU(),
            nn.Linear(512, 256),
            nn.ReLU(),
            nn.Linear(256, latent_dim)
        )
    
    def forward(self, images, neural_responses):
        img_features = self.image_tower(images)
        neural_features = self.neural_tower(neural_responses)
        return img_features, neural_features
    
    def contrastive_loss(self, img_features, neural_features, temperature=0.07):
        # InfoNCE loss
        similarities = torch.matmul(img_features, neural_features.T) / temperature
        labels = torch.arange(len(img_features), device=img_features.device)
        return F.cross_entropy(similarities, labels)

Applications

Visual Neuroscience: Understanding computational mechanisms in primary visual cortex
Brain-Computer Interfaces: Improved decoding of visual stimuli from neural activity
Model Validation: Testing whether artificial vision models capture biological computation
Cross-Species Comparison: Comparing V1 computation across species using shared alignment framework
Feature Visualization: Identifying which visual features drive neural population responses

Pitfalls

Requires large-scale neural datasets (small datasets may not capture population dynamics)
Contrastive alignment may capture superficial correlations rather than causal mechanisms
Feature map interpretability depends on tower architecture design choices
Mouse V1 findings may not directly generalize to primate/human visual cortex
Two-photon calcium imaging has temporal resolution limits compared to electrophysiology

Related Skills

primary-visual-cortex-v1-functions
neural-encoding-evaluation-ground-truth
connectome-constrained-neural-network
eeg-visual-attention-decoding

More from this repository

same repository

attachment-representations-interbrain-synchrony

hiyenwong/ai_collection

Attachment representations in early childhood as independent endogenous driver of interbrain synchrony during remote cooperation. Novel Remote Partner-Belief Manipulation paradigm isolates attachment representations by manipulating partner-belief. EEG synchrony concentrated at P4 channel (right TPJ). Activation: attachment, interbrain synchrony, EEG hyperscanning, child-adult interaction, attachment representations, social neuroscience, partner-belief manipulation, early childhood, mother-child interaction, brain synchronization, attachment security, social-emotional development.

2026-06-041

sleep-replay-acceleration-sharp

hiyenwong/ai_collection

SHARP (Sleep-based Hierarchical Accelerated Replay) 方法论 — 睡眠启发的分层加速回放框架用于长程非平稳时序模式识别。受啮齿动物慢波睡眠中加速回放启发，通过分离记忆模块和模式识别模块实现无反向传播的长程信用分配。适用于流式时序学习、长程依赖建模、神经科学启发的 AI 架构。触发词：睡眠回放、加速回放、SHARP、时序学习、长程依赖、流式学习、慢波睡眠、hierarchical replay

2026-06-041

piston-control-two-ion-quantum

hiyenwong/ai_collection

Inverse-engineering methodology for piston operations in trapped-ion quantum devices. One ion serves as classical piston driven by Coulomb interaction with quantum-controlled ion. Stationary state determined self-consistently. Inverse-engineering protocols enable precise control of classical ion motion. Provides route toward controlled piston dynamics in microscopic quantum devices.

2026-06-041

quantum-fault-trees-minimal-cut

hiyenwong/ai_collection

Quantum fault tree analysis methodology using quantum computing. Extends classical reliability engineering fault trees to quantum domain. Identifies minimal cut sets in system reliability analysis using quantum algorithms. Applicable to safety-critical systems, cyber-physical systems, and quantum system reliability engineering.

2026-06-041

adaptive-hybrid-feature-fusion-medical

hiyenwong/ai_collection

Adaptive Hybrid Quantum-Classical Feature Fusion methodology for medical image classification. Addresses optimization asymmetries between quantum and classical paradigms using Temperature-Scaled Hybrid Fusion (TSHF), Dynamic Hybrid Fusion (DHF), and Static Hybrid Fusion (SHF) strategies. Use when designing hybrid quantum-classical ML pipelines for healthcare/medical imaging, especially when combining ResNet backbones with variational quantum circuits for diagnostic tasks.

2026-06-041

adaptive-spiking-neuron-asn

hiyenwong/ai_collection

Adaptive Spiking Neuron (ASN) methodology for vision and language modeling. Implements trainable membrane potential dynamics with adaptive firing mechanisms for efficient Spiking Neural Networks (SNNs). Activation: adaptive spiking neuron, ASN, spiking neural network vision language, SNN adaptive neuron, neuromorphic vision language model.

2026-06-041

Source

hiyenwong

hiyenwong/ai_collection

View GitHub Repository View Creator Repositories

Install command

Download

Run Skill in Manus

name	dina-v1-population-activity-interpretation
description	DINA (Dual-Tower Image-Neural Alignment) framework for interpretable contrastive analysis of V1 population activity. Aligns visual stimuli and V1 responses in shared latent space at intermediate feature map level. Activation: DINA, V1 population activity, image-neural alignment, contrastive framework, calcium imaging decoding, visual computation.

DINA: Dual-Tower Image-Neural Alignment for V1 Population Activity Interpretation

An interpretable contrastive framework that jointly trains dual-tower architecture aligning visual stimuli and V1 population responses in shared latent space, enabling both accurate decoding and direct access to interpretable feature maps.

Metadata

Source: arXiv:2605.04309
Authors: Xin Wang, Zhuangzhi Gao, Hongyi Qin, Zhongli Wu, Feixiang Zhou, He Zhao
Published: 2026-05-05
Domain: Computational Neuroscience + Visual Processing

Core Methodology

Key Innovation

Technical Framework

Dual-Tower Architecture:
- Image Tower: Processes visual stimuli through hierarchical feature extraction
- Neural Tower: Processes V1 population activity through parallel architecture
- Both towers project into a shared latent space at intermediate feature map level
Contrastive Alignment:
- Positive pairs: (image, corresponding V1 response) are pulled together
- Negative pairs: mismatched image-response pairs are pushed apart
- Alignment occurs at multiple levels of the feature hierarchy
Interpretability Mechanism:
- Access to intermediate feature maps reveals which visual features drive neural responses
- Enables analysis of spatial regions contributing to alignment
- Identifies sparse subsets of strongly responsive neurons

Key Findings (Mouse V1 Two-Photon Calcium Imaging)

Decoding performance primarily supported by coarse, low-level visual structure (not semantic category or fine-grained details)
Alignable feature maps emerge from multiple spatially distributed image regions
Both shape and texture cues captured by alignable features
Features predominantly reconstructed by sparse subsets of strongly responsive neurons and their functional interactions

Implementation Guide

Prerequisites

Large-scale two-photon calcium imaging data from V1
Corresponding visual stimuli (natural images)
PyTorch or similar deep learning framework
GPU for contrastive training

Step-by-Step

Preprocess Neural Data: Denoise and normalize calcium traces, extract population activity vectors
Preprocess Visual Data: Extract multi-scale visual features (edges, textures, shapes)
Build Dual Towers: Design parallel architectures for image and neural processing
Define Contrastive Loss: InfoNCE or similar contrastive objective for paired alignment
Train Jointly: Optimize both towers simultaneously with shared latent space projection
Extract Feature Maps: Access intermediate representations for interpretability analysis
Analyze Spatial Contributions: Map which image regions drive neural alignment
Identify Key Neurons: Find sparse subsets of neurons most responsible for alignment

Code Sketch

import torch
import torch.nn as nn
import torch.nn.functional as F

class DINATower(nn.Module):
    """Dual-tower architecture for image-neural alignment."""
    
    def __init__(self, image_dim, neural_dim, latent_dim):
        super().__init__()
        self.image_tower = nn.Sequential(
            nn.Linear(image_dim, 512),
            nn.ReLU(),
            nn.Linear(512, 256),
            nn.ReLU(),
            nn.Linear(256, latent_dim)
        )
        self.neural_tower = nn.Sequential(
            nn.Linear(neural_dim, 512),
            nn.ReLU(),
            nn.Linear(512, 256),
            nn.ReLU(),
            nn.Linear(256, latent_dim)
        )
    
    def forward(self, images, neural_responses):
        img_features = self.image_tower(images)
        neural_features = self.neural_tower(neural_responses)
        return img_features, neural_features
    
    def contrastive_loss(self, img_features, neural_features, temperature=0.07):
        # InfoNCE loss
        similarities = torch.matmul(img_features, neural_features.T) / temperature
        labels = torch.arange(len(img_features), device=img_features.device)
        return F.cross_entropy(similarities, labels)

Applications

Visual Neuroscience: Understanding computational mechanisms in primary visual cortex
Brain-Computer Interfaces: Improved decoding of visual stimuli from neural activity
Model Validation: Testing whether artificial vision models capture biological computation
Cross-Species Comparison: Comparing V1 computation across species using shared alignment framework
Feature Visualization: Identifying which visual features drive neural population responses

Pitfalls

Requires large-scale neural datasets (small datasets may not capture population dynamics)
Contrastive alignment may capture superficial correlations rather than causal mechanisms
Feature map interpretability depends on tower architecture design choices
Mouse V1 findings may not directly generalize to primate/human visual cortex
Two-photon calcium imaging has temporal resolution limits compared to electrophysiology

Related Skills

primary-visual-cortex-v1-functions
neural-encoding-evaluation-ground-truth
connectome-constrained-neural-network
eeg-visual-attention-decoding