Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.
npx versuz@latest install hiyenwong-ai-collection-collection-skills-eeg2vision-multimodal-framework-v2git clone https://github.com/hiyenwong/ai_collection.gitcp ai_collection/SKILL.MD ~/.claude/skills/hiyenwong-ai-collection-collection-skills-eeg2vision-multimodal-framework-v2/SKILL.md---
name: eeg2vision-multimodal-framework-v2
description: "EEG-to-image reconstruction framework using diffusion models with LLM-guided post-processing. Supports low-density EEG configurations (24-128 channels) for real-time brain-to-image applications. Multimodal fusion with semantic refinement via large language models. Activation: EEG2Vision, EEG-to-image, brain visual reconstruction, multimodal EEG, low-density EEG decoding."
version: v1.0.0
last_updated: 2026-04-16
arxiv_source: "2604.08063v1"
---
# EEG2Vision: Multimodal EEG-to-Image Reconstruction Framework
Modular, end-to-end framework for reconstructing visual stimuli from non-invasive electroencephalography (EEG) using diffusion models enhanced with LLM-guided post-processing. Supports low-density EEG configurations for practical deployment.
## Core Innovation
This methodology enables visual reconstruction from brain signals by:
- **Diffusion-based generation**: High-quality image synthesis from EEG features
- **LLM-guided refinement**: Semantic post-processing for visual quality enhancement
- **Multi-resolution support**: Works with 24-128 channel EEG configurations
- **Modular architecture**: Systematic evaluation across different EEG densities
## Technical Architecture
### Three-Stage Pipeline
1. **EEG Feature Extraction**
- Pretrained EEG encoder network
- Spatial-temporal feature extraction
- Dimensionality reduction for diffusion input
2. **Diffusion Model Reconstruction**
- Latent diffusion model (LDM) for image generation
- EEG conditioning via cross-attention
- Iterative denoising from random latent
3. **LLM-Guided Post-Processing**
- Visual content analysis via multimodal LLM
- Semantic refinement and correction
- Quality enhancement through text guidance
## Supported EEG Configurations
| Channels | Configuration | Use Case |
|----------|--------------|----------|
| 128 | High-density | Laboratory research |
| 64 | Standard clinical | Medical imaging |
| 32 | Low-density | Portable BCI |
| 24 | Ultra-low-density | Consumer devices |
## Technical Details
### EEG Preprocessing
**Signal Processing Pipeline:**
1. Bandpass filtering (0.5-45 Hz)
2. Artifact removal (ICA/blink correction)
3. Temporal segmentation (stimulus-locked epochs)
4. Normalization (z-score across channels)
**Feature Extraction:**
- Convolutional layers for spatial patterns
- LSTM/Transformer for temporal dynamics
- Attention mechanisms for salient time windows
### Diffusion Model
**Architecture:**
- Base: Latent Diffusion Model (LDM)
- Conditioning: EEG feature vectors via cross-attention
- Resolution: 256×256 or 512×512 output
- Training: Paired EEG-image dataset
**Training Strategy:**
- Pretrain on large image dataset
- Fine-tune with EEG conditioning
- Add classifier-free guidance for quality
### LLM Post-Processing
**Semantic Analysis:**
```
Input: Generated image
LLM Task: Describe visual content + assess quality
Output: Textual feedback for refinement
```
**Refinement Loop:**
1. Generate initial image from EEG
2. LLM analyzes content and quality
3. Generate text prompts for enhancement
4. Rerun diffusion with improved guidance
5. Iterate until quality threshold met
## Implementation Guidelines
### System Requirements
```yaml
EEG Equipment: 24-128 channel cap
Sampling Rate: ≥500 Hz
GPU: NVIDIA GPU with ≥12GB VRAM
Framework: PyTorch with diffusers library
```
### Data Pipeline
```python
# EEG preprocessing
def preprocess_eeg(raw_eeg):
filtered = bandpass_filter(raw_eeg, 0.5, 45)
cleaned = artifact_removal(filtered)
epochs = segment_epochs(cleaned, stimulus_times)
return normalize(epochs)
# Feature extraction
def extract_features(eeg_epochs):
spatial = cnn_encoder(eeg_epochs)
temporal = lstm_encoder(spatial)
return fusion_layer(spatial, temporal)
# Image generation
def generate_image(eeg_features):
latent = random_latent()
for t in diffusion_steps:
noise_pred = unet(latent, t, eeg_features)
latent = denoise_step(latent, noise_pred, t)
return vae_decode(latent)
```
### LLM Enhancement
```python
def llm_enhance(image, initial_prompt):
description = vision_llm.describe(image)
quality_score = assess_quality(description)
if quality_score < threshold:
enhanced_prompt = refine_prompt(initial_prompt, description)
return regenerate_with_prompt(enhanced_prompt)
return image
```
## Activation Keywords
- EEG2Vision
- EEG-to-image
- brain visual reconstruction
- multimodal EEG
- low-density EEG decoding
- EEG reconstruction
- brain-to-image
- diffusion EEG
- neural decoding visual
- cognitive neuroscience imaging
## Use Cases
1. **Brain-Computer Interfaces**: Visual feedback for BCIs
2. **Dream Visualization**: Reconstructing dream imagery
3. **Memory Research**: Visual memory reconstruction
4. **Clinical Applications**: Coma patient communication
5. **Neuroscience Education**: Brain activity visualization
## Performance Evaluation
### Metrics
- **Image Quality**: FID, IS, LPIPS
- **Semantic Similarity**: CLIP score
- **Perceptual Quality**: User studies
- **Cross-Subject Generalization**: Subject-independent accuracy
### Comparison Across EEG Densities
| Channels | FID ↓ | CLIP Score ↑ | Inference Time |
|----------|-------|--------------|----------------|
| 128 | 25.3 | 0.31 | 2.1s |
| 64 | 28.7 | 0.29 | 1.8s |
| 32 | 32.1 | 0.26 | 1.5s |
| 24 | 35.8 | 0.24 | 1.2s |
## Deployment Considerations
### Real-Time Constraints
- Target latency: <500ms per image
- Streaming EEG processing
- Incremental updates
### Hardware Optimization
- Model quantization (INT8)
- TensorRT acceleration
- Batch processing for multiple users
### Privacy and Ethics
- Data anonymization
- User consent for brain data
- Secure transmission protocols
## References
- Paper: arXiv:2604.08063v1 (April 2026)
- Title: "EEG2Vision: A Multimodal EEG-Based Framework for 2D Visual Reconstruction in Cognitive Neuroscience"
- Authors: Emanuele Balloni, Emanuele Frontoni, Chiara Matti
- PDF: https://arxiv.org/pdf/2604.08063v1
## Related Skills
- brain-decoding
- multimodal-brain-connectivity-gnn
- neural-dynamics-decision-making
## Notes
- Low-density configurations (24-32 channels) show promising results
- LLM post-processing significantly improves semantic coherence
- Training on diverse visual stimuli improves generalization
- Consider individual calibration for optimal performance