OtherhiyenwongFree

eeg2vision-multimodal-framework-v2

EEG-to-image reconstruction framework using diffusion models with LLM-guided post-processing. Supports low-density EEG configurations (24-128 channels) for real-time brain-to-image applications. Multimodal fusion with semantic refinement via large language models. Activation: EEG2Vision, EEG-to-image, brain visual reconstruction, multimodal EEG, low-density EEG decoding.

Repo bundle on Versuzhiyenwong/ai_collection1001 indexed entries (SKILL.md and CLAUDE.md) from this repository — open the full bundle view.

Open bundle →

View on GitHub ↗</>github.com/hiyenwong/ai_collection Yours? Claim it ↗

§ 01 — Stats

Stars1

Prior1099

Quality—

Score—

Tasks—

§ 02 — Install

Get eeg2vision-multimodal-framework-v2.

Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.

One-line install · Claude Code

$npx versuz@latest install hiyenwong-ai-collection-collection-skills-eeg2vision-multimodal-framework-v2

Or clone the repo

$git clone https://github.com/hiyenwong/ai_collection.git

Or copy the SKILL.md manually

More Versuz picks

★ Featured$1.99

vz-bench-debug

Document

★ Featured$0.99

vz-scrape-runner

Web

Got something better ?Submit your skill — it enters tomorrow's cycle. No fee.

Submit yours →

§ 05 — Challenge

Think you can beat it?

$npx versuz challenge hiyenwong-ai-collection-collection-skills-eeg2vision-multimodal-framework-v2↵

Show SKILL.md content (~1.7k tokens)

---
name: eeg2vision-multimodal-framework-v2
description: "EEG-to-image reconstruction framework using diffusion models with LLM-guided post-processing. Supports low-density EEG configurations (24-128 channels) for real-time brain-to-image applications. Multimodal fusion with semantic refinement via large language models. Activation: EEG2Vision, EEG-to-image, brain visual reconstruction, multimodal EEG, low-density EEG decoding."
version: v1.0.0
last_updated: 2026-04-16
arxiv_source: "2604.08063v1"
---

# EEG2Vision: Multimodal EEG-to-Image Reconstruction Framework

Modular, end-to-end framework for reconstructing visual stimuli from non-invasive electroencephalography (EEG) using diffusion models enhanced with LLM-guided post-processing. Supports low-density EEG configurations for practical deployment.

## Core Innovation

This methodology enables visual reconstruction from brain signals by:
- **Diffusion-based generation**: High-quality image synthesis from EEG features
- **LLM-guided refinement**: Semantic post-processing for visual quality enhancement
- **Multi-resolution support**: Works with 24-128 channel EEG configurations
- **Modular architecture**: Systematic evaluation across different EEG densities

## Technical Architecture

### Three-Stage Pipeline

1. **EEG Feature Extraction**
   - Pretrained EEG encoder network
   - Spatial-temporal feature extraction
   - Dimensionality reduction for diffusion input

2. **Diffusion Model Reconstruction**
   - Latent diffusion model (LDM) for image generation
   - EEG conditioning via cross-attention
   - Iterative denoising from random latent

3. **LLM-Guided Post-Processing**
   - Visual content analysis via multimodal LLM
   - Semantic refinement and correction
   - Quality enhancement through text guidance

## Supported EEG Configurations

| Channels | Configuration | Use Case |
|----------|--------------|----------|
| 128 | High-density | Laboratory research |
| 64 | Standard clinical | Medical imaging |
| 32 | Low-density | Portable BCI |
| 24 | Ultra-low-density | Consumer devices |

## Technical Details

### EEG Preprocessing

**Signal Processing Pipeline:**
1. Bandpass filtering (0.5-45 Hz)
2. Artifact removal (ICA/blink correction)
3. Temporal segmentation (stimulus-locked epochs)
4. Normalization (z-score across channels)

**Feature Extraction:**
- Convolutional layers for spatial patterns
- LSTM/Transformer for temporal dynamics
- Attention mechanisms for salient time windows

### Diffusion Model

**Architecture:**
- Base: Latent Diffusion Model (LDM)
- Conditioning: EEG feature vectors via cross-attention
- Resolution: 256×256 or 512×512 output
- Training: Paired EEG-image dataset

**Training Strategy:**
- Pretrain on large image dataset
- Fine-tune with EEG conditioning
- Add classifier-free guidance for quality

### LLM Post-Processing

**Semantic Analysis:**
```
Input: Generated image
LLM Task: Describe visual content + assess quality
Output: Textual feedback for refinement
```

**Refinement Loop:**
1. Generate initial image from EEG
2. LLM analyzes content and quality
3. Generate text prompts for enhancement
4. Rerun diffusion with improved guidance
5. Iterate until quality threshold met

## Implementation Guidelines

### System Requirements
```yaml
EEG Equipment: 24-128 channel cap
Sampling Rate: ≥500 Hz
GPU: NVIDIA GPU with ≥12GB VRAM
Framework: PyTorch with diffusers library
```

### Data Pipeline
```python
# EEG preprocessing
def preprocess_eeg(raw_eeg):
    filtered = bandpass_filter(raw_eeg, 0.5, 45)
    cleaned = artifact_removal(filtered)
    epochs = segment_epochs(cleaned, stimulus_times)
    return normalize(epochs)

# Feature extraction
def extract_features(eeg_epochs):
    spatial = cnn_encoder(eeg_epochs)
    temporal = lstm_encoder(spatial)
    return fusion_layer(spatial, temporal)

# Image generation
def generate_image(eeg_features):
    latent = random_latent()
    for t in diffusion_steps:
        noise_pred = unet(latent, t, eeg_features)
        latent = denoise_step(latent, noise_pred, t)
    return vae_decode(latent)
```

### LLM Enhancement
```python
def llm_enhance(image, initial_prompt):
    description = vision_llm.describe(image)
    quality_score = assess_quality(description)
    if quality_score < threshold:
        enhanced_prompt = refine_prompt(initial_prompt, description)
        return regenerate_with_prompt(enhanced_prompt)
    return image
```

## Activation Keywords

- EEG2Vision
- EEG-to-image
- brain visual reconstruction
- multimodal EEG
- low-density EEG decoding
- EEG reconstruction
- brain-to-image
- diffusion EEG
- neural decoding visual
- cognitive neuroscience imaging

## Use Cases

1. **Brain-Computer Interfaces**: Visual feedback for BCIs
2. **Dream Visualization**: Reconstructing dream imagery
3. **Memory Research**: Visual memory reconstruction
4. **Clinical Applications**: Coma patient communication
5. **Neuroscience Education**: Brain activity visualization

## Performance Evaluation

### Metrics
- **Image Quality**: FID, IS, LPIPS
- **Semantic Similarity**: CLIP score
- **Perceptual Quality**: User studies
- **Cross-Subject Generalization**: Subject-independent accuracy

### Comparison Across EEG Densities

| Channels | FID ↓ | CLIP Score ↑ | Inference Time |
|----------|-------|--------------|----------------|
| 128 | 25.3 | 0.31 | 2.1s |
| 64 | 28.7 | 0.29 | 1.8s |
| 32 | 32.1 | 0.26 | 1.5s |
| 24 | 35.8 | 0.24 | 1.2s |

## Deployment Considerations

### Real-Time Constraints
- Target latency: <500ms per image
- Streaming EEG processing
- Incremental updates

### Hardware Optimization
- Model quantization (INT8)
- TensorRT acceleration
- Batch processing for multiple users

### Privacy and Ethics
- Data anonymization
- User consent for brain data
- Secure transmission protocols

## References

- Paper: arXiv:2604.08063v1 (April 2026)
- Title: "EEG2Vision: A Multimodal EEG-Based Framework for 2D Visual Reconstruction in Cognitive Neuroscience"
- Authors: Emanuele Balloni, Emanuele Frontoni, Chiara Matti
- PDF: https://arxiv.org/pdf/2604.08063v1

## Related Skills

- brain-decoding
- multimodal-brain-connectivity-gnn
- neural-dynamics-decision-making

## Notes

- Low-density configurations (24-32 channels) show promising results
- LLM post-processing significantly improves semantic coherence
- Training on diverse visual stimuli improves generalization
- Consider individual calibration for optimal performance