DocumenthiyenwongFree

Brain3D: EEG-to-3D Decoding of Visual Representations via Multimodal Reasoning

name: brain3d-eeg-multimodal description: Brain3D - EEG-to-3D visual reconstruction via multimodal reasoning. Progressive transformation from EEG to 3D using geometry-aware generative reasoning with LLM-guided 3D-aware descriptions and diffusion-based generation. Activation: Brain3D, EEG-to-3D, brain 3D reconstruction, multimodal EEG, visual decoding 3D. category: aicollection ---

Repo bundle on Versuzhiyenwong/ai_collection1001 indexed entries (SKILL.md and CLAUDE.md) from this repository — open the full bundle view.

Open bundle →

View on GitHub ↗</>github.com/hiyenwong/ai_collection Yours? Claim it ↗

§ 01 — Stats

Stars1

Prior1099

Quality—

Score—

Tasks—

§ 02 — Install

Get Brain3D: EEG-to-3D Decoding of Visual Representations via Multimodal Reasoning.

Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.

One-line install · Claude Code

$npx versuz@latest install hiyenwong-ai-collection-collection-skills-brain3d-eeg-multimodal

Or clone the repo

$git clone https://github.com/hiyenwong/ai_collection.git

Or copy the SKILL.md manually

More Versuz picks

★ Featured$0.99

vz-scrape-runner

Web

★ Featured$1.99

vz-bench-debug

Document

Got something better ?Submit your skill — it enters tomorrow's cycle. No fee.

Submit yours →

§ 05 — Challenge

Think you can beat it?

$npx versuz challenge hiyenwong-ai-collection-collection-skills-brain3d-eeg-multimodal↵

Show SKILL.md content (~1.4k tokens)

---
name: brain3d-eeg-multimodal
description: Brain3D - EEG-to-3D visual reconstruction via multimodal reasoning. Progressive transformation from EEG to 3D using geometry-aware generative reasoning with LLM-guided 3D-aware descriptions and diffusion-based generation. Activation: Brain3D, EEG-to-3D, brain 3D reconstruction, multimodal EEG, visual decoding 3D.
category: ai_collection
---

# Brain3D: EEG-to-3D Decoding of Visual Representations via Multimodal Reasoning

基于论文 "Brain3D: EEG-to-3D Decoding of Visual Representations via Multimodal Reasoning" (arXiv:2604.08068v1, 2026)

## 核心创新

首次实现从EEG信号到3D视觉表征的重建，突破传统2D图像重建限制，为几何理解和神经解码应用开辟新途径。

## 技术架构

### 多阶段渐进式重建流程

```
┌─────────────────────────────────────────────────────────────────┐
│  Stage 1: EEG-to-Image                                          │
│  ├── 输入: EEG信号 (128/64/32/24通道)                            │
│  ├── EEG条件扩散模型                                             │
│  └── 输出: 基础2D图像                                            │
└─────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│  Stage 2: Multimodal LLM Semantic Extraction                    │
│  ├── 输入: 基础2D图像                                            │
│  ├── 多模态大语言模型 (MLLM)                                      │
│  └── 输出: 结构化3D感知描述                                       │
│      └── "一个红色的球体，表面光滑，位于白色背景中央..."            │
└─────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│  Stage 3: Diffusion-based 3D Generation                         │
│  ├── 输入: 3D感知描述 + 图像条件                                  │
│  ├── 扩散模型生成                                                │
│  └── 输出: 3D一致的多视角图像                                     │
└─────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│  Stage 4: Single-Image-to-3D Reconstruction                     │
│  ├── 输入: 多视角图像                                            │
│  ├── 单图3D重建模型 (如Zero123, One-2-3-45)                       │
│  └── 输出: 连贯的3D网格 (3D Mesh)                                │
└─────────────────────────────────────────────────────────────────┘
```

## 关键技术

### 1. EEG条件扩散模型

```python
# EEG编码器
eeg_features = EEGEncoder(eeg_signal)  # [batch, eeg_dim]

# 条件扩散过程
for t in reversed(range(timesteps)):
    noise_pred = UNet(latent, t, eeg_features)
    latent = denoise_step(latent, noise_pred, t)

# 解码为图像
image = VAE_Decoder(latent)
```

### 2. 多模态LLM 3D描述提取

```python
# 构建提示词
prompt = f"""
分析这张图像并生成详细的3D描述，包括:
1. 物体几何形状 (形状、尺寸、比例)
2. 表面属性 (材质、纹理、颜色)
3. 空间关系 (位置、朝向、背景)
4. 3D结构特征 (体积、深度、轮廓)

图像: {image}
"""

# 生成3D感知描述
description_3d = MultimodalLLM.generate(prompt)
```

### 3. 几何感知生成推理

**核心优势**：
- 避免直接的EEG-to-3D映射（过于复杂且数据需求大）
- 分解为可管理的子问题
- 每个阶段可独立优化和验证

## 实验结果

### 性能指标

| 指标 | 数值 | 说明 |
|------|------|------|
| EEG解码准确率 | 85.4% | 10-way Top-1 |
| CLIPScore | 0.648 | 语义对齐度 |
| 几何保真度 | 高 | 视觉评估 |

### 对比分析

与2D重建方法对比：
- 语义一致性：相当
- 几何理解：显著提升
- 应用潜力：更广泛

## 应用场景

### 1. 增强现实/虚拟现实

```
应用: 脑控3D内容生成
场景:
- 用户想象一个3D物体
- EEG采集脑信号
- Brain3D实时重建3D模型
- 在AR/VR环境中显示
```

### 2. 设计辅助

- 工业设计概念可视化
- 建筑方案脑控生成
- 艺术创作辅助

### 3. 神经科学研究

- 3D视觉表征的神经编码研究
- 空间认知机制探索
- 脑-行为关联分析

### 4. 医疗康复

- 空间认知康复训练
- 视觉障碍辅助
- 神经可塑性评估

## 实现要点

### 系统要求

```python
Brain3D_System = {
    'eeg_device': '兼容24-128通道',
    'gpu_memory': '≥16GB (扩散模型)',
    'mllm': 'GPT-4V / CLIP / 本地多模态模型',
    'diffusion_model': 'Stable Diffusion XL',
    'image_to_3d': 'Zero123 / One-2-3-45 / InstantMesh'
}
```

### 推理流程

```python
class Brain3D:
    def __init__(self):
        self.eeg_encoder = EEGEncoder()
        self.diffusion = EEGConditionedDiffusion()
        self.mllm = MultimodalLLM()
        self.image3d = ImageTo3D()
    
    def reconstruct(self, eeg_signal):
        # Stage 1: EEG -> 2D
        base_image = self.diffusion.generate(eeg_signal)
        
        # Stage 2: 2D -> 3D Description
        description = self.mllm.describe_3d(base_image)
        
        # Stage 3: Description -> Multi-view
        multi_view = self.diffusion.generate_3d(description, base_image)
        
        # Stage 4: Multi-view -> 3D Mesh
        mesh = self.image3d.reconstruct(multi_view)
        
        return mesh
```

### 训练数据

- EEG数据集：视觉刺激 + 对应EEG
- 图像-3D配对：多视角图像 + 3D模型
- 总数据量：大规模多模态数据集

## 技术优势

| 特性 | 传统2D重建 | Brain3D |
|------|------------|---------|
| 输出维度 | 2D图像 | 3D模型 |
| 几何理解 | 有限 | 完整 |
| 应用范围 | 受限 | 广泛 |
| 可扩展性 | 中等 | 高 |

## 局限性与未来方向

### 当前局限

- 计算复杂度高（多阶段推理）
- 实时性挑战
- 3D重建精度依赖基础模型

### 未来方向

1. **端到端训练**：联合优化所有阶段
2. **实时优化**：模型压缩与加速
3. **多模态融合**：结合fMRI、MEG等
4. **交互式编辑**：脑控3D模型修改

## 论文信息

- **Authors**: Emanuele Balloni, Emanuele Frontoni, Chiara Matti, Marina Paolanti, Roberto Pierdicca, et al.
- **Published**: 2026-04-09
- **arXiv**: https://arxiv.org/abs/2604.08068v1
- **PDF**: https://arxiv.org/pdf/2604.08068v1

## 相关研究

- EEG2Vision: 2D视觉重建
- Neural 3D Reconstruction
- Brain-Computer Interfaces for 3D Control
- Multimodal Foundation Models

## 触发词

- Brain3D
- EEG-to-3D
- brain 3D reconstruction
- multimodal EEG 3D
- visual decoding 3D
- 脑电三维重建
- 多模态脑解码
- 3D视觉神经解码