DocumenthiyenwongFree

eeg-foundation-models-review

Comprehensive review of EEG foundation models covering self-supervised learning, transfer learning, and downstream applications. Synthesizes current state-of-the-art approaches, benchmarks, and best practices. Based on arXiv:2604.16655 (April 2026).

Repo bundle on Versuzhiyenwong/ai_collection1001 indexed entries (SKILL.md and CLAUDE.md) from this repository — open the full bundle view.

Open bundle →

View on GitHub ↗</>github.com/hiyenwong/ai_collection Yours? Claim it ↗

§ 01 — Stats

Stars1

Prior1099

Quality—

Score—

Tasks—

§ 02 — Install

Get eeg-foundation-models-review.

Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.

One-line install · Claude Code

$npx versuz@latest install hiyenwong-ai-collection-collection-skills-eeg-foundation-models-review

Or clone the repo

$git clone https://github.com/hiyenwong/ai_collection.git

Or copy the SKILL.md manually

cp ai_collection/SKILL.MD ~/.claude/skills/hiyenwong-ai-collection-collection-skills-eeg-foundation-models-review/SKILL.md

More Versuz picks

★ Featured$1.99

vz-bench-debug

Document

★ Featured$0.99

vz-scrape-runner

Web

Got something better ?Submit your skill — it enters tomorrow's cycle. No fee.

Submit yours →

§ 05 — Challenge

Think you can beat it?

$npx versuz challenge hiyenwong-ai-collection-collection-skills-eeg-foundation-models-review↵

Show SKILL.md content (~1.6k tokens)

---
name: eeg-foundation-models-review
description: Comprehensive review of EEG foundation models covering self-supervised learning, transfer learning, and downstream applications. Synthesizes current state-of-the-art approaches, benchmarks, and best practices. Based on arXiv:2604.16655 (April 2026).
tags: [EEG, foundation models, self-supervised learning, transfer learning, brain-computer interface, neural decoding]
---

# EEG Foundation Models Review

## Overview

Comprehensive survey of foundation models for EEG signal processing, covering pre-training strategies, architectural designs, transfer learning approaches, and downstream task performance.

**Paper**: arXiv:2604.16655 (April 2026)

## Key Categories

### 1. Self-Supervised Pre-training Objectives
1. **Masked Signal Modeling**
- Mask random temporal segments
- Predict masked content from context
- Similar to BERT's masked language modeling
- Works well for temporal structure learning

2. **Contrastive Learning**
- Positive pairs: augmentations of same EEG segment
- Negative pairs: different subjects/sessions
- Subject-invariant representation learning
- Handles inter-subject variability

3. **Predictive Coding**
- Predict future EEG samples from past
- Temporal dynamics modeling
- Autoregressive pre-training
- Captures neural oscillation patterns

4. **Cross-Modal Pre-training**
- EEG + text, EEG + image, EEG + behavior
- Multi-modal alignment objectives
- Leverages abundant paired data
- Enables zero-shot transfer

### 2. Architecture Families

#### Transformer-Based
- Multi-head self-attention for temporal modeling
- Positional encoding for temporal structure
- Scalable to large datasets
- Most popular approach (BrainBERT, NeuroBERT variants)

#### CNN-Based
- Local temporal feature extraction
- Efficient for short-range dependencies
- Lower computational requirements
- Good for real-time applications

#### Hybrid Architectures
- CNN for local features + Transformer for global context
- Best of both worlds
- More complex training pipeline

#### State-Space Models
- Selective SSMs (Mamba-style)
- Linear scaling with sequence length
- Emerging approach for long EEG recordings

### 3. Transfer Learning Strategies

#### Fine-tuning Approaches
1. **Full fine-tuning**
- All layers updated
- Best performance, highest compute
- Risk of catastrophic forgetting

2. **Linear probing**
- Only classification head trained
- Fast evaluation of representation quality
- Lower performance ceiling

3. **Parameter-efficient fine-tuning**
- LoRA adapters
- Prompt tuning
- Prefix tuning
- Minimal parameter updates

#### Domain Adaptation
1. **Subject-to-subject transfer**
- Pre-train on many subjects
- Adapt to new subject with minimal data
- Few-shot / zero-shot scenarios

2. **Task-to-task transfer**
- Pre-train on one task (e.g., motor imagery)
- Adapt to another (e.g., emotion recognition)
- Cross-task generalization

3. **Cross-dataset transfer**
- Pre-train on large public datasets
- Fine-tune on domain-specific data
- Handles distribution shift

### 4. Benchmark Datasets

#### Large-Scale Pre-training
- TUH EEG Corpus (largest clinical EEG dataset)
- Sleep-EDF / Sleep-EDFx
- CHB-MIT (seizure detection)
- BCI Competition datasets

#### Downstream Evaluation
- Motor imagery (BCI IV 2a/2b)
- Emotion recognition (DEAP, SEED)
- Sleep staging
- Seizure detection
- Cognitive state decoding

## Performance Insights

### Representation Quality
1. Self-supervised pre-training significantly outperforms random initialization
2. Contrastive learning excels at subject-invariant features
3. Masked modeling captures temporal structure better
4. Multi-modal pre-training enables cross-domain transfer

### Scalability
1. Performance scales with pre-training data size
2. Larger models show better transfer to low-data regimes
3. Diminishing returns beyond certain model sizes
4. Computational cost remains a bottleneck

### Practical Considerations
1. Channel configuration matters
- Models trained on 64 channels don't transfer well to 32
- Need channel-agnostic pre-training strategies
2. Sampling rate standardization needed
3. Artifact handling critical for clinical deployment
4. Interpretability tools needed for medical applications

## Best Practices

### Pre-training
1. Use diverse datasets covering multiple paradigms
2. Balance subject diversity and data volume
3. Include both clinical and research-grade EEG
4. Standardize preprocessing pipeline

### Fine-tuning
1. Start with linear probing to assess representations
2. Use parameter-efficient methods for small target datasets
3. Apply domain-specific augmentations during fine-tuning
4. Monitor for distribution shift

### Evaluation
1. Report both within-subject and cross-subject performance
2. Use standardized metrics per task type
3. Include ablation studies on pre-training objectives
4. Compare against strong baselines (not just random init)

## Pitfalls

- Ignoring inter-subject variability during pre-training
- Over-reliance on a single pre-training objective
- Not evaluating cross-dataset generalization
- Missing channel mismatch between pre-training and fine-tuning
- Insufficient data augmentation for contrastive learning
- Neglecting computational cost for deployment

## Future Directions

1. Foundation models for emerging EEG modalities (dry electrodes, wearable)
2. Real-time inference optimization
3. Multi-modal fusion (EEG + fNIRS, EEG + eye-tracking)
4. Federated pre-training for privacy-preserving EEG models
5. Open-source benchmarks and standardized evaluation

## Related Skills

- eeg-foundation-model-adapters
- neural-encoding-evaluation-meeg
- meta-learning-in-context-brain-decoding

## References

- arXiv:2604.16655 (April 2026)