OtherhiyenwongFree

eeg-biomarker-robustness-cross-population

Cross-population framework for evaluating robustness and generalizability of EEG biomarkers in multi-site clinical settings. Addresses cross-subject and cross-platform variation for reliable Parkinson's disease detection. Keywords: EEG biomarkers, cross-population, generalization, multi-site, Parkinson's disease, clinical reliability.

Repo bundle on Versuzhiyenwong/ai_collection1001 indexed entries (SKILL.md and CLAUDE.md) from this repository — open the full bundle view.

Open bundle →

View on GitHub ↗</>github.com/hiyenwong/ai_collection Yours? Claim it ↗

§ 01 — Stats

Stars1

Prior1099

Quality—

Score—

Tasks—

§ 02 — Install

Get eeg-biomarker-robustness-cross-population.

Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.

One-line install · Claude Code

$npx versuz@latest install hiyenwong-ai-collection-collection-skills-eeg-biomarker-robustness-cross-population

Or clone the repo

$git clone https://github.com/hiyenwong/ai_collection.git

Or copy the SKILL.md manually

More Versuz picks

★ Featured$0.99

vz-scrape-runner

Web

★ Featured$1.99

vz-bench-debug

Document

Got something better ?Submit your skill — it enters tomorrow's cycle. No fee.

Submit yours →

§ 05 — Challenge

Think you can beat it?

$npx versuz challenge hiyenwong-ai-collection-collection-skills-eeg-biomarker-robustness-cross-population↵

Show SKILL.md content (~2.6k tokens)

---
name: eeg-biomarker-robustness-cross-population
description: "Cross-population framework for evaluating robustness and generalizability of EEG biomarkers in multi-site clinical settings. Addresses cross-subject and cross-platform variation for reliable Parkinson's disease detection. Keywords: EEG biomarkers, cross-population, generalization, multi-site, Parkinson's disease, clinical reliability."
---

# Robust and Clinically Reliable EEG Biomarkers: A Cross Population Framework

> Framework for developing EEG biomarkers that generalize across populations, sites, and recording platforms for reliable clinical deployment.

## Metadata
- **Source**: arXiv:2604.23933v1
- **Authors**: Nicholas R. Rasmussen, Longwei Wang, Rodrigue Rizk, et al.
- **Published**: 2026-04-27

## Core Methodology

### The Cross-Population Challenge

EEG biomarkers often fail when deployed across different:
- **Subjects**: Individual physiological differences
- **Sites**: Different hospitals/clinics with varying protocols
- **Platforms**: Different EEG hardware and software
- **Populations**: Different demographics, disease stages

### Three-Pillar Evaluation Framework

```
┌─────────────────────────────────────────────────────────────┐
│              ROBUST EEG BIOMARKER FRAMEWORK                  │
├─────────────────┬─────────────────┬─────────────────────────┤
│   INTERNAL      │   EXTERNAL      │   CLINICAL              │
│   RELIABILITY   │   RELIABILITY   │   UTILITY               │
├─────────────────┼─────────────────┼─────────────────────────┤
│ • Test-retest   │ • Cross-site    │ • Diagnostic            │
│   stability     │   generalization│   accuracy              │
│ • Split-half    │ • Cross-platform│ • Prognostic            │
│   consistency   │   robustness    │   value                 │
│ • Intra-subject │ • Cross-population• Treatment           │
│   variance      │   transfer      │   monitoring            │
└─────────────────┴─────────────────┴─────────────────────────┘
```

## Implementation Guide

### Step 1: Biomarker Feature Extraction

```python
import numpy as np
from scipy import signal
from sklearn.preprocessing import StandardScaler

class EEGBiomarkerExtractor:
    """
    Extract robust EEG biomarkers for clinical applications
    """
    def __init__(self, fs=500):
        self.fs = fs
        self.bands = {
            'delta': (0.5, 4),
            'theta': (4, 8),
            'alpha': (8, 13),
            'beta': (13, 30),
            'gamma': (30, 100)
        }
    
    def extract_spectral_features(self, eeg_data):
        """
        Extract band power and spectral features
        
        Args:
            eeg_data: (channels, time) array
        
        Returns:
            Dictionary of spectral features
        """
        features = {}
        
        for band_name, (low, high) in self.bands.items():
            # Bandpass filter
            sos = signal.butter(4, [low, high], btype='band', fs=self.fs, output='sos')
            filtered = signal.sosfilt(sos, eeg_data, axis=-1)
            
            # Compute power
            power = np.mean(filtered ** 2, axis=-1)
            features[f'{band_name}_power'] = power
            
        # Relative power ratios
        total_power = sum(features[f'{b}_power'] for b in self.bands.keys())
        for band in self.bands.keys():
            features[f'{band}_relative'] = features[f'{band}_power'] / (total_power + 1e-10)
        
        # Alpha peak frequency (individualized)
        freqs, psd = signal.welch(eeg_data, fs=self.fs, nperseg=self.fs*2)
        alpha_mask = (freqs >= 8) & (freqs <= 13)
        features['alpha_peak'] = freqs[alpha_mask][np.argmax(psd[:, alpha_mask], axis=1)]
        
        return features
    
    def extract_connectivity_features(self, eeg_data):
        """Extract functional connectivity features"""
        from scipy.stats import spearmanr
        
        n_channels = eeg_data.shape[0]
        connectivity = np.zeros((n_channels, n_channels))
        
        for i in range(n_channels):
            for j in range(i+1, n_channels):
                # Phase locking value (PLV)
                phase_i = np.angle(signal.hilbert(eeg_data[i]))
                phase_j = np.angle(signal.hilbert(eeg_data[j]))
                plv = np.abs(np.mean(np.exp(1j * (phase_i - phase_j))))
                connectivity[i, j] = connectivity[j, i] = plv
        
        return {'plv_connectivity': connectivity}
```

### Step 2: Robustness Evaluation

```python
from sklearn.model_selection import LeaveOneGroupOut, cross_val_score
from sklearn.ensemble import RandomForestClassifier

class BiomarkerRobustnessEvaluator:
    """
    Evaluate biomarker robustness across populations
    """
    def __init__(self, biomarker_extractor):
        self.extractor = biomarker_extractor
        self.classifier = RandomForestClassifier(n_estimators=100)
    
    def evaluate_internal_reliability(self, X, y, subject_ids):
        """
        Test-retest reliability and split-half consistency
        """
        results = {}
        
        # Test-retest (repeated recordings from same subjects)
        unique_subjects = np.unique(subject_ids)
        test_retest_scores = []
        
        for subject in unique_subjects:
            mask = subject_ids == subject
            if np.sum(mask) >= 2:
                subject_features = X[mask]
                # Correlation between sessions
                corr = np.corrcoef(subject_features)[0, 1]
                test_retest_scores.append(corr)
        
        results['test_retest_icc'] = np.mean(test_retest_scores)
        
        # Split-half reliability
        from sklearn.model_selection import ShuffleSplit
        cv = ShuffleSplit(n_splits=10, test_size=0.5)
        split_scores = cross_val_score(self.classifier, X, y, cv=cv)
        results['split_half_accuracy'] = np.mean(split_scores)
        
        return results
    
    def evaluate_external_reliability(self, X, y, site_ids, platform_ids):
        """
        Cross-site and cross-platform generalization
        """
        results = {}
        
        # Cross-site validation
        logo = LeaveOneGroupOut()
        site_scores = cross_val_score(
            self.classifier, X, y, 
            cv=logo.split(X, y, site_ids)
        )
        results['cross_site_accuracy'] = np.mean(site_scores)
        
        # Cross-platform validation
        platform_scores = cross_val_score(
            self.classifier, X, y,
            cv=logo.split(X, y, platform_ids)
        )
        results['cross_platform_accuracy'] = np.mean(platform_scores)
        
        return results
    
    def evaluate_clinical_utility(self, X_train, y_train, X_test, y_test):
        """
        Diagnostic accuracy and prognostic value
        """
        from sklearn.metrics import roc_auc_score, precision_recall_fscore_support
        
        self.classifier.fit(X_train, y_train)
        y_pred = self.classifier.predict(X_test)
        y_prob = self.classifier.predict_proba(X_test)[:, 1]
        
        results = {
            'auc_roc': roc_auc_score(y_test, y_prob),
            'accuracy': np.mean(y_pred == y_test),
            'precision': precision_recall_fscore_support(y_test, y_pred, average='binary')[0],
            'recall': precision_recall_fscore_support(y_test, y_pred, average='binary')[1],
            'f1': precision_recall_fscore_support(y_test, y_pred, average='binary')[2]
        }
        
        return results
```

### Step 3: Harmonization Pipeline

```python
class EEGHarmonization:
    """
    Combat harmonization for multi-site EEG data
    """
    def __init__(self):
        from sklearn.linear_model import LinearRegression
        self.model = LinearRegression()
    
    def fit(self, X, site_ids):
        """
        Learn site-specific effects (ComBat-style)
        """
        self.site_effects = {}
        overall_mean = np.mean(X, axis=0)
        
        for site in np.unique(site_ids):
            site_mask = site_ids == site
            self.site_effects[site] = {
                'mean': np.mean(X[site_mask], axis=0) - overall_mean,
                'std': np.std(X[site_mask], axis=0)
            }
        
        self.overall_mean = overall_mean
        return self
    
    def transform(self, X, site_ids):
        """Remove site effects"""
        X_harmonized = X.copy()
        
        for site in np.unique(site_ids):
            site_mask = site_ids == site
            X_harmonized[site_mask] -= self.site_effects[site]['mean']
        
        return X_harmonized
```

## Applications

- **Parkinson's Disease**: Reliable EEG biomarkers for early detection
- **Alzheimer's Disease**: Cross-site validation of diagnostic markers
- **Depression**: Objective biomarkers for treatment monitoring
- **Clinical Trials**: Standardized biomarkers for drug development

## Pitfalls

- **Overfitting to Training Site**: May not generalize to new sites
- **Platform Differences**: Sampling rates, electrode positions vary
- **Population Bias**: Training data may not represent target population
- **Temporal Drift**: Biomarkers may degrade over time

## Related Skills
- eeg-tinnitus-biomarker-robustness
- eeg-hopfield-emotion-energy
- tms-eeg-biomarkers
- explainable-gnn-eeg-neurological

## References
- Rasmussen et al. (2026) Robust EEG Biomarkers, arXiv:2604.23933
- Fortin et al. (2017) Harmonization of multi-site diffusion tensor imaging data
- Combrisson & Jerbi (2015) Exceeding chance level by chance