Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.
npx versuz@latest install hiyenwong-ai-collection-collection-skills-bleg-llm-functions-as-powerful-fmrigit clone https://github.com/hiyenwong/ai_collection.gitcp ai_collection/SKILL.MD ~/.claude/skills/hiyenwong-ai-collection-collection-skills-bleg-llm-functions-as-powerful-fmri/SKILL.md---
name: bleg-llm-functions-as-powerful-fmri
description: "BLEG (Brain LLM Enhanced Graph) - using Large Language Models as fMRI graph enhancers for brain network analysis. LLM-augmented GNNs for sparse neurograph learning. Keywords: LLM, fMRI, brain network, GNN, graph enhancement, multimodal fusion."
---
# BLEG: LLM Functions as Powerful fMRI Graph-Enhancer for Brain Network Analysis
> A framework that leverages Large Language Models to enhance Graph Neural Networks for brain network analysis, addressing feature sparsity and knowledge limitations in traditional neurograph approaches.
## Metadata
- **Source**: arXiv:2604.07361
- **Authors**: Rui Dong, Zitong Wang, Jiaxing Li, et al.
- **Published**: 2026-04-01
- **Category**: Computer Vision and Pattern Recognition (cs.CV), Quantitative Methods (q-bio.QM)
## Core Methodology
### Motivation
Graph Neural Networks (GNNs) for brain network analysis face two fundamental challenges:
1. **Feature sparsity**: fMRI-derived node features are high-dimensional but information-sparse
2. **Domain knowledge limitations**: GNNs lack neuroscientific priors about brain organization
Large Language Models (LLMs) offer:
- Rich semantic representations
- Implicit neuroscientific knowledge from training corpora
- Natural language reasoning about brain functions
### BLEG Framework Architecture
#### Stage 1: LLM-Based Graph Enhancement
For each brain region (graph node), BLEG generates enhanced features:
```
Input: ROI name (e.g., "superior temporal gyrus")
LLM Query: "What are the functional characteristics of [ROI]?
What cognitive processes does it support?"
LLM Output: Semantic description → Embedding vector
Enhanced Feature: Concatenate fMRI feature + LLM embedding
```
#### Stage 2: Cross-Modal Fusion
Combines fMRI time series and LLM semantics:
```
Node Features:
├── fMRI-derived: [BOLD signal statistics, connectivity patterns]
├── LLM-derived: [Functional ontology, anatomical description]
└── Fused: Cross-attention between modalities
```
#### Stage 3: Knowledge-Guided GNN
LLM-generated knowledge guides the message passing:
```python
# Traditional GNN
h_v = AGGREGATE({h_u for u in N(v)})
# Knowledge-guided GNN
functional_similarity = LLM_similarity(v, u)
h_v = AGGREGATE({h_u * functional_similarity for u in N(v)})
```
### Key Innovations
#### 1. Neuro-semantic Embeddings
LLMs encode brain regions into semantic space:
- Regions with similar functions cluster together
- Captures hierarchical brain organization
- Provides interpretable features
#### 2. Dynamic Knowledge Injection
At inference time, BLEG can:
- Query LLM for patient-specific explanations
- Generate natural language predictions
- Provide evidence-based reasoning
#### 3. Zero-Shot Transfer
LLM knowledge enables:
- Transfer to unseen brain disorders
- Adaptation to new imaging protocols
- Generalization across populations
## Implementation Guide
### Prerequisites
- Pretrained LLM (GPT-4, Llama, or similar)
- fMRI preprocessing pipeline (FSL, AFNI, or nipype)
- PyTorch Geometric or DGL for GNNs
- Brain parcellation atlas (AAL, Schaefer, or Destrieux)
### Step-by-Step
1. **Prepare fMRI Graphs**
```python
from nilearn.connectome import ConnectivityMeasure
# Extract time series from ROIs
time_series = extract_timeseries(fmri_img, atlas)
# Compute connectivity matrix
correlation_measure = ConnectivityMeasure(kind='correlation')
connectivity = correlation_measure.fit_transform([time_series])[0]
# Create graph
G = create_graph_from_connectivity(connectivity, atlas_labels)
```
2. **Generate LLM Embeddings**
```python
from transformers import AutoModel, AutoTokenizer
model = AutoModel.from_pretrained("sentence-transformers/all-MiniLM-L6-v2")
tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/all-MiniLM-L6-v2")
def get_llm_embedding(roi_name):
prompt = f"Brain region: {roi_name}. Function:"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model(**inputs)
embedding = outputs.last_hidden_state.mean(dim=1)
return embedding
# Generate for all ROIs
llm_features = torch.stack([get_llm_embedding(roi) for roi in atlas_labels])
```
3. **Create BLEG Model**
```python
import torch.nn as nn
import torch_geometric.nn as gnn
class BLEG(nn.Module):
def __init__(self, n_rois, fmri_dim, llm_dim, hidden_dim, num_classes):
super().__init__()
# Feature fusion
self.fusion = nn.Sequential(
nn.Linear(fmri_dim + llm_dim, hidden_dim),
nn.ReLU(),
nn.Dropout(0.3)
)
# GNN layers
self.conv1 = gnn.GCNConv(hidden_dim, hidden_dim)
self.conv2 = gnn.GCNConv(hidden_dim, hidden_dim)
# Classification
self.classifier = nn.Linear(hidden_dim, num_classes)
def forward(self, fmri_feat, llm_feat, edge_index):
# Cross-modal fusion
x = torch.cat([fmri_feat, llm_feat], dim=-1)
x = self.fusion(x)
# Graph convolution
x = self.conv1(x, edge_index).relu()
x = self.conv2(x, edge_index)
# Readout
x = gnn.global_mean_pool(x, batch)
return self.classifier(x)
```
4. **Knowledge-Guided Message Passing**
```python
class KnowledgeGuidedGNN(nn.Module):
def __init__(self, llm_similarity_matrix):
self.llm_sim = llm_similarity_matrix # Precomputed
def message(self, x_j, edge_index_i, edge_index_j, size_i):
# Weight messages by LLM functional similarity
sim = self.llm_sim[edge_index_i, edge_index_j]
return x_j * sim.view(-1, 1)
```
### Training Configuration
- Learning rate: 1e-4 with Adam
- Batch size: 8-16 (whole-brain graphs are large)
- Dropout: 0.3-0.5
- Early stopping based on validation AUC
## Applications
- **Disease classification**: Alzheimer's, Parkinson's, depression
- **Cognitive state prediction**: Task decoding, mind-wandering detection
- **Brain age estimation**: Biological aging markers
- **Treatment response prediction**: Personalized medicine
## Pitfalls
1. **LLM hallucination**: Generated descriptions may be inaccurate
2. **Atlas dependency**: Performance varies with parcellation choice
3. **Computational cost**: LLM inference for each ROI is expensive
4. **Temporal resolution**: Static LLM embeddings miss dynamic brain states
5. **Interpretability gap**: LLM reasoning may not align with neuroscience
## Related Skills
- brain-graph-neural
- llm-neuroscience-applications
- multimodal-brain-fusion
- functional-connectivity-analysis
## Citation
```bibtex
@article{dong2026bleg,
title={BLEG: LLM Functions as Powerful fMRI Graph-Enhancer for Brain Network Analysis},
author={Dong, Rui and Wang, Zitong and Li, Jiaxing and others},
journal={arXiv preprint arXiv:2604.07361},
year={2026}
}
```