Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.
npx versuz@latest install hiyenwong-ai-collection-collection-skills-brain-to-speech-prosody-feature-engineeringgit clone https://github.com/hiyenwong/ai_collection.gitcp ai_collection/SKILL.MD ~/.claude/skills/hiyenwong-ai-collection-collection-skills-brain-to-speech-prosody-feature-engineering/SKILL.md---
name: brain-to-speech-prosody-feature-engineering
description: Brain-to-speech synthesis from intracranial EEG using prosody feature engineering and transformer-based reconstruction — decodes speech with natural intonation and rhythm from neural signals. Maps neural activity to speech acoustic features including pitch contour, energy envelope, and timing. Use when: brain-to-speech decoding, intracranial EEG speech synthesis, prosody feature engineering, transformer-based neural decoding, speech reconstruction from brain signals, neural speech synthesis, iEEG acoustic feature mapping, natural speech decoding, speech BCI, neural-to-speech transformation. Activation: brain-to-speech, prosody engineering, speech decoding BCI, iEEG speech synthesis, neural speech reconstruction, transformer speech decoding, intonation decoding, rhythm decoding, neural acoustic mapping, speech brain-computer interface.
version: 1.0.0
metadata:
hermes:
tags: [brain-to-speech, prosody, transformer, iEEG, speech-decoding, BCI, acoustic-features, neural-synthesis]
source_paper: "Brain-to-Speech Synthesis with Prosody Feature Engineering (arXiv:2603.12456)"
date: 2026-03-18
---
# Brain-to-Speech Synthesis with Prosody Feature Engineering
## Overview
This framework reconstructs natural-sounding speech from intracranial EEG (iEEG) signals by:
1. Decoding prosody features (pitch, energy, rhythm) from neural activity
2. Using transformer-based models for neural-to-acoustic mapping
3. Synthesizing speech with natural intonation and timing
**Source Paper**: Brain-to-Speech Synthesis with Prosody Feature Engineering (arXiv:2603.12456, 2026-03-18)
## Core Architecture
```
┌──────────────────────────────────────────────┐
│ Brain-to-Speech Pipeline │
├──────────────────────────────────────────────┤
│ iEEG Neural Signals │
│ ↓ │
│ Temporal Feature Extraction │
│ ↓ │
│ Transformer Neural Decoder │
│ ↓ │
│ Prosody Feature Prediction: │
│ - Pitch contour (F0) │
│ - Energy envelope │
│ - Timing/rhythm │
│ ↓ │
│ Speech Synthesizer (vocoder) │
│ ↓ │
│ Reconstructed Speech │
└──────────────────────────────────────────────┘
```
## Key Innovations
1. **Prosody-aware decoding**: Goes beyond phoneme prediction to capture natural speech rhythm and intonation
2. **Transformer architecture**: Captures long-range temporal dependencies in neural signals
3. **End-to-end training**: Joint optimization of neural decoding and speech synthesis
## Implementation Pattern
```python
import torch
import torch.nn as nn
class ProsodyDecoder(nn.Module):
"""Transformer-based prosody feature decoder from iEEG."""
def __init__(self, n_channels=256, d_model=512, n_heads=8, n_layers=6):
super().__init__()
self.input_proj = nn.Linear(n_channels, d_model)
self.encoder = nn.TransformerEncoder(
nn.TransformerEncoderLayer(d_model, n_heads, batch_first=True),
n_layers
)
# Output heads for different prosody features
self.pitch_head = nn.Linear(d_model, 1) # F0 contour
self.energy_head = nn.Linear(d_model, 1) # Energy envelope
self.duration_head = nn.Linear(d_model, 1) # Timing
def forward(self, neural_signal):
"""
Args:
neural_signal: iEEG data of shape (batch, time, channels)
Returns:
pitch, energy, duration predictions
"""
x = self.input_proj(neural_signal)
x = self.encoder(x)
pitch = self.pitch_head(x)
energy = self.energy_head(x)
duration = self.duration_head(x)
return pitch, energy, duration
```
## Applications
- **Speech BCI**: Restore communication for paralyzed patients
- **Voice prosthetics**: Natural-sounding speech for ALS patients
- **Neuroscience research**: Study neural basis of speech prosody
- **Silent speech interfaces**: Decode inner speech to text/audio
## Related Skills
- brain-to-speech-synthesis — Overview of brain-to-speech methods
- neural-decoder-quantum-error-correction — Neural decoding methods
- eeg-ieeg-bridge-bci — EEG to iEEG bridging