ResearchhiyenwongFree

Brain-to-Speech Synthesis with Prosody Feature Engineering

name: brain-to-speech-prosody-feature-engineering description: Brain-to-speech synthesis from intracranial EEG using prosody feature engineering and transformer-based reconstruction — decodes speech with natural intonation and rhythm from neural signals. Maps neural activity to speech acoustic features including pitch contour, energy envelope, and timing. Use when: brain-to-speech decoding, intracranial EEG speech synthesis, prosody feature engineering, transformer-based neural decoding, speech reconstruction from brain signals, neural speech synthesis, iEEG acoustic feature mapping, natural speech decoding, speech BCI, neural-to-speech transformation. Activation: brain-to-speech, prosody engineering, speech decoding BCI, iEEG speech synthesis, neural speech reconstruction, transformer speech decoding, intonation decoding, rhythm decoding, neural acoustic mapping, speech brain-computer interface. version: 1.0.0 metadata: hermes: tags: [brain-to-speech, prosody, transformer, iEEG, speech-decoding, BCI, acoustic-features, neural-synthesis] sourcepaper: "Brain-to-Speech Synthesis with Prosody Feature Engineering (arXiv:2603.12456)" date: 2026-03-18 ---

Repo bundle on Versuzhiyenwong/ai_collection1001 indexed entries (SKILL.md and CLAUDE.md) from this repository — open the full bundle view.

Open bundle →

View on GitHub ↗</>github.com/hiyenwong/ai_collection Yours? Claim it ↗

§ 01 — Stats

Stars1

Prior1099

Quality—

Score—

Tasks—

§ 02 — Install

Get Brain-to-Speech Synthesis with Prosody Feature Engineering.

Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.

One-line install · Claude Code

$npx versuz@latest install hiyenwong-ai-collection-collection-skills-brain-to-speech-prosody-feature-engineering

Or clone the repo

$git clone https://github.com/hiyenwong/ai_collection.git

Or copy the SKILL.md manually

More Versuz picks

★ Featured$1.99

vz-bench-debug

Document

★ Featured$0.99

vz-scrape-runner

Web

Got something better ?Submit your skill — it enters tomorrow's cycle. No fee.

Submit yours →

§ 05 — Challenge

Think you can beat it?

$npx versuz challenge hiyenwong-ai-collection-collection-skills-brain-to-speech-prosody-feature-engineering↵

Show SKILL.md content (~1.2k tokens)

---
name: brain-to-speech-prosody-feature-engineering
description: Brain-to-speech synthesis from intracranial EEG using prosody feature engineering and transformer-based reconstruction — decodes speech with natural intonation and rhythm from neural signals. Maps neural activity to speech acoustic features including pitch contour, energy envelope, and timing. Use when: brain-to-speech decoding, intracranial EEG speech synthesis, prosody feature engineering, transformer-based neural decoding, speech reconstruction from brain signals, neural speech synthesis, iEEG acoustic feature mapping, natural speech decoding, speech BCI, neural-to-speech transformation. Activation: brain-to-speech, prosody engineering, speech decoding BCI, iEEG speech synthesis, neural speech reconstruction, transformer speech decoding, intonation decoding, rhythm decoding, neural acoustic mapping, speech brain-computer interface.
version: 1.0.0
metadata:
  hermes:
    tags: [brain-to-speech, prosody, transformer, iEEG, speech-decoding, BCI, acoustic-features, neural-synthesis]
    source_paper: "Brain-to-Speech Synthesis with Prosody Feature Engineering (arXiv:2603.12456)"
    date: 2026-03-18
---

# Brain-to-Speech Synthesis with Prosody Feature Engineering

## Overview

This framework reconstructs natural-sounding speech from intracranial EEG (iEEG) signals by:
1. Decoding prosody features (pitch, energy, rhythm) from neural activity
2. Using transformer-based models for neural-to-acoustic mapping
3. Synthesizing speech with natural intonation and timing

**Source Paper**: Brain-to-Speech Synthesis with Prosody Feature Engineering (arXiv:2603.12456, 2026-03-18)

## Core Architecture

```
┌──────────────────────────────────────────────┐
│         Brain-to-Speech Pipeline             │
├──────────────────────────────────────────────┤
│  iEEG Neural Signals                         │
│       ↓                                       │
│  Temporal Feature Extraction                  │
│       ↓                                       │
│  Transformer Neural Decoder                   │
│       ↓                                       │
│  Prosody Feature Prediction:                  │
│    - Pitch contour (F0)                       │
│    - Energy envelope                          │
│    - Timing/rhythm                            │
│       ↓                                       │
│  Speech Synthesizer (vocoder)                 │
│       ↓                                       │
│  Reconstructed Speech                         │
└──────────────────────────────────────────────┘
```

## Key Innovations

1. **Prosody-aware decoding**: Goes beyond phoneme prediction to capture natural speech rhythm and intonation
2. **Transformer architecture**: Captures long-range temporal dependencies in neural signals
3. **End-to-end training**: Joint optimization of neural decoding and speech synthesis

## Implementation Pattern

```python
import torch
import torch.nn as nn

class ProsodyDecoder(nn.Module):
    """Transformer-based prosody feature decoder from iEEG."""
    
    def __init__(self, n_channels=256, d_model=512, n_heads=8, n_layers=6):
        super().__init__()
        self.input_proj = nn.Linear(n_channels, d_model)
        self.encoder = nn.TransformerEncoder(
            nn.TransformerEncoderLayer(d_model, n_heads, batch_first=True),
            n_layers
        )
        # Output heads for different prosody features
        self.pitch_head = nn.Linear(d_model, 1)    # F0 contour
        self.energy_head = nn.Linear(d_model, 1)   # Energy envelope
        self.duration_head = nn.Linear(d_model, 1) # Timing
        
    def forward(self, neural_signal):
        """
        Args:
            neural_signal: iEEG data of shape (batch, time, channels)
        Returns:
            pitch, energy, duration predictions
        """
        x = self.input_proj(neural_signal)
        x = self.encoder(x)
        
        pitch = self.pitch_head(x)
        energy = self.energy_head(x)
        duration = self.duration_head(x)
        
        return pitch, energy, duration
```

## Applications

- **Speech BCI**: Restore communication for paralyzed patients
- **Voice prosthetics**: Natural-sounding speech for ALS patients
- **Neuroscience research**: Study neural basis of speech prosody
- **Silent speech interfaces**: Decode inner speech to text/audio

## Related Skills

- brain-to-speech-synthesis — Overview of brain-to-speech methods
- neural-decoder-quantum-error-correction — Neural decoding methods
- eeg-ieeg-bridge-bci — EEG to iEEG bridging