Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.
npx versuz@latest install hiyenwong-ai-collection-collection-skills-cortico-cerebellar-rnngit clone https://github.com/hiyenwong/ai_collection.gitcp ai_collection/SKILL.MD ~/.claude/skills/hiyenwong-ai-collection-collection-skills-cortico-cerebellar-rnn/SKILL.md---
name: cortico-cerebellar-rnn
description: "Cortico-cerebellar modularity as architectural inductive bias for efficient temporal learning. Augments RNNs with cerebellar-inspired feedforward module for faster learning, higher performance, and parameter efficiency. Key finding: cortex functions as fixed reservoir while cerebellum drives learning. Use for bio-inspired RNN architecture design, cortico-cerebellar loop studies, temporal learning optimization, modular heterogeneous neural architectures, reservoir computing with cerebellar modules. Activation: cortico-cerebellar, CB-RNN, cerebellar RNN, fixed reservoir, cerebellar module, cortical-cerebellar loop, heterogeneous modularity, temporal learning efficiency, brain-inspired RNN."
---
# Cortico-Cerebellar RNN (CB-RNN)
## Overview
Methodology based on the discovery that **cortico-cerebellar modularity** serves as a powerful structural inductive bias for efficient temporal learning. Augments a cortex-inspired Elman-type RNN with a cerebellar-inspired feedforward module that provides a learned additive bias signal to the recurrent core.
**Source**: Voce, Giannakakis & Clopath (2026). "Cortico-cerebellar modularity as an architectural inductive bias for efficient temporal learning." arXiv:2605.10356 [q-bio.NC]
## Biological Motivation
| Brain Region | Computational Role | Network Analog |
|---|---|---|
| **Cerebral Cortex** | Recurrent, context-dependent computation over time | Elman RNN core |
| **Cerebellum** | Feedforward expansion-compression, rapid transformation & adaptive correction | Feedforward bias module |
| **Cortico-Cerebellar Loop** | Closed reciprocal interaction for flexible processing | RNN output → CB module → additive bias to RNN |
## CB-RNN Architecture
### Core Design
```
Input x_t → [Cortex RNN] → h_t → Output
↓ ↑
[Cerebellar Module] → bias_t (additive)
```
**Cortex (Recurrent Core)**: Elman-type RNN processing temporal sequences
**Cerebellum (Feedforward Module)**: Takes cortex output, applies learned transformation, returns additive bias to cortex
### Key Implementation Pattern
```python
import torch
import torch.nn as nn
class CerebellarModule(nn.Module):
"""Cerebellar-inspired feedforward module providing additive bias to RNN core."""
def __init__(self, hidden_dim, expansion_ratio=4):
super().__init__()
# Expansion-compression architecture (cerebellar microcircuit analog)
expanded_dim = hidden_dim * expansion_ratio
self.expand = nn.Linear(hidden_dim, expanded_dim)
self.compress = nn.Linear(expanded_dim, hidden_dim)
self.activation = nn.ReLU()
def forward(self, h):
"""Transform cortex state into corrective bias signal."""
z = self.expand(h)
z = self.activation(z)
bias = self.compress(z)
return bias
class CB_RNN(nn.Module):
"""Cortico-Cerebellar RNN: cortex RNN + cerebellar feedforward module."""
def __init__(self, input_dim, hidden_dim, output_dim, expansion_ratio=4):
super().__init__()
self.input_proj = nn.Linear(input_dim, hidden_dim)
self.recurrent = nn.RNNCell(hidden_dim, hidden_dim)
self.cerebellum = CerebellarModule(hidden_dim, expansion_ratio)
self.output = nn.Linear(hidden_dim, output_dim)
def forward(self, x, h0=None, cerebellar_plasticity=True):
"""
Args:
cerebellar_plasticity: If False, freeze RNN core after minimal training
"""
batch_size = x.size(0)
if h0 is None:
h = torch.zeros(batch_size, self.recurrent.hidden_size)
else:
h = h0
outputs = []
for t in range(x.size(1)):
# Cortex processing
input_t = self.input_proj(x[:, t])
# Cerebellar bias signal
bias = self.cerebellum(h)
# Combine: recurrent update with cerebellar modulation
h = self.recurrent(input_t + bias, h)
out = self.output(h)
outputs.append(out)
return torch.stack(outputs, dim=1), h
```
## Key Findings
### 1. Superior Learning Efficiency
- CB-RNN learns **faster** than parameter-matched fully recurrent baselines across temporal tasks
- Reaches **higher maximum performance** on tasks of varying difficulty
- More **parameter-efficient**: better performance per parameter
### 2. Fixed Reservoir Hypothesis
- **Critical finding**: Freezing the recurrent core after minimal training, and delegating subsequent learning to the cerebellar module, **preserves superior learning efficiency**
- This suggests the **cerebellar module is the primary driver of efficiency**
- The **cortical network can largely function as a fixed reservoir** — it provides rich recurrent representations, while the cerebellar module adapts them to the task
### 3. Heterogeneous Modularity as Inductive Bias
- Different modules with **distinct architectures, functions, and timescales** create structural inductive biases
- Promotes efficient and generalizable learning
- Mitigates interference between learned functions
### 4. Representational Analysis
- **Distinct dimensionality and timescale profiles** across RNN and CB modules
- **Task-dependent division of labor** between cortex and cerebellum
- Post-training ablations confirm that **CB bias actively structures recurrent representations**
## Practical Applications
### When to Use CB-RNN
1. **Temporal sequence tasks** where learning speed matters
2. **Resource-constrained settings** needing parameter efficiency
3. **Continual learning** scenarios where freezing the core prevents catastrophic interference
4. **Bio-inspired architecture research** studying cortico-cerebellar interactions
5. **Reservoir computing** variants needing adaptive readout mechanisms
### Architecture Variants
- **GRU-based CB**: Replace Elman RNN with GRU cell (see paper Appendix C)
- **Variable CB expansion ratio**: Larger ratios increase cerebellar capacity but add parameters
- **Multi-task setup**: Fixed cortex + task-specific cerebellar modules
- **Ablation studies**: Remove cerebellar module to verify its contribution (see paper Appendix H)
## Experimental Setup Reference
From the paper's methodology:
- **Tasks**: Temporal processing tasks of varying difficulty (see paper Appendix F)
- **Parameter matching**: CB-RNN and baseline RNN matched on total parameter count (see Appendix B)
- **Training regimes**: Standard end-to-end, frozen cortex, ablation conditions
- **Analysis tools**: PCA for population dimensionality (see Appendix G)
## Verification Steps
1. **Parameter matching**: Ensure CB-RNN and baseline have equal total parameters
2. **Learning speed**: Compare loss curves — CB-RNN should converge faster
3. **Fixed reservoir test**: Freeze RNN weights after N steps, train only CB module — should maintain advantage
4. **Ablation**: Remove CB module — performance should drop to baseline RNN level
5. **Dimensionality analysis**: PCA of hidden states should show distinct profiles for cortex vs cerebellum
## Pitfalls
- **Expansion ratio too high**: Cerebellar module becomes dominant, losing modularity benefits
- **No parameter matching**: Comparing unequal parameter counts invalidates efficiency claims
- **Task too simple**: Advantages manifest on temporally complex tasks, not trivial ones
- **Over-freezing**: Freezing cortex too early (before minimal representations form) degrades performance
- **Missing ablation**: Without ablation studies, cannot attribute benefits to cerebellar module specifically
## Related Skills
- `spiking-neural-network-analysis` — For SNN analysis methods
- `heterogeneous-synaptic-dynamics` — For synaptic heterogeneity modeling
- `brain-inspired-intelligence-paradigm` — For broader brain-inspired design
- `working-memory-heterogeneous-delays` — For working memory in RNNs
- `nonlinear-rnn-fixed-connectivity-solution` — For analytical RNN solutions
## References
- Voce, A., Giannakakis, E., & Clopath, C. (2026). "Cortico-cerebellar modularity as an architectural inductive bias for efficient temporal learning." arXiv:2605.10356 [q-bio.NC]
- Goyal, R., & Bengio, Y. (2022). Inductive biases for deep learning of higher-level cognition
- Marr, D. (1969). A theory of cerebellar cortex
- Cayco-Gajic, N.A., & Silver, R.A. (2019). Re-evaluating circuit mechanisms underlying pattern separation