Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.
npx versuz@latest install hiyenwong-ai-collection-collection-skills-hdc-stdp-emergentgit clone https://github.com/hiyenwong/ai_collection.gitcp ai_collection/SKILL.MD ~/.claude/skills/hiyenwong-ai-collection-collection-skills-hdc-stdp-emergent/SKILL.md---
name: hdc-stdp-emergent
description: >
Comprehensive methodology for emergent STDP in inverted Galois-field
Hyperdimensional Computing (HDC). Derives the closed-form expression for
STDP magnitude, documents the VaCoAl algorithm, SRAM-CAM hardware mapping,
catastrophic forgetting mitigation, and path-dependent semantic selection.
Bridges theoretical neuroscience with practical deterministic AI hardware.
tags:
- hyperdimensional-computing
- emergent-stdp
- galois-field
- vacoal
- sram-cam
- catastrophic-forgetting
- neuromorphic
- path-dependent-selection
trigger_keywords:
- hyperdimensional computing
- STDP emergent
- VaCoAl
- Galois field HDC
- SRAM-CAM neuromorphic
- catastrophic forgetting
- path-dependent selection
- closed-form STDP
- inverted Galois-field
- Vague Coincident Algorithm
- hypervector binding
- sparse distributed memory
related_skills:
- hyperdimensional-stdp-computing
- vacoul-hdc-sram-cam-ai
- spiking-neural-network-analysis
- snn-learning-survey
- brain-inspired-memory-ai-agents
paper:
title: >-
"Beyond LLMs, Sparse Distributed Memory, and Neuromorphics: A
Hyper-Dimensional SRAM-CAM 'VaCoAl' for Ultra-High Speed, Ultra-Low Power, and Low Cost"
authors: [Chuma, Otsuka, Sato]
arxiv: "2604.11665"
year: 2026
---
# Emergent STDP in Hyperdimensional Computing (VaCoAl)
## 1. Overview — Inverted Galois-Field HDC Architecture
### 1.1 The Paradigm Shift
Conventional HDC uses Galois-field (GF) algebra to **converge** toward a
unique correct answer — error-correction drives noise out and collapses
the representation to a single symbol.
The **inverted Galois-field HDC** architecture does the opposite:
- GF algebra is used as an **engine for relative similarity ranking**, not
convergence to a unique answer.
- Multiple candidate paths remain active simultaneously; their
**path-quality is ranked** by collision-tolerance and coherence decay.
- The system **selects semantically** based on path-dependent evidence
rather than deterministic unbinding alone.
### 1.2 Architecture Components
```
┌─────────────────────────────────────────────────────────┐
│ Inverted GF-HDC │
├──────────────┬──────────────┬───────────────────────────┤
│ Encoder │ Memory │ Decoder / Selector │
│ │ │ │
│ Symbol → HV │ SRAM-CAM │ Path-quality ranking │
│ Bind/Bundle │ store (HV,D) │ STDP-equivalent selection │
│ Permute │ lookup (HV) │ CR score (reliability) │
│ Unbind │ parallel │ Multi-path comparison │
└──────────────┴──────────────┴───────────────────────────┘
```
### 1.3 Key Architectural Properties
| Property | Conventional HDC | Inverted GF-HDC (VaCoAl) |
|---|---|---|
| GF purpose | Error correction → unique answer | Relative similarity ranking |
| Path handling | Single-path, deterministic | Multi-path, quality-ranked |
| Selection | Threshold-based match | Path-dependent semantic selection |
| Plasticity | Explicit weight updates | Emergent from algebraic structure |
| Hardware | SRAM or DRAM | SRAM-CAM with parallel match |
| Forgetting | Weight overwrite | Orthogonal hypervectors (no interference) |
### 1.4 Core HDC Operations (Galois-Field GF(q))
```python
class InvertedGaloisFieldHDC:
"""
Inverted Galois-field HDC for relative similarity ranking.
Instead of converging to a single answer, this architecture
maintains multiple candidate paths and ranks them by
collision-tolerance and coherence decay.
"""
def __init__(self, dim: int = 10_000, q: int = 2):
self.dim = dim
self.q = q # GF(q) field order
# --- Binding: GF(q) addition (⊗) ---
def bind(self, hv1: np.ndarray, hv2: np.ndarray) -> np.ndarray:
"""Element-wise GF(q) addition: hv ⊗ hv' = (hv + hv') mod q"""
return np.mod(hv1 + hv2, self.q)
# --- Unbinding: GF(q) subtraction (⊘) ---
def unbind(self, composite: np.ndarray, key: np.ndarray) -> np.ndarray:
"""Inverse binding: (composite ⊘ key) = (composite - key) mod q"""
return np.mod(composite - key, self.q)
# --- Bundling: GF(q) accumulation + threshold (⊕) ---
def bundle(self, vectors: list[np.ndarray]) -> np.ndarray:
"""Superposition: majority-rule over GF(q) elements."""
summed = np.sum(vectors, axis=0)
return np.mod(np.round(summed), self.q)
# --- Permutation: position encoding (ρ) ---
def permute(self, hv: np.ndarray, shift: int = 1) -> np.ndarray:
"""Cyclic shift for encoding sequence order."""
return np.roll(hv, shift)
# --- Similarity: Hamming match fraction ---
def similarity(self, hv1: np.ndarray, hv2: np.ndarray) -> float:
"""Fraction of matching elements (normalized Hamming similarity)."""
return np.mean(hv1 == hv2)
# --- Noise injection: simulate GF diffusion ---
def diffuse(self, hv: np.ndarray, noise_rate: float) -> np.ndarray:
"""
Apply GF(q) diffusion noise.
Each element flips to a random GF(q) value with probability noise_rate.
"""
mask = np.random.random(self.dim) < noise_rate
noisy = hv.copy()
noisy[mask] = np.random.randint(0, self.q, size=np.sum(mask))
return noisy
```
---
## 2. Derivation of Emergent STDP from HDC Operations
### 2.1 The Key Insight
In the inverted GF-HDC architecture, **collision-tolerance** — the ability
of the system to tolerate element-wise conflicts during bundling — creates
a **path-dependent pruning** mechanism:
1. **Multiple reasoning paths** compete simultaneously.
2. **Longer paths** accumulate more collision noise (more bundling
operations → more element-wise conflicts in GF(q)).
3. **Shorter paths** maintain higher coherence (fewer operations → less
accumulated noise).
4. The system **selects shorter, more coherent paths** — behaviorally
equivalent to STDP's temporal preference.
### 2.2 Formal Derivation
Let a reasoning path be a sequence of $L$ binding operations:
$$H_L = h_1 \otimes h_2 \otimes \cdots \otimes h_L$$
where each $h_i \in \text{GF}(q)^D$ is a hypervector of dimension $D$.
**Collision model**: When two hypervectors are bundled, each element
position experiences a collision with probability:
$$P_{\text{collision}} = 1 - \frac{1}{q}$$
For GF(2) (binary), $P_{\text{collision}} = 0.5$.
After $L$ bundling operations, the probability that an element remains
unperturbed is:
$$P_{\text{clean}}(L) = \left(\frac{1}{q}\right)^L$$
The **expected coherence** (fraction of unperturbed elements) after $L$
operations:
$$C(L) = \frac{1}{q^L} + \left(1 - \frac{1}{q^L}\right) \cdot \frac{1}{q}$$
For GF(2):
$$C(L) = \frac{1}{2^L} + \frac{1}{2} \left(1 - \frac{1}{2^L}\right) = \frac{1}{2} + \frac{1}{2^{L+1}}$$
### 2.3 Emergent STDP Mapping
Map path length $L$ to temporal interval $\Delta t$:
$$L \leftrightarrow \frac{|\Delta t|}{\tau_0}$$
where $\tau_0$ is a characteristic timescale (number of operations per
time unit).
The **coherence-based selection weight** becomes:
$$W(\Delta t) = C\left(\frac{|\Delta t|}{\tau_0}\right) - \frac{1}{q}$$
The subtraction of $1/q$ removes the baseline random-match level,
isolating the **signal above chance**.
For GF(2):
$$W(\Delta t) = \frac{1}{2^{L+1}} = \frac{1}{2} \cdot \exp\left(-\frac{|\Delta t|}{\tau_0} \ln 2\right)$$
This is **exponential decay** — identical to the STDP window function:
$$\Delta w = A \cdot \exp\left(-\frac{|\Delta t|}{\tau}\right)$$
where:
- $A = 1/2$ (maximum potentiation for GF(2))
- $\tau = \tau_0 / \ln 2$ (effective time constant)
### 2.4 Causal vs. Anti-Causal
The inverted GF-HDC architecture distinguishes causal from anti-causal
through **permutation direction**:
- **Causal** (pre → post): forward permutation $\rho(h)$
- **Anti-causal** (post → pre): inverse permutation $\rho^{-1}(h)$
The coherence differs:
$$W_{\text{causal}}(\Delta t > 0) = A_+ \exp(-|\Delta t|/\tau_+)$$
$$W_{\text{anti-causal}}(\Delta t < 0) = -A_- \exp(-|\Delta t|/\tau_-)$$
where the asymmetry $A_+ \neq A_-$ arises from **directional diffusion**
in the GF bundling process.
```python
def derive_stdp_from_hdc(
delta_t: float,
tau_0: float = 1.0,
q: int = 2,
asymmetric: bool = True
) -> float:
"""
Compute the emergent STDP weight change from HDC coherence decay.
Args:
delta_t: Time difference (post-spike time - pre-spike time).
tau_0: Characteristic timescale (operations per time unit).
q: Galois field order.
asymmetric: Whether to use causal/anti-causal asymmetry.
Returns:
delta_w: STDP-like weight change.
"""
L = abs(delta_t) / tau_0 # Path length equivalent
# Coherence above random baseline
coherence = (1/q**L) + (1 - 1/q**L) * (1/q)
baseline = 1/q
signal = coherence - baseline # = 1/q^(L+1)
if asymmetric:
if delta_t > 0:
# Causal: full signal
amplitude = 1.0
else:
# Anti-causal: attenuated (directional diffusion)
amplitude = -0.5
else:
amplitude = 1.0 if delta_t > 0 else -1.0
return amplitude * signal
```
---
## 3. Closed-Form Expression for STDP Magnitude
### 3.1 The Expression
The closed-form prediction for the STDP-equivalent magnitude in the
inverted GF-HDC architecture:
$$\boxed{\Delta W(L) = \frac{1}{q^{L+1}}}$$
where $L$ is the effective path length (number of bundling/bind operations).
In temporal form:
$$\boxed{\Delta W(\Delta t) = \frac{1}{q} \cdot \exp\left(-\frac{|\Delta t|}{\tau_0} \ln q\right)}$$
### 3.2 Parameter Mapping
| Biological STDP | HDC Equivalent | Expression |
|---|---|---|
| Maximum weight change $A$ | $1/q$ | $0.5$ for GF(2) |
| Time constant $\tau$ | $\tau_0 / \ln q$ | $\tau_0 / \ln 2 \approx 1.44\tau_0$ |
| Exponential decay | Coherence decay | $q^{-L}$ |
| Asymmetry ratio | Diffusion directionality | $A_-/A_+ \approx 0.5$ |
### 3.3 Validation Against Measurements
The closed-form expression matches **measured values** from the VaCoAl
system:
```python
def validate_closed_form(
measured_values: list[float],
path_lengths: list[int],
q: int = 2,
tolerance: float = 0.05
) -> dict:
"""
Validate the closed-form STDP prediction against measured data.
Args:
measured_values: Experimentally measured STDP magnitudes.
path_lengths: Corresponding path lengths for each measurement.
q: Galois field order.
tolerance: Acceptable relative error.
Returns:
Dictionary with validation results.
"""
predicted = [1.0 / (q ** (L + 1)) for L in path_lengths]
errors = []
for m, p in zip(measured_values, predicted):
rel_error = abs(m - p) / (p + 1e-10)
errors.append(rel_error)
return {
"predicted": predicted,
"measured": measured_values,
"relative_errors": errors,
"max_error": max(errors),
"mean_error": sum(errors) / len(errors),
"within_tolerance": all(e < tolerance for e in errors),
}
```
### 3.4 Numerical Examples
```python
# STDP magnitude table for GF(2)
print(f"{'Path L':<8} {'ΔW (closed-form)':<20}")
print(f"{'─' * 8} {'─' * 20}")
for L in range(1, 8):
dw = 1.0 / (2 ** (L + 1))
print(f"{L:<8} {dw:<20.6f}")
# Output:
# Path L ΔW (closed-form)
# ──────── ────────────────────
# 1 0.250000
# 2 0.125000
# 3 0.062500
# 4 0.031250
# 5 0.015625
# 6 0.007812
# 7 0.003906
```
---
## 4. VaCoAl Algorithm — Path-Dependent Selection
### 4.1 Algorithm Overview
**VaCoAl** (Vague Coincident Algorithm) performs **path-dependent semantic
selection** by ranking competing reasoning paths using coherence metrics
derived from the inverted GF-HDC architecture.
### 4.2 Full Algorithm
```python
class VaCoAlAlgorithm:
"""
VaCoAl: Vague Coincident Algorithm for path-dependent semantic selection.
Steps:
1. Encode all candidate items as hypervectors
2. Build reasoning paths via binding sequences
3. Score each path by coherence decay (CR Score)
4. Select top-ranked path(s) based on quality
5. Return result with reliability metric
"""
def __init__(self, dim: int = 10_000, q: int = 2):
self.hdc = InvertedGaloisFieldHDC(dim, q)
self.dim = dim
self.q = q
self.memory: dict[str, np.ndarray] = {}
self.path_registry: list[dict] = []
# --- Step 1: Encode ---
def encode(self, symbol: str) -> np.ndarray:
"""Encode symbol as random hypervector (or retrieve cached)."""
if symbol not in self.memory:
self.memory[symbol] = np.random.randint(0, self.q, size=self.dim)
return self.memory[symbol].copy()
# --- Step 2: Build paths ---
def build_path(
self, start: str, relations: list[tuple[str, str]]
) -> dict:
"""
Build a reasoning path via sequential binding.
Args:
start: Starting symbol.
relations: List of (relation_name, target_symbol) tuples.
Returns:
Path dictionary with composite hypervector and metadata.
"""
current = self.encode(start)
composites = [current.copy()]
for rel_name, target in relations:
rel_hv = self.encode(f"REL:{rel_name}")
target_hv = self.encode(target)
# Bind: current ⊗ relation → query vector
query = self.hdc.bind(current, rel_hv)
# Bundle with target for storage
composite = self.hdc.bundle([query, target_hv])
composites.append(composite.copy())
current = target_hv.copy()
return {
"start": start,
"relations": relations,
"composites": composites,
"length": len(relations),
}
# --- Step 3: Score paths (CR Score) ---
def score_path(self, path: dict) -> float:
"""
Compute CR (Coherence Reliability) Score for a path.
CR Score = average pairwise coherence between consecutive composites.
Penalizes longer paths (more collisions = lower coherence).
"""
composites = path["composites"]
L = path["length"]
if L == 0:
return 1.0
similarities = []
for i in range(len(composites) - 1):
sim = self.hdc.similarity(composites[i], composites[i + 1])
similarities.append(sim)
avg_coherence = np.mean(similarities)
# Length penalty (exponential decay, matches closed-form)
length_penalty = 1.0 / (self.q ** (L + 1))
# Combine: CR Score is coherence scaled by expected signal
cr_score = avg_coherence * length_penalty
return cr_score
# --- Step 4: Rank and select ---
def rank_paths(self, paths: list[dict]) -> list[dict]:
"""Score and rank all candidate paths."""
scored = []
for path in paths:
cr = self.score_path(path)
# Also compute closed-form prediction
closed_form = 1.0 / (self.q ** (path["length"] + 1))
scored.append({
**path,
"cr_score": cr,
"closed_form_prediction": closed_form,
})
scored.sort(key=lambda x: x["cr_score"], reverse=True)
return scored
# --- Step 5: Select ---
def select(
self, start: str, candidate_paths: list[list[tuple[str, str]]],
top_k: int = 1
) -> list[dict]:
"""
Build, score, and select the best reasoning paths.
Args:
start: Starting symbol.
candidate_paths: List of relation sequences to evaluate.
top_k: Number of top paths to return.
"""
built = [self.build_path(start, rels) for rels in candidate_paths]
ranked = self.rank_paths(built)
return ranked[:top_k]
# --- Full multi-hop reasoning ---
def reason(
self, query: tuple[str, str], knowledge: list[tuple[str, str, str]]
) -> dict:
"""
Full VaCoAl reasoning with multi-path exploration.
Args:
query: (subject, relation) to resolve.
knowledge: List of (subject, relation, object) triples.
Returns:
Result with answer, CR score, and all explored paths.
"""
subject, relation = query
# Encode knowledge
for s, r, o in knowledge:
self.encode(s)
self.encode(f"REL:{r}")
self.encode(o)
# Find all paths from subject through relation
direct_hits = [(s, o) for s, r, o in knowledge if s == subject and r == relation]
if direct_hits:
# Direct match — highest coherence
best_o = direct_hits[0][1]
return {
"answer": best_o,
"cr_score": 1.0 / (self.q ** 2), # L=1 path
"path_type": "direct",
"all_answers": [o for _, o in direct_hits],
}
# Multi-hop: find intermediate paths
intermediates = [
(s, o) for s, r, o in knowledge if s == subject
]
candidate_paths = []
for _, inter in intermediates:
second_hops = [
(rel, o) for s, r, o in knowledge
if s == inter and r != relation
]
for r2, o2 in second_hops:
candidate_paths.append([(relation, inter), (r2, o2)])
if not candidate_paths:
return {"answer": None, "cr_score": 0.0, "path_type": "no_path"}
ranked = self.select(subject, candidate_paths, top_k=1)
best = ranked[0]
last_rel, last_target = best["relations"][-1]
return {
"answer": last_target,
"cr_score": best["cr_score"],
"path_type": "multi_hop",
"path": best["relations"],
"length": best["length"],
}
```
### 4.3 Path-Dependent Selection Properties
1. **Preferential attachment to short paths**: The $q^{-(L+1)}$ decay
means shorter paths are exponentially favored.
2. **No explicit training**: Selection emerges from algebraic properties,
not learned weights.
3. **Deterministic**: Same inputs always produce same rankings.
4. **Transparent**: CR Score provides interpretable confidence.
5. **Parallelizable**: All paths can be scored simultaneously.
---
## 5. SRAM-CAM Hardware Implementation
### 5.1 Hardware Architecture
VaCoAl maps naturally to **SRAM-CAM** (Static RAM + Content-Addressable
Memory) because:
| HDC Operation | SRAM-CAM Hardware Mapping |
|---|---|
| **Store** | Write hypervector to SRAM row |
| **Lookup** | CAM parallel match: all rows compared simultaneously |
| **Bind (GF add)** | XOR gate array (bitwise parallel) |
| **Unbind (GF sub)** | Same XOR (self-inverse in GF(2)) |
| **Bundle (majority)** | Column-wise population count + threshold |
| **Similarity** | XNOR + popcount (single cycle in CAM) |
| **Permute** | Fixed wiring permutation (no logic needed) |
### 5.2 SRAM-CAM Data Path
```
┌──────────────────────────────────────────────────────┐
│ SRAM-CAM Array │
│ │
│ [HV Row 0] ──┐ │
│ [HV Row 1] ──┤ │
│ [HV Row 2] ──┤──→ Parallel match lines ──→ Match │
│ [HV Row 3] ──┤ flags
│ [HV Row N] ──┘ │
│ │
│ Query HV ──→ CAM match lines (all N rows in 1 cycle) │
│ │
│ XOR array ──→ Bind/Unbind (element-wise GF add) │
│ Popcount ──→ Similarity score │
│ Majority ──→ Bundle output │
└──────────────────────────────────────────────────────┘
```
### 5.3 Hardware Implementation Guide
```python
class SRAM_CAM_VaCoAl:
"""
Simulated SRAM-CAM implementation of VaCoAl.
Maps HDC operations to hardware-equivalent circuits.
"""
def __init__(self, dim: int = 10_000, num_rows: int = 1024):
self.dim = dim
self.num_rows = num_rows
# SRAM array: each row stores one hypervector
self.sram = np.zeros((num_rows, dim), dtype=np.uint8)
self.valid = np.zeros(num_rows, dtype=bool)
self.next_row = 0
# --- SRAM WRITE: Single cycle ---
def write(self, index: int, hv: np.ndarray):
"""Write hypervector to SRAM row. Single-cycle write."""
if index < self.num_rows:
self.sram[index] = hv.astype(np.uint8)
self.valid[index] = True
# --- CAM MATCH: Parallel (1 cycle) ---
def cam_match(self, query: np.ndarray) -> list[tuple[int, float]]:
"""
Content-addressable match: compare query against ALL rows.
Hardware: Match lines activate in parallel for all rows.
Returns (row_index, similarity) for all valid rows.
"""
results = []
for i in range(self.num_rows):
if not self.valid[i]:
continue
# XNOR + popcount in hardware (single cycle)
matches = np.sum(self.sram[i] == query)
similarity = matches / self.dim
results.append((i, similarity))
return sorted(results, key=lambda x: x[1], reverse=True)
# --- BIND: XOR gate array (1 cycle) ---
def bind_hw(self, row_a: int, row_b: int) -> np.ndarray:
"""
Hardware bind: XOR gate array between two SRAM rows.
All dim bits processed in parallel. 1 cycle latency.
"""
a = self.sram[row_a].astype(np.uint8)
b = self.sram[row_b].astype(np.uint8)
return np.bitwise_xor(a, b)
# --- BUNDLE: Column popcount + threshold ---
def bundle_hw(self, row_indices: list[int]) -> np.ndarray:
"""
Hardware bundle: column-wise population count + majority threshold.
All columns processed in parallel. 1 cycle for popcount + 1 for threshold.
"""
subset = self.sram[row_indices].astype(np.uint8)
counts = np.sum(subset, axis=0)
return (counts >= len(row_indices) / 2).astype(np.uint8)
# --- Performance estimates (typical 28nm CMOS) ---
def hardware_specs(self) -> dict:
return {
"lookup_latency_cycles": 1, # CAM parallel match
"bind_latency_cycles": 1, # XOR gate array
"bundle_latency_cycles": 2, # Popcount + threshold
"permute_latency_cycles": 0, # Fixed wiring (no delay)
"power_lookup_nJ": 0.01, # Ultra-low: match lines
"power_bind_pJ": 0.001, # XOR gates
"power_idle_nW": 100, # SRAM standby
"throughput_paths_per_us": 1000, # Parallel path evaluation
"area_mm2": 1.5, # 1024 rows × 10K dims
"process_node_nm": 28,
}
```
### 5.4 Hardware Design Considerations
1. **Bit-serial vs. bit-parallel**: For $D > 4096$, consider bit-serial
processing to reduce routing congestion.
2. **Match-line capacitance**: Long CAM match lines slow down with more
rows. Partition into sub-arrays (e.g., 256 rows each).
3. **Power gating**: Only activate rows that are valid. Unused rows
contribute to leakage.
4. **Error tolerance**: The inherent noise tolerance of HDC means
occasional bit-flip errors in SRAM do not significantly affect results.
5. **Scalability**: Adding more rows increases capacity linearly but
match time sub-linearly (due to partitioning).
---
## 6. Catastrophic Forgetting Mitigation
### 6.1 Why HDC Avoids Catastrophic Forgetting
Traditional neural networks suffer from catastrophic forgetting because:
- **Shared weights**: New patterns modify the same weights used by old
patterns.
- **Gradient interference**: Updates for new data corrupt old knowledge.
Inverted GF-HDC **does not have this problem**:
- **Orthogonal storage**: Each symbol is stored as a random high-dimensional
hypervector. With $D = 10{,}000$, the expected cosine similarity between
any two random vectors is $\approx 0$.
- **No weight sharing**: Adding a new item simply writes a new row — it
does not modify existing rows.
- **Additive memory**: Bundling is additive; removing an item requires
unbinding, not weight overwriting.
### 6.2 Forgetting Mechanisms (When Desired)
While catastrophic forgetting is avoided, the system supports
**controlled forgetting**:
```python
class ForgettingMechanism:
"""
Controlled forgetting mechanisms for HDC memory.
These are optional — the base system has no forgetting.
"""
def __init__(self, hdc: InvertedGaloisFieldHDC):
self.hdc = hdc
self.usage_counts: dict[str, int] = {}
self.timestamps: dict[str, float] = {}
def track_access(self, symbol: str):
"""Track symbol access for LRU-based forgetting."""
self.usage_counts[symbol] = self.usage_counts.get(symbol, 0) + 1
def lru_forget(
self, memory: dict[str, np.ndarray],
keep_top_k: int
) -> dict[str, np.ndarray]:
"""
Least Recently Used forgetting.
Keep top_k most frequently accessed symbols.
"""
sorted_items = sorted(
memory.items(),
key=lambda x: self.usage_counts.get(x[0], 0),
reverse=True
)
return dict(sorted_items[:keep_top_k])
def time_decay_forget(
self, memory: dict[str, np.ndarray],
current_time: float, decay_threshold: float
) -> dict[str, np.ndarray]:
"""
Time-based decay forgetting.
Remove symbols not accessed since decay_threshold.
"""
return {
symbol: hv for symbol, hv in memory.items()
if self.timestamps.get(symbol, 0) > decay_threshold
}
def capacity_forget(
self, memory: dict[str, np.ndarray],
max_items: int, strategy: str = "lru"
) -> dict[str, np.ndarray]:
"""
Capacity-based forgetting when memory is full.
"""
if len(memory) <= max_items:
return memory
if strategy == "lru":
return self.lru_forget(memory, max_items)
elif strategy == "random":
import random
to_remove = random.sample(
list(memory.keys()), len(memory) - max_items
)
return {k: v for k, v in memory.items() if k not in to_remove}
else:
raise ValueError(f"Unknown strategy: {strategy}")
```
### 6.3 Interference Analysis
Even without explicit forgetting, there is **theoretical interference**
from bundling multiple items:
**Expected interference after storing $N$ items in $D$ dimensions (GF(2))**:
$$P_{\text{error}}(N, D) \approx \frac{1}{2} \cdot \text{erfc}\left(\frac{\sqrt{D}}{2\sqrt{N}}\right)$$
For $D = 10{,}000$, $N = 1{,}000$:
$$P_{\text{error}} \approx \frac{1}{2} \cdot \text{erfc}\left(\frac{100}{2\sqrt{1000}}\right) \approx \frac{1}{2} \cdot \text{erfc}(1.58) \approx 0.013$$
**Practical guidance**:
- $D = 10{,}000$: Can store ~1,000 items with <2% error
- $D = 100{,}000$: Can store ~10,000 items with <2% error
- Error probability decreases **exponentially** with $D$
### 6.4 Comparison with Neural Network Forgetting
| Aspect | Traditional NN | Inverted GF-HDC |
|---|---|---|
| **New pattern storage** | Modifies shared weights | Writes new row (no modification) |
| **Old pattern degradation** | Gradient interference | Only from bundling collisions |
| **Degradation rate** | Rapid (catastrophic) | Gradual (controlled by $D/N$ ratio) |
| **Mitigation needed** | Replay, regularization, elastic weights | None by default; optional capacity mgmt |
| **Perfect recall** | Impossible after many updates | Achievable if $D \gg N$ |
---
## 7. Comparison with Traditional STDP
### 7.1 Side-by-Side Comparison
| Feature | Biological STDP | Emergent STDP (Inverted GF-HDC) |
|---|---|---|
| **Mechanism** | Calcium-dependent synaptic modification | Collision-tolerance in GF bundling |
| **Time window** | ~20-100 ms | Path length (abstract time) |
| **Potentiation** | Pre→Post: $A_+ e^{-\Delta t/\tau_+}$ | Coherence: $1/q^{L+1}$ |
| **Depression** | Post→Pre: $-A_- e^{\Delta t/\tau_-}$ | Anti-causal diffusion attenuation |
| **Implementation** | Biological synapses | XOR gates + popcount in SRAM-CAM |
| **Energy per update** | ~10⁶ ATP molecules | ~1 pJ (CMOS) |
| **Speed** | Milliseconds | Nanoseconds (hardware) |
| **Determinism** | Stochastic (probabilistic release) | Deterministic (algebraic) |
| **Predictability** | Empirical fit | Closed-form exact prediction |
| **Scalability** | Limited by biology | Linear with $D$ and rows |
| **Hardware** | N/A (biological) | Standard CMOS SRAM-CAM |
| **Training required** | No (inherent to synapse) | No (emergent from architecture) |
| **Forgetting** | Natural synaptic decay | Orthogonal storage (no interference) |
### 7.2 Why Emergent STDP Matters
1. **No explicit learning rule needed**: The STDP-like behavior emerges
from the algebraic structure — there is no separate "learning" phase.
2. **Hardware efficiency**: Implementing STDP in neuromorphic hardware
requires complex spike-timing circuits. In GF-HDC, it's just XOR +
popcount.
3. **Predictable**: The closed-form expression allows **a priori**
prediction of learning behavior — no empirical fitting.
4. **Scalable**: Works at any dimension $D$ and any number of paths.
5. **Bridges paradigms**: Shows that biological learning rules can emerge
from purely algebraic/computational structures.
### 7.3 Equivalence Proof Sketch
The equivalence between traditional STDP and emergent STDP rests on three
observations:
1. **Exponential decay**: Both use exponential decay with temporal/spatial
distance.
- STDP: $\Delta w \propto e^{-|\Delta t|/\tau}$
- HDC: $\Delta W \propto q^{-L} = e^{-L \ln q}$
2. **Causal asymmetry**: Both show stronger potentiation for causal
(pre→post) than anti-causal ordering.
- STDP: $A_+ > A_-$
- HDC: Directional diffusion creates $A_-/A_+ \approx 0.5$
3. **Path competition**: Both select the "best" path among competing
alternatives.
- STDP: Strongest synapses survive, weak ones depress
- HDC: Highest-coherence paths win, noisy ones are pruned
---
## 8. Implementation Guide — Step by Step
### Step 1: Set Up the HDC Basis
```python
import numpy as np
DIM = 10_000
Q = 2 # GF(2) — binary hypervectors
class HDCBasis:
def __init__(self, dim, q, seed=42):
self.dim = dim
self.q = q
self.rng = np.random.RandomState(seed)
self.vectors = {}
def get(self, symbol):
if symbol not in self.vectors:
self.vectors[symbol] = self.rng.randint(0, self.q, self.dim)
return self.vectors[symbol].copy()
def similarity(self, a, b):
return np.mean(a == b)
```
### Step 2: Implement Core Operations
```python
def bind(a, b, q=2):
"""GF(q) binding (element-wise addition)."""
return np.mod(a + b, q)
def unbind(composite, key, q=2):
"""GF(q) unbinding (element-wise subtraction)."""
return np.mod(composite - key, q)
def bundle(vectors, q=2):
"""GF(q) bundling (majority vote)."""
return np.mod(np.round(np.sum(vectors, axis=0)), q)
def permute(hv, shift=1):
"""Cyclic permutation for position encoding."""
return np.roll(hv, shift)
```
### Step 3: Compute Emergent STDP
```python
def emergent_stdp(delta_t, tau_0=1.0, q=2):
"""
Compute emergent STDP magnitude from closed-form expression.
Args:
delta_t: Time difference (positive = causal).
tau_0: Operations per time unit.
q: GF order.
Returns:
Weight change (positive = potentiation, negative = depression).
"""
L = abs(delta_t) / tau_0
magnitude = 1.0 / (q ** (L + 1))
# Asymmetric: causal positive, anti-causal negative
sign = 1.0 if delta_t > 0 else -0.5
return sign * magnitude
```
### Step 4: Build and Rank Paths (VaCoAl)
```python
def build_and_rank_paths(start, path_candidates, basis, q=2):
"""
Build reasoning paths and rank by CR score.
Args:
start: Starting symbol.
path_candidates: List of relation sequences.
basis: HDCBasis instance.
q: GF order.
"""
scored_paths = []
for rels in path_candidates:
# Build path
current = basis.get(start)
composites = [current.copy()]
for rel_name, target in rels:
rel_hv = basis.get(f"REL:{rel_name}")
query = bind(current, rel_hv, q)
target_hv = basis.get(target)
composite = bundle([query, target_hv], q)
composites.append(composite.copy())
current = target_hv.copy()
# Score
L = len(rels)
similarities = [
basis.similarity(composites[i], composites[i+1])
for i in range(len(composites) - 1)
]
avg_coherence = np.mean(similarities)
cr_score = avg_coherence / (q ** (L + 1))
closed_form = 1.0 / (q ** (L + 1))
scored_paths.append({
"path": rels,
"cr_score": cr_score,
"closed_form": closed_form,
"length": L,
})
scored_paths.sort(key=lambda x: x["cr_score"], reverse=True)
return scored_paths
```
### Step 5: Verify Closed-Form Prediction
```python
def verify_prediction(measured_dws, path_lengths, q=2):
"""Verify closed-form STDP prediction matches measurements."""
predicted = [1.0 / (q ** (L + 1)) for L in path_lengths]
print(f"{'Length':<8} {'Measured':<12} {'Predicted':<12} {'Error':<10}")
print("─" * 42)
for L, m, p in zip(path_lengths, measured_dws, predicted):
err = abs(m - p) / p
print(f"{L:<8} {m:<12.6f} {p:<12.6f} {err*100:<10.2f}%")
mean_err = np.mean([abs(m-p)/p for m, p in zip(measured_dws, predicted)])
print(f"\nMean relative error: {mean_err*100:.2f}%")
return mean_err < 0.05 # 5% tolerance
```
---
## 9. Quick Reference Card
| Concept | Formula / Value |
|---|---|
| **STDP closed-form** | $\Delta W(L) = q^{-(L+1)}$ |
| **Temporal form** | $\Delta W(\Delta t) = q^{-1} \cdot \exp(-|\Delta t|\ln q/\tau_0)$ |
| **GF(2) max weight** | $A = 0.5$ |
| **Effective time constant** | $\tau = \tau_0 / \ln 2 \approx 1.44\tau_0$ |
| **Interference prob** | $P_{\text{error}} \approx \frac{1}{2}\text{erfc}(\sqrt{D}/2\sqrt{N})$ |
| **Similarity** | Hamming match fraction: $\frac{1}{D}\sum_i [h_{1,i} = h_{2,i}]$ |
| **CR Score** | $\text{avg\_coherence} \times q^{-(L+1)}$ |
| **Hardware latency** | Lookup: 1 cycle, Bind: 1 cycle, Bundle: 2 cycles |
| **Energy per op** | ~1 pJ (CMOS 28nm) |
---
## 10. Pitfalls and Troubleshooting
### Pitfall 1: Dimensionality Too Low
- **Symptom**: High interference, poor recall, CR scores unreliable.
- **Fix**: Increase $D$ to at least 4,096. For >1,000 items, use $D \geq
10{,}000$.
### Pitfall 2: GF Order Mismatch
- **Symptom**: Binding/unbinding produces incorrect results.
- **Fix**: Ensure all operations use the same $q$. GF(2) uses XOR, GF(q>2)
uses modular arithmetic.
### Pitfall 3: Permutation Direction
- **Symptom**: Sequence decoding produces wrong order.
- **Fix**: Encode with $\rho^i$, decode with $\rho^{-i}$. Permutation is
its own inverse only for specific shift values.
### Pitfall 4: Bundle Saturation
- **Symptom**: Bundled vector converges to all-same value (no information).
- **Fix**: Limit bundle size or use weighted bundling. With $N$ items in
GF(2), need $D \gg N$ to avoid saturation.
### Pitfall 5: Misinterpreting Emergent STDP
- **Symptom**: Expecting biological fidelity from algebraic equivalence.
- **Fix**: The equivalence is at the **functional level** (same mathematical
form), not the mechanistic level (different physical process). Use it
for efficient computation, not neuroscience simulation.
---
## 11. Related Work and Further Reading
- **Kanerva (2009)**: Hyperdimensional Computing — foundational HDC paper
- **Plate (2003)**: Holographic Reduced Representation — binding theory
- **Gallant & Okaywe (2013)**: Representing object relations in HDC
- **Rachkovskij & Kussul (2001)**: HDC for analogical reasoning
- **Chuma, Otsuka, Sato (2026)**: VaCoAl paper — arXiv:2604.11665
- **Bi & Poo (1998)**: Original STDP discovery in biological synapses
- **Song, Abbott & Nelson (2000)**: STDP theory and modeling
---
## 12. When to Use This Skill
Use this skill when:
- Implementing **hyperdimensional computing** with path-dependent selection
- Designing **SRAM-CAM hardware** for neuromorphic inference
- Addressing **catastrophic forgetting** in continual learning systems
- Needing **deterministic reasoning** with interpretable confidence scores
- Exploring **emergent learning rules** from algebraic structures
- Building **edge AI** systems with ultra-low power requirements
- Researching **STDP equivalents** in non-biological systems