Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.
npx versuz@latest install hiyenwong-ai-collection-collection-skills-globally-optimal-snn-traininggit clone https://github.com/hiyenwong/ai_collection.gitcp ai_collection/SKILL.MD ~/.claude/skills/hiyenwong-ai-collection-collection-skills-globally-optimal-snn-training/SKILL.md--- name: globally-optimal-snn-training category: training description: Globally optimal SNN training via parameter reconstruction. Convexifies parallel recurrent threshold networks (subsuming parallel SNNs) and uses parameter reconstruction to avoid surrogate gradient approximation errors. Use when training SNNs without surrogate gradients or seeking globally optimal solutions. --- # Globally Optimal SNN Training via Parameter Reconstruction ## Problem (arXiv:2605.08022) SNN training typically relies on **surrogate gradients** due to the non-differentiability of the spike function. This introduces approximation errors that **accumulate across layers**, degrading training quality, especially in deep networks. ## Core Approach ### Theoretical Foundation - Extend convexification of parallel feedforward threshold networks to **parallel recurrent threshold networks** - Parallel SNNs are a structured special case of parallel recurrent threshold networks - This convexification enables globally optimal training without gradient approximation ### Parameter Reconstruction Algorithm 1. **Convexify the SNN**: Reformulate the non-convex SNN training problem as a convex optimization over parallel recurrent threshold networks 2. **Solve the convex problem**: Obtain globally optimal parameters in the convexified space 3. **Reconstruct SNN parameters**: Map the convex solution back to the original SNN parameter space 4. **Optional hybrid**: Combine with surrogate-gradient training for further improvement ### Advantages - **No surrogate gradient approximation**: Eliminates accumulated errors across layers - **Global optimality**: Convex formulation guarantees finding the global optimum - **Data scalability**: Demonstrates consistent scaling with data size - **Robust to model configurations**: Works across various SNN architectures ### Usage Patterns **Standalone**: Use parameter reconstruction as the sole training method **Hybrid**: Combine with surrogate-gradient training for enhanced performance - Initialize with parameter reconstruction, fine-tune with surrogate gradients - Or use reconstruction to refine surrogate-gradient trained models ## Relationship to Existing Methods | Method | Gradient Type | Optimality | Error Accumulation | |--------|--------------|------------|-------------------| | Surrogate Gradient | Approximate | Local | Yes (accumulates across layers) | | Parameter Reconstruction | Exact (convex) | Global | No | | Hybrid | Both | Improved | Reduced | ## Pitfalls - **Network structure constraint**: Only applies to parallel recurrent threshold networks (subsuming parallel SNNs) - **Reconstruction accuracy**: The mapping from convex to original space may not be perfect - **Computational cost**: Convex optimization may be expensive for very large networks - **Non-parallel SNNs**: Sequential/temporal SNNs may not fit this framework directly ## Activation Keywords - globally optimal SNN, parameter reconstruction, convex SNN training - surrogate gradient alternative, threshold network convexification - parallel recurrent threshold network, SNN training without approximation - recurrent SNN, exact gradient SNN