Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.
npx versuz@latest install hiyenwong-ai-collection-collection-skills-efficient-reasoning-bcrgit clone https://github.com/hiyenwong/ai_collection.gitcp ai_collection/SKILL.MD ~/.claude/skills/hiyenwong-ai-collection-collection-skills-efficient-reasoning-bcr/SKILL.md---
name: efficient-reasoning-bcr
description: 'Reduce LLM reasoning token consumption using Batched Contextual Reinforcement (BCR). Use when optimizing inference costs for reasoning tasks, implementing efficient Chain-of-Thought, or discovering task-scaling laws. Based on arXiv:2604.02322 - A Task-Scaling Law for Efficient Reasoning.'
---
# Efficient Reasoning via Batched Contextual Reinforcement
Reduce token consumption in reasoning LLMs through a simple structural modification.
## Problem
Chain-of-Thought reasoning achieves strong performance but:
- Excessive token consumption inflates inference costs
- Explicit length penalties cause optimization collapse
- Existing efficiency methods degrade reasoning quality
## BCR: Batched Contextual Reinforcement
**Key insight**: Train model to solve N problems simultaneously within shared context window, rewarded purely by per-instance accuracy.
```python
# Standard: single problem per inference
prompt = problem_description
# BCR: N problems in shared context
prompt = f"""
Problem 1: {p1}
Problem 2: {p2}
...
Problem N: {pN}
Solve all problems.
"""
```
This creates an **implicit token budget** - models must allocate tokens efficiently across N problems.
## Task-Scaling Law Discovery
As N (concurrent problems) increases during inference:
- Per-problem token usage decreases monotonically
- Accuracy degrades far more gracefully than baselines
- N becomes a controllable throughput dimension
## "Free Lunch" Phenomenon
At standard single-problem inference (N=1), BCR models:
- Reduce token usage 15.8% to 62.6%
- Maintain or IMPROVE accuracy across benchmarks
- No explicit length supervision needed
This challenges the traditional accuracy-efficiency trade-off.
## Key Findings
1. **Implicit budget > explicit penalties**: Avoids adversarial gradients and catastrophic optimization collapse
2. **Self-regulated efficiency**: Models autonomously eliminate redundant metacognitive loops
3. **Stable training**: Single-stage, no complex curricula or difficulty estimators needed
## Benchmarks
Tested on 1.5B and 4B model families across 5 mathematical benchmarks:
- GSM8K
- MATH
- AIME
- Olympiad
- College math
## Implementation Guidelines
When applying BCR:
1. **Training**: Modify batch structure to include N problems per context
2. **Reward**: Use per-instance accuracy only (no length penalty)
3. **N selection**: Higher N = more efficiency, slight accuracy trade-off
4. **Inference**: Can use N > 1 for throughput gains, or N = 1 for best accuracy
## When to Apply
- Reducing inference costs for reasoning models
- Training efficient reasoning without length supervision
- Implementing implicit token budgeting
- Avoiding optimization collapse from explicit penalties
- High-throughput batch reasoning
## Why It Works
BCR creates implicit budget constraint through shared context:
- Model must fit N solutions in fixed window
- Competition for tokens forces efficiency
- No adversarial gradients from explicit penalties
- Emergent self-regulation eliminates verbose reasoning
## Paper Reference
arXiv:2604.02322 - "Batched Contextual Reinforcement: A Task-Scaling Law for Efficient Reasoning" (Apr 2026)
## Activation Keywords
- efficient-reasoning-bcr
- efficient-reasoning-bcr 技能
- efficient-reasoning-bcr skill
## Tools Used
- `read` - Read documentation and references
- `web_search` - Search for related information
- `web_fetch` - Fetch paper or documentation
## Instructions for Agents
Follow these steps when applying this skill:
### Step 1: Implicit budget > explicit penalties
### Step 2: Self-regulated efficiency
### Step 3: Stable training
### Step 4: Training
### Step 5: Reward
## Examples
### Example 1: Basic Application
**User:** I need to apply Efficient Reasoning via Batched Contextual Reinforcement to my analysis.
**Agent:** I'll help you apply efficient-reasoning-bcr. First, let me understand your specific use case...
**Context:** Apply the methodology
### Example 2: Advanced Scenario
**User:** Complex analysis scenario
**Agent:** Based on the methodology, I'll guide you through the advanced application...
### Example 2: Advanced Application
**User:** What are the key considerations for efficient-reasoning-bcr?
**Agent:** Let me search for the latest research and best practices...