DocumentjeremylongshoreFree

clade-rate-limits

Handle Anthropic rate limits — understand tiers, implement backoff, Use when working with rate-limits patterns. optimize throughput, and monitor usage. Trigger with "anthropic rate limit", "claude 429", "anthropic throttling", "anthropic usage limits", "claude tokens per minute".

Repo bundle on Versuzjeremylongshore/claude-code-plugins-plus-skills1001 indexed entries (SKILL.md and CLAUDE.md) from this repository — open the full bundle view.

Open bundle →

View on GitHub ↗</>github.com/jeremylongshore/claude-code-plugins-plus-skills Yours? Claim it ↗

§ 01 — Stats

Stars2.2k

Prior1190

Quality—

Score—

Tasks—

§ 02 — Install

Get clade-rate-limits.

Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.

One-line install · Claude Code

npx versuz@latest install jeremylongshore-claude-code-plugins-plus-skills-plugins-saas-packs-claude-pack-skills-clade-rate-limits

Or clone the repo

$git clone https://github.com/jeremylongshore/claude-code-plugins-plus-skills.git

Or copy the SKILL.md manually

More Versuz picks

★ Featured$1.99

vz-bench-debug

Document

★ Featured$0.99

vz-scrape-runner

Web

Got something better ?Submit your skill — it enters tomorrow's cycle. No fee.

Submit yours →

§ 05 — Challenge

Think you can beat it?

$npx versuz challenge jeremylongshore-claude-code-plugins-plus-skills-plugins-saas-packs-claude-pack-skills-clade-rate-limits↵

Show SKILL.md content (~1.3k tokens)

---
name: clade-rate-limits
description: "Handle Anthropic rate limits \u2014 understand tiers, implement backoff,\n\
  Use when working with rate-limits patterns.\noptimize throughput, and monitor usage.\n\
  Trigger with \"anthropic rate limit\", \"claude 429\", \"anthropic throttling\"\
  ,\n\"anthropic usage limits\", \"claude tokens per minute\".\n"
allowed-tools: Read, Write, Edit
version: 1.0.0
license: MIT
author: Jeremy Longshore <jeremy@intentsolutions.io>
tags:
- saas
- anthropic
- claude
- rate-limits
compatibility: Designed for Claude Code
---
# Anthropic Rate Limits

## Overview
Anthropic enforces three types of limits: requests per minute (RPM), input tokens per minute (TPM), and output tokens per minute. Limits depend on your spend tier.

## Rate Limit Tiers

| Tier | Qualification | RPM | Input TPM | Output TPM |
|------|--------------|-----|-----------|------------|
| Tier 1 | Free | 50 | 40,000 | 8,000 |
| Tier 2 | $40+ spend | 1,000 | 80,000 | 16,000 |
| Tier 3 | $200+ spend | 2,000 | 160,000 | 32,000 |
| Tier 4 | $400+ spend | 4,000 | 400,000 | 80,000 |
| Scale | Custom | Custom | Custom | Custom |

> **Check your tier:** console.anthropic.com → Settings → Limits

## Response Headers
Every API response includes rate limit headers:
```
claude-ratelimit-requests-limit: 1000
claude-ratelimit-requests-remaining: 998
claude-ratelimit-requests-reset: 2025-01-01T00:01:00Z
claude-ratelimit-tokens-limit: 80000
claude-ratelimit-tokens-remaining: 79500
claude-ratelimit-tokens-reset: 2025-01-01T00:01:00Z
retry-after: 5
```

## Built-In SDK Retries
The SDK automatically retries 429 and 529 errors with exponential backoff:
```typescript
import Anthropic from '@claude-ai/sdk';

const client = new Anthropic({
  maxRetries: 3, // default: 2. Set to 0 to disable.
});
```

## Custom Backoff
```typescript
async function callWithBackoff(params: Anthropic.MessageCreateParams, maxRetries = 5) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return await client.messages.create(params);
    } catch (err) {
      if (err instanceof Anthropic.RateLimitError) {
        const retryAfter = Number(err.headers?.['retry-after'] || 2 ** attempt);
        const jitter = Math.random() * 1000;
        console.log(`Rate limited. Retry in ${retryAfter}s (attempt ${attempt + 1})`);
        await new Promise(r => setTimeout(r, retryAfter * 1000 + jitter));
      } else {
        throw err;
      }
    }
  }
  throw new Error('Exceeded max retries');
}
```

## Throughput Optimization
| Strategy | Impact |
|----------|--------|
| Use Message Batches API | Bypasses rate limits entirely (async, 24h SLA) |
| Use prompt caching | Cached tokens don't count toward input TPM |
| Use smaller models for simple tasks | Lower token counts = more requests per minute |
| Pre-count tokens with `countTokens` | Avoid wasted requests that will fail |
| Queue and batch requests | Smooth out bursts |

## Token Counting
```typescript
// Count before sending — avoid burning RPM on requests that'll fail
const count = await client.messages.countTokens({
  model: 'claude-sonnet-4-20250514',
  messages,
  system: systemPrompt,
});
console.log(`This request will use ${count.input_tokens} input tokens`);
```

## Python
```python
import anthropic
import time

client = anthropic.Anthropic(max_retries=5)

# Or manual handling:
try:
    message = client.messages.create(...)
except anthropic.RateLimitError as e:
    retry_after = float(e.response.headers.get("retry-after", 5))
    time.sleep(retry_after)
```

## Output
- Rate limit tier identified from response headers
- SDK configured with appropriate `maxRetries` setting
- Custom backoff implemented with jitter for high-throughput use cases
- Throughput optimized using batches, caching, or model selection

## Error Handling
| Error | Cause | Solution |
|-------|-------|----------|
| API Error | Check error type and status code | See `clade-common-errors` |

## Examples
See Rate Limit Tiers table, Response Headers section, Built-In SDK Retries, Custom Backoff implementation, and Throughput Optimization strategies above.

## Resources
- [Rate Limits Docs](https://docs.anthropic.com/en/api/rate-limits)
- [Message Batches](https://docs.anthropic.com/en/api/creating-message-batches) — no rate limits
- [Token Counting](https://docs.anthropic.com/en/api/counting-tokens)

## Next Steps
See `clade-cost-tuning` for cost optimization strategies.

## Prerequisites
- Completed `clade-install-auth`
- Understanding of HTTP response headers
- Familiarity with exponential backoff patterns

## Instructions

### Step 1: Review the patterns below
Each section contains production-ready code examples. Copy and adapt them to your use case.

### Step 2: Apply to your codebase
Integrate the patterns that match your requirements. Test each change individually.

### Step 3: Verify
Run your test suite to confirm the integration works correctly.