Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.
npx versuz@latest install jeremylongshore-claude-code-plugins-plus-skills-plugins-saas-packs-castai-pack-skills-castai-cost-tuninggit clone https://github.com/jeremylongshore/claude-code-plugins-plus-skills.gitcp claude-code-plugins-plus-skills/SKILL.MD ~/.claude/skills/jeremylongshore-claude-code-plugins-plus-skills-plugins-saas-packs-castai-pack-skills-castai-cost-tuning/SKILL.md---
name: castai-cost-tuning
description: 'Maximize Kubernetes cost savings with CAST AI spot strategies and right-sizing.
Use when analyzing cloud spend, optimizing spot-to-on-demand ratios,
or configuring CAST AI for maximum savings.
Trigger with phrases like "cast ai cost", "cast ai savings",
"cast ai spot strategy", "reduce kubernetes cost", "cast ai budget".
'
allowed-tools: Read, Write, Edit, Bash(curl:*), Grep
version: 1.0.0
license: MIT
author: Jeremy Longshore <jeremy@intentsolutions.io>
tags:
- saas
- kubernetes
- cost-optimization
- castai
compatibility: Designed for Claude Code
---
# CAST AI Cost Tuning
## Overview
Maximize Kubernetes cost savings through CAST AI: spot instance strategies, workload right-sizing, cluster hibernation, and savings tracking. Typical savings: 50-70% on cloud compute costs.
## Prerequisites
- CAST AI Phase 2 enabled with full automation
- Savings report available (requires 24h+ of data)
- Understanding of workload criticality tiers
## Instructions
### Step 1: Analyze Current Savings
```bash
# Get savings breakdown
curl -s -H "X-API-Key: ${CASTAI_API_KEY}" \
"https://api.cast.ai/v1/kubernetes/clusters/${CASTAI_CLUSTER_ID}/savings" \
| jq '{
currentMonthlyCost: .currentMonthlyCost,
optimizedMonthlyCost: .optimizedMonthlyCost,
monthlySavings: .monthlySavings,
savingsPercentage: .savingsPercentage,
spotSavings: .spotSavings,
rightSizingSavings: .rightSizingSavings
}'
```
### Step 2: Maximize Spot Usage
```bash
# Enable aggressive spot with diversity and fallbacks
curl -X PUT -H "X-API-Key: ${CASTAI_API_KEY}" \
-H "Content-Type: application/json" \
"https://api.cast.ai/v1/kubernetes/clusters/${CASTAI_CLUSTER_ID}/policies" \
-d '{
"enabled": true,
"spotInstances": {
"enabled": true,
"clouds": ["aws"],
"spotDiversityEnabled": true,
"spotDiversityPriceIncreaseLimitPercent": 20,
"spotBackups": {
"enabled": true,
"spotBackupRestoreRateSeconds": 600
}
}
}'
```
**Spot allocation strategy by workload tier:**
| Workload Type | Spot % | Rationale |
|---------------|--------|-----------|
| Batch jobs, CI runners | 100% spot | Interruptible, restartable |
| Stateless APIs (behind LB) | 80% spot | Can handle brief interruptions |
| Stateful services, databases | 0% spot | Use on-demand or reserved |
| ML training | 80-100% spot | Checkpointing handles interrupts |
### Step 3: Workload Right-Sizing
```bash
# Get resource waste analysis
curl -s -H "X-API-Key: ${CASTAI_API_KEY}" \
"https://api.cast.ai/v1/workload-autoscaling/clusters/${CASTAI_CLUSTER_ID}/workloads" \
| jq '[.items[] | select(.estimatedSavingsPercent > 20) | {
name: .workloadName,
namespace: .namespace,
wastedCpu: (.currentCpuRequest - .recommendedCpuRequest),
wastedMemory: (.currentMemoryRequest - .recommendedMemoryRequest),
savingsPercent: .estimatedSavingsPercent
}] | sort_by(-.savingsPercent) | .[0:10]'
```
### Step 4: Cluster Hibernation (Dev/Staging)
```bash
# Hibernate non-production clusters during off-hours
# Scales nodes to zero, resume on demand
# Enable hibernation
curl -X POST -H "X-API-Key: ${CASTAI_API_KEY}" \
-H "Content-Type: application/json" \
"https://api.cast.ai/v1/kubernetes/clusters/${CASTAI_CLUSTER_ID}/hibernate" \
-d '{
"schedule": {
"enabled": true,
"hibernateAt": "20:00",
"wakeUpAt": "08:00",
"timezone": "America/New_York",
"weekdaysOnly": true
}
}'
```
### Step 5: Cost Tracking Dashboard
```typescript
interface CostReport {
cluster: string;
period: string;
currentCost: number;
optimizedCost: number;
savings: number;
spotPercent: number;
}
async function generateMonthlyCostReport(
clusterIds: string[]
): Promise<CostReport[]> {
const reports: CostReport[] = [];
for (const clusterId of clusterIds) {
const [cluster, savings, nodes] = await Promise.all([
castaiGet(`/v1/kubernetes/external-clusters/${clusterId}`),
castaiGet(`/v1/kubernetes/clusters/${clusterId}/savings`),
castaiGet(`/v1/kubernetes/external-clusters/${clusterId}/nodes`),
]);
const spotNodes = nodes.items.filter(
(n: { lifecycle: string }) => n.lifecycle === "spot"
).length;
reports.push({
cluster: cluster.name,
period: new Date().toISOString().slice(0, 7),
currentCost: savings.currentMonthlyCost,
optimizedCost: savings.optimizedMonthlyCost,
savings: savings.monthlySavings,
spotPercent:
nodes.items.length > 0
? (spotNodes / nodes.items.length) * 100
: 0,
});
}
return reports;
}
```
## Cost Optimization Checklist
- [ ] Spot instances enabled with diversity
- [ ] Workload autoscaler right-sizing resources
- [ ] Dev/staging clusters hibernated off-hours
- [ ] Empty node downscaler enabled
- [ ] Instance families include latest generation (cheaper)
- [ ] Reserved/savings plan for baseline on-demand nodes
- [ ] Weekly savings report review
## Error Handling
| Issue | Cause | Solution |
|-------|-------|----------|
| Savings lower than expected | Too many on-demand constraints | Relax node template constraints |
| Spot interruptions too frequent | Single instance type | Enable spot diversity |
| Hibernation not triggering | Schedule timezone wrong | Use IANA timezone format |
| Right-sizing too aggressive | Low headroom | Increase memory headroom to 20% |
## Resources
- [CAST AI Savings Report](https://docs.cast.ai/docs/getting-started)
- [Spot Instance Best Practices](https://docs.cast.ai/docs/autoscaler-settings)
- [Cluster Hibernation](https://docs.cast.ai/docs/autoscaling-cluster-hibernation)
## Next Steps
For architecture patterns, see `castai-reference-architecture`.