Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.
npx versuz@latest install jmagly-aiwg-agentic-code-frameworks-sdlc-complete-skills-regression-performagit clone https://github.com/jmagly/aiwg.gitcp aiwg/SKILL.MD ~/.claude/skills/jmagly-aiwg-agentic-code-frameworks-sdlc-complete-skills-regression-performa/SKILL.md---
namespace: aiwg
name: regression-performance
platforms: [all]
description: Detect performance regressions by comparing benchmarks across versions with latency, throughput, and statistical significance analysis
---
# regression-performance
Detect performance regressions by comparing benchmarks across versions, analyzing latency/throughput degradation, and providing statistical significance testing.
## Triggers
Alternate expressions and non-obvious activations (primary phrases are matched automatically from the skill description):
- "latency regression" → performance benchmark comparison
- "p99" / "p95" → percentile-based performance metrics
- "benchmark diff" → performance baseline comparison
## Purpose
This skill detects performance regressions across software versions by:
- Comparing latency metrics (p50, p95, p99) between baseline and current versions
- Detecting throughput regressions (requests/sec, transactions/sec)
- Identifying memory regressions (heap growth, memory leaks)
- Analyzing resource utilization (CPU, disk I/O, network)
- Running benchmark comparisons with statistical significance testing
- Generating performance regression reports with visualizations
## Behavior
When triggered, this skill:
1. **Identifies baseline version**:
- Detect last known good version from git tags
- Load baseline benchmark results
- Extract performance metrics from monitoring
2. **Runs performance benchmarks**:
- Execute load tests using k6, Artillery, or wrk
- Capture latency distributions (p50, p95, p99)
- Measure throughput (req/s, TPS)
- Profile memory usage and heap growth
- Monitor CPU and I/O utilization
3. **Performs statistical comparison**:
- Calculate delta and percentage change
- Apply statistical significance tests (t-test, Mann-Whitney U)
- Determine if degradation exceeds threshold
- Account for variance and noise
4. **Detects regression patterns**:
- Latency spikes at specific percentiles
- Throughput capacity reduction
- Memory leak indicators (growing heap)
- CPU saturation points
- I/O bottlenecks
5. **Generates regression report**:
- Performance comparison tables
- Percentile distribution graphs
- Time-series trend analysis
- Root cause indicators
- Recommendations
6. **Logs regression findings**:
- Create regression register entry
- Tag commits with performance impact
- Alert on threshold violations
## Performance Metrics Model
```
┌─────────────────────┐
│ BASELINE v2.3.0 │
├─────────────────────┤
│ p50: 45ms │
│ p95: 120ms │
│ p99: 180ms │
│ RPS: 2500 │
│ Mem: 256MB │
└─────────────────────┘
│
▼ Compare
┌─────────────────────┐
│ CURRENT v2.4.0 │
├─────────────────────┤
│ p50: 52ms (+15%) │ ⚠️ REGRESSION
│ p95: 145ms (+21%) │ ⚠️ REGRESSION
│ p99: 220ms (+22%) │ ⚠️ REGRESSION
│ RPS: 2100 (-16%) │ ⚠️ REGRESSION
│ Mem: 312MB (+22%) │ ⚠️ REGRESSION
└─────────────────────┘
│
▼
┌─────────────────────┐
│ REGRESSION REPORT │
│ │
│ Type: Latency │
│ Severity: HIGH │
│ Confidence: 99.5% │
│ Root Cause: TBD │
└─────────────────────┘
```
## Metric Categories
### Latency Metrics
| Metric | Description | Threshold | Tool |
|--------|-------------|-----------|------|
| p50 (median) | 50th percentile latency | +10% | k6, Artillery, wrk |
| p95 | 95th percentile latency | +15% | k6, Artillery, wrk |
| p99 | 99th percentile latency | +20% | k6, Artillery, wrk |
| max | Maximum observed latency | +30% | k6, Artillery, wrk |
### Throughput Metrics
| Metric | Description | Threshold | Tool |
|--------|-------------|-----------|------|
| Requests/sec | HTTP requests per second | -10% | k6, wrk, ab |
| Transactions/sec | Business transactions per second | -10% | Custom |
| Bytes/sec | Network throughput | -15% | iperf3, iftop |
| Queries/sec | Database query throughput | -10% | pgbench, sysbench |
### Memory Metrics
| Metric | Description | Threshold | Tool |
|--------|-------------|-----------|------|
| Heap size | JavaScript heap usage | +20% | Node.js heap snapshot |
| RSS | Resident set size | +20% | ps, top |
| Memory growth rate | MB/hour increase | >10 MB/hour | Continuous profiling |
| GC pressure | Garbage collection frequency | +30% | Node.js --trace-gc |
### Resource Metrics
| Metric | Description | Threshold | Tool |
|--------|-------------|-----------|------|
| CPU utilization | Average CPU usage | +20% | mpstat, top |
| Disk I/O wait | I/O wait percentage | +25% | iostat |
| Network bandwidth | Network utilization | +15% | iftop, nethogs |
| File descriptors | Open file handles | +30% | lsof |
## Benchmark Tools Integration
### k6 Load Testing
```javascript
// benchmark.k6.js
import http from 'k6/http';
import { check, sleep } from 'k6';
export let options = {
stages: [
{ duration: '2m', target: 100 }, // Ramp up
{ duration: '5m', target: 100 }, // Steady state
{ duration: '2m', target: 0 }, // Ramp down
],
thresholds: {
'http_req_duration': ['p(50)<100', 'p(95)<200', 'p(99)<300'],
'http_req_failed': ['rate<0.01'],
},
};
export default function () {
const res = http.get('https://api.example.com/endpoint');
check(res, {
'status is 200': (r) => r.status === 200,
'response time < 200ms': (r) => r.timings.duration < 200,
});
sleep(1);
}
```
**Running comparison**:
```bash
# Baseline
k6 run --out json=baseline-results.json benchmark.k6.js
# Current version
k6 run --out json=current-results.json benchmark.k6.js
# Compare
./compare-k6-results.sh baseline-results.json current-results.json
```
### Artillery Load Testing
```yaml
# artillery-config.yml
config:
target: 'https://api.example.com'
phases:
- duration: 120
arrivalRate: 10
rampTo: 50
- duration: 300
arrivalRate: 50
- duration: 120
arrivalRate: 50
rampTo: 0
plugins:
metrics-by-endpoint:
stripQueryString: true
scenarios:
- name: "API Performance Test"
flow:
- get:
url: "/api/users"
- get:
url: "/api/products"
- post:
url: "/api/orders"
json:
product_id: 123
quantity: 2
```
**Running comparison**:
```bash
# Baseline
artillery run --output baseline.json artillery-config.yml
# Current
artillery run --output current.json artillery-config.yml
# Compare
artillery report baseline.json --output baseline-report.html
artillery report current.json --output current-report.html
./compare-artillery-results.sh baseline.json current.json
```
### wrk HTTP Benchmarking
```bash
# Simple throughput test
wrk_benchmark() {
local version=$1
local output_file=$2
wrk -t12 -c400 -d30s \
--latency \
--timeout 10s \
https://api.example.com/endpoint \
> "$output_file"
}
# Baseline
wrk_benchmark "v2.3.0" "wrk-baseline.txt"
# Current
wrk_benchmark "v2.4.0" "wrk-current.txt"
# Compare
./parse-wrk-results.sh wrk-baseline.txt wrk-current.txt
```
### Apache Bench (ab)
```bash
# Quick regression check
ab_compare() {
local baseline_version=$1
local current_version=$2
echo "=== Baseline ${baseline_version} ==="
ab -n 10000 -c 100 https://api.example.com/ > ab-baseline.txt
echo "=== Current ${current_version} ==="
ab -n 10000 -c 100 https://api.example.com/ > ab-current.txt
# Extract key metrics
echo "Comparison:"
echo "Baseline RPS: $(grep 'Requests per second' ab-baseline.txt | awk '{print $4}')"
echo "Current RPS: $(grep 'Requests per second' ab-current.txt | awk '{print $4}')"
}
```
### Hyperfine (CLI tool benchmarking)
```bash
# For CLI tool performance
hyperfine \
--warmup 3 \
--export-json comparison.json \
'git checkout v2.3.0 && npm run build && npm test' \
'git checkout v2.4.0 && npm run build && npm test'
```
## Statistical Significance Testing
### T-Test for Mean Comparison
```typescript
function detectLatencyRegression(
baseline: number[],
current: number[]
): RegressionAnalysis {
const baselineMean = mean(baseline);
const currentMean = mean(current);
const delta = currentMean - baselineMean;
const deltaPercent = (delta / baselineMean) * 100;
// Perform Welch's t-test
const tTestResult = welchTTest(baseline, current);
return {
metric: 'latency',
baseline: {
mean: baselineMean,
stddev: stddev(baseline),
n: baseline.length,
},
current: {
mean: currentMean,
stddev: stddev(current),
n: current.length,
},
delta,
deltaPercent,
pValue: tTestResult.pValue,
isSignificant: tTestResult.pValue < 0.05,
isRegression: deltaPercent > 10 && tTestResult.pValue < 0.05,
confidence: 1 - tTestResult.pValue,
};
}
```
### Mann-Whitney U Test (Non-Parametric)
```typescript
function detectThroughputRegression(
baseline: number[],
current: number[]
): RegressionAnalysis {
const baselineMedian = median(baseline);
const currentMedian = median(current);
const delta = currentMedian - baselineMedian;
const deltaPercent = (delta / baselineMedian) * 100;
// Use Mann-Whitney U for non-normal distributions
const uTestResult = mannWhitneyU(baseline, current);
return {
metric: 'throughput',
baseline: {
median: baselineMedian,
iqr: iqr(baseline),
n: baseline.length,
},
current: {
median: currentMedian,
iqr: iqr(current),
n: current.length,
},
delta,
deltaPercent,
pValue: uTestResult.pValue,
isSignificant: uTestResult.pValue < 0.05,
isRegression: deltaPercent < -10 && uTestResult.pValue < 0.05,
confidence: 1 - uTestResult.pValue,
};
}
```
### Multiple Comparison Correction
```typescript
// Apply Bonferroni correction when testing multiple metrics
function detectMultiMetricRegressions(
metrics: MetricComparison[],
familyAlpha: number = 0.05
): RegressionResult[] {
const adjustedAlpha = familyAlpha / metrics.length;
return metrics.map(metric => ({
...metric,
adjustedAlpha,
isSignificantCorrected: metric.pValue < adjustedAlpha,
}));
}
```
## Threshold Configuration
Configure regression detection thresholds in `.aiwg/config/performance-thresholds.yaml`:
```yaml
performance_thresholds:
latency:
p50:
warning: 10% # Warn if p50 increases by >10%
critical: 20% # Critical if p50 increases by >20%
p95:
warning: 15%
critical: 25%
p99:
warning: 20%
critical: 30%
throughput:
requests_per_sec:
warning: -10% # Warn if RPS decreases by >10%
critical: -20%
transactions_per_sec:
warning: -10%
critical: -20%
memory:
heap_size:
warning: 20%
critical: 40%
growth_rate:
warning: 10 # MB/hour
critical: 25
resources:
cpu_utilization:
warning: 20%
critical: 40%
io_wait:
warning: 25%
critical: 50%
statistical:
significance_level: 0.05 # p-value threshold
min_sample_size: 100 # Minimum observations
multiple_comparison_correction: bonferroni
```
## Memory Leak Detection
### Heap Snapshot Comparison
```typescript
interface HeapAnalysis {
version: string;
timestamp: string;
totalSize: number;
usedSize: number;
objectCounts: Record<string, number>;
retainedPaths: RetainedPath[];
}
function detectMemoryLeak(
baseline: HeapAnalysis,
current: HeapAnalysis
): MemoryLeakReport {
const growthRate = (current.usedSize - baseline.usedSize) /
(current.timestamp - baseline.timestamp);
const suspectObjects = Object.keys(current.objectCounts)
.filter(type => {
const baseCount = baseline.objectCounts[type] || 0;
const currCount = current.objectCounts[type];
const growth = currCount - baseCount;
return growth > 1000; // >1000 additional objects
})
.map(type => ({
type,
baseline: baseline.objectCounts[type] || 0,
current: current.objectCounts[type],
delta: current.objectCounts[type] - (baseline.objectCounts[type] || 0),
}));
return {
isLeak: growthRate > 10 * 1024 * 1024, // >10MB/hour
growthRate,
suspectObjects,
recommendation: suspectObjects.length > 0
? `Investigate ${suspectObjects[0].type} accumulation`
: 'Run extended profiling session',
};
}
```
### Continuous Memory Monitoring
```bash
#!/bin/bash
# monitor-memory-regression.sh
monitor_memory() {
local duration_minutes=$1
local interval_seconds=$2
local output_file=$3
echo "timestamp,rss,heap_used,heap_total,external" > "$output_file"
for i in $(seq 1 $((duration_minutes * 60 / interval_seconds))); do
local stats=$(curl -s http://localhost:3000/metrics/memory)
echo "$(date +%s),$stats" >> "$output_file"
sleep "$interval_seconds"
done
}
# Run for 30 minutes, sample every 10 seconds
monitor_memory 30 10 memory-profile.csv
# Analyze for leaks
python analyze-memory-trend.py memory-profile.csv
```
## Performance Regression Report Format
### Executive Summary
```markdown
# Performance Regression Report
**Project**: my-api-service
**Analysis Date**: 2024-01-25
**Baseline Version**: v2.3.0
**Current Version**: v2.4.0
**Test Duration**: 10 minutes
**Load Profile**: 100 concurrent users, steady state
## Executive Summary
**Regression Detected**: YES (HIGH severity)
**Metrics Affected**: 5 of 8 metrics show significant degradation
**Statistical Confidence**: 99.5%
**Recommendation**: DO NOT DEPLOY - requires investigation
| Category | Status | Details |
|----------|--------|---------|
| Latency | ⚠️ REGRESSION | p50: +15%, p95: +21%, p99: +22% |
| Throughput | ⚠️ REGRESSION | -16% requests/sec |
| Memory | ⚠️ REGRESSION | +22% heap, potential leak |
| CPU | ✅ PASS | Within acceptable range |
| I/O | ✅ PASS | No significant change |
```
### Detailed Metrics Comparison
```markdown
## Latency Metrics
| Metric | Baseline | Current | Delta | % Change | p-value | Regression? |
|--------|----------|---------|-------|----------|---------|-------------|
| p50 | 45ms | 52ms | +7ms | +15.6% | 0.001 | ⚠️ YES |
| p75 | 78ms | 92ms | +14ms | +17.9% | 0.002 | ⚠️ YES |
| p90 | 105ms | 128ms | +23ms | +21.9% | 0.001 | ⚠️ YES |
| p95 | 120ms | 145ms | +25ms | +20.8% | 0.001 | ⚠️ YES |
| p99 | 180ms | 220ms | +40ms | +22.2% | 0.001 | ⚠️ YES |
| max | 450ms | 580ms | +130ms | +28.9% | 0.023 | ⚠️ YES |
**Statistical Test**: Welch's t-test
**Significance Level**: α = 0.05 (Bonferroni corrected: 0.0083)
**Confidence**: 99.9% that degradation is real
### Latency Distribution
```
Baseline v2.3.0 Current v2.4.0
│ │
600 │ │ *
│ │
500 │ │ *
│ │ *
400 │ │ *
│ │
300 │ │ **
│ │ *
200 │ * │ **
│ ** │ **
100 │ **** │ **
│ ***** │**
0 └───────────────── └─────────────────
0 25 50 75 95 99 0 25 50 75 95 99
Percentile Percentile
```
```
### Throughput Regression
```markdown
## Throughput Metrics
| Metric | Baseline | Current | Delta | % Change | p-value | Regression? |
|--------|----------|---------|-------|----------|---------|-------------|
| Requests/sec | 2500 | 2100 | -400 | -16.0% | 0.003 | ⚠️ YES |
| Total requests | 150000 | 126000 | -24000 | -16.0% | - | - |
| Failed requests | 150 (0.1%) | 1260 (1.0%) | +1110 | +740% | 0.001 | ⚠️ YES |
**Statistical Test**: Mann-Whitney U test
**Confidence**: 99.7% that throughput decreased
### Time Series Comparison
```
Requests per Second Over Time
3000 ┤
│ Baseline ─────
2500 │ ████████████████████████████████████
│
2000 │ Current ─ ─ ─ ─
│ ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
1500 │
│
1000 │
│
500 │
│
0 └────────────────────────────────────────
0 2m 4m 6m 8m 10m
Time
```
```
### Memory Analysis
```markdown
## Memory Metrics
| Metric | Baseline | Current | Delta | % Change | Leak Detected? |
|--------|----------|---------|-------|----------|----------------|
| Heap Used | 256 MB | 312 MB | +56 MB | +21.9% | ⚠️ Suspected |
| RSS | 384 MB | 468 MB | +84 MB | +21.9% | - |
| Growth Rate | 2 MB/hour | 18 MB/hour | +16 MB/hour | +800% | ⚠️ YES |
| GC Frequency | 12/min | 18/min | +6/min | +50% | ⚠️ Pressure |
**Memory Leak Analysis**:
- **Leak Detected**: YES (high confidence)
- **Growth Rate**: 18 MB/hour (threshold: 10 MB/hour)
- **Suspect Objects**: Array (+12500 instances), Closure (+8200 instances)
- **Recommendation**: Take heap snapshot and analyze retained paths
### Heap Growth Over Time
```
Heap Size (MB)
400 ┤ Current ╱╱
│ ╱╱╱╱╱
350 │ ╱╱╱╱╱
│ ╱╱╱╱╱
300 │ ╱╱╱╱╱
│ ╱╱╱╱╱
250 │ ╱╱╱╱╱ Baseline ────
│ ╱╱╱╱╱ ──────────────────────────
200 │
│
150 │
│
100 │
└────────────────────────────────────────
0 10 20 30 40 50 60
Time (minutes)
```
```
### Root Cause Indicators
```markdown
## Root Cause Analysis
**Automated Analysis**:
### Likely Culprits
1. **Synchronous I/O in Request Path** (HIGH confidence)
- Evidence: p99 latency +22%, I/O wait +15%
- Pattern: Blocking operations correlate with latency spikes
- Recommendation: Profile with `clinic doctor` or `0x`
2. **Memory Accumulation** (HIGH confidence)
- Evidence: Linear heap growth at 18 MB/hour
- Pattern: Array and Closure object growth
- Recommendation: Take heap snapshots at t=0 and t=30min, analyze diff
3. **Increased Computation** (MEDIUM confidence)
- Evidence: CPU +12%, throughput -16%
- Pattern: Less throughput despite similar CPU usage
- Recommendation: Profile with `perf` or `clinic flame`
### Recommended Investigation Steps
1. **Git Bisect**: Identify introducing commit
```bash
git bisect start HEAD v2.3.0
git bisect run ./performance-test.sh
```
2. **Heap Snapshot Diff**:
```bash
node --heap-prof app.js # Baseline
# Run load test
node --heap-prof app.js # Current
# Compare snapshots
```
3. **Flame Graph Analysis**:
```bash
clinic flame -- node app.js
# Analyze hot paths
```
```
### Recommendations
```markdown
## Recommendations
### Immediate Actions (DO NOT DEPLOY)
- [ ] **Do not deploy v2.4.0** - Severity is HIGH
- [ ] Run git bisect to identify introducing commit
- [ ] Take heap snapshots and analyze memory accumulation
- [ ] Profile CPU with flame graphs to identify hot paths
### Investigation Priority
| Priority | Action | Owner | ETA |
|----------|--------|-------|-----|
| P0 | Identify regression commit via bisect | Regression Analyst | 2 hours |
| P0 | Analyze heap snapshots for leak | Performance Engineer | 4 hours |
| P1 | Profile CPU for hot paths | Performance Engineer | 4 hours |
| P1 | Review recent I/O changes | Developer | 2 hours |
### Prevention (After Fix)
- [ ] Add performance regression tests to CI
- [ ] Set up continuous memory profiling
- [ ] Enable automated alerts for p99 > 200ms
- [ ] Add load testing to pull request checks
```
## Usage Examples
### Full Performance Comparison
```
User: "Performance regression check between v2.3.0 and v2.4.0"
Skill executes:
1. Checkout v2.3.0, run k6 benchmark → baseline-results.json
2. Checkout v2.4.0, run k6 benchmark → current-results.json
3. Statistical comparison → 5 of 8 metrics regressed
4. Memory profiling → leak detected (+18 MB/hour)
5. Generate report → .aiwg/reports/performance-regression-20240125.md
Output:
"⚠️ PERFORMANCE REGRESSION DETECTED
Severity: HIGH
Confidence: 99.5%
Metrics Affected: Latency (+15-22%), Throughput (-16%), Memory Leak
Report: .aiwg/reports/performance-regression-20240125.md
DO NOT DEPLOY - Investigation required"
```
### Quick Latency Check
```
User: "Check latency regression"
Skill executes:
1. Compare current p99 against baseline
2. Statistical t-test for significance
3. Quick report
Output:
"Latency Regression Detected:
p50: 52ms (baseline: 45ms) → +15.6% ⚠️
p95: 145ms (baseline: 120ms) → +20.8% ⚠️
p99: 220ms (baseline: 180ms) → +22.2% ⚠️
Statistical confidence: 99.9%
Exceeds threshold: YES (>10%)
Recommendation: Investigate before deploying"
```
### Memory Leak Detection
```
User: "Detect memory leak"
Skill executes:
1. Monitor memory for 30 minutes
2. Calculate growth rate
3. Identify accumulating objects
Output:
"⚠️ MEMORY LEAK DETECTED
Growth Rate: 18 MB/hour (threshold: 10 MB/hour)
Suspect Objects:
- Array: +12500 instances
- Closure: +8200 instances
Heap snapshots saved:
- .aiwg/profiling/heap-t0.heapsnapshot
- .aiwg/profiling/heap-t30.heapsnapshot
Recommendation: Analyze with Chrome DevTools Memory Profiler"
```
### Benchmark Comparison
```
User: "Compare performance with main branch"
Skill executes:
1. Stash current changes
2. Benchmark main branch
3. Restore changes
4. Benchmark current branch
5. Statistical comparison
Output:
"Performance Comparison: feature-branch vs main
Latency (p99):
main: 180ms
feature-branch: 185ms
Delta: +2.8% (within threshold ✓)
Throughput:
main: 2500 req/s
feature-branch: 2480 req/s
Delta: -0.8% (within threshold ✓)
Verdict: NO REGRESSION DETECTED
Safe to merge"
```
## Integration
This skill uses:
- `artifact-metadata`: Load benchmark results from previous runs
- `project-awareness`: Detect git tags and version history
- `regression-analyst` agent: For detailed root cause analysis when regression detected
This skill is used by:
- CI/CD pipelines for automated regression gates
- Developers for pre-merge performance validation
- Release managers for go/no-go decisions
## Automated Detection in CI
### GitHub Actions Example
```yaml
# .github/workflows/performance-gate.yml
name: Performance Regression Gate
on:
pull_request:
branches: [main]
jobs:
performance-gate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
with:
fetch-depth: 0 # Full history for baseline
- name: Setup Node.js
uses: actions/setup-node@v3
with:
node-version: '20'
- name: Install dependencies
run: npm ci
- name: Benchmark baseline (main branch)
run: |
git checkout main
npm run build
k6 run --out json=baseline.json benchmark.k6.js
- name: Benchmark current (PR branch)
run: |
git checkout ${{ github.head_ref }}
npm run build
k6 run --out json=current.json benchmark.k6.js
- name: Performance regression analysis
id: regression
run: |
npm run analyze:performance -- \
--baseline baseline.json \
--current current.json \
--threshold 10 \
--output regression-report.md
- name: Comment PR with results
uses: actions/github-script@v6
with:
script: |
const fs = require('fs');
const report = fs.readFileSync('regression-report.md', 'utf8');
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: report
});
- name: Fail if regression detected
if: steps.regression.outputs.regression == 'true'
run: |
echo "Performance regression detected - see report"
exit 1
```
## Output Locations
- Performance reports: `.aiwg/reports/performance-regression-{date}.md`
- Benchmark data: `.aiwg/benchmarks/{version}-{timestamp}.json`
- Heap snapshots: `.aiwg/profiling/heap-{version}-{timestamp}.heapsnapshot`
- Flame graphs: `.aiwg/profiling/flame-{version}-{timestamp}.html`
- Regression register: `.aiwg/testing/regression-register/REG-PERF-{id}.yaml`
## References
- @$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/agents/regression-analyst.md - Regression analysis agent
- @$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/agents/performance-engineer.md - Performance optimization agent
- @$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/schemas/testing/regression.yaml - Regression schema
- @.aiwg/research/findings/REF-013-metagpt.md - Debug memory pattern for performance history
- @$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/rules/executable-feedback.md - Execution validation requirements