Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.
npx versuz@latest install seb155-atlas-plugin-dist-atlas-dev-addon-skills-workflow-benchmarkgit clone https://github.com/seb155/atlas-plugin.gitcp atlas-plugin/SKILL.MD ~/.claude/skills/seb155-atlas-plugin-dist-atlas-dev-addon-skills-workflow-benchmark/SKILL.md---
name: workflow-benchmark
description: "Flow-analytics baseline + comparison + report. For performance benchmarks or feature usage analysis."
mode: [personal, all]
effort: medium
superpowers_pattern: [iron_law, red_flags]
see_also: [flow-analytics, workflow-data-analysis, workflow-cost-tracking]
thinking_mode: adaptive
version: 6.1.0
tier: [core, dev, admin]
category: analytics
emoji: "📈"
triggers: ["benchmark", "performance test", "compare versions", "metrics comparison"]
schema_version: 1
output_schema_ref: ".blueprint/schemas/workflow-step-result-v1.json"
resumable: true
parallelizable_groups: []
estimated_duration_min: 60
persona_tags: [engineer]
requires_hitl: false
workflow_steps:
- step: 1
name: "Baseline capture"
skill: flow-analytics
gate: MANDATORY
purpose: "Capture current state metrics before any change. Multiple runs for variance."
parallelizable: false
depends_on: []
model_preference: sonnet
effort: medium
- step: 2
name: "Apply change / comparison condition"
skill: document-generator
gate: MANDATORY
purpose: "Clear isolated change. 1 variable at a time. Document what changed."
parallelizable: false
depends_on: [1]
model_preference: sonnet
effort: low
- step: 3
name: "Post-change capture"
skill: flow-analytics
gate: MANDATORY
purpose: "Same metrics as baseline. Multiple runs. Statistical significance check."
parallelizable: false
depends_on: [2]
model_preference: sonnet
effort: medium
- step: 4
name: "Delta report"
skill: decision-log
gate: HARD_GATE
purpose: "Baseline vs post, % delta, statistical confidence, recommendation"
parallelizable: false
depends_on: [3]
model_preference: opus
effort: medium
---
<red-flags>
| Thought | Reality |
|---|---|
| "Single run is enough" | Variance exists. Minimum 3 runs, report median + range. |
| "5% improvement is a win" | 5% may be within noise. Check statistical significance. |
| "Change multiple things at once" | No. 1 variable at a time. Multi-change = can't attribute causation. |
</red-flags>
# Workflow: Benchmark
## Success output
```json
{
"workflow": "benchmark",
"status": "completed",
"metric": "latency p50 | throughput | cost | ...",
"baseline": N,
"post_change": M,
"delta_pct": "+X% | -X%",
"statistically_significant": true,
"runs": 5,
"report_path": "memory/bench-*.md"
}
```