Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.
npx versuz@latest install seb155-atlas-plugin-skills-test-value-decidegit clone https://github.com/seb155/atlas-plugin.gitcp atlas-plugin/SKILL.MD ~/.claude/skills/seb155-atlas-plugin-skills-test-value-decide/SKILL.md---
name: test-value-decide
description: "Decide whether to write a test for a class/function and at which tier (T0-T4). This skill should be used when about to write a test, asking 'should I test this?', '/a-test-decide', or reviewing a PR with new tests. Implements TVT — Test Value Tiering framework."
mode: [coding, engineering]
effort: low
---
# TVT — Test Value Tiering Decision
Before writing a test, ask: **does this code deserve a test, and at which tier?**
The 5 tiers cover the complete decision space:
| Tier | When | Example | Test count |
|------|------|---------|------------|
| **T0 — Skip** | Pydantic w/o validators, getters/setters, type aliases, auto-gen, <10 LOC re-exports | `class UserResponse(BaseModel): email: str` | 0 |
| **T1 — Unit smoke** | Pure functions, simple algos, helpers <50 LOC | `parse_wbs_code("WBS-01-A")` | 1-3 |
| **T2 — Unit business** | Single-service methods with branches, validators with rules | `recompute_hours()` w/ 4 branches | 5-10 |
| **T3 — Integration** | Multi-service composition, GET handlers, services w/ 2-3 DI deps | `list_instruments` endpoint | 3-7 |
| **T4 — Smoke + E2E** | Orchestrators ≥3 DI, POST/PUT/DELETE, alembic migrations | `create_instrument` POST | 1-3 smoke + 1-2 E2E |
## Decision flowchart
```
1. Is the file < 10 LOC and only re-exports / type aliases? → T0
2. Is the file auto-generated (`# generated by`)? → T0
3. Touches alembic/versions/ OR @router.(post|put|delete)? → T4 (mandatory)
4. ≥3 DI deps in __init__ OR file matches *Orchestrator*? → T4 (mandatory)
5. Pydantic schema with NO validators? → T0
6. Pure function (no I/O, no side effects)? → T1
7. Single-service method with branches OR validator? → T2
8. Multi-service composition OR FastAPI GET handler? → T3
9. Default → T2
```
## Promotion rules (escalate +1 tier when ANY apply)
- High fan-in (file imported by ≥5 modules)
- Production incident reference (`# bug-2026-XX-XX`)
- Money path (estimate/cost/rate/price)
- HITL gate (irreversible action)
## How to invoke (project-specific)
In a Synapse codebase context:
```bash
# Classify a single file
python3 scripts/tvt-classify.py backend/app/services/foo.py
# Classify a directory
python3 scripts/tvt-classify.py --batch backend/app/services/
# Generate baseline report
python3 scripts/tvt-classify.py --report > memory/tvt-baseline.md
```
The lefthook `tvt-hint` step runs automatically on staged backend/app/*.py files
during pre-commit and prints the recommended tier as advisory output.
## When to override
The classifier is **advisory**. Override engineering judgment when:
- Domain expertise tells you a "T1" function is load-bearing
- Past incident shows tests at this layer would have caught the bug
- Coverage tool flags a critical-path uncovered branch
**Log overrides** in `.claude/decisions.jsonl` with rationale + date. The
classifier learns from these (Phase 5b stretch goal).
## Anti-patterns (you'll know it when you see it)
1. Tests that mock every dep (testing the mock framework, not the code)
2. Tests that assert implementation (`mock.called_with(...)` for every internal call)
3. Tests that re-export production code (assert `User(email="x").email == "x"` after defining)
4. Tests that depend on test ordering (module-level fixtures with mutable state)
5. Tests that print "OK" but don't assert anything
If you wrote one of these, you're at the wrong tier. Re-check the flowchart.
## Default = T2
When unsure, T2. Don't over-test (T1 too low — false confidence). Don't
under-test (T3 too high — slow CI for what could be 5 unit cases).
## Companion skills + rules
- **testing-funnel.md** — G0-G4 pyramid (WHERE to put a test)
- **testing-mock-budget.md** — 5 rules (HOW NOT to mock)
- **CATO** (cato-orchestration.md) — runtime sibling (ROUTE existing tests)
- **audit-enforcement-protocol.md** — 8-layer scoring (TVT inherits this structure)
- **tdd** skill — strict red-green cycle (orthogonal: TDD = how to develop, TVT = whether to develop a test)
## Why this exists
Two incidents motivate the framework:
1. **2026-04-14 — Skeleton deletion**: 380 auto-generated unit tests removed
(71% of `backend/tests/unit/`). Each one was just `test_module_imports()` —
redundant with app startup smoke. They added maintenance overhead with zero
logic coverage.
2. **2026-04-16 — Persona-bug**: 49 unit tests with `MagicMock(spec=User)`
passed green. One real curl caught the attribute drift in 10s. Mocked tests
gave false confidence; smoke tests would have caught it earlier.
TVT codifies the decision so future commits don't re-incur these costs.