Shellseb155Free

test-value-decide

Decide whether to write a test for a class/function and at which tier (T0-T4). This skill should be used when about to write a test, asking 'should I test this?', '/a-test-decide', or reviewing a PR with new tests. Implements TVT — Test Value Tiering framework.

Repo bundle on Versuzseb155/atlas-plugin336 indexed entries (SKILL.md and CLAUDE.md) from this repository — open the full bundle view.

Open bundle →

View on GitHub ↗</>github.com/seb155/atlas-plugin Yours? Claim it ↗

§ 01 — Stats

Prior1090

Quality—

Score—

Tasks—

§ 02 — Install

Get test-value-decide.

Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.

One-line install · Claude Code

$npx versuz@latest install seb155-atlas-plugin-skills-test-value-decide

Or clone the repo

$git clone https://github.com/seb155/atlas-plugin.git

Or copy the SKILL.md manually

$cp atlas-plugin/SKILL.MD ~/.claude/skills/seb155-atlas-plugin-skills-test-value-decide/SKILL.md

More Versuz picks

★ Featured$1.99

vz-bench-debug

Document

★ Featured$0.99

vz-scrape-runner

Web

Got something better ?Submit your skill — it enters tomorrow's cycle. No fee.

Submit yours →

§ 05 — Challenge

Think you can beat it?

$npx versuz challenge seb155-atlas-plugin-skills-test-value-decide↵

Show SKILL.md content (~1.3k tokens)

---
name: test-value-decide
description: "Decide whether to write a test for a class/function and at which tier (T0-T4). This skill should be used when about to write a test, asking 'should I test this?', '/a-test-decide', or reviewing a PR with new tests. Implements TVT — Test Value Tiering framework."
mode: [coding, engineering]
effort: low
---

# TVT — Test Value Tiering Decision

Before writing a test, ask: **does this code deserve a test, and at which tier?**

The 5 tiers cover the complete decision space:

| Tier | When | Example | Test count |
|------|------|---------|------------|
| **T0 — Skip** | Pydantic w/o validators, getters/setters, type aliases, auto-gen, <10 LOC re-exports | `class UserResponse(BaseModel): email: str` | 0 |
| **T1 — Unit smoke** | Pure functions, simple algos, helpers <50 LOC | `parse_wbs_code("WBS-01-A")` | 1-3 |
| **T2 — Unit business** | Single-service methods with branches, validators with rules | `recompute_hours()` w/ 4 branches | 5-10 |
| **T3 — Integration** | Multi-service composition, GET handlers, services w/ 2-3 DI deps | `list_instruments` endpoint | 3-7 |
| **T4 — Smoke + E2E** | Orchestrators ≥3 DI, POST/PUT/DELETE, alembic migrations | `create_instrument` POST | 1-3 smoke + 1-2 E2E |

## Decision flowchart

```
1. Is the file < 10 LOC and only re-exports / type aliases?  → T0
2. Is the file auto-generated (`# generated by`)?            → T0
3. Touches alembic/versions/ OR @router.(post|put|delete)?  → T4 (mandatory)
4. ≥3 DI deps in __init__ OR file matches *Orchestrator*?    → T4 (mandatory)
5. Pydantic schema with NO validators?                       → T0
6. Pure function (no I/O, no side effects)?                  → T1
7. Single-service method with branches OR validator?         → T2
8. Multi-service composition OR FastAPI GET handler?         → T3
9. Default                                                   → T2
```

## Promotion rules (escalate +1 tier when ANY apply)

- High fan-in (file imported by ≥5 modules)
- Production incident reference (`# bug-2026-XX-XX`)
- Money path (estimate/cost/rate/price)
- HITL gate (irreversible action)

## How to invoke (project-specific)

In a Synapse codebase context:

```bash
# Classify a single file
python3 scripts/tvt-classify.py backend/app/services/foo.py

# Classify a directory
python3 scripts/tvt-classify.py --batch backend/app/services/

# Generate baseline report
python3 scripts/tvt-classify.py --report > memory/tvt-baseline.md
```

The lefthook `tvt-hint` step runs automatically on staged backend/app/*.py files
during pre-commit and prints the recommended tier as advisory output.

## When to override

The classifier is **advisory**. Override engineering judgment when:
- Domain expertise tells you a "T1" function is load-bearing
- Past incident shows tests at this layer would have caught the bug
- Coverage tool flags a critical-path uncovered branch

**Log overrides** in `.claude/decisions.jsonl` with rationale + date. The
classifier learns from these (Phase 5b stretch goal).

## Anti-patterns (you'll know it when you see it)

1. Tests that mock every dep (testing the mock framework, not the code)
2. Tests that assert implementation (`mock.called_with(...)` for every internal call)
3. Tests that re-export production code (assert `User(email="x").email == "x"` after defining)
4. Tests that depend on test ordering (module-level fixtures with mutable state)
5. Tests that print "OK" but don't assert anything

If you wrote one of these, you're at the wrong tier. Re-check the flowchart.

## Default = T2

When unsure, T2. Don't over-test (T1 too low — false confidence). Don't
under-test (T3 too high — slow CI for what could be 5 unit cases).

## Companion skills + rules

- **testing-funnel.md** — G0-G4 pyramid (WHERE to put a test)
- **testing-mock-budget.md** — 5 rules (HOW NOT to mock)
- **CATO** (cato-orchestration.md) — runtime sibling (ROUTE existing tests)
- **audit-enforcement-protocol.md** — 8-layer scoring (TVT inherits this structure)
- **tdd** skill — strict red-green cycle (orthogonal: TDD = how to develop, TVT = whether to develop a test)

## Why this exists

Two incidents motivate the framework:

1. **2026-04-14 — Skeleton deletion**: 380 auto-generated unit tests removed
   (71% of `backend/tests/unit/`). Each one was just `test_module_imports()` —
   redundant with app startup smoke. They added maintenance overhead with zero
   logic coverage.

2. **2026-04-16 — Persona-bug**: 49 unit tests with `MagicMock(spec=User)`
   passed green. One real curl caught the attribute drift in 10s. Mocked tests
   gave false confidence; smoke tests would have caught it earlier.

TVT codifies the decision so future commits don't re-incur these costs.