skill-discovery-loop

Show SKILL.md content (~3.3k tokens)
---
name: skill-discovery-loop
description: "Voyager-style skill auto-discovery loop: idle-curiosity propose gap → skill-creator draft → adversary test → canary deploy → eval gate → promote OR rollback. HITL gate first 3 auto-skills."
mode: [personal, all]
effort: high
version: 1.0.0
tier: [admin]
---

# Skill Discovery Loop — Voyager Autonomy Pattern

Closed-loop skill auto-discovery inspired by **NVIDIA Voyager** (Wang et al.,
2023): an autonomous agent proposes new abilities, drafts them, stress-tests
them, deploys them progressively, and promotes or discards them based on
empirical evidence — all without a human in the inner loop.

This is the v7.0 W4.1 capstone of the autonomy stack. It does **not**
introduce new logic — it **composes** five existing skills into a single
self-perpetuating workflow that grows the ATLAS skill library over time.

> **HITL gate (W4.1 mandatory)**: the **first three auto-created skills**
> require explicit Seb sign-off at the promotion stage. After three
> successful auto-skills land in `main` and survive 30 days without rollback,
> the loop transitions to **fully autonomous** — Seb is notified async via
> Routines, and only intervenes on explicit failure escalations.

## When to invoke

- `/atlas skill-discovery-loop --once` — manual single iteration
- `/atlas skill-discovery-loop --status` — list recent auto-skills + state
- Daily cron via Anthropic Routines (target cadence: ≥1 auto-skill/month
  steady state Q4 2026)
- After idle-curiosity surfaces a gap with `confidence ≥ 0.7`
- When the user asks to "grow the skill library", "Voyager loop",
  "auto-discover skills", "self-improve ATLAS"

## Workflow — The Voyager Loop (6 stages)

```
┌──────────────────────────────────────────────────────────────────────┐
│  Stage 1: PROPOSE                  (idle-curiosity)                  │
│   └─ Detect skill gap from session telemetry                         │
│   └─ Output: gap-proposal.json {topic, evidence, confidence}         │
│                  │                                                   │
│                  ▼                                                   │
│  Stage 2: DRAFT                    (skill-creator)                   │
│   └─ Generate SKILL.md frontmatter + body from proposal              │
│   └─ Output: skills/<auto-name>/SKILL.md (in draft/ branch)          │
│                  │                                                   │
│                  ▼                                                   │
│  Stage 3: STRESS-TEST              (W3.3 skill-adversary)            │
│   └─ Apply 8 attack patterns: prompt injection, edge inputs,         │
│      ambiguous triggers, scope creep, hallucination bait, etc.       │
│   └─ Gate: ≥6/8 attacks survived → continue, else → discard          │
│                  │                                                   │
│                  ▼                                                   │
│  Stage 4: CANARY DEPLOY            (W3.2 skill-canary-deployer)      │
│   └─ Mirror 10% session traffic for 50-invocation window             │
│   └─ Watch error rate via skill-scorecard JSONL telemetry            │
│   └─ Auto-rollback if error budget >2%                               │
│                  │                                                   │
│                  ▼                                                   │
│  Stage 5: EVAL GATE                (W3.1 skill-regression-test)      │
│   └─ Run golden eval suite (LLM-as-judge + deterministic asserts)    │
│   └─ Gate: pass-rate ≥ baseline OR no regression on existing skills  │
│                  │                                                   │
│                  ▼                                                   │
│  Stage 6: PROMOTE OR ROLLBACK                                        │
│   └─ HITL gate (first 3 auto-skills): block at PR for Seb sign-off   │
│   └─ Auto-promote (after 3 successful): merge to main + tag          │
│   └─ Rollback: archive draft + log decision in decisions.jsonl       │
└──────────────────────────────────────────────────────────────────────┘
```

## Reuse pointers (no new logic — pure orchestration)

| Stage | Composed skill                         | Source |
|-------|----------------------------------------|--------|
| 1     | `idle-curiosity`                       | `skills/idle-curiosity/SKILL.md` |
| 2     | `skill-creator` (Anthropic core)       | `skills/skill-creator/SKILL.md` (or `skill-management` for ATLAS-specific scaffold) |
| 3     | `skill-adversary` (W3.3)               | `skills/skill-adversary/SKILL.md` |
| 4     | `skill-canary-deployer` (W3.2)         | `skills/skill-canary-deployer/SKILL.md` |
| 5     | `skill-regression-test` (W3.1)         | `skills/skill-regression-test/SKILL.md` |
| 6     | `forgejo-pr` + `decision-log`          | `skills/forgejo-pr/SKILL.md`, `skills/decision-log/SKILL.md` |

**Zero net-new code paths** — this skill is a workflow contract. All side
effects flow through composed children's existing telemetry, hooks, and
deployment mechanisms.

## CLI surface

```bash
# Manual single iteration (synchronous)
atlas skill-discovery-loop --once
  # → runs Stages 1-6 in sequence, blocks on HITL gate if first 3 skills

# Status / observability
atlas skill-discovery-loop --status
  # → prints table:
  #   | auto-name           | stage    | created    | result    |
  #   |---------------------|----------|------------|-----------|
  #   | auto-grafana-tuner  | promoted | 2026-04-12 | live      |
  #   | auto-pg-vacuum-tip  | rolled-back | 2026-04-19 | adversary |
  #   | auto-traefik-debug  | canary   | 2026-04-28 | watching  |

# Specific stage isolation (dev/debug)
atlas skill-discovery-loop --propose-only       # Stage 1 only
atlas skill-discovery-loop --resume <auto-name> # restart from last stage
atlas skill-discovery-loop --abandon <auto-name> # archive draft + log

# Routines integration (cloud cron)
atlas skill-discovery-loop --routine-create
  # → registers Anthropic Routine: daily at 03:00 EDT, --once
```

## Configuration

`~/.atlas/skill-discovery-loop.yaml` (created on first run):

```yaml
hitl:
  required_signoffs: 3              # promote auto-skills 1-3 with Seb approval
  signoffs_completed: 0             # incremented only after main-merge + 30d
gates:
  adversary_min_pass: 6             # of 8 attacks
  canary_error_budget_pct: 2.0
  canary_window_invocations: 50
  regression_min_baseline_ratio: 1.0
cadence:
  proposals_per_day_max: 1
  steady_state_target: "1 skill/month"
naming:
  prefix: "auto-"                   # all auto-created skills use auto-* prefix
  reserved_words: [atlas, core, admin, dev]  # forbidden in auto-names
```

## HITL gate semantics (CRITICAL)

The first 3 auto-created skills MUST satisfy ALL of:

1. PR opened against `main` with label `auto-skill-hitl-required`
2. Seb posts approving review with the exact phrase `auto-skill: APPROVE`
3. PR description contains the full Stage 1-5 audit trail (proposal,
   adversary report, canary metrics, regression diff)
4. Skill survives 30 days post-merge with zero rollbacks

Only when all 4 conditions hold for 3 distinct auto-skills does the loop
flip `signoffs_completed: 3` and transition to autonomous mode.

**Escape valve**: any auto-skill can be force-rolled-back via
`atlas skill-discovery-loop --abandon <auto-name>` regardless of stage.

## Telemetry & observability

Each loop iteration appends one JSONL line to
`~/.atlas/skill-discovery-loop.jsonl`:

```json
{"iteration": 42, "auto_name": "auto-pg-vacuum-tip", "stage_reached": "regression",
 "stage_results": {"propose": "ok", "draft": "ok", "adversary": "6/8",
                   "canary": "1.2% err", "regression": "fail"},
 "outcome": "rolled-back", "duration_s": 487,
 "ts": "2026-04-19T03:14:02Z"}
```

The `--status` subcommand renders this ledger plus a 90-day moving average
of "skills proposed → skills promoted" funnel conversion. SLO target:
**≥10% propose-to-promote conversion** in steady state.

## Rationale (Voyager → ATLAS)

Voyager grew Minecraft skills via env feedback (block-world physics).
ATLAS grows engineering skills via session telemetry (success/error rates,
user reactions, reuse counts). The composition pattern preserves
Voyager's three invariants:

1. **Open-ended exploration** — proposals are not constrained to a fixed
   taxonomy; idle-curiosity surfaces whatever gap is empirically real.
2. **Iterative refinement** — failed adversary or regression rounds feed
   back into the proposal corpus (negative training signal).
3. **Skill library compounds** — each promoted auto-skill becomes a
   primitive callable by future proposals (composition multiplier).

## Failure modes & recovery

| Failure                                | Detection                       | Recovery                                          |
|----------------------------------------|---------------------------------|---------------------------------------------------|
| idle-curiosity returns no proposals    | Stage 1 stdout empty            | Skip iteration, log `no-proposal`, retry next day |
| skill-creator drafts malformed YAML    | Frontmatter parse fails         | Abandon draft, log `draft-malformed`              |
| skill-adversary <6/8 pass              | adversary report `pass < 6`     | Abandon draft, feed report into `decisions.jsonl` |
| Canary error >2% over 50 invocations   | scorecard JSONL rolling window  | Auto-rollback via skill-canary-deployer           |
| Regression eval fails baseline         | regression-test exit ≠ 0        | Abandon draft, log diff for next iteration        |
| HITL signoff timeout (>14d, first 3)   | PR labeled `auto-skill-stale`   | Auto-close PR, archive draft, retry next month    |
| forgejo-pr open fails (rate limit, 5xx)| stderr ≠ 0                      | Backoff 1h, retry once, then escalate to Seb      |

All failures are **non-fatal to the loop itself** — the next scheduled
iteration runs as if the failed attempt never happened, except that the
failed proposal's topic is added to a 30-day cooldown to avoid re-proposing
the same gap immediately.

## Worked example (Stage 1-6 audit trail)

```text
# Iteration #17, 2026-04-12 03:00 EDT
Stage 1 PROPOSE   → idle-curiosity surfaced gap: "Grafana dashboard tuner"
                    confidence=0.82, evidence=12 sessions debugging panels manually
Stage 2 DRAFT     → skill-creator generated skills/auto-grafana-tuner/SKILL.md
                    (frontmatter valid, body 287 lines)
Stage 3 ADVERSARY → 7/8 attacks survived (failed: prompt-injection in panel JSON)
                    → continue (≥6/8 threshold met)
Stage 4 CANARY    → 10% mirror, 50 invocations, error rate 0.4%
                    → continue (under 2% budget)
Stage 5 REGRESS   → golden eval: 14/14 pass, no regression on 73 existing skills
                    → continue
Stage 6 PROMOTE   → HITL gate active (signoffs_completed=0/3)
                    → PR opened, awaiting `auto-skill: APPROVE` from Seb
```

After Seb approves and skill survives 30d in main, `signoffs_completed`
increments to 1. After 3 such cycles, the gate flips to autonomous and Seb
receives Routines digest emails instead of PR review pings.

## Anti-patterns (what this loop will NOT do)

- ❌ Propose skills outside the admin tier without explicit Seb invocation
- ❌ Skip any of the 5 gates "because the proposal looks safe"
- ❌ Auto-promote during the first-3-skill HITL window
- ❌ Run more than 1 proposal/day (avoid skill-spam)
- ❌ Create skills with names not prefixed `auto-`
- ❌ Modify existing skills (loop only CREATES; modifications go through
  normal `atlas-dev-self` workflow)

## References

- Plan section H W4.1 — Voyager autonomy pattern
- W3.1 `skill-regression-test/SKILL.md` — eval gate
- W3.2 `skill-canary-deployer/SKILL.md` — gradual rollout
- W3.3 `skill-adversary/SKILL.md` — stress-test gate
- `idle-curiosity/SKILL.md` — gap proposer
- `skill-creator/SKILL.md` — draft generator
- Voyager paper: Wang et al., "Voyager: An Open-Ended Embodied Agent with
  Large Language Models" (2023, arXiv:2305.16291)
- Anthropic Routines API — daily cron substrate
Get skill-discovery-loop.

vz-bench-debug

vz-scrape-runner

Think you can beat it?