Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.
npx versuz@latest install seb155-atlas-plugin-skills-grill-with-docsgit clone https://github.com/seb155/atlas-plugin.gitcp atlas-plugin/SKILL.MD ~/.claude/skills/seb155-atlas-plugin-skills-grill-with-docs/SKILL.md---
name: grill-with-docs
description: "Requirements elicitation anchored on existing domain model docs (CLAUDE.md, PRD, .blueprint/). Use when 'grill on feature X', 'requirements elicitation', 'spec the feature', or before implementing anything ambiguous."
mode: [engineering]
effort: medium
version: 2.1.0
tier: [admin]
attribution: "Cherry-picked from mattpocock/skills (MIT, see CREDITS.md). Socratic enhancements: sp5 2026-05-02. Re-merged upstream practical-workflow tactics + ported CONTEXT-FORMAT.md + ADR-FORMAT.md companion files: 2026-05-03 (per memory/audit-mattpocock-drift-2026-05-03.md)."
see_also: [vision-alignment, brainstorming, plan-builder, to-prd, scope-check, decision-log]
thinking_mode: adaptive
---
# Grill With Docs — Socratic Requirements Elicitation
> Stress-test a plan or feature idea against the project's existing domain model. Sharpen
> terminology, surface hidden variables, and force precision BEFORE implementation. Anchored
> on the Synapse + AXOIQ documented landscape (CLAUDE.md, `.blueprint/PRD-*`, MEMORY.md, ADRs).
>
> **SOTA 2026** — enhanced with Socratic Q-Tree (3 levels), Adversarial Debate (proponent vs
> skeptic), and Hidden-Var Drag (5 categories). Precedes `/plan-builder` in the standard flow.
## When to Invoke
### Manual triggers
- `/grill-me "feature description"` (canonical — use command for UX)
- "grill on feature X" / "grill me on this" / "grill-with-docs"
- "requirements elicitation" / "elicit requirements"
- "spec the feature" / "stress-test this plan"
- "challenge this against the domain model"
### Automatic triggers
Auto-fire BEFORE implementation when ANY of these signals:
- User asks to implement a feature with < 3 sentences of spec
- User uses domain terms (`E-Part`, `M-Part`, `WBS`, `package_material_rules`) without binding to concrete behaviour
- A new endpoint/service/page is requested but no PRD section exists yet
- User says "add X" without naming the persona, the entry-point, or the data source
### Skip triggers (do NOT grill)
- A PRD section in `.blueprint/PRD-SYNAPSE-ENTERPRISE.md` already covers this feature
→ call out the section + skip elicitation
- User has produced a written spec / sub-plan in `.blueprint/plans/*.md`
→ run `vision-alignment` instead (alignment, not elicitation)
- Trivial bugfixes / one-line config tweaks
- Pure refactor with no behaviour change
## Anti-Pattern Guard
Before grilling, scan the documented landscape:
```bash
# Verify the user hasn't already specced this
rg -l "<feature-keyword>" .blueprint/PRD-*.md .blueprint/plans/ memory/ 2>/dev/null
```
If hits found → STOP grilling, surface the existing doc, ask user if they want to
**extend** the existing spec or **replace** it. Do NOT re-elicit from scratch — that
wastes user time and introduces drift.
## Workflow
### Step 1 — Context Loading (read-only, fast)
Scan these sources in this order. Use `rg` and `grep`, NOT full reads:
| Source | Looks For |
|--------|-----------|
| `CLAUDE.md` (root + worktree) | Synapse principles, MBSE 4-layer, part lifecycle, persona list |
| `.blueprint/PRD-SYNAPSE-ENTERPRISE.md` | Existing feature specs, persona definitions |
| `.blueprint/MEGA-PLAN.md` | Cross-plan strategic context |
| `.blueprint/plans/INDEX.md` + targeted sub-plans | Adjacent or parent work |
| `memory/MEMORY.md` + `memory/feedback_*.md` | User preferences, prior decisions, lessons |
| `.blueprint/AUDITS-REGISTRY.md` | Constraints from past audits (corpo-leak, Zustand, TVT) |
| Code (only if domain term ambiguous) | `rg "class X" backend/app/` to confirm canonical naming |
Output of Step 1: a 5-7 line "context summary" — what the docs already say about this
feature area. Surface to user explicitly: "Here's what the docs already establish: …"
### Step 2 — Generate Grilling Questions
Produce **5-7 questions** targeting domain-model gaps. Cover at minimum:
1. **Persona** — Which of the 8 personas (I&C / EL / ME / Process / PM / Procurement /
Admin / Client) is the primary user? Secondary?
2. **Engineering chain placement** — Where does this fit in `IMPORT → CLASSIFY → ENGINEER
→ SPEC GROUP → E-BOM → PROCURE → ESTIMATE → OUTPUTS`?
3. **Part lifecycle** — Does this touch G-Part / E-Part / M-Part / I-Part / S-Part? Which
transitions?
4. **MBSE layer** — Which of the 4 layers (QUOI / OÙ / COMMENT / QUI) does this affect?
5. **Multi-tenant** — How does `project_id` filter work for this? THM-012 + 4 REF projects
behave the same?
6. **Data source** — Where does the data live? `material_catalog` / `package_material_rules`
/ `frm_item_presentation` / `instruments` / new table?
7. **Determinism** — Same input = same output? Any AI dependency to flag?
For each question, also propose a **recommended answer** based on Step 1 context. The
user can accept / refine / reject. Recommended answers anchor the conversation.
### Step 3 — AskUserQuestion (MANDATORY per ATLAS rule)
Ask the questions **one at a time** via the `AskUserQuestion` tool — never via free text.
This is non-negotiable per `~/.claude/CLAUDE.md` global rule.
Per question:
- Pose the question with the recommended answer as default option
- Wait for the user's reply before asking the next
- If the user can't answer or says "tu décides" → log a gap, propose the most-likely
default, mark it `assumed:` in the spec output
### Step 4 — Sharpen / Cross-Reference
While grilling:
- **Glossary clash** — if the user uses a term that conflicts with `CLAUDE.md` or
`.blueprint/PRD-*` → call it out: "You said `M-Part`, but the lifecycle defines that
as a procurement-stage part. Did you mean `E-Part` (engineering stage)?"
- **Code clash** — if user states a behaviour that contradicts the code → surface it:
"Your description says `recompute_hours` ignores salvage rate, but the branching in
`backend/app/services/estimate_service.py` includes it."
- **Concrete scenario** — for each ambiguity, invent one concrete scenario. "Walk me
through THM-012 building 3, panel 2A, package P-201 — what happens?"
### Step 5 — Detect Showstoppers (refuse to proceed)
Refuse to produce a spec if ANY of these unresolved:
- No persona identified (≠ "everyone uses it")
- Multi-tenant filter undefined (would leak data across projects)
- Data source undefined ("we'll figure it out") = future hardcode incident
- Engineering chain placement unclear (cannot wire into existing pipeline)
If a showstopper is unresolved → tell the user, document the gap in `memory/grill-gap-<topic>.md`,
abandon the elicitation cleanly. Better to refuse than produce a wishful spec.
### Step 6 — Produce Spec Output
If all questions answered → produce a **spec markdown** ready for `to-prd` or
`plan-builder` downstream:
```
.blueprint/plans/draft-<feature-slug>.md
```
Sections (minimum):
- **Persona(s)** — primary + secondary
- **Engineering chain placement** — node in the pipeline
- **Part lifecycle touched** — which G/E/M/I/S transitions
- **MBSE layer(s)** — QUOI / OÙ / COMMENT / QUI
- **Multi-tenant rule** — how `project_id` is enforced
- **Data source** — table(s) + columns + new schema if any
- **Determinism contract** — explicit "no AI in prod" reaffirmation
- **Acceptance scenarios** — 3-5 concrete scenarios from Step 4
The spec is a **draft** — `to-prd` (W4.6) folds it into `PRD-SYNAPSE-ENTERPRISE.md`,
`plan-builder` produces the executable sub-plan.
## Pair With (downstream)
| Skill | Role |
|-------|------|
| `vision-alignment` | If feature feels strategic — run BEFORE grilling for roadmap fit |
| `brainstorming` | If multiple architectural options — run BEFORE grilling to enumerate |
| `to-prd` (W4.6) | Folds the grill output into the PRD |
| `plan-builder` | Produces the 15+5 section sub-plan from the PRD section |
| `scope-check` | Mid-execution drift detection on the grilled spec |
## Pocock → ATLAS Adaptations
| Pocock pattern | ATLAS adaptation |
|----------------|-------------------|
| `CONTEXT.md` glossary | Synapse uses CLAUDE.md + `.blueprint/PRD-*` as SSoT — no separate glossary file |
| `docs/adr/` lazy creation | ATLAS uses `.blueprint/AUDITS-REGISTRY.md` + `.claude/decisions.jsonl` — log there, not new ADR per term |
| Inline CONTEXT.md updates | Inline `.blueprint/PRD-SYNAPSE-ENTERPRISE.md` updates — only via `to-prd` downstream, never directly here |
| Free-text questions | **MANDATORY `AskUserQuestion`** per global rule (non-negotiable) |
| ADR offered sparingly | ADRs in `infrastructure/docs/ADR/` (homelab) or `projects/atlas-plugin/docs/ADR/` (atlas) — leave to `decision-log` skill |
| Single-language English | Bilingual FR/Quebecois friendly — questions can be French if user French |
## Anti-Patterns
1. **Grilling without context load (Step 1)** — produces generic questions, wastes user's time
2. **Grilling AFTER spec exists** — re-elicits what's already documented → drift
3. **Free-text questions** — violates global AskUserQuestion rule
4. **Producing a spec without persona** — every Synapse feature MUST name a persona
5. **Ignoring multi-tenant** — features that don't filter `project_id` = data leak risk
6. **Batching questions** — Pocock rule is one-at-a-time; lose context if batched
7. **Refusing to refuse** — if showstoppers unresolved, abandon. Wishful specs cost weeks
to retract (see `memory/feedback_time_estimation_always_wrong.md`)
8. **Using grill-me to write PRD** — `/grill-me` produces a draft spec only. `/to-prd`
folds the draft into `PRD-SYNAPSE-ENTERPRISE.md`. **NEVER** write to the PRD from here.
---
## Socratic Q-Tree (SOTA 2026 Enhancement)
> Reference: `references/socratic-patterns.md` | Source: Socratic-Solver (emergentmind.com)
> Three levels of questioning depth, auto-selected based on ambiguity signals.
### Level Selection Logic
| Signal detected | Run levels |
|-----------------|-----------|
| Feature described in ≥ 5 sentences, persona named, data source specified | L1 only |
| Feature described in < 4 sentences OR missing persona OR data source unclear | L1 + L2 |
| L2 surfaces 2+ unstated assumptions OR `--level 3` flag | L1 + L2 + L3 |
| Multi-tenant/auth touched OR 2+ critical risks at L3 OR `--debate` flag | L1 + L2 + L3 + Debate |
### Level 1 — Clarify (always)
**Goal**: establish identity. 5 fixed questions tied to Synapse domain model.
1. **Persona** — Which of the 8 personas (I&C / EL / ME / Process / PM / Procurement /
Admin / Client) is primary? Secondary? Anti-user (who must NOT use it)?
2. **Chain placement** — Where does this fit in `IMPORT → CLASSIFY → ENGINEER → SPEC GROUP
→ E-BOM → PROCURE → ESTIMATE → OUTPUTS`?
3. **Part lifecycle** — Which G→E→M→I→S transitions does this trigger or depend on?
4. **Data source** — Which table(s): `material_catalog`, `package_material_rules`,
`frm_item_presentation`, `instruments`, or new?
5. **Multi-tenant** — How does `project_id` filtering apply? THM-012 AND 4 REF projects
behave identically?
Output: context summary + 5 questions with recommended answers.
### Level 2 — Challenge
**Goal**: surface assumptions the user didn't articulate. 3-5 questions using 3 tactics.
**Tactic A — Reverse the spec**: "If we DON'T build this, what breaks or stays manual?"
This surfaces whether the feature is essential or nice-to-have.
**Tactic B — Counter-example**: "Give me one concrete scenario where your stated rule
produces the wrong result." Forces precision on edge conditions.
**Tactic C — Dependency probe**: "What must already be true (in the data, in the code,
in the user's workflow) for this to work?" Uncovers hidden prerequisites.
Example Level 2 questions:
- "You said 'auto-classify'. If classification returns no match, what happens?"
- "If the same package exists in THM-012 AND BRTZ, should the rule apply identically?"
- "What existing feature would break if we deployed this as described?"
### Level 3 — Constrain
**Goal**: bound the solution space before design begins. 3-4 constraint questions.
**Type P (Performance)**: "What is the acceptable response time for X, given Y records?"
**Type S (Security)**: "Who must NOT see this data? What role is required to execute this?"
**Type R (Rollback)**: "If we revert this feature in 2 weeks, what data becomes permanently orphaned?"
**Type E (Edge)**: "What happens at zero records? At 10,000? Under concurrent edits?"
Level 3 output includes: 2-3 constraint statements + 1-2 flagged risks ready for decision-log.
---
## Adversarial Debate (SOTA 2026 Enhancement)
> Reference: `references/adversarial-debate.md` | Source: Mitsubishi multi-agent pattern (2026)
> Two conceptual sub-agents stress-test the spec from opposing directions.
> Patterns: atlas-team compatible (can be spawned as real parallel agents in V2).
### When to Run
- Explicit `--debate` flag
- Level 3 surfaces 2+ critical risks
- Feature touches: multi-tenant data, authentication, billing/cost calculations, HITL gates
### Proponent Agent
**Role**: "Defend this spec as stated. Find the 3 strongest reasons this approach is correct."
Inputs: draft spec from Levels 1-3 + existing domain-model docs.
Produces:
1. 3 arguments FOR the approach (anchored on domain-model evidence)
2. 1 risk the approach explicitly accepts (and why it's acceptable)
3. 1 success criterion that would prove this works
### Skeptic Agent
**Role**: "Challenge this spec. Find the 3 most likely production failure modes."
Inputs: same as proponent.
Produces:
1. 3 points where the spec breaks in production (concrete failure scenario per point)
2. 1 alternative approach that avoids the highest-risk assumption
3. 1 hidden dependency the spec silently relies on
### Resolution (Orchestrator)
After both agents respond:
1. Identify 1-2 genuine disagreements (proponent's strength vs skeptic's failure mode)
2. Pose each genuine disagreement to user via `AskUserQuestion` for resolution
3. Log resolved disagreements as decision-log entries (`.claude/decisions.jsonl`)
4. Unresolved disagreements after 2 turns → showstopper (abandon cleanly)
---
## Hidden-Var Drag (SOTA 2026 Enhancement)
> "Hidden variables" = invisible spec dependencies that drag the feature toward production
> failure if not surfaced during elicitation. Run AFTER Level 2, BEFORE Level 3.
### 5 Categories
| Category | Probe questions | Risk if missed |
|----------|----------------|----------------|
| **domain** | Term matches CLAUDE.md glossary? MBSE layer correct? Chain placement confirmed? | Terminology drift → wrong implementation |
| **persona** | Named persona from 8? Secondary user? Anti-user (who must NOT)? | Features built for "everyone" work for no one |
| **edge** | Zero records? Max records? Concurrent edits? Partial import? | 3 AM failures on edge inputs |
| **perf** | Hot path? (see `performance-discipline.md`) Max latency? ParadeDB BM25 or PG? | Perf budget violated post-launch |
| **sec** | `project_id` filter confirmed? Audit trail needed? Role required? RBAC impact? | Multi-tenant data leak (irreversible) |
### Probe Execution
For each category, run 1-2 targeted questions based on the feature description.
A category is "covered" when its core invariant is confirmed (not necessarily perfect).
An uncovered category at this stage → **promoted to Level 3** constraint question.
### Hidden-Var Summary
After probing, emit a table:
```
| Category | Status | Gap (if any) |
|----------|----------|---------------------------------|
| domain | ✅ clear | — |
| persona | ⚠️ gap | Anti-user not defined |
| edge | ✅ clear | — |
| perf | ⚠️ gap | Latency budget not stated |
| sec | ✅ clear | project_id filter confirmed |
```
Gaps → promoted to Level 3 or showstopper (if sec category is uncovered).
---
## /grill-me ≠ /to-prd (NON-NEGOTIABLE Distinction)
| Dimension | `/grill-me` | `/to-prd` |
|-----------|-------------|-----------|
| **Output** | `draft-<slug>.md` + decision-log entry | Section in `PRD-SYNAPSE-ENTERPRISE.md` |
| **Trigger** | Feature is ambiguous, pre-implementation | Spec is sharp, post-grilling |
| **Audience** | Developer + AI (internal, iterative) | Stakeholders + future devs (canonical) |
| **Produces PRD?** | ❌ NEVER | ✅ YES |
| **Decision-log?** | ✅ ALWAYS | ❌ Not its job |
| **Reversible?** | ✅ draft can be discarded | ⚠️ PRD edit = official record |
**Refusal pattern**: if user says "grill me and write the PRD" → `/grill-me` MUST:
1. Complete grilling and produce the draft spec
2. State explicitly: "Draft spec created. Use `/to-prd` to fold into the PRD."
3. NOT touch `PRD-SYNAPSE-ENTERPRISE.md`
This is a golden.jsonl refusal case (sp5-014) — weight 2.5.
## Output Discipline
- Per turn: **1 question** via `AskUserQuestion`, ≤ 5 lines of preamble, recommended answer included
- Per session: **1 draft spec** (`draft-<slug>.md`), gap log if abandoned, decision-log entry always
- Never: free-text question, multi-question turn, spec without persona, PRD write
## References
- Source: https://github.com/mattpocock/skills `engineering/grill-with-docs`
- License: MIT (see `CREDITS.md` after batch consolidation)
- Plan (v1): `.blueprint/plans/ultrathink-regarde-ce-qui-abundant-petal.md` Section H W4.5
- Plan (v2): `.blueprint/plans/sp5-grill-me-socratic-2026.md`
- Memory: `memory/lesson_cherrypick_distribute_vs_dedicated_wave.md`
- Socratic patterns: `skills/grill-with-docs/references/socratic-patterns.md`
- Adversarial debate: `skills/grill-with-docs/references/adversarial-debate.md`
- Command: `commands/grill-me.md`
- Companion: `vision-alignment`, `brainstorming`, `plan-builder`, `to-prd`, `decision-log`
- Socratic-Solver: https://www.emergentmind.com/topics/socratic-solver
- SocraticAgent: https://www.emergentmind.com/topics/socraticagent
- Mitsubishi adversarial debates: https://us.mitsubishielectric.com/en/pr/global/2026/0120/
- Towards-AI: https://towardsai.net/p/machine-learning/the-socratic-prompt-how-to-make-a-language-model-stop-guessing-and-start-thinking
- Synapse domain SSoT: `CLAUDE.md`, `.blueprint/PRD-SYNAPSE-ENTERPRISE.md`, `memory/MEMORY.md`
- Persona list: `.blueprint/PRD-SYNAPSE-ENTERPRISE.md` § Personas
- Engineering chain: `CLAUDE.md` § IDENTITY
- Part lifecycle: `CLAUDE.md` § IDENTITY (G→E→M→I→S)
- MBSE 4-layer: `CLAUDE.md` § IDENTITY (QUOI / OÙ / COMMENT / QUI)
- Performance discipline: `.claude/rules/performance-discipline.md`
---
## Practical workflow tactics (re-merged from upstream 2026-05-03)
The atlas v2.0.0 SOTA enhancements (Socratic Q-Tree, Adversarial Debate, Hidden-Var Drag) augment but do not replace upstream's practical tactics. Per audit `memory/audit-mattpocock-drift-2026-05-03.md`, these upstream tactics are now exposed via companion references:
- **Domain awareness** — before grilling, locate the project's existing `CLAUDE.md` glossary section, `.blueprint/PRD-*.md` glossary, or `.claude/decisions.jsonl` ADRs. Do not re-litigate established decisions.
- **Cross-reference with code** — when the user asserts a behavior, grep the codebase to verify before challenging. Saves rounds of unnecessary debate.
- **Sharpen fuzzy language** — when a term is used loosely, name it explicitly during the conversation. Add to CONTEXT.md / glossary on the spot.
- **Challenge against the glossary** — every claim should map to a known term. Unmappable claims surface either (a) a missing glossary term, or (b) a confused requirement.
- **Offer ADRs sparingly** — only when the rationale would actually be needed by a future explorer. Skip ephemeral ("not worth it now") and self-evident reasons.
- **Discuss concrete scenarios** — ground abstract claims in scenarios. "When the user clicks X with Y selected, what happens?" beats "the system should be flexible."
Companion references (ported 2026-05-03 from upstream `mattpocock/skills`):
- [references/CONTEXT-FORMAT.md](references/CONTEXT-FORMAT.md) — DDD glossary template (compatible with `ubiquitous-language` skill)
- [references/ADR-FORMAT.md](references/ADR-FORMAT.md) — ADR template (used by `decision-log` skill)