Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.
npx versuz@latest install m2ai-st-metro-skill-forge-skills-spec-gap-detectorgit clone https://github.com/m2ai-st-metro/skill-forge.gitcp skill-forge/SKILL.MD ~/.claude/skills/m2ai-st-metro-skill-forge-skills-spec-gap-detector/SKILL.md--- name: spec-gap-detector description: Stress-test any agent prompt or specification for ambiguity, missing constraints, and edge cases that would cause random behavior at scale. --- # Specification Gap Detector Takes any prompt, spec, or instruction set written for an AI agent and stress-tests it for gaps that would cause unpredictable behavior at scale. ## Trigger Use when the user says "check this spec", "review this prompt", "stress test this", "is this spec tight enough", "spec review", or provides a prompt/instruction set and asks if it's ready for production. ## Phase 1: Intake Accept the specification. This can be: - A prompt or system prompt for an agent - A SKILL.md file - A CLAUDE.md section - A task description for a mission/scheduled task - Any structured instruction set If the user points to a file, read it. If they paste it, use it directly. ## Phase 2: Gap Analysis Analyze the spec across 7 dimensions. For each, identify specific gaps: ### 1. Ambiguous Edge Cases - Where could two reasonable interpretations exist? - What inputs would make the agent guess? - Flag any "use your judgment" without bounds ### 2. Missing Success Criteria - How does the agent know it succeeded? - Are there measurable outputs defined? - Is "done" clearly defined? ### 3. Unclear Hard vs. Soft Constraints - Which rules are absolute (MUST/NEVER) vs. preferences (SHOULD/prefer)? - Are there implicit constraints that aren't stated? - Could the agent reasonably violate an unstated rule? ### 4. Missing Error Handling - What should happen when the agent can't complete a step? - Are fallback behaviors defined? - Is escalation to human specified for uncertain cases? ### 5. Context Dependencies - Does the spec assume context that may not be present? - Are external dependencies (APIs, files, services) explicitly listed? - What happens if a dependency is unavailable? ### 6. Scale Behavior - Would this produce consistent results across 100 runs? - Are there non-deterministic paths that should be constrained? - Could token/context limits cause mid-task degradation? ### 7. Security & Trust Boundaries - Can the agent access more than it needs? - Are destructive operations gated? - Is there PII/credential exposure risk? ## Phase 3: Scoring Score the spec: | Dimension | Score (1-5) | Critical Gaps | |-----------|-------------|---------------| | Edge Cases | X | [list] | | Success Criteria | X | [list] | | Constraints | X | [list] | | Error Handling | X | [list] | | Context Deps | X | [list] | | Scale Behavior | X | [list] | | Trust Boundaries | X | [list] | | **Overall** | **X/5** | | **Rating scale**: 1 = will break immediately, 2 = breaks at scale, 3 = works but fragile, 4 = production-ready with minor gaps, 5 = bulletproof ## Phase 4: Tightening Suggestions For each gap scored 3 or below, provide a specific rewrite or addition: ``` GAP: [description] RISK: [what goes wrong without this] FIX: [exact text to add/change in the spec] ``` Limit to the top 5 most impactful fixes. Prioritize gaps that would cause the worst outcomes at scale. ## Phase 5: Output Present the full analysis, then offer: - "Want me to apply the fixes directly?" (if editing a file) - "Want the tightened spec as a new file?" - "Want just the fixes as a checklist?" ## Source Extracted from Nate Kadlac newsletter (2026-03-26) -- "The K-Shaped AI Labor Market" -- specification precision as a core AI-native skill.