Documentm2ai-st-metroFree

artifact-contract-validator

Validate a directory of claimed deliverables for file-type authenticity, formula presence in spreadsheets, embedded media in presentations, and PDF metadata sanity. Catches AI-generated fakes (HTML renamed as .pptx, empty spreadsheet cells passed off as data). Use before shipping any AI-generated deliverable set to a client or reviewer.

Repo bundle on Versuzm2ai-st-metro/skill-forge81 indexed entries (SKILL.md and CLAUDE.md) from this repository — open the full bundle view.

Open bundle →

View on GitHub ↗</>github.com/m2ai-st-metro/skill-forge Yours? Claim it ↗

§ 01 — Stats

Stars1

Prior1099

Quality—

Score—

Tasks—

§ 02 — Install

Get artifact-contract-validator.

Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.

One-line install · Claude Code

$npx versuz@latest install m2ai-st-metro-skill-forge-skills-artifact-contract-validator

Or clone the repo

$git clone https://github.com/m2ai-st-metro/skill-forge.git

Or copy the SKILL.md manually

$cp skill-forge/SKILL.MD ~/.claude/skills/m2ai-st-metro-skill-forge-skills-artifact-contract-validator/SKILL.md

More Versuz picks

★ Featured$1.99

vz-bench-debug

Document

★ Featured$0.99

vz-scrape-runner

Web

Got something better ?Submit your skill — it enters tomorrow's cycle. No fee.

Submit yours →

§ 05 — Challenge

Think you can beat it?

$npx versuz challenge m2ai-st-metro-skill-forge-skills-artifact-contract-validator↵

Show SKILL.md content (~1.4k tokens)

---
name: artifact-contract-validator
description: Validate a directory of claimed deliverables for file-type authenticity, formula presence in spreadsheets, embedded media in presentations, and PDF metadata sanity. Catches AI-generated fakes (HTML renamed as .pptx, empty spreadsheet cells passed off as data). Use before shipping any AI-generated deliverable set to a client or reviewer.
---

# Artifact Contract Validator

Runs per-format integrity checks on a directory of deliverables and produces a pass/fail report per file. The core problem it solves: AI models frequently produce files in the wrong format while giving them the correct extension — a "PowerPoint" that is actually an HTML file, a "spreadsheet" with no formulas, a "PDF" with no embedded text. This skill catches those failures before they reach a human reviewer.

## Trigger

Use when the user says "/artifact-contract-validator", "validate my deliverables", "check these files are real", "run artifact checks", "verify output formats", or after any AI-generated multi-artifact task completes.

## Phase 1: Intake

Ask the user for:
1. **Deliverable directory** — path to the folder containing the output files.
2. **Expected artifact contract** (optional) — a list of expected files and their types. If not provided, infer from the directory contents.

Example contract:
```
quarterly-report.pptx   → PowerPoint with embedded images
financial-model.xlsx     → Excel with at least one formula
exec-summary.pdf         → PDF with selectable text (not a scan)
press-release.docx       → Word document with tracked changes off
```

## Phase 2: Per-Format Validation

Run the following checks for each file type found in the directory. Use Python's standard library (`zipfile`, `struct`) or lightweight tools; do not install external packages without asking.

### .pptx / .pptx-like files
- Open as ZIP. A valid .pptx contains `ppt/presentation.xml`.
- Check `ppt/slides/` directory exists and has at least one slide.
- Check `ppt/media/` for embedded images if the contract specifies "with embedded images."
- **Fail signal**: ZIP opens but lacks `ppt/` tree → HTML or other format with .pptx extension.

### .xlsx / .xls files
- Open as ZIP. A valid .xlsx contains `xl/workbook.xml`.
- Check `xl/sharedStrings.xml` for non-empty string content.
- Search `xl/worksheets/sheet*.xml` for `<f>` tags (formula cells).
- **Fail signal**: No `<f>` tags → spreadsheet has no formulas despite claiming to be a financial model.

### .pdf files
- Read first 4 bytes: must be `%PDF`.
- Scan for `/Font` entries (presence indicates selectable text, not a scanned image-only PDF).
- Check file size > 10 KB (near-empty PDFs are suspicious).
- **Fail signal**: Starts with `<html` or `<!DOCTYPE` → HTML renamed to .pdf.

### .docx files
- Open as ZIP. A valid .docx contains `word/document.xml`.
- Check `word/document.xml` has non-trivial content (> 500 bytes).
- **Fail signal**: ZIP lacks `word/` tree → not a real Word document.

### Other / unknown extensions
- Report file size and MIME type detected from magic bytes.
- Flag if extension and detected type mismatch.

## Phase 3: Report

Produce a structured report:

```
=================================================================
ARTIFACT CONTRACT VALIDATION REPORT
=================================================================
Directory:   {path}
Validated:   {date}
Files:       {count}

Results
-------
{filename}          {PASS | FAIL | WARN}
  {failure reason if applicable}

{filename}          PASS

Summary
-------
Passed:    {n}
Failed:    {n}
Warnings:  {n}

{If failures}: The following files did not pass authenticity checks
and should be regenerated before delivery:
  - {filename}: {specific issue}
=================================================================
```

## Phase 4: Remediation Guidance

For each failed file, provide the specific regeneration instruction:

- **HTML-as-PPTX**: "Re-run the generation prompt with an explicit instruction: 'Output a real .pptx file using python-pptx, not an HTML file.'"
- **No-formula spreadsheet**: "Re-run with: 'Use openpyxl to write real Excel formulas in column C. Do not hard-code values — use =SUM() and cell references.'"
- **Scan-only PDF**: "Re-run with: 'Generate the PDF using reportlab or weasyprint to produce selectable text, not a screenshot.'"

## What This Does NOT Do

- Does not validate the *content* of files — only format authenticity.
- Does not check for copyright or data accuracy.
- Does not execute or render files — all checks are structural/byte-level.
- Does not require external packages to be pre-installed — it generates inline Python to perform the checks.

## Source

Extracted from Nate Kadlac newsletter (2026-04-28): "ChatGPT 5.5 scored 87 where the next best model scored 67. Here's what that gap looks like in real work." The Multi-Artifact Work Package template (Nate's Prompt 2) includes a verification layer that confirms artifacts are real files in real formats — this skill operationalizes that verification layer as a standalone, reusable validator.