Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.
npx versuz@latest install jmagly-aiwg-plugins-media-curator-skills-integrity-verificationgit clone https://github.com/jmagly/aiwg.gitcp aiwg/SKILL.MD ~/.claude/skills/jmagly-aiwg-plugins-media-curator-skills-integrity-verification/SKILL.md---
namespace: aiwg
platforms: [all]
name: Integrity Verification
description: SHA-256 checksum manifest generation, self-verification, and PREMIS fixity patterns
category: media-curator
---
# Integrity Verification
Cryptographic checksum verification patterns for detecting bit rot, tampering, and transfer errors in media archives. Implements self-verifying manifests with PREMIS fixity metadata.
## Manifest Generation Script
Complete bash implementation for generating self-verifying checksum manifests:
```bash
#!/bin/bash
set -euo pipefail
# Archive Checksum Manifest Generator
# Generates self-verifying SHA-256 checksum manifest
# Usage: ./generate-checksums.sh /path/to/archive
ARCHIVE_PATH="${1:-.}"
CHECKSUM_FILE="CHECKSUMS.sha256"
TEMP_FILE="/tmp/checksums-$$.tmp"
# Validate archive exists
if [ ! -d "$ARCHIVE_PATH" ]; then
echo "Error: Archive directory not found: $ARCHIVE_PATH" >&2
exit 1
fi
cd "$ARCHIVE_PATH"
echo "Generating checksums for: $ARCHIVE_PATH"
# Find all files, exclude checksum manifest itself
# Use null-terminated strings for handling filenames with spaces
find . -type f ! -name "$CHECKSUM_FILE" -print0 | \
sort -z | \
xargs -0 sha256sum > "$TEMP_FILE"
# Count files
FILE_COUNT=$(wc -l < "$TEMP_FILE")
echo "Found $FILE_COUNT files"
# Compute manifest hash (hash of the checksum content)
MANIFEST_HASH=$(sha256sum "$TEMP_FILE" | awk '{print $1}')
# Generate timestamp (ISO 8601 UTC with nanosecond precision)
TIMESTAMP=$(date -u +%Y-%m-%dT%H:%M:%S.%NZ)
# Write final manifest with self-verifying header
{
echo "# MANIFEST_HASH: $MANIFEST_HASH"
echo "# Generated: $TIMESTAMP"
echo "# Verify with: tail -n +4 $CHECKSUM_FILE | sha256sum"
cat "$TEMP_FILE"
} > "$CHECKSUM_FILE"
# Clean up
rm "$TEMP_FILE"
echo "✓ Generated $CHECKSUM_FILE"
echo " Manifest hash: $MANIFEST_HASH"
echo " Timestamp: $TIMESTAMP"
echo " Files: $FILE_COUNT"
```
**Key features**:
- Handles filenames with spaces via null-terminated strings (`-print0`, `-z`, `-0`)
- Deterministic output (sorted by path)
- Self-verifying header with manifest hash
- ISO 8601 UTC timestamps with nanosecond precision
- Exit on error (`set -euo pipefail`)
## Verification Commands
### Quick Manifest Integrity Check
Verify manifest has not been tampered with (sub-second):
```bash
#!/bin/bash
# Quick verification - manifest integrity only
CHECKSUM_FILE="CHECKSUMS.sha256"
if [ ! -f "$CHECKSUM_FILE" ]; then
echo "✗ Checksum manifest not found: $CHECKSUM_FILE" >&2
exit 1
fi
# Extract expected hash from header
EXPECTED=$(grep '^# MANIFEST_HASH:' "$CHECKSUM_FILE" | awk '{print $3}')
if [ -z "$EXPECTED" ]; then
echo "✗ Manifest header missing or malformed" >&2
exit 1
fi
# Compute actual hash of manifest content (lines 4+)
ACTUAL=$(tail -n +4 "$CHECKSUM_FILE" | sha256sum | awk '{print $1}')
# Compare
if [ "$EXPECTED" = "$ACTUAL" ]; then
echo "✓ Manifest integrity verified"
echo " Hash: $EXPECTED"
exit 0
else
echo "✗ Manifest has been tampered with" >&2
echo " Expected: $EXPECTED" >&2
echo " Actual: $ACTUAL" >&2
exit 1
fi
```
**Use case**: Daily automated checks. Fast execution regardless of archive size.
**Exit codes**:
- `0` - Manifest integrity verified
- `1` - Manifest corrupted or tampered
### Full File Verification
Verify all files match their checksums:
```bash
#!/bin/bash
# Full verification - manifest integrity + all files
CHECKSUM_FILE="CHECKSUMS.sha256"
# Step 1: Verify manifest integrity
echo "Step 1: Verifying manifest integrity..."
EXPECTED=$(grep '^# MANIFEST_HASH:' "$CHECKSUM_FILE" | awk '{print $3}')
ACTUAL=$(tail -n +4 "$CHECKSUM_FILE" | sha256sum | awk '{print $1}')
if [ "$EXPECTED" != "$ACTUAL" ]; then
echo "✗ Manifest integrity check failed - stopping" >&2
exit 1
fi
echo "✓ Manifest integrity verified"
# Step 2: Verify all files
echo "Step 2: Verifying all files..."
if tail -n +4 "$CHECKSUM_FILE" | sha256sum -c; then
echo "✓ All files verified successfully"
exit 0
else
echo "✗ One or more files failed verification" >&2
exit 1
fi
```
**Output format** (from `sha256sum -c`):
```
./audio/episode-001.opus: OK
./audio/episode-002.opus: OK
./video/recording.mp4: FAILED
sha256sum: WARNING: 1 computed checksum did NOT match
```
**Exit codes**:
- `0` - All files verified successfully
- `1` - Verification failed (manifest or files)
### Quiet Mode
Show only failures:
```bash
#!/bin/bash
# Quiet verification - only show failures
CHECKSUM_FILE="CHECKSUMS.sha256"
# Quick manifest check (silent)
EXPECTED=$(grep '^# MANIFEST_HASH:' "$CHECKSUM_FILE" | awk '{print $3}')
ACTUAL=$(tail -n +4 "$CHECKSUM_FILE" | sha256sum | awk '{print $1}')
if [ "$EXPECTED" != "$ACTUAL" ]; then
echo "MANIFEST: FAILED" >&2
exit 1
fi
# Verify files (quiet mode - only show failures)
tail -n +4 "$CHECKSUM_FILE" | sha256sum -c --quiet
EXIT_CODE=$?
if [ $EXIT_CODE -eq 0 ]; then
# Silent success
exit 0
else
# sha256sum already printed failures to stderr
exit 1
fi
```
**Use case**: Cron jobs, automated monitoring, CI/CD pipelines.
**Output**: Nothing on success, only failed files on failure.
## Regeneration After Changes
Script for regenerating manifest after archive modifications:
```bash
#!/bin/bash
set -euo pipefail
# Regenerate checksum manifest after archive changes
# Usage: ./fix-checksums.sh /path/to/archive
ARCHIVE_PATH="${1:-.}"
CHECKSUM_FILE="CHECKSUMS.sha256"
BACKUP_FILE="CHECKSUMS.sha256.bak"
TEMP_FILE="/tmp/checksums-$$.tmp"
cd "$ARCHIVE_PATH"
# Backup existing manifest
if [ -f "$CHECKSUM_FILE" ]; then
cp "$CHECKSUM_FILE" "$BACKUP_FILE"
echo "Backed up existing manifest to $BACKUP_FILE"
fi
# Generate new manifest
echo "Regenerating checksums..."
find . -type f ! -name "$CHECKSUM_FILE" ! -name "$BACKUP_FILE" -print0 | \
sort -z | \
xargs -0 sha256sum > "$TEMP_FILE"
FILE_COUNT=$(wc -l < "$TEMP_FILE")
MANIFEST_HASH=$(sha256sum "$TEMP_FILE" | awk '{print $1}')
TIMESTAMP=$(date -u +%Y-%m-%dT%H:%M:%S.%NZ)
{
echo "# MANIFEST_HASH: $MANIFEST_HASH"
echo "# Generated: $TIMESTAMP"
echo "# Verify with: tail -n +4 $CHECKSUM_FILE | sha256sum"
cat "$TEMP_FILE"
} > "$CHECKSUM_FILE"
rm "$TEMP_FILE"
# Detect changes
if [ -f "$BACKUP_FILE" ]; then
echo ""
echo "Changes detected:"
# Extract file paths from old and new manifests
tail -n +4 "$BACKUP_FILE" | awk '{print $2}' | sort > /tmp/old-files-$$.txt
tail -n +4 "$CHECKSUM_FILE" | awk '{print $2}' | sort > /tmp/new-files-$$.txt
# Added files
ADDED=$(comm -13 /tmp/old-files-$$.txt /tmp/new-files-$$.txt)
if [ -n "$ADDED" ]; then
echo " Added:"
echo "$ADDED" | sed 's/^/ /'
fi
# Removed files
REMOVED=$(comm -23 /tmp/old-files-$$.txt /tmp/new-files-$$.txt)
if [ -n "$REMOVED" ]; then
echo " Removed:"
echo "$REMOVED" | sed 's/^/ /'
fi
# Modified files (different hash for same path)
# This requires comparing hashes, not just paths
COMMON_FILES=$(comm -12 /tmp/old-files-$$.txt /tmp/new-files-$$.txt)
if [ -n "$COMMON_FILES" ]; then
while IFS= read -r file; do
OLD_HASH=$(grep -F "$file" "$BACKUP_FILE" | awk '{print $1}')
NEW_HASH=$(grep -F "$file" "$CHECKSUM_FILE" | awk '{print $1}')
if [ "$OLD_HASH" != "$NEW_HASH" ]; then
echo " Modified: $file"
fi
done <<< "$COMMON_FILES"
fi
rm /tmp/old-files-$$.txt /tmp/new-files-$$.txt
fi
echo ""
echo "✓ Generated new $CHECKSUM_FILE"
echo " Manifest hash: $MANIFEST_HASH"
echo " Files: $FILE_COUNT"
```
**Features**:
- Backs up existing manifest to `.bak` file
- Regenerates checksums for all current files
- Reports added, removed, and modified files
- Preserves backup for comparison
## VERIFY.md Template
Human-readable instructions placed in archive root:
```markdown
# Archive Integrity Verification
This archive contains a self-verifying checksum manifest (`CHECKSUMS.sha256`) for detecting corruption, tampering, or transfer errors.
## Archive Information
- **Generated**: {TIMESTAMP}
- **Total files**: {FILE_COUNT}
- **Total size**: {TOTAL_SIZE}
- **Manifest hash**: {MANIFEST_HASH}
## Quick Verification (30 seconds)
Verify the manifest has not been tampered with:
\`\`\`bash
EXPECTED=$(grep '^# MANIFEST_HASH:' CHECKSUMS.sha256 | awk '{print $3}')
ACTUAL=$(tail -n +4 CHECKSUMS.sha256 | sha256sum | awk '{print $1}')
[ "$EXPECTED" = "$ACTUAL" ] && echo "✓ Verified" || echo "✗ Tampered"
\`\`\`
## Full Verification (10-60 minutes)
Verify all files match their checksums:
\`\`\`bash
tail -n +4 CHECKSUMS.sha256 | sha256sum -c
\`\`\`
## Recommended Schedule
| Frequency | Verification Type | Purpose |
|-----------|-------------------|---------|
| Daily | Quick (manifest only) | Detect tampering |
| Weekly | Full (all files) | Detect bit rot |
| After transfer | Full | Verify transfer integrity |
| Before backup | Full | Ensure source integrity |
## Automated Monitoring
Add to crontab for daily verification:
\`\`\`cron
# Daily quick check at 2am
0 2 * * * cd /path/to/archive && tail -n +4 CHECKSUMS.sha256 | sha256sum -c --quiet || echo "Verification failed" | mail -s "Archive Alert" admin@example.com
\`\`\`
## If Verification Fails
### Manifest Integrity Failure
The manifest itself has been corrupted or tampered with.
**Recovery**:
1. Restore `CHECKSUMS.sha256` from backup
2. If no backup, regenerate manifest (see below)
### File Verification Failure
One or more files have been corrupted or modified.
**Identify failures**:
\`\`\`bash
tail -n +4 CHECKSUMS.sha256 | sha256sum -c 2>&1 | grep FAILED
\`\`\`
**Recovery**:
1. Restore failed files from backup
2. If intentional modification, regenerate manifest
## Regenerate Manifest
After making changes to the archive (add/remove/modify files):
\`\`\`bash
# Backup existing manifest
cp CHECKSUMS.sha256 CHECKSUMS.sha256.bak
# Regenerate
find . -type f ! -name "CHECKSUMS.sha256" -print0 | \\
sort -z | \\
xargs -0 sha256sum > /tmp/checksums.tmp
MANIFEST_HASH=$(sha256sum /tmp/checksums.tmp | awk '{print $1}')
TIMESTAMP=$(date -u +%Y-%m-%dT%H:%M:%S.%NZ)
{
echo "# MANIFEST_HASH: $MANIFEST_HASH"
echo "# Generated: $TIMESTAMP"
echo "# Verify with: tail -n +4 CHECKSUMS.sha256 | sha256sum"
cat /tmp/checksums.tmp
} > CHECKSUMS.sha256
rm /tmp/checksums.tmp
\`\`\`
## Technical Details
- **Hash algorithm**: SHA-256 (NIST FIPS 180-4)
- **Timestamp format**: ISO 8601 UTC with nanosecond precision
- **Self-verification**: Manifest hash prevents undetected tampering
- **Coverage**: All files except `CHECKSUMS.sha256` itself
## Support
For questions or issues:
- Documentation: https://aiwg.io/media-curator
- Issues: https://github.com/jmagly/aiwg/issues
- Command reference: `aiwg verify-archive --help`
---
*Generated by AIWG Media Curator v{VERSION}*
```
**Template variables**:
- `{TIMESTAMP}` - ISO 8601 generation timestamp
- `{FILE_COUNT}` - Number of files in manifest
- `{TOTAL_SIZE}` - Archive size (e.g., "4.2 GB")
- `{MANIFEST_HASH}` - SHA-256 hash of manifest
- `{VERSION}` - AIWG version
## Scheduled Verification
### Cron Pattern for Daily Checks
```cron
# Quick verification daily at 2am
0 2 * * * cd /media/archives/podcast && /usr/local/bin/aiwg verify-archive . --verify --quiet || echo "Archive verification failed: $(pwd)" | mail -s "Archive Alert" admin@example.com
# Full verification weekly on Sunday at 3am
0 3 * * 0 cd /media/archives/podcast && /usr/local/bin/aiwg verify-archive . --verify 2>&1 | mail -s "Weekly Archive Verification" admin@example.com
```
### Systemd Timer
**Service file** (`/etc/systemd/system/archive-verify.service`):
```ini
[Unit]
Description=Verify media archive checksums
After=network.target
[Service]
Type=oneshot
User=media
WorkingDirectory=/media/archives/podcast
ExecStart=/usr/local/bin/aiwg verify-archive . --verify --quiet
StandardOutput=journal
StandardError=journal
```
**Timer file** (`/etc/systemd/system/archive-verify.timer`):
```ini
[Unit]
Description=Daily archive verification
[Timer]
OnCalendar=daily
Persistent=true
[Install]
WantedBy=timers.target
```
**Enable timer**:
```bash
sudo systemctl enable archive-verify.timer
sudo systemctl start archive-verify.timer
```
### Monitoring Integration
**Prometheus exporter pattern**:
```bash
#!/bin/bash
# Export verification status as Prometheus metrics
CHECKSUM_FILE="CHECKSUMS.sha256"
METRICS_FILE="/var/lib/prometheus/node_exporter/archive_integrity.prom"
# Quick verification
EXPECTED=$(grep '^# MANIFEST_HASH:' "$CHECKSUM_FILE" | awk '{print $3}')
ACTUAL=$(tail -n +4 "$CHECKSUM_FILE" | sha256sum | awk '{print $1}')
if [ "$EXPECTED" = "$ACTUAL" ]; then
MANIFEST_OK=1
else
MANIFEST_OK=0
fi
# Write metrics
cat > "$METRICS_FILE" <<EOF
# HELP archive_manifest_integrity Archive manifest integrity status (1=ok, 0=failed)
# TYPE archive_manifest_integrity gauge
archive_manifest_integrity{path="$PWD"} $MANIFEST_OK
# HELP archive_manifest_check_timestamp Unix timestamp of last verification
# TYPE archive_manifest_check_timestamp gauge
archive_manifest_check_timestamp{path="$PWD"} $(date +%s)
EOF
```
**Grafana alert rule**:
```yaml
groups:
- name: archive_integrity
interval: 5m
rules:
- alert: ArchiveManifestCorrupted
expr: archive_manifest_integrity == 0
for: 5m
labels:
severity: critical
annotations:
summary: "Archive manifest integrity check failed"
description: "Archive at {{ $labels.path }} has corrupted or tampered manifest"
```
## Standards Reference
| Standard | Specification | Purpose |
|----------|---------------|---------|
| **SHA-256** | NIST FIPS 180-4 | Cryptographic hash function for file integrity |
| **PREMIS 3.0** | Library of Congress | Preservation metadata for digital objects |
| **W3C PROV-O** | W3C Recommendation 2013 | Provenance ontology for tracking derivation |
| **ISO 8601** | ISO 8601:2019 | Date and time format (UTC timestamps) |
| **JSON-LD 1.1** | W3C Recommendation 2020 | Linked data format for provenance records |
### SHA-256 (NIST FIPS 180-4)
**Properties**:
- 256-bit hash output (64 hexadecimal characters)
- Collision resistance: 2^128 operations
- Pre-image resistance: 2^256 operations
- Deterministic: same input always produces same output
**Command**: `sha256sum <file>`
**Format**: `<64-char-hex> <path>`
### PREMIS 3.0 Fixity
**Purpose**: Record fixity information for digital preservation.
**Key elements**:
- `messageDigest` - Hash value with algorithm prefix (e.g., `sha256:abc123...`)
- `messageDigestAlgorithm` - Algorithm name (`SHA-256`)
- `messageDigestOriginator` - Software that computed hash
- `fixityCheckDateTime` - When fixity was verified
**Use case**: Long-term digital preservation requiring audit trails.
### W3C PROV-O
**Entity-Activity-Agent model**:
- **Entity**: Archive or checksum manifest
- **Activity**: Checksum generation or verification
- **Agent**: Software (AIWG Media Curator) that performed activity
**Key relationships**:
- `wasGeneratedBy` - Entity generated by activity
- `used` - Activity used entity
- `wasAssociatedWith` - Activity performed by agent
- `wasAttributedTo` - Entity attributed to agent
### ISO 8601 Timestamps
**Format**: `YYYY-MM-DDTHH:MM:SS.NNNNNNNNNZ`
**Requirements**:
- Always UTC (trailing `Z`)
- Nanosecond precision (9 decimal places)
- Monotonic (lexicographically sortable)
**Bash command**: `date -u +%Y-%m-%dT%H:%M:%S.%NZ`
**Example**: `2026-02-14T18:45:22.387654321Z`
## Performance Characteristics
| Operation | Time Complexity | Example Duration (100 GB archive) |
|-----------|-----------------|-----------------------------------|
| Quick verify (manifest only) | O(1) | < 1 second |
| Full verify (all files) | O(n × file_size) | 10-60 minutes (disk-bound) |
| Generate manifest | O(n × file_size) | 10-60 minutes (disk-bound) |
**Optimization tips**:
- Use SSD storage for faster checksumming
- Enable parallel hashing for multi-core systems (GNU parallel)
- Quick verification detects 99% of tampering instantly
- Schedule full verification during off-peak hours
## Error Scenarios
| Error | Cause | Recovery |
|-------|-------|----------|
| Manifest integrity failure | Manifest file corrupted/tampered | Restore from backup or regenerate |
| File verification failure | File corrupted or modified | Restore file from backup |
| Missing manifest | New archive or manifest deleted | Generate new manifest |
| Permission denied | Cannot read files | Fix permissions, run as appropriate user |
| Disk full | Cannot write manifest | Free disk space |
| Hash mismatch | File changed since manifest generated | Regenerate manifest if intentional |
## Integration Examples
### Git Pre-Commit Hook
Verify archive integrity before committing:
```bash
#!/bin/bash
# .git/hooks/pre-commit
if [ -f CHECKSUMS.sha256 ]; then
echo "Verifying archive integrity..."
if ! aiwg verify-archive . --verify --quiet; then
echo "Error: Archive verification failed" >&2
echo "Run 'aiwg verify-archive . --fix' to regenerate checksums" >&2
exit 1
fi
echo "✓ Archive integrity verified"
fi
```
### CI/CD Pipeline
GitHub Actions workflow:
```yaml
name: Verify Archive Integrity
on: [push, pull_request]
jobs:
verify:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Install AIWG
run: npm install -g aiwg
- name: Verify archive
run: aiwg verify-archive media/archives/podcast --verify
```
### Backup Verification
Verify backup integrity after rsync:
```bash
#!/bin/bash
# Backup and verify script
SOURCE="/media/source/podcast"
DEST="/media/backup/podcast"
# Sync files
rsync -av --delete "$SOURCE/" "$DEST/"
# Verify destination
cd "$DEST"
if aiwg verify-archive . --verify --quiet; then
echo "✓ Backup verified successfully"
else
echo "✗ Backup verification failed" >&2
exit 1
fi
```
## References
- @$AIWG_ROOT/agentic/code/addons/aiwg-utils/rules/human-authorization.md — Seek explicit authorization before regenerating checksums (overwrites existing CHECKSUMS.sha256)
- @$AIWG_ROOT/agentic/code/frameworks/media-curator/skills/acquire/SKILL.md — Acquisition skill that invokes integrity verification via --verify-after flag
- @$AIWG_ROOT/agentic/code/frameworks/media-curator/skills/provenance-tracking/SKILL.md — Provenance tracking that records fixity events alongside checksums
- @$AIWG_ROOT/agentic/code/frameworks/media-curator/skills/verify-archive/SKILL.md — High-level verify-archive skill that uses these integrity patterns