Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.
npx versuz@latest install jmagly-aiwg-plugins-media-curator-skills-gap-documentationgit clone https://github.com/jmagly/aiwg.gitcp aiwg/SKILL.MD ~/.claude/skills/jmagly-aiwg-plugins-media-curator-skills-gap-documentation/SKILL.md---
namespace: aiwg
name: gap-documentation
platforms: [all]
description: Document missing content in media archives with standardized GAP-NOTE markers that guide acquisition and measure completeness
---
# GAP-NOTE Documentation Skill
A systematic approach to documenting missing content in media archives using standardized gap markers that guide acquisition, measure completeness, and enable parallel collection workflows.
## Overview
GAP-NOTE.md files serve as placeholders in directories where content is expected but not yet acquired. They transform empty directories from ambiguous voids into actionable instructions, creating a clear separation between "not yet collected" and "intentionally excluded."
This pattern emerged from field testing during music archive curation (issue #81), where agents needed explicit guidance on acquisition priorities and strategies without requiring central coordination.
## The GAP-NOTE.md Template
### Standard Structure
```markdown
# GAP-NOTE: {Directory Purpose}
## Expected Content
{What should be here}
## Acquisition Strategy
{How to obtain it}
## Priority
{HIGH | MEDIUM | LOW}
## Sources
- {Source 1}
- {Source 2}
```
### Field Descriptions
**Title Line (`# GAP-NOTE: ...`)**
- Brief description of directory purpose
- Helps agents understand context at a glance
- Example: `# GAP-NOTE: BBC Radio Sessions (1990-1995)`
**Expected Content**
- Specific description of missing content
- Include formats, quality expectations, metadata requirements
- Be concrete enough to validate completion
- Example: "Lossless audio (FLAC/ALAC) of all BBC Radio 1 sessions, with broadcast date, presenter name, and tracklisting"
**Acquisition Strategy**
- Step-by-step instructions for obtaining content
- Include API endpoints, web scrapers, manual download URLs
- Specify required credentials, rate limits, legal considerations
- Example: "Use BBC Sounds API → filter by artist → download via yt-dlp → transcode to FLAC"
**Priority**
- Standardized levels: HIGH, MEDIUM, LOW
- Determines resource allocation for parallel workflows
- See Priority Scoring section below
**Sources**
- URLs, API endpoints, contact information
- Structured as bulleted list for easy parsing
- Include backup sources where available
- Example: `- https://www.bbc.co.uk/sounds/brand/b006wkqb`
## Lifecycle Management
### 1. Creation
Created when establishing directory structure but before content acquisition:
```bash
# Create directory and gap marker
mkdir -p "Artist Name/Radio Sessions/BBC Radio 1"
cat > "Artist Name/Radio Sessions/BBC Radio 1/GAP-NOTE.md" <<'INNER_EOF'
# GAP-NOTE: BBC Radio 1 Sessions
## Expected Content
Lossless audio recordings of all BBC Radio 1 sessions (1990-1995)
with metadata: broadcast date, presenter, tracklist, duration
## Acquisition Strategy
1. Query BBC Sounds API for artist sessions
2. Download via yt-dlp with metadata extraction
3. Transcode to FLAC if necessary
4. Extract and verify tracklist from audio
## Priority
HIGH
## Sources
- https://www.bbc.co.uk/sounds/brand/b006wkqb
- https://archive.org/details/bbcradio1sessions
INNER_EOF
```
### 2. Active Use
Agents read GAP-NOTE.md as instructions during acquisition:
```bash
# Find high-priority gaps
find . -name "GAP-NOTE.md" -exec grep -l "^## Priority$" {} \; | \
xargs grep -A1 "^## Priority$" | grep "HIGH" | cut -d: -f1
# Read acquisition strategy
grep -A20 "^## Acquisition Strategy$" "path/to/GAP-NOTE.md"
```
### 3. Completion and Removal
Once content acquired, GAP-NOTE.md is removed or replaced:
```bash
# Option 1: Remove when content present
rm "Artist Name/Radio Sessions/BBC Radio 1/GAP-NOTE.md"
# Option 2: Replace with catalog
mv "Artist Name/Radio Sessions/BBC Radio 1/GAP-NOTE.md" \
"Artist Name/Radio Sessions/BBC Radio 1/CATALOG.md"
```
## Priority Scoring
### HIGH Priority
**Criteria:**
- Core content essential to collection purpose
- Readily available from reliable sources
- Legal and ethical to acquire
- Low technical complexity
**Examples:**
- Radio sessions from major broadcasters (BBC, NPR)
- Album artwork from official label websites
- Press photos from artist official sites
- Lyrics from licensed databases
**Expected timeframe:** Hours to days
### MEDIUM Priority
**Criteria:**
- Supplementary content enhancing collection
- Requires moderate effort or coordination
- May involve multiple sources or processing steps
- Quality/completeness trade-offs acceptable
**Examples:**
- Concert bootlegs from trading communities
- Artist photos from Wikimedia Commons requiring attribution
- Interviews transcribed from audio/video
- Fan-created setlists cross-referenced with recordings
**Expected timeframe:** Days to weeks
### LOW Priority
**Criteria:**
- Nice-to-have content, non-essential
- Difficult to source or verify
- May require manual intervention
- Uncertain availability or legal status
**Examples:**
- Pre-label era demo recordings
- Rare promotional materials
- Personal photos from fan archives
- Unverified session information
**Expected timeframe:** Weeks to months (or indefinite)
## Completeness Measurement
### Gap Inventory
Count remaining gaps to measure collection progress:
```bash
# Total gap count
find . -name "GAP-NOTE.md" | wc -l
# Gaps by priority
echo "=== Collection Gaps by Priority ==="
echo -n "HIGH: "
grep -r "^## Priority" --include="GAP-NOTE.md" . | grep -A1 "^## Priority$" | grep -c "HIGH"
echo -n "MEDIUM: "
grep -r "^## Priority" --include="GAP-NOTE.md" . | grep -A1 "^## Priority$" | grep -c "MEDIUM"
echo -n "LOW: "
grep -r "^## Priority" --include="GAP-NOTE.md" . | grep -A1 "^## Priority$" | grep -c "LOW"
```
### Completeness Percentage
Calculate completion based on gap ratio:
```bash
#!/bin/bash
# Calculate archive completeness
TOTAL_DIRS=$(find . -type d | wc -l)
GAP_DIRS=$(find . -name "GAP-NOTE.md" | wc -l)
COMPLETE_DIRS=$((TOTAL_DIRS - GAP_DIRS))
PERCENT=$((COMPLETE_DIRS * 100 / TOTAL_DIRS))
echo "Archive Completeness: $PERCENT%"
echo " Complete: $COMPLETE_DIRS/$TOTAL_DIRS directories"
echo " Gaps remaining: $GAP_DIRS"
```
### Gap Report
Generate structured report for review:
```bash
#!/bin/bash
# Generate gap report with locations and priorities
echo "# Archive Gap Report"
echo "Generated: $(date -I)"
echo ""
for priority in HIGH MEDIUM LOW; do
echo "## $priority Priority"
find . -name "GAP-NOTE.md" -print0 | while IFS= read -r -d '' file; do
if grep -A1 "^## Priority$" "$file" | grep -q "$priority"; then
dir=$(dirname "$file")
title=$(grep "^# GAP-NOTE:" "$file" | sed 's/# GAP-NOTE: //')
echo "- **$dir**"
echo " $title"
fi
done
echo ""
done
```
## Parallel Gap Fill Pattern
Multiple agents can work on gaps simultaneously by claiming GAP-NOTE files:
### 1. Claim a Gap
```bash
# Add agent claim to gap note
echo "" >> "path/to/GAP-NOTE.md"
echo "## Claimed By" >> "path/to/GAP-NOTE.md"
echo "Agent: $AGENT_ID" >> "path/to/GAP-NOTE.md"
echo "Timestamp: $(date -Iseconds)" >> "path/to/GAP-NOTE.md"
```
### 2. Execute Acquisition
Follow the acquisition strategy documented in the GAP-NOTE:
```bash
# Read strategy section
STRATEGY=$(sed -n '/^## Acquisition Strategy$/,/^##/p' "path/to/GAP-NOTE.md" | \
grep -v "^##")
# Execute steps (example)
while IFS= read -r step; do
echo "Executing: $step"
# ... implement step execution ...
done <<< "$STRATEGY"
```
### 3. Validate and Close
```bash
# Verify content acquired
if [ -f "expected-content-file.flac" ]; then
# Move GAP-NOTE to completion log
mkdir -p .archive-logs/completed-gaps
mv "path/to/GAP-NOTE.md" ".archive-logs/completed-gaps/$(date -I)-gap-filled.md"
echo "Gap filled successfully"
else
# Update GAP-NOTE with failure info
echo "" >> "path/to/GAP-NOTE.md"
echo "## Acquisition Attempt" >> "path/to/GAP-NOTE.md"
echo "Failed: $(date -Iseconds)" >> "path/to/GAP-NOTE.md"
echo "Reason: Content not available from listed sources" >> "path/to/GAP-NOTE.md"
fi
```
## Real-World Examples
### Example 1: Radio Sessions
```markdown
# GAP-NOTE: BBC Radio 1 Sessions (1990-1995)
## Expected Content
Lossless audio recordings (FLAC/ALAC) of all BBC Radio 1 sessions
between 1990-1995, including:
- Full session audio
- Broadcast date and presenter name
- Complete tracklist with song titles
- Duration and audio quality metadata
## Acquisition Strategy
1. Query BBC Sounds API: `https://api.bbc.co.uk/sounds/search?q={artist}`
2. Filter results for Radio 1 sessions in date range
3. Download via yt-dlp: `yt-dlp --extract-audio --audio-format flac {url}`
4. Extract metadata using `ffprobe` and validate against expected format
5. Organize into subdirectories by broadcast date
## Priority
HIGH
## Sources
- https://www.bbc.co.uk/sounds/brand/b006wkqb
- https://archive.org/details/bbcradio1sessions
- https://musicbrainz.org (for session metadata validation)
```
### Example 2: Album Artwork
```markdown
# GAP-NOTE: Album Artwork Collection
## Expected Content
High-resolution album covers (minimum 1400x1400px, JPEG or PNG)
for all studio albums, EPs, and major singles:
- Front cover (required)
- Back cover (optional)
- CD/vinyl labels (optional)
- Embedded artwork extracted from audio files
## Acquisition Strategy
1. Check MusicBrainz Cover Art Archive API
2. Query Discogs API for releases: `https://api.discogs.com/artists/{id}/releases`
3. Download largest available image from each release
4. Fallback: extract embedded artwork from FLAC files using `metaflac --export-picture-to=cover.jpg`
5. Validate dimensions and format
## Priority
MEDIUM
## Sources
- https://coverartarchive.org/release/{mbid}
- https://api.discogs.com/releases/{id}
- Embedded in existing FLAC files
```
### Example 3: Interview Transcripts
```markdown
# GAP-NOTE: Interview Archive
## Expected Content
Transcripts of all major interviews (print, radio, video) from 1988-2000:
- Full text transcription
- Interview date and publication/broadcast source
- Interviewer name
- Topic tags (e.g., "album recording", "tour", "influences")
## Acquisition Strategy
1. Search online archives: Rock's Backpages, music magazine databases
2. For video/audio interviews: use Whisper API for transcription
3. Manual transcription for short clips (<5 minutes)
4. Cross-reference dates with tour/album timeline for validation
5. Format as markdown with YAML frontmatter
## Priority
LOW
## Sources
- https://www.rocksbackpages.com
- https://archive.org/details/audio (radio interviews)
- YouTube channels (fan uploads, use Whisper for transcription)
```
## Management Commands
### Create Gap Template
```bash
# Function to scaffold new gap note
create_gap_note() {
local dir="$1"
local title="$2"
local priority="${3:-MEDIUM}"
mkdir -p "$dir"
cat > "$dir/GAP-NOTE.md" <<INNER_EOF
# GAP-NOTE: $title
## Expected Content
{Describe what content should be here}
## Acquisition Strategy
1. {Step 1}
2. {Step 2}
## Priority
$priority
## Sources
- {Source URL 1}
INNER_EOF
echo "Created gap note: $dir/GAP-NOTE.md"
}
# Usage
create_gap_note "Artist/Radio Sessions/NPR Tiny Desk" \
"NPR Tiny Desk Concert" \
"HIGH"
```
### Bulk Status Check
```bash
# Check all gaps and report status
check_gaps() {
echo "=== Archive Gap Status ==="
echo "Generated: $(date)"
echo ""
find . -name "GAP-NOTE.md" -print0 | sort -z | while IFS= read -r -d '' file; do
dir=$(dirname "$file")
title=$(grep "^# GAP-NOTE:" "$file" | sed 's/# GAP-NOTE: //')
priority=$(grep -A1 "^## Priority$" "$file" | tail -n1 | tr -d ' ')
claimed=$(grep -q "^## Claimed By$" "$file" && echo "[CLAIMED]" || echo "")
printf "%-60s [%s] %s\n" "$dir" "$priority" "$claimed"
done
}
```
### Clean Completed Gaps
```bash
# Archive completed gap notes
archive_completed_gaps() {
mkdir -p .archive-logs/completed-gaps
find . -name "GAP-NOTE.md" -print0 | while IFS= read -r -d '' file; do
dir=$(dirname "$file")
# Check if directory has actual content now
content_count=$(find "$dir" -type f ! -name "GAP-NOTE.md" | wc -l)
if [ "$content_count" -gt 0 ]; then
echo "Gap filled: $dir"
mv "$file" ".archive-logs/completed-gaps/$(date -I)-$(basename "$dir" | tr ' ' '-').md"
fi
done
}
```
## Integration with Media Curator Framework
### Gap Detection
The Media Curator framework can automatically detect gaps:
```bash
# Detect expected-but-missing directories
aiwg media-curator detect-gaps --archive /path/to/archive
```
### Gap Filling Workflow
```bash
# Fill high-priority gaps
aiwg media-curator fill-gaps --priority HIGH --parallel 3
```
### Gap Reporting
```bash
# Generate gap report
aiwg media-curator gap-report --format markdown > GAPS.md
```
## Best Practices
1. **Be specific in Expected Content** - Vague descriptions lead to incomplete acquisitions
2. **Test acquisition strategies** - Verify each step works before documenting
3. **Update priority as context changes** - Source availability affects urgency
4. **Use consistent formatting** - Enables automated parsing and reporting
5. **Archive completed gaps** - Maintain history of what was filled and when
6. **Review gaps periodically** - Sources and priorities evolve over time
7. **Claim gaps before starting** - Prevents duplicate effort in parallel workflows
## See Also
- `@$AIWG_ROOT/agentic/code/frameworks/media-curator/skills/catalog-generation.md` - Replacing GAP-NOTEs with catalogs
- `@$AIWG_ROOT/agentic/code/frameworks/media-curator/skills/archive-organization.md` - Directory structure conventions
- Issue #81 (gitea) - Original field testing that discovered this pattern
## References
- @$AIWG_ROOT/agentic/code/frameworks/media-curator/skills/check-completeness/SKILL.md — Completeness analysis that identifies gaps which this skill documents
- @$AIWG_ROOT/agentic/code/frameworks/media-curator/skills/analyze-artist/SKILL.md — Artist analysis that establishes the canonical discography used to detect gaps
- @$AIWG_ROOT/agentic/code/frameworks/media-curator/skills/find-sources/SKILL.md — Source discovery used to locate items identified as gaps
- @$AIWG_ROOT/agentic/code/frameworks/media-curator/skills/provenance-tracking/SKILL.md — Provenance tracking applied when gaps are resolved and acquisitions recorded