Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.
npx versuz@latest install jmagly-aiwg-plugins-media-curator-skills-check-completenessgit clone https://github.com/jmagly/aiwg.gitcp aiwg/SKILL.MD ~/.claude/skills/jmagly-aiwg-plugins-media-curator-skills-check-completeness/SKILL.md---
namespace: aiwg
name: check-completeness
platforms: [all]
description: Analyze collection completeness against canonical discography and generate prioritized gap report
commandHint:
argumentHint: '"<artist>" [--collection <path>] [--gaps-only] [--priority high|medium|low|all] [--output report.yaml]'
allowedTools: 'Bash, Read, Write, Glob, Grep, WebSearch, WebFetch'
model: sonnet
category: media-curator
---
# check-completeness
Analyze collection completeness by comparing actual inventory against canonical discography, identify gaps, score coverage, and generate prioritized acquisition plans.
## Purpose
Answer collection completeness questions:
- "How complete is my [artist] collection?"
- "What am I missing?"
- "What should I acquire next?"
- "What legendary performances am I missing?"
- "Where should I upgrade quality?"
Produces comprehensive gap reports with actionable acquisition strategies prioritized by importance, availability, and cost.
## Parameters
### Required
**`<artist>`** - Artist name (quoted if contains spaces)
- Matched against collection directory structure
- Used to fetch canonical discography from MusicBrainz
- Example: `"Pink Floyd"` or `"The Beatles"`
### Optional
**`--collection <path>`** - Collection root directory
- Default: Current working directory
- Must contain artist subdirectories
- Example: `/mnt/media/Music`
**`--gaps-only`** - Show only missing items, skip completeness scoring
- Faster execution when you just need gap list
- Useful for quick "what's missing?" queries
**`--priority <level>`** - Filter gaps by priority level
- Values: `high`, `medium`, `low`, `all` (default: `all`)
- Example: `--priority high` shows only critical gaps
**`--output <file>`** - Write report to file instead of stdout
- Format determined by extension: `.yaml`, `.json`, `.md`
- Default: stdout (terminal display)
- Example: `--output gaps-report.yaml`
**`--canonical-source <source>`** - Canonical discography source
- Values: `musicbrainz` (default), `discogs`, `allmusic`
- Falls back to secondary sources if primary fails
- Example: `--canonical-source discogs`
**`--include-legendary`** - Include legendary/bootleg content in analysis
- Default: Official releases only
- Adds section for high-priority unofficial recordings
- Requires legendary catalog file or web search
**`--quality-threshold <level>`** - Minimum quality for "complete" status
- Values: `lossless`, `320kbps`, `256kbps`, `any` (default: `any`)
- Example: `--quality-threshold lossless` treats 320kbps as needing upgrade
**`--create-gap-notes`** - Generate GAP-NOTE.md files for missing items
- Creates placeholder files in artist directory structure
- Each contains acquisition strategy and priority
- Enables parallel gap-fill workflow
**`--incremental`** - Incremental update mode
- Loads previous state file if exists
- Only re-fetches canonical if >7 days old
- Faster for routine monitoring
**`--cost-estimate`** - Include cost estimates in report
- Estimates based on typical pricing for format/availability
- Adds budget planning section to output
- Example range: "$15-25 for used CD"
## Workflow
### 1. Load Collection Inventory
```bash
# Use existing scan results if available
SCAN_FILE=".media-curator/scans/${ARTIST_SLUG}/scan-latest.json"
if [[ -f "$SCAN_FILE" ]]; then
INVENTORY=$(cat "$SCAN_FILE")
else
# Trigger fresh scan
/scan-collection "$ARTIST" --format json --output "$SCAN_FILE"
INVENTORY=$(cat "$SCAN_FILE")
fi
```
Inventory provides:
- All albums/releases present in collection
- Track counts and file formats
- Quality levels (lossless/lossy bitrates)
- File sizes and directory structure
### 2. Fetch Canonical Discography
```bash
# Check for cached canonical data
CANONICAL_FILE=".media-curator/completeness/${ARTIST_SLUG}/canonical.yaml"
CACHE_AGE_DAYS=$((( $(date +%s) - $(stat -c %Y "$CANONICAL_FILE" 2>/dev/null || echo 0) ) / 86400))
if [[ $CACHE_AGE_DAYS -gt 7 ]] || [[ ! -f "$CANONICAL_FILE" ]]; then
# Fetch fresh canonical discography
# MusicBrainz API example:
MBID=$(musicbrainz-lookup "$ARTIST" --type artist --format json | jq -r '.artists[0].id')
RELEASES=$(musicbrainz-releases "$MBID" --format json)
# Parse into canonical.yaml structure
echo "$RELEASES" | parse-canonical > "$CANONICAL_FILE"
fi
```
Canonical discography includes:
- Studio albums with track counts and years
- EPs and singles with A/B sides
- Official live albums
- Compilations (filtered by novelty)
- Official box sets
**MusicBrainz advantages:**
- Comprehensive and accurate
- Includes release dates and track counts
- Distinguishes editions (original, remaster, deluxe)
- Free API with rate limiting
**Discogs alternative:**
- More complete for rare/indie releases
- Includes pressing details and variants
- Better for vinyl collectors
- Requires API key
### 3. Build Extended Content Catalog
If `--include-legendary` flag set:
```bash
# Load legendary performances catalog
LEGENDARY_FILE=".media-curator/completeness/${ARTIST_SLUG}/legendary.yaml"
if [[ ! -f "$LEGENDARY_FILE" ]]; then
# Web search for "artist legendary performances bootleg"
# Or load from community-maintained catalog
# Or use private tracker tags (if integrated)
fetch-legendary-catalog "$ARTIST" > "$LEGENDARY_FILE"
fi
```
Extended catalog includes:
- Notable covers (tribute albums, sessions)
- Soundtrack contributions
- Collaboration tracks (appears on other artists' albums)
- Legendary live performances (soundboard, historical significance)
- Unreleased demos and outtakes
### 4. Compare Inventory to Canonical
```yaml
# Comparison logic pseudocode
for release in canonical_releases:
match = find_in_inventory(release.title, release.year)
if match:
if match.track_count == release.track_count:
release.status = "complete"
else:
release.status = "partial"
release.missing_tracks = release.track_count - match.track_count
release.quality = match.quality
release.source = match.source
else:
release.status = "missing"
```
**Fuzzy matching rules:**
- Tolerate minor title differences ("The Album" vs "Album, The")
- Match by year if title ambiguous
- Prefer exact track count match over year match
- Flag multiple matches as potential duplicates
### 5. Score Completeness
```python
# Weighted scoring formula
weights = {
'studio_albums': 0.40,
'eps': 0.15,
'singles': 0.15,
'live_albums': 0.10,
'notable_covers': 0.10,
'legendary': 0.10
}
component_scores = {}
for category, weight in weights.items():
total = len(canonical[category])
complete = sum(1 for item in canonical[category] if item.status == 'complete')
partial = sum(0.5 for item in canonical[category] if item.status == 'partial')
component_scores[category] = ((complete + partial) / total * 100) if total > 0 else 100
overall_score = sum(component_scores[cat] * weights[cat] for cat in weights)
# Quality bonus
lossless_pct = (lossless_count / total_count) * 100
if lossless_pct > 80:
overall_score += 5
```
### 6. Identify and Prioritize Gaps
```yaml
# Gap prioritization logic
for release in canonical_releases:
if release.status == "missing" or release.status == "partial":
priority = calculate_priority(release)
availability = check_availability(release)
cost = estimate_cost(release, availability)
gaps.append({
'item': release.title,
'type': release.type,
'year': release.year,
'priority': priority,
'availability': availability,
'estimated_cost': cost,
'reason': priority_reason(release)
})
gaps.sort(key=lambda x: (priority_order[x['priority']], x['year']))
```
**Priority calculation:**
```python
def calculate_priority(release):
score = 0
# Type importance
if release.type == 'studio_album':
score += 40
elif release.type in ['ep', 'single']:
score += 15
elif release.type == 'live_album':
score += 10
# Critical acclaim (if available)
if release.rating and release.rating > 8.0:
score += 20
# Classic period (artist-specific)
if release.year in artist_classic_years:
score += 15
# Completes a set
if is_part_of_trilogy(release) and have_other_parts(release):
score += 10
# Historical significance
if release.legendary or release.historically_significant:
score += 25
# Assign tier
if score >= 60:
return "HIGH"
elif score >= 30:
return "MEDIUM"
else:
return "LOW"
```
### 7. Check Availability
```python
def check_availability(release):
sources = []
# Streaming services (easy, legal, lossy)
if available_on_streaming(release):
sources.append({
'type': 'streaming',
'services': ['Apple Music', 'Spotify', 'Tidal'],
'quality': '256kbps AAC',
'cost': 'subscription',
'difficulty': 'easy'
})
# Digital purchase (legal, often lossless)
if available_on_bandcamp(release):
sources.append({
'type': 'digital_purchase',
'platform': 'Bandcamp',
'quality': 'FLAC / 320kbps MP3',
'cost': '$7-12',
'difficulty': 'easy'
})
# Physical (CD/vinyl)
discogs_listings = check_discogs(release)
if discogs_listings:
sources.append({
'type': 'physical',
'format': 'CD',
'availability': f"{len(discogs_listings)} listings",
'price_range': f"${min(discogs_listings)} - ${max(discogs_listings)}",
'difficulty': 'moderate'
})
# Private trackers (torrents, requires ratio)
if user_has_tracker_access():
tracker_results = search_trackers(release)
if tracker_results:
sources.append({
'type': 'private_tracker',
'sites': tracker_results.sites,
'quality': 'FLAC',
'cost': 'ratio',
'difficulty': 'moderate' if tracker_results.seeded else 'hard'
})
return sources
```
### 8. Generate Report
Output format depends on `--output` extension or defaults to rich terminal display:
**Terminal output (default):**
```
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Pink Floyd Collection Completeness Report
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
📊 Overall Score: 92.3% (Excellent Collection)
🎵 Canonical Releases: 94.1%
✅ Studio Albums: 16/17 (94.1%)
✅ EPs: 4/4 (100%)
✅ Singles: 24/30 (80.0%)
✅ Live Albums: 12/15 (80.0%)
🎸 Extended Content: 78.5%
✅ Notable Covers: 8/12 (66.7%)
✅ Soundtracks: 3/5 (60.0%)
⚠️ Legendary: 12/47 (25.5%)
💿 Quality Distribution:
🟢 Lossless (FLAC): 87% (298 files)
🟡 High (320kbps): 11% (38 files)
🔴 Medium (256kbps): 2% (6 files)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Top 5 Acquisition Priorities
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
🔴 HIGH: The Endless River (2014)
Missing: Complete studio album
Reason: Only missing album from official discography
Sources:
• Apple Music (stream rip, 256kbps) - FREE with subscription
• Bandcamp (FLAC) - $9.99
• Discogs (used CD) - $12-18
Recommended: Bandcamp FLAC purchase
🔴 HIGH: Live at Pompeii (1972) - Soundboard
Missing: Legendary performance
Reason: Exceptional quality, historically significant
Sources:
• Private tracker (FLAC soundboard) - Ratio cost
• YouTube rip (256kbps, unofficial) - FREE but low quality
Recommended: Private tracker if available
🟡 MEDIUM: Obscured by Clouds B-sides
Missing: 3 tracks from singles
Reason: Completes album era, unique content
Sources:
• Compilation "Works" (1983) - Discogs $15-20
• Individual singles (rare) - eBay ~$30 each
Recommended: Works compilation
🟡 MEDIUM: The Wall (24bit/96kHz remaster)
Upgrade: Current 320kbps → Lossless remaster
Reason: Significant audio improvement, bonus tracks
Sources:
• HDtracks (24bit FLAC) - $19.99
• Qobuz (24bit streaming) - Subscription
Recommended: Wait for HDtracks sale (<$15)
🟢 LOW: Relics (1971 compilation)
Missing: Compilation of earlier singles
Reason: Low priority, all tracks owned on original albums
Sources:
• Streaming (256kbps) - FREE
• CD (out of print) - eBay $20-30
Recommended: Stream rip if desired for convenience
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Acquisition Plan Summary
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Next 5 acquisitions:
1. The Endless River - Bandcamp FLAC ($9.99)
2. Live at Pompeii - Private tracker (ratio)
3. Obscured by Clouds B-sides - Works CD ($18)
4. The Wall remaster - Wait for sale (~$15)
5. (Optional) Relics - Stream rip (FREE)
💰 Estimated cost: $43-60 (excluding sale wait items)
⏱️ Estimated time: 1-2 weeks
🎯 Next score: ~96.5% (Completionist tier)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Report saved to: .media-curator/completeness/pink-floyd/gap-report-2026-02-14.yaml
GAP-NOTE files created: 5 items
```
**YAML output (--output report.yaml):**
Uses the full gap report structure defined in the Completeness Tracker agent documentation.
**Markdown output (--output report.md):**
Human-readable report with sections, tables, and acquisition instructions.
### 9. Create GAP-NOTE Files (if --create-gap-notes)
```bash
for gap in high_priority_gaps:
NOTE_PATH="${COLLECTION}/${ARTIST}/${gap.year} - ${gap.title}/GAP-NOTE.md"
mkdir -p "$(dirname "$NOTE_PATH")"
cat > "$NOTE_PATH" << EOF
# GAP-NOTE: ${gap.title} (${gap.year})
## Expected Content
- ${gap.type}: ${gap.track_count} tracks
- ${gap.description}
- Rating: ${gap.rating}/10
## Acquisition Strategy
${format_acquisition_steps(gap.sources)}
## Priority
**${gap.priority}** - ${gap.reason}
## Sources
${format_source_links(gap)}
- Last searched: Never
EOF
done
```
## Examples
### Example 1: Basic completeness check
```bash
/check-completeness "Pink Floyd" --collection /mnt/media/Music
```
Output: Full terminal report with scores and top 5 priorities
### Example 2: Just show high-priority gaps
```bash
/check-completeness "The Beatles" --gaps-only --priority high
```
Output:
```
High-Priority Gaps for The Beatles:
1. Let It Be Sessions (1970) - Legendary unreleased sessions
2. Live at Hollywood Bowl (original 1977) - Out of print, reissue available
3. Anthology outtakes (various) - 12 tracks not on official release
```
### Example 3: Create acquisition workflow
```bash
/check-completeness "Led Zeppelin" \
--include-legendary \
--create-gap-notes \
--output gaps-report.yaml
```
Creates:
- `gaps-report.yaml` - Full structured report
- GAP-NOTE.md files in each missing album directory
- Updated completeness state file
### Example 4: Incremental monitoring
```bash
# Weekly cron job
/check-completeness "Grateful Dead" \
--incremental \
--quality-threshold lossless \
--output /reports/weekly-$(date +%F).yaml
```
Fast execution, only re-fetches canonical if stale.
## Integration with Other Commands
**Before completeness check:**
```bash
/scan-collection "Artist" --format json # Generates inventory
```
**After completeness check:**
```bash
/acquisition-plan "Artist" --from-gaps gaps-report.yaml # Plan downloads
/quality-audit "Artist" --upgrades-only # Find quality improvement targets
```
**Parallel gap filling:**
```bash
/completeness-fill "Artist" --priority high --source streaming &
/completeness-fill "Artist" --priority high --source physical &
wait
/check-completeness "Artist" --incremental # Re-check after acquisition
```
## Error Handling
**MusicBrainz API failure:**
- Fall back to Discogs API
- If both fail, use cached canonical (warn if >30 days old)
- Suggest manual canonical.yaml creation
**Artist not found:**
- Suggest fuzzy matches from collection directory names
- Offer to use directory name as canonical source
**No collection inventory:**
- Auto-trigger `/scan-collection` if not found
- Error if collection path invalid
## Performance
**Expected execution time:**
- Small collection (<500 files): 5-10 seconds
- Medium collection (500-2000 files): 15-30 seconds
- Large collection (>2000 files): 30-60 seconds
- Incremental mode: 50% faster (uses cached canonical)
**Rate limiting:**
- MusicBrainz: 1 request/second (built-in delay)
- Discogs: 60 requests/minute (with API key)
- Wait and retry on 429 responses
## Output Files
All output written to `.media-curator/completeness/{artist-slug}/`:
```
.media-curator/completeness/pink-floyd/
├── canonical.yaml # Cached canonical discography
├── legendary.yaml # Extended content catalog
├── state.json # Completeness tracking state
├── gap-report-2026-02-14.yaml
├── gap-report-2026-01-15.yaml # Historical reports
└── history.json # Completeness score over time
```
## Success Criteria
- Completeness scores accurate within ±2%
- Zero false positives (claiming missing when present)
- Priority assignments align with collector consensus >90%
- Availability checks current within 7 days
- Report generation <60 seconds for any collection size
## References
- @$AIWG_ROOT/agentic/code/addons/aiwg-utils/rules/research-before-decision.md — Research canonical discography (MusicBrainz, Discogs) before determining completeness
- @$AIWG_ROOT/agentic/code/frameworks/media-curator/skills/analyze-artist/SKILL.md — Artist analysis that establishes the canonical discography used as completeness baseline
- @$AIWG_ROOT/agentic/code/frameworks/media-curator/skills/gap-documentation/SKILL.md — Gap documentation skill used to record identified missing items
- @$AIWG_ROOT/agentic/code/frameworks/media-curator/skills/find-sources/SKILL.md — Source discovery skill used to locate missing releases identified by completeness check