DocumentVivekKarmarkarFree

extract-figures-from-paper

Extract every figure from a research-paper PDF with its caption, tight crops (no redundant prose), and white padding for breathing room. Use when the user wants to pull figures out of a paper, given a PDF path, paper title, or description.

Repo bundle on VersuzVivekKarmarkar/claude-code-os810 indexed entries (SKILL.md and CLAUDE.md) from this repository — open the full bundle view.

Open bundle →

View on GitHub ↗</>github.com/VivekKarmarkar/claude-code-os Yours? Claim it ↗

§ 01 — Stats

Prior1090

Quality—

Score—

Tasks—

§ 02 — Install

Get extract-figures-from-paper.

Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.

One-line install · Claude Code

$npx versuz@latest install vivekkarmarkar-claude-code-os-skills-extract-figures-from-paper

Or clone the repo

$git clone https://github.com/VivekKarmarkar/claude-code-os.git

Or copy the SKILL.md manually

$cp claude-code-os/SKILL.MD ~/.claude/skills/vivekkarmarkar-claude-code-os-skills-extract-figures-from-paper/SKILL.md

More Versuz picks

★ Featured$0.99

vz-scrape-runner

Web

★ Featured$1.99

vz-bench-debug

Document

Got something better ?Submit your skill — it enters tomorrow's cycle. No fee.

Submit yours →

§ 05 — Challenge

Think you can beat it?

$npx versuz challenge vivekkarmarkar-claude-code-os-skills-extract-figures-from-paper↵

Show SKILL.md content (~1.0k tokens)

---
name: extract-figures-from-paper
description: Extract every figure from a research-paper PDF with its caption, tight crops (no redundant prose), and white padding for breathing room. Use when the user wants to pull figures out of a paper, given a PDF path, paper title, or description.
---

# Extract Figures From Paper

Given a research-paper PDF, produce one PNG per figure, each containing the figure and its caption, tightly cropped by the actual geometric extent of figure content, with white padding for breathing room.

## When to Use

- User says "extract figures from this paper", "pull out all the figures", "get me the figures from X"
- User provides a PDF path, paper title, or descriptive reference to a paper

## Inputs

Parse from user context:
- **PDF path** — required. If user gives a title / description instead, resolve via the `find-paper-by-title` skill or ask.
- **Output directory** — default `extracted_figures/`
- **Stem** — filename prefix, default = PDF basename. Use something short (e.g. `paper1`) when extracting from multiple papers into one directory.

## Method (why this works)

Naive approach ("render page and crop above the caption") fails on title pages, multi-column layouts, and pages with inline prose above the figure. Instead:

1. **Ground truth from captions**: scan all text blocks, match `^(?:Fig\.?|Figure)\s*(\d+)` at block start. Every matching block number is a figure that must be extracted.
2. **Content-relative bbox**: for each caption, collect bboxes of (a) raster images via `page.get_image_info()` and (b) vector drawings via `page.get_drawings()` that lie strictly above the caption and below the nearest caption higher on the same page. Union them.
3. **Include the caption**: union the caption's own bbox into the crop → caption is always present.
4. **Padding**: pad the PDF crop bbox by 4 pt, then after rasterizing at 200 DPI, add 40 px of white border via `PIL.ImageOps.expand` for visual breathing room.

Invariants to verify after running:
- `set(extracted) == set(ground_truth)` — no figure missing
- every PNG opens and is non-trivially large
- filename number matches caption number

## How to Run

```bash
python3 ~/.claude/skills/extract-figures-from-paper/extract_figures.py <PDF> \
[--out DIR] [--stem NAME] [--dpi 200] [--pad 40]
```

The script prints a JSON report. Exit code 2 if any figure is missing.

### Example (two papers into one directory)

```bash
python3 ~/.claude/skills/extract-figures-from-paper/extract_figures.py \
"Chen-tactile tomography-1.pdf" --out extracted_figures --stem paper2

python3 ~/.claude/skills/extract-figures-from-paper/extract_figures.py \
"Autonomous_Robotic_Tissue_Palpation...pdf" --out extracted_figures --stem paper1
```

### Verification after running

Always visually spot-check at least 2 outputs with the Read tool to confirm:
- ✅ the caption text ("Fig. N. ...") is visible at the bottom
- ✅ no title/abstract/body prose bleeds into the crop
- ✅ multi-panel figures (a/b/c/...) are kept together

If a caption is missing or prose bleeds in, check `examples/` for the reference implementation that shipped with the skill — those are known-good results.

## Dependencies

- `pymupdf` (imports as `fitz`)
- `Pillow`

Install: `pip install pymupdf pillow`

## Reference Results

`examples/` contains the JSON report from the session that produced this skill:
- Paper 1 (Beber et al., robotic tissue palpation): 5/5 figures
- Paper 2 (Chen, tactile tomography): 16/16 figures
- 21/21 total, zero missing, all captions included, all tight crops.

See `examples/report.json` for the exact extraction output and `examples/README.md` for the story behind the method (including the two bugs that shaped it: missing captions and prose bleed).