Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.
npx versuz@latest install vivekkarmarkar-claude-code-os-skills-reproduce-page-basic-texgit clone https://github.com/VivekKarmarkar/claude-code-os.gitcp claude-code-os/SKILL.MD ~/.claude/skills/vivekkarmarkar-claude-code-os-skills-reproduce-page-basic-tex/SKILL.md---
name: reproduce-page-basic-tex
description: Reproduce a single research-paper PDF page as a standalone LaTeX source file (`.tex` only — NO compile, NO open, NO viewer launch). Prose taken verbatim from the PDF text layer, equations hand-transcribed from a visual reading, figures as placeholder boxes, tables reconstructed via booktabs. Runs fully autonomously with no confirmation prompts. Use when the user invokes `/reproduce-page-basic-tex` with a PDF path and page number, or when they want batch `.tex` generation without per-page interactive viewing.
---
# reproduce-page-basic-tex
Produce ONLY the `.tex` source file for a single PDF page reconstruction. This is the source-only variant of `reproduce-page-basic` — it stops after writing the `.tex` file and never runs `pdflatex` or `xdg-open`.
**Scope boundary** (identical to `reproduce-page-basic`): figures become placeholder boxes with prose descriptions; citations stay as literal `[N]` markers; output uses standard `article` class; not pixel-perfect; no BibTeX; no scanned PDFs.
## When to use this skill vs. `reproduce-page-basic`
| If you want… | Use |
|---|---|
| A `.tex` file plus a rendered `.pdf` you can look at immediately | `reproduce-page-basic` |
| A `.tex` file only, with no side effects, for batch processing or later compilation | **`reproduce-page-basic-tex`** (this skill) |
| To reproduce many pages with a shell loop, compiling them all later with your own command | **`reproduce-page-basic-tex`** (no per-page viewer pop-ups) |
| To study one page interactively | `reproduce-page-basic` |
Both skills share `NOTES.md`, `preamble_stable.tex`, `helpers/`, and `examples/` — those directories are symlinks from this skill back to `reproduce-page-basic`. Any updates to the judgment-call notes or example set automatically apply to both skills.
## Arguments
- `<pdf_path>` — absolute path to the source PDF (must have a text layer)
- `<page_number>` — 1-indexed page number as opened in a PDF viewer
- `[<out_dir>]` — optional; defaults to the current working directory
## Pipeline (execute these steps in order, autonomously)
**Execute all three steps without pausing to ask for permission between them.** If `pdftotext` or the Write tool needs environmental approval, that's fine — but do not introduce confirmation prompts of your own (e.g., "should I proceed with step 2?"). The skill's contract is: PDF + page in, `.tex` file out, no interactive pauses.
### 1. Extract prose verbatim
```bash
python3 ~/.claude/skills/reproduce-page-basic-tex/helpers/extract_pdf_text.py \
<pdf_path> --page <N> > <out_dir>/<stem>_page<N>_raw.txt
```
Treat the resulting `_raw.txt` as the source of truth for all prose. Do NOT retype prose from memory or from the visual reading — only from this file. This is the key invariant that prevents hallucination.
### 2. Read the page visually (equations, figures, tables, layout)
Use the Read tool with `pages: "<N>"`:
```
Read(<pdf_path>, pages: "<N>")
```
From this visual reading, extract:
- **Equations** — hand-transcribe to LaTeX (pdftotext mangles math; ignore those line fragments)
- **Figures** — describe via the placeholder template (don't reproduce graphically)
- **Tables** — reproduce structure via `booktabs` + `multirow`
- **Section / subsection headers** — layout cues the text layer doesn't preserve reliably
- **Printed journal page number** — read the number actually displayed on the page; it is what goes into the `\cfoot` footer
### 3. Write the `.tex` file
Combine the cleaned prose (step 1) with the hand-transcribed equations and figure/table descriptions (step 2) into a standalone compileable `.tex` at `<out_dir>/<stem>_page<N>.tex`. The file must:
1. **Start with the stable preamble.** See `preamble_stable.tex` (symlinked in this skill) and the complete files in `examples/wei-explicit-inverse/`. Add `booktabs`, `multirow`, `array` to the preamble only if the page contains a table.
2. **Use the new footer convention**: `\cfoot{\small <PRINTED_PAGE> $\to$ \arabic{page}}` + `\setcounter{page}{1}`, where `<PRINTED_PAGE>` is the journal's printed page number.
3. **Set the equation counter** via `\setcounter{equation}{M}` where M is the last equation number from the previous page. Skip this line if the page has no equations.
4. **Apply prose cleanup rules** from `NOTES.md §6`:
- Soft hyphens (U+00AD): join word halves
- Line breaks inside paragraphs: collapse to spaces
- Unicode en-dashes: convert to LaTeX `--`
- Unicode right quotation marks: convert to ASCII `'`
- Escape `%`, `&`, `$`, `_`, `#` in prose
5. **Preserve typesetting anomalies verbatim** — do NOT "fix" the paper. See `NOTES.md §5` for known anomaly types.
6. **Figures as placeholder boxes** following the template in `NOTES.md §7`.
7. **Tables** using the template in `NOTES.md §7a`.
8. **If the page ends mid-sentence, preserve the cut-off.** Do NOT complete the sentence from context.
### 4. STOP
- Do NOT run `pdflatex`. The user will compile the `.tex` file themselves (or via a batch script) when they choose to.
- Do NOT run `xdg-open` or any viewer.
- Do NOT append notes to any other `NOTES.md`. This skill is a pure function: PDF + page in, `.tex` file out, no other state touched.
- Do NOT ask "should I also compile it now?" or "want me to open the PDF?". The user invoked this skill specifically to avoid those follow-ups.
Report completion by listing the output path(s): the `.tex` file you wrote, and the `_raw.txt` scratch file from step 1 (which the user can delete freely — it's intermediate).
## The invariant you must preserve
> **Prose comes from the text layer. Equations come from the visual. Neither tool crosses into the other's domain.**
`pdftotext` is character-exact on prose but mangles equations. The Read tool sees equations cleanly but will hallucinate prose. Using each tool only inside its strength zone is what gives the output its fidelity.
If you ever find yourself retyping a sentence "from memory" because the `_raw.txt` had some weird character, STOP. Either the `_raw.txt` is correct (and you need to escape the weird character in LaTeX), or the PDF has no text layer (in which case this skill cannot handle it — report that to the user and stop).
## Resources in this skill
All resources below are symlinks to the base `reproduce-page-basic` skill. Any update to the base skill's resources applies here automatically.
- **`NOTES.md`** (symlink) — judgment calls collected from 11 worked Wei page reconstructions. Math notation conventions, prose cleanup rules, figure placeholder template, table template, known typesetting anomalies. Consult as a decision reference.
- **`preamble_stable.tex`** (symlink) — the stable LaTeX preamble that has compiled cleanly across 11 pages.
- **`helpers/`** (symlinked directory) — four atomic CLI tools:
- `extract_pdf_text.py` — wraps `pdftotext -layout -f N -l N`
- `clean_soft_hyphens.py` — strips U+00AD, U+2019, converts dashes
- `reflow_paragraphs.py` — collapses per-line output to paragraph-per-blank-line
- `escape_latex.py` — escapes LaTeX special characters in prose
- **`examples/wei-explicit-inverse/`** (symlinked directory) — 11 worked reconstructions of Wei et al. (Appl. Math. Modelling 134, 2024). Each is a complete `.tex` → `.pdf` pair. See `examples/wei-explicit-inverse/README.md` for the per-page content map.
## When to consult which example
| If the page has… | Consult |
|---|---|
| Only prose (no equations, no figures) | `examples/wei-explicit-inverse/wei_page1.tex` |
| Equations + prose | `wei_page2.tex` or `wei_page4.tex` |
| A figure | `wei_page3.tex` or `wei_page5.tex` |
| A result figure comparison grid | `wei_page9.tex` or `wei_page10.tex` |
| A table | `wei_page11.tex` |
| A typesetting anomaly | `NOTES.md §5` |
| Uncertainty about a math symbol | `NOTES.md §4` |
## Batch composition (the point of the source-only variant)
Because this skill has no per-page side effects, it composes cleanly with shell loops:
```bash
# Reproduce all pages of a paper as .tex files, no viewer pop-ups
for p in $(seq 1 22); do
/reproduce-page-basic-tex paper.pdf $p
done
# Compile them all in a separate pass
for f in *_page*.tex; do pdflatex -interaction=nonstopmode "$f"; done
# Or concatenate a subset
pdfunite paper_page0{5,6,7,8}.pdf paper_methods.pdf
```
This is the McIlroy-style composition the blueprint's architecture document describes: the atomic source-generator is the primitive, and compile+view workflows live at the user level where they belong.
## Do-not-touch rules
- NEVER modify the input PDF. It is read-only.
- NEVER overwrite an existing `<stem>_page<N>.tex` without first renaming the existing file to `<stem>_page<N>.tex.bak`. Reconstructions are work product, not scratch.
- NEVER "fix" typesetting anomalies in the source paper. Reproduce verbatim.
- NEVER retype prose from memory. It must come from `pdftotext` output.
- NEVER compile or open the PDF. That is the base skill's job, not this one's.