ResearchVivekKarmarkarFree

find-evidence-in-paper

Find verbatim evidence in a source research-paper PDF that supports, contradicts, or is relevant to a user's claim or hypothesis. Appends an indexed entry to a per-paper journal file with the verbatim excerpt, its page number and structural context, and AI commentary on the evidential relationship (supports, contradicts, partially supports, or silent). Use when the user invokes `/find-evidence-in-paper` with a claim they want checked against the paper's actual text.

Repo bundle on VersuzVivekKarmarkar/claude-code-os810 indexed entries (SKILL.md and CLAUDE.md) from this repository — open the full bundle view.

Open bundle →

View on GitHub ↗</>github.com/VivekKarmarkar/claude-code-os Yours? Claim it ↗

§ 01 — Stats

Prior1090

Quality—

Score—

Tasks—

§ 02 — Install

Get find-evidence-in-paper.

Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.

One-line install · Claude Code

$npx versuz@latest install vivekkarmarkar-claude-code-os-skills-find-evidence-in-paper

Or clone the repo

$git clone https://github.com/VivekKarmarkar/claude-code-os.git

Or copy the SKILL.md manually

$cp claude-code-os/SKILL.MD ~/.claude/skills/vivekkarmarkar-claude-code-os-skills-find-evidence-in-paper/SKILL.md

More Versuz picks

★ Featured$0.99

vz-scrape-runner

Web

★ Featured$1.99

vz-bench-debug

Document

Got something better ?Submit your skill — it enters tomorrow's cycle. No fee.

Submit yours →

§ 05 — Challenge

Think you can beat it?

$npx versuz challenge vivekkarmarkar-claude-code-os-skills-find-evidence-in-paper↵

Show SKILL.md content (~2.8k tokens)

---
name: find-evidence-in-paper
description: Find verbatim evidence in a source research-paper PDF that supports, contradicts, or is relevant to a user's claim or hypothesis. Appends an indexed entry to a per-paper journal file with the verbatim excerpt, its page number and structural context, and AI commentary on the evidential relationship (supports, contradicts, partially supports, or silent). Use when the user invokes `/find-evidence-in-paper` with a claim they want checked against the paper's actual text.
---

# find-evidence-in-paper

**Category 1 skill: find verbatim evidence in a source PDF for or against a user's claim.**

Third and final sibling of `/content-to-highlight-in-paper` and `/ask-question-about-paper`. All three produce structurally identical journal entries in the same YAML format. They differ only in how the AI interprets the user's input:

| Skill | User provides | AI finds |
|---|---|---|
| `/content-to-highlight-in-paper` | Context cue about content to highlight | The matching verbatim excerpt |
| `/ask-question-about-paper` | A question | The verbatim excerpt that answers it |
| **`/find-evidence-in-paper`** (this skill) | **A claim or hypothesis** | **The verbatim excerpt that supports or contradicts it** |

## The architectural commitment

**The AI is a spotlight, not an oracle.**

When the user says "Find evidence that the explicit method is always better than the implicit method," the AI does NOT say "yes, the explicit method is better." The AI finds the **verbatim passage** that bears on the claim — which might *support* it, *contradict* it, *partially support* it, or reveal that the paper is *silent* on the topic. The `comments` field transparently states the evidential relationship.

This skill is especially important for the "scientific temperament" commitment: the evidence might not say what the user hopes it says. The AI's job is to locate what the paper *actually* says, not to confirm the user's prior. A passage that *contradicts* the user's claim is as valuable — arguably more valuable — than one that confirms it.

## Arguments

- `<paper_context>` — stem, PDF filename, or informal name. Inferred from session context if unambiguous.
- `<claim>` — the user's claim, hypothesis, or statement they want evidence for. Can range from precise ("the explicit method uses exactly 56 optimization variables") to broad ("the method works well under noise").

If zero arguments, ask interactively.

## Pipeline

### Step 1 — Gather context

Same as the other Cat-1 skills. Identify source PDF path, stem, reconstruction directory, total page count, and the user's claim.

### Step 2 — Ensure raw text layer exists

Same as the other Cat-1 skills. Check for `<stem>_pageN_raw.txt` files; generate missing ones via `extract_pdf_text.py`.

### Step 3 — Find the evidential excerpt

This step is the most judgment-intensive of the three Cat-1 skills because the AI must:

1. **Parse the claim's testable content**: what specific assertion is being made? What would a supporting passage look like? What would a contradicting passage look like?

2. **Search across all `_raw.txt` files** for passages bearing on the claim:

   | Claim type | Search strategy |
   |---|---|
   | Specific factual ("uses 56 variables") | Search for the number + surrounding technical terms |
   | Comparative ("method A is better than B") | Search for passages comparing the two methods, especially tables with quantitative results |
   | Existence ("the paper addresses X") | Search for X and related terms; absence across all raw.txt files is itself evidence |
   | Causal ("X happens because of Y") | Search for Y near causal language ("because", "due to", "caused by", "leads to") |
   | Broad ("the method is robust") | Search for "robust" and related terms; also look for caveats, failure cases, limitations |

3. **Classify the evidential relationship** between the excerpt and the claim:

   | Relationship | When to use | Example |
   |---|---|---|
   | `supports` | The excerpt directly confirms the claim | Claim: "explicit is faster." Excerpt: "328 vs 4354 iterations." |
   | `contradicts` | The excerpt directly refutes the claim | Claim: "explicit always wins." Excerpt: "the explicit method may perform less effectively when no prior knowledge is available." |
   | `partially_supports` | The excerpt confirms part of the claim but not all, or confirms it with caveats | Claim: "error is under 5%." Excerpt: "average relative error of only 5%" (boundary case) |
   | `silent` | The paper does not address the claim at all | No relevant passage found in any raw.txt file |

   Record the relationship in the `comments` field, not in a separate field — the YAML schema stays identical to the other Cat-1 skills.

4. **If multiple passages bear on the claim, pick the STRONGEST one** — the passage that most directly addresses the claim's core assertion. Note alternatives in `comments`. If the evidence is split (some passages support, others contradict), pick the one that the user would most want to see first and note the counter-evidence with page references.

5. **The excerpt MUST come from the `_raw.txt` file** — same invariant as always.

### Step 4 — Enrich with spatial context

Same as the other Cat-1 skills. Visual Read of the identified page for section heading, nearby landmarks, paragraph position, printed page number.

### Step 5 — Handle "paper is silent on this claim"

If the paper genuinely doesn't address the claim:

- `status: null`
- `excerpt`: omitted
- `location`: null
- `comments`: explain what the paper IS silent about, and what the closest related discussion is (with page references). Make it clear this is "no evidence found" not "evidence of absence" — unless the paper explicitly states that something was NOT studied, in which case that statement IS evidence and should be recorded as the excerpt with relationship `contradicts` or `partially_supports`.

**Important nuance**: "the paper doesn't mention X" and "the paper says X was not studied" are different findings. The first is `status: null` (silence). The second is `status: matched` with the excerpt being the passage where the authors say X was not studied.

### Step 6 — Construct entry and append

Same mechanism as the other Cat-1 skills:

```bash
cat > /tmp/entry.yaml <<'EOF'
request: |
  <the user's claim, verbatim>
excerpt: |
  <the verbatim passage from _raw.txt bearing on the claim>
location:
  page_pdf: <int>
  page_printed: <int or null>
  section: <string or null>
  surrounding_context: |
    <navigation description>
  before: |
    <preceding sentence(s)>
  after: |
    <following sentence(s)>
  bbox: null
comments: |
  <EVIDENTIAL RELATIONSHIP: supports / contradicts / partially_supports.
   WHY this excerpt bears on the claim. Any caveats, alternative passages
   considered, or counter-evidence found elsewhere in the paper (with page
   references). If the claim is broad, note what aspect of the claim this
   excerpt addresses and what aspects remain unaddressed.>
EOF

python3 ~/.claude/skills/find-evidence-in-paper/helpers/append_journal_entry.py \
  "<reconstruction_dir>/find-evidence-in-paper_<stem>.md" \
  /tmp/entry.yaml \
  --paper <stem> \
  --skill find-evidence-in-paper
```

### Step 7 — Report to the user

**For a matched entry:**
```
Added entry [N] to find-evidence-in-paper_<stem>.md:

  Claim:    "<user's claim>"
  Status:   matched
  Evidence: [supports | contradicts | partially_supports]
  Page:     PDF <N> (printed <M>)
  Excerpt:  "<verbatim excerpt, first ~120 chars>..."
  Context:  <surrounding context>
  Comments: <evidential reasoning>
```

**For a null entry (paper is silent):**
```
Added entry [N] to find-evidence-in-paper_<stem>.md:

  Claim:    "<user's claim>"
  Status:   null (paper is silent on this claim)
  Comments: <what the paper does discuss that's closest>
```

Then **STOP**. Do NOT:
- State your own opinion on whether the claim is true
- Summarize the paper's conclusions
- Suggest alternative claims to check
- Invoke any Cat-2 highlighting skill

## Why this skill matters more than the other two

`/content-to-highlight-in-paper` is navigation — "find this text." `/ask-question-about-paper` is retrieval — "find the answer." This skill is **critical reading** — "does the paper actually support what I think it supports?" That's the highest-value operation in the "AI as spotlight" paradigm because it's the one most prone to confirmation bias when done manually. A human skimming a paper can easily read what they want to read. An AI that honestly classifies excerpts as `supports`, `contradicts`, or `partially_supports` is a bias-correction mechanism, not just a search engine.

The `contradicts` case is especially load-bearing. If the user's claim is "the explicit method always outperforms implicit," and the AI finds the passage on page 19 where the authors explicitly say "the explicit inverse method may indeed perform less effectively compared to the implicit inverse method" — that's the passage the user most needs to see and would be most likely to skip in a manual skim.

## The invariant

> **The excerpt comes from `_raw.txt`. The evidential classification goes in `comments`. Never conflate the paper's words with the AI's interpretation.**

## Journal file specification

Same as `/content-to-highlight-in-paper`. YAML documents, `---` separated, metadata first, indexed entries after.

**Filename**: `find-evidence-in-paper_<stem>.md`
**Location**: `<reconstruction_dir>/`

## Resources

- **`helpers/append_journal_entry.py`** — symlink → `content-to-highlight-in-paper/helpers/append_journal_entry.py`. Shared `CleanDumper` helper.
- Transitively uses `extract_pdf_text.py` and `get_total_pages.py` from the reproduce-page-basic family.

## Do-not-touch rules

- NEVER modify the source PDF.
- NEVER modify existing `_raw.txt` or `.tex` files.
- NEVER delete or edit past journal entries. Append-only.
- NEVER fabricate an excerpt. `status: null` for silence.
- NEVER put the evidential classification in the `excerpt` field. That goes in `comments`.
- NEVER suppress a `contradicts` finding because it's inconvenient. The user's independence from confirmation bias is the whole point.
- NEVER compile any `.tex` file.
- NEVER invoke a Cat-2 highlighting skill.