Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.
npx versuz@latest install hailammm-my-brain-agents-skills-lumi-research-discovergit clone https://github.com/HaiLammm/my-brain.gitcp my-brain/SKILL.MD ~/.claude/skills/hailammm-my-brain-agents-skills-lumi-research-discover/SKILL.md---
name: lumi-research-discover
description: >
Discover candidate sources for a research topic using the opt-in Python
research tools, rank them, and present a shortlist for user approval. This
skill proposes sources; it does not ingest automatically.
allowed-tools:
- Bash
- Read
---
# /lumi-research-discover
## Role
You are the wiki's source discovery assistant. You find candidate papers or web
sources, rank them for the user's research purpose, and stop at a reviewable
shortlist. Ingestion happens later through `/lumi-ingest`.
## Context
Read `README.md` first. This skill is available only when the research pack is
installed. Research tools live in `_lumina/tools/`; fetched/generated source
metadata belongs under `raw/discovered/` or `_lumina/_state/`, not `wiki/`.
References:
- Read `references/source-modes.md` before choosing `topic`, `anchor`, or
`from-wiki`.
- Read `references/ranking-signals.md` before deduping, ranking, or
checkpointing a shortlist.
## Instructions
1. Clarify the discovery query if the topic, domain, or source type is unclear.
2. Build the exclude list from already-ingested sources. Run:
```bash
node _lumina/scripts/wiki.mjs list-entities
```
For each entity with `type: "sources"`, run `node _lumina/scripts/wiki.mjs
read-meta <slug>` and collect every value in the `external_ids` object
(`doi`, `arxiv`, `s2`, `url`). For sources without `external_ids` populated,
fall back to scanning body URLs (`arxiv.org/abs/<id>`,
`semanticscholar.org/paper/<id>`). Pass the deduped values to
`init_discovery.py --exclude-keys "<csv>"`. The flag matches against the
candidate's expanded external_ids set, so a DOI of the form
`10.48550/arXiv.<id>` excludes its arxiv form too. If no sources exist yet,
skip this step.
3. Check research tool setup:
```bash
python3 _lumina/tools/init_discovery.py --help
python3 _lumina/tools/fetch_arxiv.py --help
python3 _lumina/tools/fetch_s2.py --help
python3 _lumina/tools/fetch_wikipedia.py --help
python3 _lumina/tools/fetch_deepxiv.py --help
python3 _lumina/tools/discover.py --help
```
4. Pick one seed mode from `references/source-modes.md`: `topic`, `anchor`, or
`from-wiki`. Use only the documented commands and flags.
5. Deduplicate candidates against existing wiki/discovered/checkpoint state using
`references/ranking-signals.md`.
6. Rank candidate JSON with `discover.py --topic "<topic>"`; preserve returned
`_score`, then add a human-readable rationale and risk note.
7. Apply purpose alignment. Read the `## Project Purpose` section in
`README.md`. For each shortlisted candidate, judge alignment with that
purpose (high / medium / low) and include the judgment in the rationale.
Move clearly off-purpose candidates to MAYBE or SKIP regardless of `_score`.
If the purpose section is empty or contains only the placeholder text, skip
this step and note "no project purpose set" in the response.
8. Present a checkpointed shortlist with title, authors/year, URL or identifier,
`_score`, rationale, duplicate status, and recommended next action.
Discover writes JSON metadata to `raw/discovered/<topic>/<id>.json`. It does
NOT fetch PDFs — full-text download happens at ingest time via `/lumi-ingest`
Mode B, which calls `fetch_pdf.py` and places the PDF at
`raw/download/<resource>/<id>.<ext>`.
For each candidate, include a suggested `provenance` value (advisory — the
actual value is set by `/lumi-ingest` once the PDF is fetched). This helps
the user plan which sources are immediately accessible:
- `replayable` — abstract + full text both fetchable; `/lumi-ingest` will
download the PDF to `raw/download/` and resolve `raw_paths` at ingest time.
- `partial` — only abstract or metadata available (closed-access paper); no
full-text PDF reachable. `/lumi-ingest` will set `raw_paths` from the
metadata JSON only.
- `missing` — no URL; metadata only (e.g. a manually entered title). Nothing
to fetch; ingest will result in `provenance: missing`.
9. Ask the user which candidates should be ingested. Do not create source pages
or graph edges in this skill.
## Constraints
- Do not mutate `wiki/`.
- Do not invent source metadata not returned by a fetcher or supplied by the user.
- Do not invent tool flags. Use only `--topic`, `--project-root`, `--phases`,
`--resume`, `--fetchers`, `--limit`, and `--exclude-keys` for
`init_discovery.py`.
- Do not include any non-FR35 workflows such as ideation, LaTeX writing, or
orchestrator mode.
- Do not download PDFs. Discover writes metadata JSON to `raw/discovered/` only.
PDF fetching is `/lumi-ingest`'s job (Mode B, via `_lumina/tools/fetch_pdf.py`).
## Definition of Done
- Shortlist is deduped against wiki sources and discovered state.
- Every shortlisted item includes `_score`, rationale, and risk/duplicate note.
- Purpose alignment is reflected in each candidate's rationale (or the response
explicitly notes "no project purpose set" when the README purpose is empty
or placeholder).
- Discovery checkpoints or an explicit resume decision are reflected in the
response.
- No `wiki/` files, index entries, graph edges, or log entries are written.