openclaw-release-maintainer openclaw-secret-scanning-maintainer openclaw-small-bugfix-sweep openclaw-test-performance openclaw-testing openhue

WebopenclawFree

openclaw-test-heap-leaks

Investigate OpenClaw pnpm test memory growth, Vitest OOMs, RSS spikes, and heap snapshot deltas.

Repo bundle on Versuzopenclaw/openclaw113 indexed entries (SKILL.md and CLAUDE.md) from this repository — open the full bundle view.

Open bundle →

View on GitHub ↗</>github.com/openclaw/openclaw Yours? Claim it ↗

§ 01 — Stats

Stars371.3k

Prior1247

Quality—

Score—

Tasks—

§ 02 — Install

Get openclaw-test-heap-leaks.

Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.

One-line install · Claude Code

$npx versuz@latest install openclaw-openclaw-agents-skills-openclaw-test-heap-leaks

Or clone the repo

$git clone https://github.com/openclaw/openclaw.git

Or copy the SKILL.md manually

$cp openclaw/SKILL.MD ~/.claude/skills/openclaw-openclaw-agents-skills-openclaw-test-heap-leaks/SKILL.md

More Versuz picks

★ Featured$1.99

vz-scrape-runner

Web

Got something better ?Submit your skill — it enters tomorrow's cycle. No fee.

Submit yours →

§ 05 — Challenge

Think you can beat it?

$npx versuz challenge openclaw-openclaw-agents-skills-openclaw-test-heap-leaks↵

Show SKILL.md content (~2.0k tokens)

---
name: openclaw-test-heap-leaks
description: Investigate OpenClaw pnpm test memory growth, Vitest OOMs, RSS spikes, and heap snapshot deltas.
---

# OpenClaw Test Heap Leaks

Use this skill for test-memory investigations. Do not guess from RSS alone when heap snapshots are available. Treat snapshot-name deltas as triage evidence, not proof, until retainers or dominators support the call.

For **runtime fixes** (e.g., closure leaks in long-running services like the gateway), see [Validating runtime fixes](#validating-runtime-fixes-not-test-memory) below — that uses a dedicated harness, not the test-parallel snapshot machinery.

## Workflow

1. Reproduce the failing shape first.
- Match the real entrypoint if possible. For Linux CI-style unit failures, start with:
- `pnpm canvas:a2ui:bundle && OPENCLAW_TEST_MEMORY_TRACE=1 OPENCLAW_TEST_HEAPSNAPSHOT_INTERVAL_MS=60000 OPENCLAW_TEST_HEAPSNAPSHOT_DIR=.tmp/heapsnap OPENCLAW_TEST_WORKERS=2 OPENCLAW_TEST_MAX_OLD_SPACE_SIZE_MB=6144 pnpm test`
- Keep `OPENCLAW_TEST_MEMORY_TRACE=1` enabled so the wrapper prints per-file RSS summaries alongside the snapshots.
- If the report is about a specific shard or worker budget, preserve that shape.
- Before you analyze snapshots, identify the real lane names from `[test-parallel] start ...` lines or `pnpm test --plan`. Do not assume a single `unit-fast` lane; local plans often split into `unit-fast-batch-*`.

2. Wait for repeated snapshots before concluding anything.
- Take at least two intervals from the same lane.
- Compare snapshots from the same PID inside the real lane directory such as `.tmp/heapsnap/unit-fast-batch-2/`.
- Use `.agents/skills/openclaw-test-heap-leaks/scripts/heapsnapshot-delta.mjs` to compare either two files directly or the earliest/latest pair per PID in one lane directory.
- If the helper suggests transformed-module retention, confirm the top entries in DevTools retainers/dominators before calling it solved.

3. Classify the growth before choosing a fix.
- If growth is dominated by Vite/Vitest transformed source strings, `Module`, `system / Context`, bytecode, descriptor arrays, or property maps, treat it as likely retained module graph growth in long-lived workers.
- If growth is dominated by app objects, caches, buffers, server handles, timers, mock state, sqlite state, or similar runtime objects, treat it as a likely cleanup or lifecycle leak.
- If the names are ambiguous, stop short of a confident label and inspect retainers/dominators in DevTools for the top deltas.

4. Fix the right layer.
- For likely retained transformed-module growth in shared workers:
- Prefer timing and hotspot-driven scheduling fixes first. Check whether the file is already represented in `test/fixtures/test-timings.unit.json` and whether `scripts/test-update-memory-hotspots.mjs` should refresh the measured hotspot manifest before hand-editing behavior overrides.
- Move hotspot files out of the real shared lane by updating `test/fixtures/test-parallel.behavior.json` only when timing-driven peeling is insufficient.
- Prefer `singletonIsolated` for files that are safe alone but inflate shared worker heaps.
- If the file should already have been peeled out by timings but is absent from `test/fixtures/test-timings.unit.json`, call that out explicitly. Missing timings are a scheduling blind spot.
- For real leaks:
- Patch the implicated test or runtime cleanup path.
- Look for missing `afterEach`/`afterAll`, module-reset gaps, retained global state, unreleased DB handles, or listeners/timers that survive the file.

5. Verify with the most direct proof.
- Re-run the targeted lane or file with heap snapshots enabled if the suite still finishes in reasonable time.
- If snapshot overhead pushes tests over Vitest timeouts, fall back to the same lane without snapshots and confirm the RSS trend or OOM is reduced.
- For wrapper-only changes, at minimum verify the expected lanes start and the snapshot files are written.

## Heuristics

- Do not call everything a leak. In this repo, large `unit-fast` or `unit-fast-batch-*` growth can be a worker-lifetime problem rather than an application object leak.
- `scripts/test-parallel.mjs` and `scripts/test-parallel-memory.mjs` are the primary control points for wrapper diagnostics.
- The lane names printed by `[test-parallel] start ...` and `[test-parallel][mem] summary ...` tell you where to focus.
- When one or two files account for most of the delta and they are missing from timings, reducing impact by isolating them is usually the first pragmatic fix.
- When the same retained object families grow across multiple intervals in the same worker PID, trust the snapshots over intuition, then confirm ambiguous calls with retainer evidence.

## Snapshot Comparison

- Direct comparison:
- `node .agents/skills/openclaw-test-heap-leaks/scripts/heapsnapshot-delta.mjs before.heapsnapshot after.heapsnapshot`
- Auto-select earliest/latest snapshots per PID within one lane:
- `node .agents/skills/openclaw-test-heap-leaks/scripts/heapsnapshot-delta.mjs --lane-dir .tmp/heapsnap/unit-fast-batch-2`
- Useful flags:
- `--top 40`
- `--min-kb 32`
- `--pid 16133`

Read the top positive deltas first. Large positive growth in module-transform artifacts suggests lane isolation; large positive growth in runtime objects suggests a real leak. If the names alone do not settle it, open the same snapshot pair in DevTools and inspect retainers/dominators for the top rows before declaring root cause.

## Validating runtime fixes (not test-memory)

The workflow above is for diagnosing Vitest worker memory growth. For
validating that a runtime/closure fix actually releases captured state, use the
dedicated harness:

- `pnpm leak:embedded-run` — runs `scripts/embedded-run-abort-leak.ts`. Loops N
aborted runs in a function-shaped scope mimicking `runEmbeddedAttempt`,
writes heap snapshots, and reports a PASS/FAIL verdict on retention growth
using `FinalizationRegistry` for tracked-instance counting plus RSS delta.

Modes:

- `closure-extracted` (default) — production fix shape (helper at module scope).
- `closure-inline` — pre-fix shape (closure inside the runner scope). Use as a
sensitivity check: if it passes you've broken the harness, not fixed a bug.
- `synthetic-leak` — deliberately retains via a module-level bucket. Use to
confirm the harness can detect leaks before trusting a PASS on a real fix.

Snapshots land in `.tmp/embedded-run-abort-leak/`. Diff with the same script
as above:

```
node .agents/skills/openclaw-test-heap-leaks/scripts/heapsnapshot-delta.mjs \
.tmp/embedded-run-abort-leak/baseline-*.heapsnapshot \
.tmp/embedded-run-abort-leak/batch-N-*.heapsnapshot --top 30
```

When fixing a different runtime leak, add a new harness alongside this one
rather than retrofitting it. The fixture function should mimic the lexical
scope of the function where the leak lives, not be a generic abort-loop.

## Output Expectations

When using this skill, report:

- The exact reproduce command.
- Which lane and PID were compared.
- The dominant retained object families from the snapshot delta.
- Whether the issue is a likely real leak or likely shared-worker retained module growth, plus whether retainers/dominators confirmed it.
- The concrete fix or impact-reduction patch.
- What you verified, and what snapshot overhead prevented you from verifying.

openclaw-test-heap-leaks

Get openclaw-test-heap-leaks.

vz-bench-debug

vz-scrape-runner

Think you can beat it?