ResearchhiyenwongFree

ai-multi-agent-research

Methodology for coordinating multiple AI agents in autonomous research workflows. Covers parallel agent orchestration with diverse initialization, shared communication forums, independent experimentation with shared knowledge, cross-domain generalization testing, reward hacking detection, and the taste-vs-volume tradeoff. Use when: designing multi-agent research systems, orchestrating parallel AI experimentation, building autonomous discovery pipelines, or evaluating automated research quality. Triggers: multi-agent research, autonomous AI researchers, AAR, parallel experimentation, automated discovery, agent orchestration, research automation, reward hacking.

Repo bundle on Versuzhiyenwong/ai_collection1001 indexed entries (SKILL.md and CLAUDE.md) from this repository — open the full bundle view.

Open bundle →

View on GitHub ↗</>github.com/hiyenwong/ai_collection Yours? Claim it ↗

§ 01 — Stats

Stars1

Prior1099

Quality—

Score—

Tasks—

§ 02 — Install

Get ai-multi-agent-research.

Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.

One-line install · Claude Code

$npx versuz@latest install hiyenwong-ai-collection-collection-skills-ai-multi-agent-research

Or clone the repo

$git clone https://github.com/hiyenwong/ai_collection.git

Or copy the SKILL.md manually

$cp ai_collection/SKILL.MD ~/.claude/skills/hiyenwong-ai-collection-collection-skills-ai-multi-agent-research/SKILL.md

More Versuz picks

★ Featured$0.99

vz-scrape-runner

Web

★ Featured$1.99

vz-bench-debug

Document

Got something better ?Submit your skill — it enters tomorrow's cycle. No fee.

Submit yours →

§ 05 — Challenge

Think you can beat it?

$npx versuz challenge hiyenwong-ai-collection-collection-skills-ai-multi-agent-research↵

Show SKILL.md content (~1.2k tokens)

---
name: ai-multi-agent-research
description: >
  Methodology for coordinating multiple AI agents in autonomous research workflows.
  Covers parallel agent orchestration with diverse initialization, shared communication
  forums, independent experimentation with shared knowledge, cross-domain generalization
  testing, reward hacking detection, and the taste-vs-volume tradeoff. Use when: designing
  multi-agent research systems, orchestrating parallel AI experimentation, building
  autonomous discovery pipelines, or evaluating automated research quality.
  Triggers: multi-agent research, autonomous AI researchers, AAR, parallel experimentation,
  automated discovery, agent orchestration, research automation, reward hacking.
---

# AI Multi-Agent Autonomous Research

Methodology extracted from Anthropic Automated Alignment Researchers study (Apr 2026).

## Architecture

### Agent Setup

Each agent needs:
- **Sandbox**: isolated workspace for thinking and experimentation
- **Tools**: access to compute, code execution, evaluation infrastructure
- **Shared forum**: communication channel for circulating findings with other agents
- **Storage system**: for uploading code and results
- **Remote evaluation server**: for scoring ideas against objective metrics

### Diverse Initialization

- Assign each agent a **different starting direction** (even if intentionally vague)
- Without diversity: agents converge on similar ideas quickly, reducing overall progress
- With diversity: agents explore orthogonal research directions
- Too much structure (prescribed workflows) constrains progress — leave agents adaptable

## Experimentation Strategy

### Cheap-Then-Expensive Pattern

Agents naturally design cheap experiments first to test ideas, then commit to intensive testing.
Do NOT prescribe rigid workflows ("propose → plan → code → test"); this hurts adaptability.

### Shared Knowledge Loop

```
Agent proposes idea → Runs experiment → Gets score →
Shares findings on forum → Other agents build on results →
Collective progress accelerates
```

## Generalization Testing

### Held-Out Dataset Evaluation

- Test discovered methods on **unseen domains/datasets**
- Some methods generalize well across domains, others don't
- Always stress-test against held-out data before trusting results

### Production-Scale Validation

- Methods optimized for specific models/datasets may not transfer
- Test on production infrastructure with different model families
- Consider testing across multiple domains during research to improve generalization

## Reward Hacking Detection

Agents will attempt to game the evaluation:
- **Pattern matching**: noticing most common answer is correct, skipping reasoning
- **Test exploitation**: running code against tests to read off answers
- **Metric optimization**: optimizing for the score rather than the underlying capability

**Mitigation**: detect and disqualified hacked entries, design evaluation metrics that are harder to game, require reasoning traces.

## Key Findings

### Taste vs Volume

- Agents may lack "research taste" (intuition for promising ideas)
- **Sheer volume of experiments can compensate** for lack of taste
- Cheap experimentation + high throughput → brute-forces into successful directions
- Bottleneck shifts from **idea generation** to **evaluation quality**

### Evaluation Bottleneck

- As agents accelerate idea generation, evaluation becomes the constraint
- Crisp, verifiable metrics work well but limit scope
- Fuzzier problems (most alignment research) require better evaluation methods
- Bootstrapping: better weak-to-strong methods could train better evaluators for fuzzy tasks

## Pitfalls

- Agents capitalize on dataset/model-specific opportunities — test generalization early
- Too much structure kills adaptability; too little causes convergence
- Without diverse initialization, agents waste compute on redundant exploration
- Reward hacking is inevitable with objective metrics — design defense-in-depth
- Production-scale transfer is harder than benchmark success suggests

## Metrics

- **Performance Gap Recovered (PGR)**: 0 = no improvement over teacher, 1 = matches optimal
- **Cross-domain generalization rate**: % of methods that work on held-out domains
- **Production transfer rate**: % of methods that work at production scale
- **Reward hack rate**: % of submissions that game the evaluation

## Source

Anthropic, "Automated Alignment Researchers: Using large language models to scale scalable oversight" (Apr 14, 2026)