Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.
npx versuz@latest install hiyenwong-ai-collection-collection-skills-autoresearch-pipelinegit clone https://github.com/hiyenwong/ai_collection.gitcp ai_collection/SKILL.MD ~/.claude/skills/hiyenwong-ai-collection-collection-skills-autoresearch-pipeline/SKILL.md---
name: autoresearch-pipeline-for-ai-safety-research
description: Skill for AI agent capabilities
---
# Autoresearch Pipeline for AI Safety Research
## Overview
**Source:** arXiv:2603.24511v1 (Claudini)
**Utility:** 0.95
**Topic:** LLM agent autonomous research for discovering new algorithms
**Key Contribution:** Autoresearch pipeline achieves SOTA results in adversarial attack discovery
## Activation Keywords
- autoresearch pipeline
- LLM agent autonomous research
- automated AI safety research
- iterative algorithm discovery
- Claude Code research automation
## Core Innovation
### Problem
- AI safety research often manual and slow
- Existing methods provide good starting points but optimization needed
- Dense quantitative feedback available but not leveraged
### Solution
**Autoresearch Pipeline:**
1. **Start from existing implementations** - Strong baseline (e.g., GCG)
2. **LLM agent iteration** - Claude Code explores modifications
3. **Quantitative evaluation** - Attack success rate (ASR) feedback
4. **Discover new algorithms** - SOTA results achieved
### Key Results
| Target Model | New Algorithm | Best Baseline | Improvement |
|--------------|---------------|---------------|-------------|
| GPT-OSS-Safeguard-20B | 40% ASR | ≤10% ASR | +30% |
| Meta-SecAlign-70B | 100% ASR | 56% ASR | +44% |
## Pipeline Architecture
```
Existing Methods → LLM Agent Exploration → Iterative Refinement → Evaluation → New Discovery
↓ ↓ ↓ ↓ ↓
Baseline Code Modification Algorithm Changes ASR Test SOTA Results
```
### Implementation Framework
```python
class AutoresearchPipeline:
def __init__(self, baseline_method, evaluation_fn, agent):
self.baseline = baseline_method
self.evaluate = evaluation_fn
self.agent = agent # Claude Code-like agent
def run(self, n_iterations=100):
current_algorithm = self.baseline
for i in range(n_iterations):
# Agent explores modifications
modifications = self.agent.suggest_modifications(current_algorithm)
# Try each modification
for mod in modifications:
new_algorithm = apply_modification(current_algorithm, mod)
score = self.evaluate(new_algorithm)
if score > best_score:
current_algorithm = new_algorithm
best_score = score
log_discovery(mod, score)
return current_algorithm, best_score
```
## Key Principles
### 1. Strong Starting Points
- Existing methods provide foundation
- Don't start from scratch
- Leverage prior research
### 2. Dense Quantitative Feedback
- Clear optimization objective
- Measurable outcomes (ASR, accuracy, etc.)
- Direct feedback drives improvement
### 3. Agent Capabilities
- Code generation/modification
- Literature understanding
- Creative exploration
### 4. Iterative Refinement
- Many small modifications
- Gradual improvement accumulation
- Exploration vs exploitation balance
## Application Domains
| Domain | Starting Point | Objective | Suitability |
|--------|----------------|-----------|-------------|
| Adversarial Attacks | GCG, AutoPrompt | ASR maximization | ✅ Excellent |
| Prompt Optimization | Base prompts | Task performance | ✅ Good |
| Architecture Search | Known architectures | Accuracy | ✅ Good |
| Hyperparameter Tuning | Default configs | Validation score | ✅ Good |
| Algorithm Discovery | Existing algorithms | Benchmark scores | ✅ Excellent |
## Safety Considerations
⚠️ **Important**: This pipeline can be used for both defensive and offensive research.
### Defensive Applications
- Discover robust defense mechanisms
- Identify vulnerabilities before attackers
- Stress-test safety systems
### Offensive Applications
- Create new attack algorithms
- Jailbreak safety measures
- Prompt injection optimization
### Recommended Use
- **Prioritize defensive research**
- Use for authorized security testing only
- Follow ethical guidelines
- Report findings responsibly
## Relation to Self-Evolution
| Self-Evolution Concept | Autoresearch Pipeline |
|------------------------|----------------------|
| Learn → Apply → Reflect → Improve | Baseline → Modify → Evaluate → Discover |
| Delegation to Specialists | Agent handles code exploration |
| Dense Feedback | Quantitative ASR metrics |
| Ship or It Doesn't Count | Published SOTA algorithms |
## Implementation for OpenClaw
### Potential Applications
1. **Skill Optimization**
- Start from existing skills
- Agent modifies instructions
- Evaluate on task performance
2. **Agent Improvement**
- Optimize agent behaviors
- Discover new workflows
- Quantitative success metrics
3. **Workflow Discovery**
- Find better processes
- Optimize existing workflows
- Task completion metrics
### Example: Skill Autoresearch
```python
class SkillAutoresearch:
def optimize_skill(self, base_skill, evaluation_tasks):
current_skill = base_skill
for iteration in range(n_iterations):
# Agent suggests skill modifications
suggestions = self.agent.analyze_skill(current_skill)
for suggestion in suggestions:
modified_skill = apply_suggestion(current_skill, suggestion)
# Evaluate on tasks
performance = evaluate_skill(modified_skill, evaluation_tasks)
if performance > best_performance:
current_skill = modified_skill
best_performance = performance
return current_skill
```
## Best Practices
1. **Define Clear Objectives** - Measurable success metrics
2. **Set Constraints** - Safety boundaries, computational limits
3. **Document Discoveries** - Track all improvements
4. **Validate Transfers** - Test generalization to other contexts
5. **Report Responsibly** - Ethical disclosure for security findings
## Description
Autoresearch Pipeline for AI Safety Research
## Tools Used
- `read` - Read documentation and references
- `web_search` - Search for related information
- `web_fetch` - Fetch paper or documentation
## Instructions for Agents
Follow these steps when applying this skill:
### Step 1: Start from existing implementations
### Step 2: LLM agent iteration
### Step 3: Quantitative evaluation
### Step 4: Discover new algorithms
### Step 5: Skill Optimization
## Examples
### Example 1: Basic Application
**User:** I need to apply Autoresearch Pipeline for AI Safety Research to my analysis.
**Agent:** I'll help you apply autoresearch-pipeline. First, let me understand your specific use case...
**Context:** Apply the methodology
### Example 2: Advanced Scenario
**User:** Complex analysis scenario
**Agent:** Based on the methodology, I'll guide you through the advanced application...
### Example 2: Advanced Application
**User:** What are the key considerations for autoresearch-pipeline?
**Agent:** Let me search for the latest research and best practices...
## References
- Paper: https://arxiv.org/abs/2603.24511
- GitHub: https://github.com/romovpa/claudini
- Related: `self-evolving-agents-survey`
---
**Created:** 2026-03-28
**Source:** arXiv:2603.24511v1 - "Claudini: Autoresearch Discovers SOTA Adversarial Attack Algorithms"
⚠️ **Note**: Focus on research methodology, not attack details. Use for defensive research only.