[![Versuz · safety-scan](https://versuz.dev/badge/skill/ruvnet-ruflo-plugins-ruflo-aidefence-skills-safety-scan)](https://versuz.dev/skills/ruvnet-ruflo-plugins-ruflo-aidefence-skills-safety-scan)

Show SKILL.md content (~370 tokens)

---
name: safety-scan
description: Scan inputs for prompt injection, unsafe content, and adversarial attacks using AIDefence
argument-hint: "<input-text>"
allowed-tools: mcp__claude-flow__aidefence_scan mcp__claude-flow__aidefence_analyze mcp__claude-flow__aidefence_is_safe mcp__claude-flow__aidefence_learn mcp__claude-flow__aidefence_stats Bash
---

# Safety Scan

Scan content for prompt injection, jailbreak attempts, and unsafe patterns.

## When to use

Before processing untrusted input (user submissions, API payloads, webhook data), scan it to detect prompt injection, adversarial content, or policy violations.

## Steps

1. **Quick safety check** — call `mcp__claude-flow__aidefence_is_safe` with the input text for a boolean safe/unsafe result
2. **Deep analysis** — call `mcp__claude-flow__aidefence_analyze` for detailed threat classification and confidence scores
3. **Full scan** — call `mcp__claude-flow__aidefence_scan` for comprehensive multi-layer scanning
4. **Train defenses** — call `mcp__claude-flow__aidefence_learn` with confirmed threats to improve detection
5. **View stats** — call `mcp__claude-flow__aidefence_stats` for detection rates and false positive metrics

## Threat categories

- Prompt injection (direct and indirect)
- Jailbreak attempts
- Data exfiltration patterns
- Instruction override attacks
- Social engineering prompts