Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.
npx versuz@latest install evolution-foundation-evo-nexus-claude-skills-ai-image-creatorgit clone https://github.com/evolution-foundation/evo-nexus.gitcp evo-nexus/SKILL.MD ~/.claude/skills/evolution-foundation-evo-nexus-claude-skills-ai-image-creator/SKILL.md---
name: ai-image-creator
description: Generate PNG images using AI (multiple models via OpenRouter including Gemini, FLUX.2, Riverflow, SeedDream, GPT-5 Image, proxied through Cloudflare AI Gateway BYOK). Also analyze/describe existing images using multimodal AI vision. Use when user asks to "generate an image", "create a PNG", "make an icon", "make it transparent", "describe this image", "analyze this image", "what's in this image", "explain this image", or needs AI-generated visual assets for the project. Supports model selection via keywords (gemini, riverflow, flux2, seedream, gpt5), configurable aspect ratios/resolutions, transparent backgrounds (-t), reference image editing (-r), image analysis (--analyze), and per-project cost tracking (--costs).
allowed-tools: Bash, Read, Write
compatibility: Requires uv (Python runner) and network access. Environment variables for CF AI Gateway or direct API keys must be configured in shell profile (~/.zshrc on macOS, ~/.bashrc on Linux, or System Environment Variables on Windows).
metadata:
tags: image-generation, ai, openrouter, cloudflare, gemini, flux2, riverflow, seedream, gpt5
---
# AI Image Creator
Generate PNG images via multiple AI models, routed through Cloudflare AI Gateway BYOK or directly via OpenRouter/Google AI Studio.
## Model Selection
When the user mentions a model keyword in their image request, use the corresponding `--model` flag:
| Keyword | Model | Use When User Says |
|---------|-------|--------------------|
| `gemini` | [Google Gemini 3.1 Flash](https://openrouter.ai/google/gemini-3.1-flash-image-preview) (default) | "gemini", "generate an image" (no model specified) |
| `riverflow` | [Sourceful Riverflow v2 Pro](https://openrouter.ai/sourceful/riverflow-v2-pro) | "riverflow", "use riverflow" |
| `flux2` | [FLUX.2 Max](https://openrouter.ai/black-forest-labs/flux.2-max) | "flux2", "flux", "use flux" |
| `seedream` | [ByteDance SeedDream 4.5](https://openrouter.ai/bytedance-seed/seedream-4.5) | "seedream", "use seedream" |
| `gpt5` | [OpenAI GPT-5 Image](https://openrouter.ai/openai/gpt-5-image) | "gpt5", "gpt5 image", "use gpt5" |
## Instructions
> **Routing check:** If the user asks to **describe, analyze, or explain an existing image** (not generate a new one), skip directly to the **Image Analysis (`--analyze`)** section below. No prompt enhancement or output path needed.
### Step 1: Write Prompt
For long or complex prompts (recommended), write to `workspace/assets/prompts/prompt.txt` using the Write tool:
```
Write prompt text to workspace/assets/prompts/prompt.txt
```
For short prompts (under 200 chars, no special characters), pass inline via `--prompt`.
**CRITICAL — Prompt Quality Tips:**
- Be detailed and descriptive. Include style, colors, composition, background, and intended use.
- Good: "A flat-design globe icon with vertical timezone band lines in blue and teal, white background, clean vector style, suitable for a web app at 512x512 pixels"
- Bad: "globe icon"
- Specify "transparent background" or "white background" explicitly.
- For icons, mention the target size (e.g., "512x512", "favicon at 32x32").
- For photos, describe lighting, camera angle, and mood.
### Step 1.5: Prompt Enhancement (Optional — Progressive Disclosure)
Professional prompt patterns are available in 3 reference files. These are **not loaded by default** — only read them when the user's request matches a category or they explicitly ask for enhancement.
**Category Detection** — Match the user's request to a category:
| If request mentions... | Category | Also read |
|----------------------|----------|-----------|
| "product shot", "product photo", "hero image" | `product_hero` | `prompt-core.md` + `prompt-categories.md` § product_hero |
| "lifestyle", "in-use", "in context" | `lifestyle` | `prompt-core.md` + `prompt-categories.md` § lifestyle |
| "instagram", "social media", "tiktok", "pinterest" | `social_media` | `prompt-core.md` + `prompt-platforms.md` + `prompt-categories.md` § social_media |
| "banner", "ad", "email header" | `marketing_banner` | `prompt-core.md` + `prompt-platforms.md` + `prompt-categories.md` § marketing_banner. **Routing hint:** If user has an existing logo and wants multiple standard sizes → use composite mode instead (see `## Composite Banners`). |
| "website", "app", "logo", "ad format", "leaderboard", "skyscraper" | `web_app` | `prompt-core.md` + `prompt-platforms.md` + `prompt-categories.md` § web_app. **Routing hint:** For "logo banners" or "OG images with my logo" where user has existing logo → use `composite-banners.py`. For "design me a new logo" → use `generate-image.py`. |
| "brand kit", "logo banners", "banner sizes", "IAB sizes", "consistent banners" + user has existing logo | `composite` | Read `references/composite-reference.md`, use `composite-banners.py` |
| "icon", "favicon", "app icon" | `icon_logo` | `prompt-core.md` + `prompt-categories.md` § icon_logo |
| "mascot", "character", "illustration", "artwork" | `illustration` | `prompt-core.md` + `prompt-categories.md` § illustration |
| "food", "drink", "recipe", "restaurant" | `food_drink` | `prompt-core.md` + `prompt-categories.md` § food_drink |
| "building", "interior", "room", "architecture" | `architecture` | `prompt-core.md` + `prompt-categories.md` § architecture |
| "chart", "infographic", "data", "diagram" | `infographic` | `prompt-core.md` + `prompt-categories.md` § infographic |
| "t-shirt", "mug design", "poster", "POD", "print-on-demand" | `pod_design` | `prompt-core.md` + `prompt-platforms.md` + `prompt-categories.md` § pod_design |
| "describe", "analyze", "what's in this image", "explain image" | `analyze` | Skip prompt enhancement — use `--analyze` mode directly. Read `references/analyze-reference.md` for advanced analysis patterns |
| No match / simple request | — | Skip patterns, generate directly |
**When to skip enhancement:**
- User's prompt is already detailed (150+ words with camera/lighting/composition specifics)
- Simple/direct requests ("generate a blue circle on white background")
- User says "no pattern" or provides a fully formed prompt
**When to apply:**
- User says "use product_hero pattern" or "apply social_media pattern" (explicit)
- Request clearly matches a category above (auto-detect)
- User asks for "enhanced prompt" or "professional quality"
**Reference files** (in `references/` directory):
- `prompt-core.md` — Foundational rules: narrative prompting, camera/lens/lighting specs, text rendering rules, model recommendations
- `prompt-platforms.md` — Social media ratios, IAB ad sizes, web dimensions, POD specs — all mapped to `-a`/`-s` flags
- `prompt-categories.md` — 11 category formulas with templates and complete example prompts
### Step 1.5b: Output Path Convention
Save generated images to `workspace/assets/images/` in the workspace root. Use descriptive filenames:
- `workspace/assets/images/wallpaper-evolution-dark.png`
- `workspace/assets/images/icon-agent-512.png`
- `workspace/assets/images/banner-summit-2026.png`
- `workspace/assets/images/social-linkedin-post.png`
Create the directory if it doesn't exist. The `workspace/assets/` folder is gitignored.
### Step 1.5c: Provider Selection
The script auto-loads env vars from the workspace `.env`. Choose provider based on available keys:
- If `AI_IMG_CREATOR_CF_ACCOUNT_ID` + `AI_IMG_CREATOR_CF_GATEWAY_ID` + `AI_IMG_CREATOR_CF_TOKEN` are set → use default (gateway mode, no flag needed)
- If only `AI_IMG_CREATOR_OPENROUTER_KEY` is set → use default (`--provider openrouter`, implicit)
- If only `AI_IMG_CREATOR_GEMINI_KEY` is set → use `--provider google`
- **Do NOT use `source .env`** — the Python script loads it internally. Just run the command directly.
### Step 2: Run Generation Script
```bash
uv run python ${CLAUDE_SKILL_DIR}/scripts/generate-image.py \
-o "workspace/assets/images/OUTPUT_NAME.png" \
[--provider openrouter|google] \
[-a "16:9"] \
[-s "2K"] \
[-m "model-id"] \
[-r "ref-image.png"] \
[-t]
```
With a specific model:
```bash
uv run python ${CLAUDE_SKILL_DIR}/scripts/generate-image.py \
-o "OUTPUT_PATH" \
-m riverflow \
-p "A serene mountain lake at sunset"
```
With transparent background (requires ffmpeg + imagemagick):
```bash
uv run python ${CLAUDE_SKILL_DIR}/scripts/generate-image.py \
-o "mascot.png" \
-t \
-p "A friendly robot mascot character"
```
With reference image for editing/style transfer (multimodal models only):
```bash
uv run python ${CLAUDE_SKILL_DIR}/scripts/generate-image.py \
-o "edited.png" \
-r "original.png" \
-p "Change the background to a sunset scene"
```
Or with inline prompt (default model):
```bash
uv run python ${CLAUDE_SKILL_DIR}/scripts/generate-image.py \
-o "OUTPUT_PATH" \
-p "A simple blue circle on white background"
```
### Step 3: Save Prompt (rename to match image)
After generation, rename the prompt file to match the output image name for historical reference:
```bash
mv workspace/assets/prompts/prompt.txt workspace/assets/prompts/OUTPUT_NAME.txt
```
Example: if image is `ai-entity-cosmos.png`, rename prompt to `ai-entity-cosmos.txt`.
### Step 4: Verify Output
```bash
file OUTPUT_PATH
```
Confirm it shows "PNG image data" and report the file path and size to the user.
### Step 5: Post-Processing (optional)
If the user needs resizing, format conversion, or other manipulation, first detect available image tools, then use them. See **Image Tools** section below.
## Parameters
| Argument | Short | Required | Default | Description |
|----------|-------|----------|---------|-------------|
| `--output` | `-o` | Yes | -- | Output file path (parent dirs auto-created) |
| `--prompt` | `-p` | No | -- | Inline prompt text |
| `--prompt-file` | -- | No | `../tmp/prompt.txt` | Path to prompt file |
| `--provider` | -- | No | `openrouter` | `openrouter` or `google` |
| `--aspect-ratio` | `-a` | No | model default | OpenRouter only: `1:1`, `16:9`, `9:16`, `3:2`, `2:3`, `4:3`, `3:4`, `4:5`, `5:4`, `21:9` |
| `--image-size` | `-s` | No | model default | OpenRouter only: `0.5K`, `1K`, `2K`, `4K` |
| `--model` | `-m` | No | `gemini` | Model keyword (`gemini`, `riverflow`, `flux2`, `seedream`, `gpt5`) or full model ID |
| `--ref` | `-r` | No | -- | Reference image file (repeatable). For editing/style transfer. Multimodal models only (gemini, gpt5) |
| `--analyze` | -- | No | -- | Analyze/describe a reference image (text-only output, no image generated). Requires `-r`. Multimodal models only |
| `--transparent` | `-t` | No | -- | Generate with transparent background. Requires ffmpeg + imagemagick |
| `--costs` | -- | No | -- | Display generation/cost history for this project and exit |
| `--list-models` | -- | No | -- | List available model keywords and exit |
## Environment Variables
| Variable | Required For | Description |
|----------|-------------|-------------|
| `AI_IMG_CREATOR_CF_ACCOUNT_ID` | Gateway mode | Cloudflare account ID |
| `AI_IMG_CREATOR_CF_GATEWAY_ID` | Gateway mode | AI Gateway name |
| `AI_IMG_CREATOR_CF_TOKEN` | Gateway mode | Gateway auth token |
| `AI_IMG_CREATOR_OPENROUTER_KEY` | Direct OpenRouter | OpenRouter API key (`sk-or-...`) |
| `AI_IMG_CREATOR_GEMINI_KEY` | Direct Google | Google AI Studio API key |
Gateway mode activates when all 3 `CF_*` vars are set. Falls back to direct mode if gateway fails.
For first-time setup, see `references/setup-guide.md`.
## Transparent Mode (`-t`)
Generates images with transparent backgrounds using a 3-step pipeline:
1. **Green screen generation** — Prompt is augmented to place subject on solid #00FF00 green
2. **FFmpeg chroma key** — Removes green background + green fringe from edges
3. **ImageMagick auto-crop** — Trims transparent padding
**Requirements:** `brew install ffmpeg imagemagick`
**Use cases:** Game sprites, icons, logos, mascots, marketing assets with transparency.
```bash
uv run python ${CLAUDE_SKILL_DIR}/scripts/generate-image.py \
-o "sprite.png" -t -p "A pixel art treasure chest"
```
## Reference Images (`-r`)
Send existing images alongside text prompts for editing, style transfer, or guided generation. Supports multiple references. **Multimodal models only** (gemini, gpt5) — image-only models (riverflow, flux2, seedream) will error.
```bash
# Edit an existing image
uv run python ${CLAUDE_SKILL_DIR}/scripts/generate-image.py \
-o "edited.png" -r "photo.png" -p "Make the background white"
# Style transfer with multiple references
uv run python ${CLAUDE_SKILL_DIR}/scripts/generate-image.py \
-o "combined.png" -r "style1.png" -r "content.png" -p "Apply the style of the first image to the second"
```
Supported formats: PNG, JPEG, WebP, GIF.
## Image Analysis (`--analyze`)
Describe, analyze, or explain existing images using multimodal AI vision. Returns text-only output (no image generated). **Multimodal models only** (gemini, gpt5).
No `-o` output path needed. No prompt enhancement needed. The script outputs JSON to stdout with the model's analysis in the `analysis` field.
```bash
# Analyze with default prompt (describes subject, style, colors, composition, mood, text)
uv run python ${CLAUDE_SKILL_DIR}/scripts/generate-image.py \
--analyze -r "photo.png"
# Analyze with custom prompt
uv run python ${CLAUDE_SKILL_DIR}/scripts/generate-image.py \
--analyze -r "photo.png" -p "Describe this image in plain text and also in JSON structured output"
# Analyze with a specific model
uv run python ${CLAUDE_SKILL_DIR}/scripts/generate-image.py \
--analyze -r "photo.png" -m gpt5 -p "What text is visible in this image?"
# Analyze multiple images together
uv run python ${CLAUDE_SKILL_DIR}/scripts/generate-image.py \
--analyze -r "before.png" -r "after.png" -p "Compare these two images and describe the differences"
```
**JSON output format:**
```json
{"ok": true, "analyze": true, "analysis": "<model text>", "provider": "openrouter", "model": "...", "mode": "gateway", "elapsed_seconds": 3.2, "ref_images": 1}
```
**Incompatible flags:** `--analyze` cannot be combined with `-o`, `-t`, `-a`, or `-s`.
For advanced analysis prompt patterns (structured output, comparison, targeted analysis), read `references/analyze-reference.md`.
## Cost Tracking (`--costs`)
Every generation is logged to `.ai-image-creator/costs.json` in your project directory. View history:
```bash
uv run python ${CLAUDE_SKILL_DIR}/scripts/generate-image.py --costs
```
Shows per-model breakdown: generation count, total tokens, elapsed time, and recent entries. **Security:** Only non-sensitive data is logged (model, tokens, timing, file path). No API keys or credentials are ever stored.
Consider adding `.ai-image-creator/` to your `.gitignore`.
## Composite Banners
Generate consistent logo banners across multiple sizes from a JSON config. Uses ImageMagick for offline compositing — no API calls, no network required. Composites an existing logo/mark onto branded backgrounds with text at standard dimensions.
### Composite vs. AI Generation — Decision Rule
Use **composite-banners.py** when ALL of these are true:
- User has an existing logo/mark they want to use as-is (provides or references a logo file)
- User wants consistent branding across multiple standard sizes (not one creative image)
- The output is logo + text on a solid/gradient background (not a photograph, illustration, or creative design)
Use **generate-image.py** (AI generation) when ANY of these are true:
- User wants a creative/artistic banner design (describes a scene, mood, concept, or style)
- User wants AI to design the visual content (product shots, illustrations, creative layouts)
- User wants a single banner with artistic content, not a multi-size brand kit
**When composite mode applies**, read `references/composite-reference.md` for full config schema, preset dimensions, and font handling details.
### Quick Start
1. **Init config:** `uv run python ${CLAUDE_SKILL_DIR}/scripts/composite-banners.py --init`
2. **Edit** `banner-config.json` — set logo path, brand text, colors, banner sizes
3. **Validate:** `uv run python ${CLAUDE_SKILL_DIR}/scripts/composite-banners.py --validate`
4. **Generate:** `uv run python ${CLAUDE_SKILL_DIR}/scripts/composite-banners.py -c banner-config.json -o ./banners/`
### Composite Parameters
| Argument | Short | Default | Description |
|----------|-------|---------|-------------|
| `--config` | `-c` | `banner-config.json` | Config JSON path |
| `--output-dir` | `-o` | `.` | Output directory |
| `--name` | `-n` | all | Generate single banner by name |
| `--format` | `-f` | `png` | `png`, `webp`, `jpeg` |
| `--list-presets` | | | List IAB/social/web size presets |
| `--init` | | | Generate starter config |
| `--validate` | | | Check config, exit 0 or 2 |
| `--dry-run` | | | Preview without rendering |
| `--json` | | | Structured JSON to stdout |
| `--verbose` | `-v` | | Verbose output |
**Requirements:** ImageMagick 7 (`brew install imagemagick` or `apt install imagemagick`).
### Workflow Hints
**Starting composite mode:**
- Ask user for: logo file path, brand name, tagline text, brand colors (hex)
- If user doesn't have a logo yet → use generate-image.py to create one first
- Run `--init` to scaffold config, then help user fill in their brand values
**During generation:**
- Always run `--validate` before generating to catch font/logo issues early
- Use `--name` to iterate on one banner before generating the full set
- Show user 3-4 representative sizes (hero, OG, square, leaderboard) for approval
**After generation:**
- If user wants creative/artistic redesign of banner visuals → switch to generate-image.py (composite only does logo + text on gradient/solid backgrounds)
- If banners look too plain → suggest AI-generating a textured or photographic background first, then compositing the logo onto it
**Combined workflow (most powerful):**
1. Use generate-image.py to AI-create a hero background or textured pattern
2. Use composite-banners.py to overlay the logo + text onto that background at all standard sizes
This gives both creative AI visuals AND pixel-perfect logo consistency.
## Image Tools
On first invocation, detect available image manipulation tools:
```bash
which magick convert sips ffmpeg 2>/dev/null
```
### Available Tools
| Tool | Check | Key Operations |
|------|-------|----------------|
| **ImageMagick 7** (`magick`) | `magick --version` | Resize, crop, convert, composite |
| **ImageMagick 6** (`convert`) | `convert --version` | Same ops, legacy command name |
| **sips** (macOS) | `sips --help` | Resize, format conversion |
| **ffmpeg** | `ffmpeg -version` | Convert formats, resize |
### Common Post-Processing
```bash
# Resize
magick output.png -resize 512x512 icon-512.png
# Multiple sizes (icons)
for s in 16 32 48 64 128 256 512; do magick output.png -resize ${s}x${s} icon-${s}.png; done
# Convert to WebP
magick output.png output.webp
# Maskable icon (add safe-zone padding)
magick output.png -gravity center -extent 120%x120% maskable.png
# macOS sips resize
sips --resampleWidth 512 --resampleHeight 512 output.png --out icon-512.png
```
CRITICAL: Check tool availability before using. Prefer `magick` (IM7) over `convert` (IM6). If no tools found, inform user: `brew install imagemagick`.
## Common Issues
### "No API credentials configured"
**Cause:** Environment variables not set or not exported.
**Fix:** Add exports to `~/.zshrc` and run `source ~/.zshrc`. See `references/setup-guide.md`.
### "HTTP 401: Unauthorized"
**Cause:** Invalid or expired API key/token.
**Fix:** Check `AI_IMG_CREATOR_CF_TOKEN` (gateway) or `AI_IMG_CREATOR_OPENROUTER_KEY` (direct). Regenerate if needed.
### "No images in response"
**Cause:** Model returned text only (safety filter, unclear prompt, or unsupported request).
**Fix:** Make the prompt more specific and descriptive. Avoid prohibited content.
### "Connection error" / timeout
**Cause:** Network issue or image generation taking too long (120s timeout).
**Fix:** Retry. If persistent, try `--provider google` as alternative. Check CF gateway status.
## Detailed API Reference
For full API formats, response schemas, BYOK configuration, and curl examples:
see [references/api-reference.md](references/api-reference.md)
For first-time setup instructions:
see [references/setup-guide.md](references/setup-guide.md)