From b1008510f8efc402606c0093d5feb47cbbed16c6 Mon Sep 17 00:00:00 2001
From: YeonGyu-Kim <code.yeon.gyu@gmail.com>
Date: Thu, 19 Feb 2026 14:17:53 +0900
Subject: [PATCH 1/6] docs: add agent-model matching guide for newcomers

- Add docs/guide/agent-model-matching.md with TL;DR table, detailed
  breakdown per agent, configuration examples, decision tree, common
  pitfalls, and default fallback chains
- Update README.md to reference the guide in TOC, Just Install This
  section, and Features overview
---
 README.md                          |   6 ++
 docs/guide/agent-model-matching.md | 166 +++++++++++++++++++++++++++++
 2 files changed, 172 insertions(+)
 create mode 100644 docs/guide/agent-model-matching.md

diff --git a/README.md b/README.md
index aa134f808..5a1e7a617 100644
--- a/README.md
+++ b/README.md
@@ -115,6 +115,7 @@ Yes, technically possible. But I cannot recommend using it.
     - [🪄 The Magic Word: `ultrawork`](#-the-magic-word-ultrawork)
     - [For Those Who Want to Read: Meet Sisyphus](#for-those-who-want-to-read-meet-sisyphus)
       - [Just Install This](#just-install-this)
+    - [Which Model Should I Use?](#which-model-should-i-use)
     - [For Those Who Want Autonomy: Meet Hephaestus](#for-those-who-want-autonomy-meet-hephaestus)
   - [Installation](#installation)
     - [For Humans](#for-humans)
@@ -222,6 +223,10 @@ Need to look something up? It scours official docs, your entire codebase history
 
 If you don't want all this, as mentioned, you can just pick and choose specific features.
 
+#### Which Model Should I Use?
+
+New to oh-my-opencode and not sure which model to pair with which agent? Check the **[Agent-Model Matching Guide](docs/guide/agent-model-matching.md)** — a quick reference for newcomers covering recommended models, fallback chains, and common pitfalls for each agent.
+
 ### For Those Who Want Autonomy: Meet Hephaestus
 
 ![Meet Hephaestus](.github/assets/hephaestus.png)
@@ -307,6 +312,7 @@ See the full [Features Documentation](docs/features.md) for detailed information
 - **Built-in MCPs**: websearch (Exa), context7 (docs), grep_app (GitHub search)
 - **Session Tools**: List, read, search, and analyze session history
 - **Productivity Features**: Ralph Loop, Todo Enforcer, Comment Checker, Think Mode, and more
+- **[Agent-Model Matching Guide](docs/guide/agent-model-matching.md)**: Which model works best with which agent
 
 ## Configuration
 
diff --git a/docs/guide/agent-model-matching.md b/docs/guide/agent-model-matching.md
new file mode 100644
index 000000000..c78aa0b91
--- /dev/null
+++ b/docs/guide/agent-model-matching.md
@@ -0,0 +1,166 @@
+# Agent-Model Matching Guide for Newcomers
+
+> **Quick Reference**: Which model to use with which agent for the best results
+
+This guide helps you match the right AI model to each oh-my-opencode agent based on real-world usage and testing.
+
+---
+
+## TL;DR
+
+| Agent | Best Models | Avoid |
+|-------|-------------|-------|
+| **Sisyphus** | Claude Opus, Sonnet, Kimi K2.5, GLM 5 | GPT ❌ |
+| **Hephaestus** | GPT-5.3-codex only | Non-GPT ❌ |
+| **Prometheus** | Claude Opus | GPT (untested) |
+| **Atlas** | Claude Opus, GPT-5.2+ | — |
+
+---
+
+## Detailed Breakdown
+
+### Sisyphus (ultraworker)
+**Purpose**: Primary orchestrator for complex multi-step tasks
+
+**Recommended Models** (in order of preference):
+1. **Claude Opus-4-6** — The best overall performance
+2. **Claude Sonnet-4-6** — Satisfiable, often better than pure Claude Code + Opus
+3. **Kimi K2.5** — Good for broad tasks, excellent cost-performance
+4. **GLM 5** — Good for various tasks, not as capable on broad tasks as Kimi
+5. **MiniMax** — Budget option when cost matters
+
+**⚠️ NEVER USE GPT** — Sisyphus is optimized for Claude-style models and performs poorly on GPT.
+
+**Configuration Example**:
+```json
+{
+  "agent": {
+    "sisyphus": {
+      "model": "anthropic/claude-opus-4-6",
+      "variant": "max"
+    }
+  }
+}
+```
+
+---
+
+### Hephaestus (deep worker)
+**Purpose**: Deep coding tasks requiring extensive reasoning
+
+**Required Model**: **GPT-5.3-codex** (always)
+
+Think of Hephaestus as "Codex on steroids" — it's specifically designed and tuned for GPT models.
+
+**⚠️ DO NOT USE** if you don't have GPT access. DeepSeek *might* work but is not officially supported.
+
+**Configuration Example**:
+```json
+{
+  "agent": {
+    "hephaestus": {
+      "model": "openai/gpt-5.3-codex",
+      "variant": "medium"
+    }
+  }
+}
+```
+
+---
+
+### Prometheus (planner)
+**Purpose**: Strategic planning and work plan generation
+
+**Recommended Model**: **Claude Opus-4-6** (strongly recommended)
+
+Prometheus is optimized for Claude's reasoning capabilities. GPT compatibility is not yet tested but may be evaluated in the future.
+
+**Configuration Example**:
+```json
+{
+  "agent": {
+    "plan": {
+      "model": "anthropic/claude-opus-4-6",
+      "variant": "max"
+    }
+  }
+}
+```
+
+---
+
+### Atlas (orchestrator)
+**Purpose**: Todo list orchestration and multi-agent coordination
+
+**Recommended Models**:
+1. **Claude Opus-4-6** — Best performance (recommended)
+2. **GPT-5.2+** — Good enough, has GPT-optimized prompt
+
+Atlas has model-specific prompt detection and will automatically use GPT-optimized instructions when running on GPT models.
+
+**Configuration Example**:
+```json
+{
+  "agent": {
+    "atlas": {
+      "model": "anthropic/claude-opus-4-6"
+    }
+  }
+}
+```
+
+---
+
+## Quick Decision Tree
+
+```
+Do you have GPT access?
+├── YES → Use Hephaestus for deep coding, Atlas for orchestration
+└── NO  → Use Sisyphus (Claude/Kimi/GLM) for all tasks
+
+Need planning/strategy?
+├── YES → Use Prometheus (Claude Opus recommended)
+└── NO  → Skip Prometheus, use other agents directly
+
+Complex multi-step task?
+├── YES → Use Sisyphus (Claude-family models)
+└── NO  → Use category-specific agents or Hephaestus
+```
+
+---
+
+## Common Pitfalls to Avoid
+
+1. **Don't use GPT with Sisyphus** — Performance will be subpar
+2. **Don't use non-GPT with Hephaestus** — It's specifically built for GPT
+3. **Don't force Prometheus on GPT** — It's untested; use Claude for now
+4. **Don't overthink Atlas** — It adapts automatically to your model
+
+---
+
+## Model Fallback Chains (Default Behavior)
+
+The system will automatically fall back through these chains if your preferred model is unavailable:
+
+**Sisyphus**: Opus → Kimi K2.5 → GLM 4.7 → Big Pickle
+**Hephaestus**: GPT-5.3-codex only (no fallback)
+**Prometheus**: Opus → Kimi K2.5 → GLM 4.7 → GPT-5.2 → Gemini 3 Pro
+**Atlas**: Kimi K2.5 → GLM 4.7 → Opus → GPT-5.2 → Gemini 3 Pro
+
+---
+
+## Tips for Newcomers
+
+- **Start with Sisyphus + Claude Opus** for general tasks
+- **Use Hephaestus when you need deep reasoning** (requires GPT)
+- **Try GLM 5 or Kimi K2.5** for cost-effective alternatives to Claude
+- **Check the model requirements** in your config to avoid mismatches
+- **Use `variant: "max"` or `variant: "high"`** for best results on capable models
+
+---
+
+## See Also
+
+- [AGENTS.md](../AGENTS.md) — Full agent documentation
+- [configurations.md](./configurations.md) — Configuration reference
+- [orchestration-guide.md](./orchestration-guide.md) — How orchestration works

From 3b8846e956a4143453d93bec601eea2e064807e7 Mon Sep 17 00:00:00 2001
From: YeonGyu-Kim <code.yeon.gyu@gmail.com>
Date: Thu, 19 Feb 2026 14:24:01 +0900
Subject: [PATCH 2/6] fix: correct Atlas model recommendations

Atlas primary model is Kimi K2.5, not Opus. Updated TL;DR table
and detailed breakdown to reflect actual recommended order:
Kimi K2.5 > Sonnet > GPT.
---
 docs/guide/agent-model-matching.md | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/docs/guide/agent-model-matching.md b/docs/guide/agent-model-matching.md
index c78aa0b91..3d6e24943 100644
--- a/docs/guide/agent-model-matching.md
+++ b/docs/guide/agent-model-matching.md
@@ -13,7 +13,7 @@ This guide helps you match the right AI model to each oh-my-opencode agent based
 | **Sisyphus** | Claude Opus, Sonnet, Kimi K2.5, GLM 5 | GPT ❌ |
 | **Hephaestus** | GPT-5.3-codex only | Non-GPT ❌ |
 | **Prometheus** | Claude Opus | GPT (untested) |
-| **Atlas** | Claude Opus, GPT-5.2+ | — |
+| **Atlas** | Kimi K2.5, Claude Sonnet, GPT-5.2+ | — |
 
 ---
 
@@ -92,9 +92,10 @@ Prometheus is optimized for Claude's reasoning capabilities. GPT compatibility i
 ### Atlas (orchestrator)
 **Purpose**: Todo list orchestration and multi-agent coordination
 
-**Recommended Models**:
-1. **Claude Opus-4-6** — Best performance (recommended)
-2. **GPT-5.2+** — Good enough, has GPT-optimized prompt
+**Recommended Models** (in order of preference):
+1. **Kimi K2.5** — Best for Atlas orchestration
+2. **Claude Sonnet-4-6** — Strong alternative
+3. **GPT-5.2+** — Good enough, has GPT-optimized prompt
 
 Atlas has model-specific prompt detection and will automatically use GPT-optimized instructions when running on GPT models.
 
@@ -103,7 +104,7 @@ Atlas has model-specific prompt detection and will automatically use GPT-optimiz
 {
   "agent": {
     "atlas": {
-      "model": "anthropic/claude-opus-4-6"
+      "model": "kimi/kimi-k2.5"
     }
   }
 }

From d9ee0d9c0db8a5743968fe12acae57b47b7a93c3 Mon Sep 17 00:00:00 2001
From: YeonGyu-Kim <code.yeon.gyu@gmail.com>
Date: Thu, 19 Feb 2026 15:01:34 +0900
Subject: [PATCH 3/6] docs: rewrite agent-model matching as technical guide for
 agents

Rewrite agent-model-matching.md as a technical reference that:
- Documents actual fallback chains from model-requirements.ts
- Explains model-specific prompt routing (Prometheus/Atlas GPT detection)
- Covers safe vs dangerous model substitutions with rationale
- Includes task categories (visual-engineering, deep, quick, etc.)
- Guides agents on how to explain model choices to users
- Adds provider priority chain

Also update installation.md to reference the guide when users
want custom model configuration, with explanation of what is
safe to change and why.
---
 docs/guide/agent-model-matching.md | 262 +++++++++++++++--------------
 docs/guide/installation.md         |  12 ++
 2 files changed, 148 insertions(+), 126 deletions(-)

diff --git a/docs/guide/agent-model-matching.md b/docs/guide/agent-model-matching.md
index 3d6e24943..27fce9b66 100644
--- a/docs/guide/agent-model-matching.md
+++ b/docs/guide/agent-model-matching.md
@@ -1,167 +1,177 @@
-# Agent-Model Matching Guide for Newcomers
+# Agent-Model Matching Guide
 
-> **Quick Reference**: Which model to use with which agent for the best results
-
-This guide helps you match the right AI model to each oh-my-opencode agent based on real-world usage and testing.
+> **For agents and users**: This document explains the principles behind oh-my-opencode's agent-model assignments. Use it to understand why each agent uses a specific model, and how to customize them correctly.
 
 ---
 
-## TL;DR
+## Why Model Matching Matters
 
-| Agent | Best Models | Avoid |
-|-------|-------------|-------|
-| **Sisyphus** | Claude Opus, Sonnet, Kimi K2.5, GLM 5 | GPT ❌ |
-| **Hephaestus** | GPT-5.3-codex only | Non-GPT ❌ |
-| **Prometheus** | Claude Opus | GPT (untested) |
-| **Atlas** | Kimi K2.5, Claude Sonnet, GPT-5.2+ | — |
+Each oh-my-opencode agent has a **dedicated system prompt** optimized for a specific model family. Some agents (Atlas, Prometheus) ship separate prompts for GPT vs Claude models, with automatic routing via `isGptModel()` detection. Assigning the wrong model family to an agent doesn't just degrade performance — the agent may receive instructions formatted for a completely different model's reasoning style.
+
+**Key principle**: Agents are tuned to model families, not individual models. A Claude-tuned agent works with Opus, Sonnet, or Haiku. A GPT-tuned agent works with GPT-5.2 or GPT-5.3-codex. Crossing families requires a model-specific prompt (which only some agents have).
 
 ---
 
-## Detailed Breakdown
+## Agent-Model Map (Source of Truth)
 
-### Sisyphus (ultraworker)
-**Purpose**: Primary orchestrator for complex multi-step tasks
+This table reflects the actual fallback chains in `src/shared/model-requirements.ts`. The first available model in the chain is used.
 
-**Recommended Models** (in order of preference):
-1. **Claude Opus-4-6** — The best overall performance
-2. **Claude Sonnet-4-6** — Satisfiable, often better than pure Claude Code + Opus
-3. **Kimi K2.5** — Good for broad tasks, excellent cost-performance
-4. **GLM 5** — Good for various tasks, not as capable on broad tasks as Kimi
-5. **MiniMax** — Budget option when cost matters
+### Core Agents
 
-**⚠️ NEVER USE GPT** — Sisyphus is optimized for Claude-style models and performs poorly on GPT.
+| Agent | Role | Primary Model Family | Fallback Chain | Has GPT Prompt? |
+|-------|------|---------------------|----------------|-----------------|
+| **Sisyphus** | Main ultraworker | Claude | Opus → Kimi K2.5 → GLM 5 → Big Pickle | No — **never use GPT** |
+| **Hephaestus** | Deep autonomous worker | GPT (only) | GPT-5.3-codex (medium) | N/A (GPT-native) |
+| **Prometheus** | Strategic planner | Claude (default), GPT (auto-detected) | Opus → GPT-5.2 → Kimi K2.5 → Gemini 3 Pro | **Yes** — `src/agents/prometheus/gpt.ts` |
+| **Atlas** | Todo orchestrator | Kimi K2.5 (default), GPT (auto-detected) | Kimi K2.5 → Sonnet → GPT-5.2 | **Yes** — `src/agents/atlas/gpt.ts` |
+| **Oracle** | Architecture/debugging | GPT | GPT-5.2 → Gemini 3 Pro → Opus | No |
+| **Metis** | Plan review consultant | Claude | Opus → Kimi K2.5 → GPT-5.2 → Gemini 3 Pro | No |
+| **Momus** | High-accuracy reviewer | GPT | GPT-5.2 → Opus → Gemini 3 Pro | No |
 
-**Configuration Example**:
-```json
+### Utility Agents
+
+| Agent | Role | Primary Model Family | Fallback Chain |
+|-------|------|---------------------|----------------|
+| **Explore** | Fast codebase grep | Grok/lightweight | Grok Code Fast 1 → MiniMax M2.5 → Haiku → GPT-5-nano |
+| **Librarian** | Docs/code search | Lightweight | MiniMax M2.5 → Gemini 3 Flash → Big Pickle |
+| **Multimodal Looker** | Vision/screenshots | Kimi/multimodal | Kimi K2.5 → Gemini 3 Flash → GPT-5.2 → GLM-4.6v |
+
+### Task Categories
+
+Categories are used for `background_task` and `delegate_task` dispatching:
+
+| Category | Purpose | Primary Model | Notes |
+|----------|---------|---------------|-------|
+| `visual-engineering` | Frontend/UI work | Gemini 3 Pro | Gemini excels at visual tasks |
+| `ultrabrain` | Maximum intelligence | GPT-5.3-codex (xhigh) | Highest reasoning variant |
+| `deep` | Deep coding | GPT-5.3-codex (medium) | Requires GPT availability |
+| `artistry` | Creative/design | Gemini 3 Pro | Requires Gemini availability |
+| `quick` | Fast simple tasks | Claude Haiku | Cheapest, fastest |
+| `unspecified-high` | General high-quality | Claude Opus | Default for complex tasks |
+| `unspecified-low` | General standard | Claude Sonnet | Default for standard tasks |
+| `writing` | Text/docs | Kimi K2.5 | Best prose quality |
+
+---
+
+## Model-Specific Prompt Routing
+
+### How It Works
+
+Some agents detect the assigned model at runtime and switch prompts:
+
+```typescript
+// From src/agents/prometheus/system-prompt.ts
+export function getPrometheusPrompt(model?: string): string {
+  if (model && isGptModel(model)) return getGptPrometheusPrompt()  // XML-tagged, principle-driven
+  return PROMETHEUS_SYSTEM_PROMPT  // Claude-optimized, modular sections
+}
+```
+
+**Agents with dual prompts:**
+- **Prometheus**: Claude prompt (modular sections) vs GPT prompt (XML-tagged, Codex plan mode style with explicit decision criteria)
+- **Atlas**: Claude prompt vs GPT prompt (GPT-optimized todo orchestration)
+
+**Why this matters for customization**: If you override Prometheus to use a GPT model, the GPT prompt activates automatically. But if you override Sisyphus to use GPT — there is no GPT prompt, and performance will degrade significantly.
+
+### Model Family Detection
+
+`isGptModel()` matches:
+- Any model starting with `openai/` or `github-copilot/gpt-`
+- Model names starting with common GPT prefixes (`gpt-`, `o1-`, `o3-`, `o4-`, `codex-`)
+
+Everything else is treated as "Claude-like" (Claude, Kimi, GLM, Gemini).
+
+---
+
+## Customization Guide
+
+### When to Customize
+
+Customize model assignments when:
+- You have a specific provider subscription (e.g., only OpenAI, no Anthropic)
+- You want to use a cheaper model for certain agents
+- You're experimenting with new models
+
+### How to Customize
+
+Override in `oh-my-opencode.json` (user: `~/.config/opencode/oh-my-opencode.json`, project: `.opencode/oh-my-opencode.json`):
+
+```jsonc
 {
-  "agent": {
-    "sisyphus": {
-      "model": "anthropic/claude-opus-4-6",
-      "variant": "max"
-    }
+  "agents": {
+    "sisyphus": { "model": "kimi-for-coding/k2p5" },
+    "atlas": { "model": "anthropic/claude-sonnet-4-6" },
+    "prometheus": { "model": "openai/gpt-5.2" }  // Will auto-switch to GPT prompt
   }
 }
 ```
 
----
+### Safe Substitutions (same model family)
 
-### Hephaestus (deep worker)
-**Purpose**: Deep coding tasks requiring extensive reasoning
+These swaps are safe because they stay within the same prompt family:
 
-**Required Model**: **GPT-5.3-codex** (always)
+| Agent | Default | Safe Alternatives |
+|-------|---------|-------------------|
+| **Sisyphus** | Claude Opus | Claude Sonnet, Kimi K2.5, GLM 5 (any Claude-like) |
+| **Hephaestus** | GPT-5.3-codex | No alternatives — GPT only |
+| **Prometheus** | Claude Opus | Claude Sonnet (Claude prompt) OR GPT-5.2 (auto-switches to GPT prompt) |
+| **Atlas** | Kimi K2.5 | Claude Sonnet (Claude prompt) OR GPT-5.2 (auto-switches to GPT prompt) |
+| **Oracle** | GPT-5.2 | Gemini 3 Pro, Claude Opus |
 
-Think of Hephaestus as "Codex on steroids" — it's specifically designed and tuned for GPT models.
+### Dangerous Substitutions (cross-family without prompt support)
 
-**⚠️ DO NOT USE** if you don't have GPT access. DeepSeek *might* work but is not officially supported.
+| Agent | Dangerous Override | Why |
+|-------|-------------------|-----|
+| **Sisyphus** → GPT | No GPT-optimized prompt exists. Sisyphus is deeply tuned for Claude-style reasoning. Performance drops dramatically. |
+| **Hephaestus** → Claude | Hephaestus is purpose-built for GPT's Codex capabilities. Claude cannot replicate this. |
+| **Explore** → Opus | Massive overkill and cost waste. Explore needs speed, not intelligence. |
 
-**Configuration Example**:
-```json
-{
-  "agent": {
-    "hephaestus": {
-      "model": "openai/gpt-5.3-codex",
-      "variant": "medium"
-    }
-  }
-}
-```
+### Explaining to Users
+
+When a user asks about model configuration, explain:
+
+1. **The default works out of the box** — the installer configures optimal models based on their subscriptions
+2. **Each agent has a "home" model family** — Sisyphus is Claude-native, Hephaestus is GPT-native
+3. **Some agents auto-adapt** — Prometheus and Atlas detect GPT models and switch to optimized prompts
+4. **Cross-family overrides are risky** — unless the agent has a dedicated prompt for that family
+5. **Cost optimization is valid** — swapping Opus → Sonnet or Kimi K2.5 for Sisyphus is fine and saves money
+6. **Point to this guide** for the full fallback chains and rationale
 
 ---
 
-### Prometheus (planner)
-**Purpose**: Strategic planning and work plan generation
+## Provider Priority
 
-**Recommended Model**: **Claude Opus-4-6** (strongly recommended)
-
-Prometheus is optimized for Claude's reasoning capabilities. GPT compatibility is not yet tested but may be evaluated in the future.
-
-**Configuration Example**:
-```json
-{
-  "agent": {
-    "plan": {
-      "model": "anthropic/claude-opus-4-6",
-      "variant": "max"
-    }
-  }
-}
-```
-
----
-
-### Atlas (orchestrator)
-**Purpose**: Todo list orchestration and multi-agent coordination
-
-**Recommended Models** (in order of preference):
-1. **Kimi K2.5** — Best for Atlas orchestration
-2. **Claude Sonnet-4-6** — Strong alternative
-3. **GPT-5.2+** — Good enough, has GPT-optimized prompt
-
-Atlas has model-specific prompt detection and will automatically use GPT-optimized instructions when running on GPT models.
-
-**Configuration Example**:
-```json
-{
-  "agent": {
-    "atlas": {
-      "model": "kimi/kimi-k2.5"
-    }
-  }
-}
-```
-
----
-
-## Quick Decision Tree
+When multiple providers are available, oh-my-opencode prefers:
 
 ```
-Do you have GPT access?
-├── YES → Use Hephaestus for deep coding, Atlas for orchestration
-└── NO  → Use Sisyphus (Claude/Kimi/GLM) for all tasks
-
-Need planning/strategy?
-├── YES → Use Prometheus (Claude Opus recommended)
-└── NO  → Skip Prometheus, use other agents directly
-
-Complex multi-step task?
-├── YES → Use Sisyphus (Claude-family models)
-└── NO  → Use category-specific agents or Hephaestus
+Native (anthropic/, openai/, google/) > Kimi for Coding > GitHub Copilot > OpenCode Zen > Z.ai Coding Plan
 ```
 
----
-
-## Common Pitfalls to Avoid
-
-1. **Don't use GPT with Sisyphus** — Performance will be subpar
-2. **Don't use non-GPT with Hephaestus** — It's specifically built for GPT
-3. **Don't force Prometheus on GPT** — It's untested; use Claude for now
-4. **Don't overthink Atlas** — It adapts automatically to your model
+Each fallback chain entry specifies which providers can serve that model. The system picks the first entry where at least one provider is connected.
 
 ---
 
-## Model Fallback Chains (Default Behavior)
+## Quick Decision Tree for Users
 
-The system will automatically fall back through these chains if your preferred model is unavailable:
+```
+What subscriptions do you have?
 
-**Sisyphus**: Opus → Kimi K2.5 → GLM 4.7 → Big Pickle
-**Hephaestus**: GPT-5.3-codex only (no fallback)
-**Prometheus**: Opus → Kimi K2.5 → GLM 4.7 → GPT-5.2 → Gemini 3 Pro
-**Atlas**: Kimi K2.5 → GLM 4.7 → Opus → GPT-5.2 → Gemini 3 Pro
+├── Claude (Anthropic) → Sisyphus works optimally. Prometheus/Metis use Claude prompts.
+├── OpenAI/ChatGPT → Hephaestus unlocked. Oracle/Momus use GPT. Prometheus auto-switches.
+├── Both Claude + OpenAI → Full agent roster. Best experience.
+├── Gemini only → Visual-engineering category excels. Other agents use Gemini as fallback.
+├── GitHub Copilot only → Works as fallback provider for all model families.
+├── OpenCode Zen only → Free-tier access to multiple models. Functional but rate-limited.
+└── No subscription → Limited functionality. Consider OpenCode Zen (free).
 
----
-
-## Tips for Newcomers
-
-- **Start with Sisyphus + Claude Opus** for general tasks
-- **Use Hephaestus when you need deep reasoning** (requires GPT)
-- **Try GLM 5 or Kimi K2.5** for cost-effective alternatives to Claude
-- **Check the model requirements** in your config to avoid mismatches
-- **Use `variant: "max"` or `variant: "high"`** for best results on capable models
+For each user scenario, the installer (`bunx oh-my-opencode install`) auto-configures the optimal assignment.
+```
 
 ---
 
 ## See Also
 
-- [AGENTS.md](../AGENTS.md) — Full agent documentation
-- [configurations.md](./configurations.md) — Configuration reference
-- [orchestration-guide.md](./orchestration-guide.md) — How orchestration works
+- [Installation Guide](./installation.md) — Setup with subscription-based model configuration
+- [Configuration Reference](../configurations.md) — Full config options including agent overrides
+- [Overview](./overview.md) — How the agent system works
+- [`src/shared/model-requirements.ts`](../../src/shared/model-requirements.ts) — Source of truth for fallback chains
diff --git a/docs/guide/installation.md b/docs/guide/installation.md
index 051887c2d..e8d27cf54 100644
--- a/docs/guide/installation.md
+++ b/docs/guide/installation.md
@@ -259,6 +259,18 @@ opencode auth login
 
 The plugin works perfectly by default. Do not change settings or turn off features without an explicit request.
 
+### Custom Model Configuration
+
+If the user wants to override which model an agent uses, refer to the **[Agent-Model Matching Guide](./agent-model-matching.md)** before making changes. That guide explains:
+
+- **Why each agent uses its default model** — prompt optimization, model family compatibility
+- **Which substitutions are safe** — staying within the same model family (e.g., Opus → Sonnet for Sisyphus)
+- **Which substitutions are dangerous** — crossing model families without prompt support (e.g., GPT for Sisyphus)
+- **How auto-routing works** — Prometheus and Atlas detect GPT models and switch to GPT-optimized prompts automatically
+- **Full fallback chains** — what happens when the preferred model is unavailable
+
+Always explain to the user *why* a model is assigned to an agent when making custom changes. The guide provides the rationale for every assignment.
+
 ### Verify the setup
 
 Read this document again, think about you have done everything correctly.

From 36432fe18ed43dfc02b8933ada98341c285991aa Mon Sep 17 00:00:00 2001
From: YeonGyu-Kim <code.yeon.gyu@gmail.com>
Date: Thu, 19 Feb 2026 15:04:57 +0900
Subject: [PATCH 4/6] docs: add prompt design rationale from Codex plan mode
 analysis

Expand model-specific prompt routing section with insights from
the actual Prometheus GPT prompt development session:
- Why Claude vs GPT models need fundamentally different prompts
- Principle-driven (GPT) vs mechanics-driven (Claude) approach
- "Decision Complete" concept from Codex Plan Mode
- Why more rules help Claude but hurt GPT (contradiction surface)
- Concrete size comparison (1100 lines Claude vs 300 lines GPT)
---
 docs/guide/agent-model-matching.md | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/docs/guide/agent-model-matching.md b/docs/guide/agent-model-matching.md
index 27fce9b66..889b19d4b 100644
--- a/docs/guide/agent-model-matching.md
+++ b/docs/guide/agent-model-matching.md
@@ -55,6 +55,18 @@ Categories are used for `background_task` and `delegate_task` dispatching:
 
 ## Model-Specific Prompt Routing
 
+### Why Different Models Need Different Prompts
+
+Claude and GPT models have fundamentally different instruction-following behaviors:
+
+- **Claude models** respond well to **mechanics-driven** prompts — detailed checklists, templates, step-by-step procedures, and explicit anti-patterns. More rules = more compliance.
+- **GPT models** (especially 5.2+) have **stronger instruction adherence** and respond better to **principle-driven** prompts — concise principles, XML-tagged structure, explicit decision criteria. More rules = more contradiction surface area = more drift.
+
+This insight comes from analyzing OpenAI's Codex Plan Mode prompt alongside the GPT-5.2 Prompting Guide:
+- Codex Plan Mode uses 3 clean principles in ~121 lines to achieve what Prometheus's Claude prompt does in ~1,100 lines across 7 files
+- GPT-5.2's "conservative grounding bias" and "more deliberate scaffolding" mean it builds clearer plans by default, but needs **explicit decision criteria** (it won't infer what you want)
+- The key concept is **"Decision Complete"** — a plan must leave ZERO decisions to the implementer. GPT models follow this literally when stated as a principle, while Claude models need enforcement mechanisms
+
 ### How It Works
 
 Some agents detect the assigned model at runtime and switch prompts:
@@ -68,10 +80,10 @@ export function getPrometheusPrompt(model?: string): string {
 ```
 
 **Agents with dual prompts:**
-- **Prometheus**: Claude prompt (modular sections) vs GPT prompt (XML-tagged, Codex plan mode style with explicit decision criteria)
-- **Atlas**: Claude prompt vs GPT prompt (GPT-optimized todo orchestration)
+- **Prometheus**: Claude prompt (~1,100 lines, 7 files, mechanics-driven with checklists and templates) vs GPT prompt (~300 lines, single file, principle-driven with XML structure inspired by Codex Plan Mode)
+- **Atlas**: Claude prompt vs GPT prompt (GPT-optimized todo orchestration with explicit scope constraints)
 
-**Why this matters for customization**: If you override Prometheus to use a GPT model, the GPT prompt activates automatically. But if you override Sisyphus to use GPT — there is no GPT prompt, and performance will degrade significantly.
+**Why this matters for customization**: If you override Prometheus to use a GPT model, the GPT prompt activates automatically — and it's specifically designed for how GPT reasons. But if you override Sisyphus to use GPT — there is no GPT prompt, and performance will degrade significantly because Sisyphus's prompt is deeply tuned for Claude's reasoning style.
 
 ### Model Family Detection
 

From 98d39ceea0d0ac5dfa97348c05aac8772c25c0b6 Mon Sep 17 00:00:00 2001
From: YeonGyu-Kim <code.yeon.gyu@gmail.com>
Date: Thu, 19 Feb 2026 15:09:05 +0900
Subject: [PATCH 5/6] docs: sync agent-model guide with latest catalog changes

Update all fallback chains to match current model-requirements.ts:
- Librarian: now minimax-m2.5-free -> gemini-flash -> big-pickle (free-tier first)
- Explore: add minimax-m2.5-free as #2 after grok-code-fast-1
- Multimodal Looker: reorder to kimi-first (k2p5 -> kimi-free -> flash -> gpt-5.2)
- Atlas: remove gemini-3-pro, keep kimi k2.5 -> sonnet -> gpt-5.2
- GLM 4.7 -> GLM 5 everywhere
- Add venice provider for grok, opencode provider for glm-5

Add design philosophy section explaining the intelligence hierarchy:
premium models for core agents, free-tier for utility agents, balanced
for orchestrators. Document why utility agents intentionally use cheap
models and why Kimi K2.5 appears as primary for multiple agents.
---
 docs/guide/agent-model-matching.md | 79 ++++++++++++++++++++----------
 1 file changed, 52 insertions(+), 27 deletions(-)

diff --git a/docs/guide/agent-model-matching.md b/docs/guide/agent-model-matching.md
index 889b19d4b..6b564d43f 100644
--- a/docs/guide/agent-model-matching.md
+++ b/docs/guide/agent-model-matching.md
@@ -12,44 +12,58 @@ Each oh-my-opencode agent has a **dedicated system prompt** optimized for a spec
 
 ---
 
+## Design Philosophy: Intelligence Where It Matters, Speed Everywhere Else
+
+The model catalog follows a clear hierarchy:
+
+1. **Core agents get premium models** — Sisyphus (Claude Opus), Hephaestus (GPT-5.3-codex), Prometheus (Opus/GPT-5.2). These agents handle complex multi-step reasoning where model quality directly impacts output.
+
+2. **Utility agents get fast, free-tier models** — Explore (Grok Code Fast → MiniMax M2.5 Free), Librarian (MiniMax M2.5 Free → Gemini Flash → Big Pickle). These agents do search, grep, and doc retrieval where speed matters more than deep reasoning.
+
+3. **Orchestrator agents get balanced models** — Atlas (Kimi K2.5 → Sonnet), Metis (Opus → Kimi K2.5). These need good instruction-following but don't need maximum intelligence.
+
+4. **Free-tier models are first-class citizens** — MiniMax M2.5 Free, Big Pickle, GPT-5-Nano, and Kimi K2.5 Free appear throughout fallback chains. This means oh-my-opencode works well even with OpenCode Zen (free) as the only provider.
+
+---
+
 ## Agent-Model Map (Source of Truth)
 
 This table reflects the actual fallback chains in `src/shared/model-requirements.ts`. The first available model in the chain is used.
 
 ### Core Agents
 
-| Agent | Role | Primary Model Family | Fallback Chain | Has GPT Prompt? |
-|-------|------|---------------------|----------------|-----------------|
-| **Sisyphus** | Main ultraworker | Claude | Opus → Kimi K2.5 → GLM 5 → Big Pickle | No — **never use GPT** |
-| **Hephaestus** | Deep autonomous worker | GPT (only) | GPT-5.3-codex (medium) | N/A (GPT-native) |
-| **Prometheus** | Strategic planner | Claude (default), GPT (auto-detected) | Opus → GPT-5.2 → Kimi K2.5 → Gemini 3 Pro | **Yes** — `src/agents/prometheus/gpt.ts` |
-| **Atlas** | Todo orchestrator | Kimi K2.5 (default), GPT (auto-detected) | Kimi K2.5 → Sonnet → GPT-5.2 | **Yes** — `src/agents/atlas/gpt.ts` |
-| **Oracle** | Architecture/debugging | GPT | GPT-5.2 → Gemini 3 Pro → Opus | No |
-| **Metis** | Plan review consultant | Claude | Opus → Kimi K2.5 → GPT-5.2 → Gemini 3 Pro | No |
-| **Momus** | High-accuracy reviewer | GPT | GPT-5.2 → Opus → Gemini 3 Pro | No |
+| Agent | Role | Fallback Chain (in order) | Has GPT Prompt? |
+|-------|------|---------------------------|-----------------|
+| **Sisyphus** | Main ultraworker | Opus (max) → Kimi K2.5 → Kimi K2.5 Free → GLM 5 → Big Pickle | No — **never use GPT** |
+| **Hephaestus** | Deep autonomous worker | GPT-5.3-codex (medium) — no fallback | N/A (GPT-native) |
+| **Prometheus** | Strategic planner | Opus (max) → **GPT-5.2 (high)** → Kimi K2.5 → Kimi K2.5 Free → Gemini 3 Pro | **Yes** — auto-switches |
+| **Atlas** | Todo orchestrator | **Kimi K2.5** → Kimi K2.5 Free → Sonnet → GPT-5.2 | **Yes** — auto-switches |
+| **Oracle** | Architecture/debugging | GPT-5.2 (high) → Gemini 3 Pro (high) → Opus (max) | No |
+| **Metis** | Plan review consultant | Opus (max) → Kimi K2.5 → Kimi K2.5 Free → GPT-5.2 (high) → Gemini 3 Pro (high) | No |
+| **Momus** | High-accuracy reviewer | GPT-5.2 (medium) → Opus (max) → Gemini 3 Pro (high) | No |
 
 ### Utility Agents
 
-| Agent | Role | Primary Model Family | Fallback Chain |
-|-------|------|---------------------|----------------|
-| **Explore** | Fast codebase grep | Grok/lightweight | Grok Code Fast 1 → MiniMax M2.5 → Haiku → GPT-5-nano |
-| **Librarian** | Docs/code search | Lightweight | MiniMax M2.5 → Gemini 3 Flash → Big Pickle |
-| **Multimodal Looker** | Vision/screenshots | Kimi/multimodal | Kimi K2.5 → Gemini 3 Flash → GPT-5.2 → GLM-4.6v |
+| Agent | Role | Fallback Chain (in order) | Design Rationale |
+|-------|------|---------------------------|------------------|
+| **Explore** | Fast codebase grep | Grok Code Fast 1 → **MiniMax M2.5 Free** → Haiku → GPT-5-Nano | Speed over intelligence. Grok Code is fastest for grep-style work. MiniMax Free as cheap fallback. |
+| **Librarian** | Docs/code search | **MiniMax M2.5 Free** → Gemini 3 Flash → Big Pickle | Entirely free-tier chain. Doc retrieval doesn't need Opus-level reasoning. |
+| **Multimodal Looker** | Vision/screenshots | **Kimi K2.5** → Kimi K2.5 Free → Gemini 3 Flash → GPT-5.2 → GLM-4.6v | Kimi excels at multimodal. Gemini Flash as lightweight vision fallback. |
 
 ### Task Categories
 
 Categories are used for `background_task` and `delegate_task` dispatching:
 
-| Category | Purpose | Primary Model | Notes |
-|----------|---------|---------------|-------|
-| `visual-engineering` | Frontend/UI work | Gemini 3 Pro | Gemini excels at visual tasks |
-| `ultrabrain` | Maximum intelligence | GPT-5.3-codex (xhigh) | Highest reasoning variant |
-| `deep` | Deep coding | GPT-5.3-codex (medium) | Requires GPT availability |
-| `artistry` | Creative/design | Gemini 3 Pro | Requires Gemini availability |
-| `quick` | Fast simple tasks | Claude Haiku | Cheapest, fastest |
-| `unspecified-high` | General high-quality | Claude Opus | Default for complex tasks |
-| `unspecified-low` | General standard | Claude Sonnet | Default for standard tasks |
-| `writing` | Text/docs | Kimi K2.5 | Best prose quality |
+| Category | Purpose | Fallback Chain | Notes |
+|----------|---------|----------------|-------|
+| `visual-engineering` | Frontend/UI work | Gemini 3 Pro (high) → GLM 5 → Opus (max) → Kimi K2.5 | Gemini excels at visual tasks |
+| `ultrabrain` | Maximum intelligence | GPT-5.3-codex (xhigh) → Gemini 3 Pro (high) → Opus (max) | Highest reasoning variant |
+| `deep` | Deep coding | GPT-5.3-codex (medium) → Opus (max) → Gemini 3 Pro (high) | Requires GPT availability |
+| `artistry` | Creative/design | Gemini 3 Pro (high) → Opus (max) → GPT-5.2 | Requires Gemini availability |
+| `quick` | Fast simple tasks | Haiku → Gemini 3 Flash → GPT-5-Nano | Cheapest, fastest |
+| `unspecified-high` | General high-quality | Opus (max) → GPT-5.2 (high) → Gemini 3 Pro | Default for complex tasks |
+| `unspecified-low` | General standard | Sonnet → GPT-5.3-codex (medium) → Gemini 3 Flash | Default for standard tasks |
+| `writing` | Text/docs | **Kimi K2.5** → Gemini 3 Flash → Sonnet | Kimi produces best prose quality |
 
 ---
 
@@ -129,6 +143,8 @@ These swaps are safe because they stay within the same prompt family:
 | **Prometheus** | Claude Opus | Claude Sonnet (Claude prompt) OR GPT-5.2 (auto-switches to GPT prompt) |
 | **Atlas** | Kimi K2.5 | Claude Sonnet (Claude prompt) OR GPT-5.2 (auto-switches to GPT prompt) |
 | **Oracle** | GPT-5.2 | Gemini 3 Pro, Claude Opus |
+| **Librarian** | MiniMax M2.5 Free | Gemini 3 Flash, Big Pickle, any lightweight model |
+| **Explore** | Grok Code Fast 1 | MiniMax M2.5 Free, Haiku, GPT-5-Nano — speed is key |
 
 ### Dangerous Substitutions (cross-family without prompt support)
 
@@ -137,6 +153,7 @@ These swaps are safe because they stay within the same prompt family:
 | **Sisyphus** → GPT | No GPT-optimized prompt exists. Sisyphus is deeply tuned for Claude-style reasoning. Performance drops dramatically. |
 | **Hephaestus** → Claude | Hephaestus is purpose-built for GPT's Codex capabilities. Claude cannot replicate this. |
 | **Explore** → Opus | Massive overkill and cost waste. Explore needs speed, not intelligence. |
+| **Librarian** → Opus | Same — doc retrieval is a search task, not a reasoning task. Opus is wasted here. |
 
 ### Explaining to Users
 
@@ -146,8 +163,10 @@ When a user asks about model configuration, explain:
 2. **Each agent has a "home" model family** — Sisyphus is Claude-native, Hephaestus is GPT-native
 3. **Some agents auto-adapt** — Prometheus and Atlas detect GPT models and switch to optimized prompts
 4. **Cross-family overrides are risky** — unless the agent has a dedicated prompt for that family
-5. **Cost optimization is valid** — swapping Opus → Sonnet or Kimi K2.5 for Sisyphus is fine and saves money
-6. **Point to this guide** for the full fallback chains and rationale
+5. **Cost optimization is valid** — swapping Opus → Sonnet or Kimi K2.5 for Sisyphus saves money with acceptable quality trade-off
+6. **Utility agents are intentionally cheap** — Librarian and Explore use free-tier models by design. Don't "upgrade" them to Opus thinking it'll help — it just wastes tokens on simple search tasks
+7. **Kimi K2.5 is a versatile workhorse** — it appears as primary for Atlas (orchestration), Multimodal Looker (vision), and writing tasks. It's consistently good across these roles without being expensive.
+8. **Point to this guide** for the full fallback chains and rationale
 
 ---
 
@@ -161,6 +180,11 @@ Native (anthropic/, openai/, google/) > Kimi for Coding > GitHub Copilot > OpenC
 
 Each fallback chain entry specifies which providers can serve that model. The system picks the first entry where at least one provider is connected.
 
+**Notable provider mappings:**
+- `venice` — alternative provider for Grok Code Fast 1 (Explore agent)
+- `opencode` — serves free-tier models (Kimi K2.5 Free, MiniMax M2.5 Free, Big Pickle, GPT-5-Nano) and premium models via OpenCode Zen
+- `zai-coding-plan` — GLM 5 and GLM-4.6v models
+
 ---
 
 ## Quick Decision Tree for Users
@@ -172,8 +196,9 @@ What subscriptions do you have?
 ├── OpenAI/ChatGPT → Hephaestus unlocked. Oracle/Momus use GPT. Prometheus auto-switches.
 ├── Both Claude + OpenAI → Full agent roster. Best experience.
 ├── Gemini only → Visual-engineering category excels. Other agents use Gemini as fallback.
+├── Kimi for Coding → Atlas, Multimodal Looker, writing tasks work great. Sisyphus usable.
 ├── GitHub Copilot only → Works as fallback provider for all model families.
-├── OpenCode Zen only → Free-tier access to multiple models. Functional but rate-limited.
+├── OpenCode Zen only → Free-tier access. Librarian/Explore work perfectly. Core agents functional but rate-limited.
 └── No subscription → Limited functionality. Consider OpenCode Zen (free).
 
 For each user scenario, the installer (`bunx oh-my-opencode install`) auto-configures the optimal assignment.

From 6909e5fb4c631002204b8dbad317dd36bfdc7b7b Mon Sep 17 00:00:00 2001
From: YeonGyu-Kim <code.yeon.gyu@gmail.com>
Date: Thu, 19 Feb 2026 15:17:41 +0900
Subject: [PATCH 6/6] docs: restructure agent-model guide by model family and
 role

Complete rewrite organized around model families, agent roles,
task categories, and selection priority rules.

- Model families: Claude-like (Kimi, GLM/Big Pickle), GPT,
  different-behavior (Gemini, MiniMax), speed-focused (Grok, Spark)
- Agent roles: Claude-optimized, dual-prompt, GPT-native, utility
- gpt-5.3-codex-spark: extremely fast but compacts too aggressively
- Big Pickle = GLM 4.6
- Explicit guidance: do not upgrade utility agents to Opus
- opencode models / opencode auth login references at top
- Link to orchestration system guide for task categories
---
 docs/guide/agent-model-matching.md | 281 +++++++++++++----------------
 1 file changed, 130 insertions(+), 151 deletions(-)

diff --git a/docs/guide/agent-model-matching.md b/docs/guide/agent-model-matching.md
index 6b564d43f..0d74538ca 100644
--- a/docs/guide/agent-model-matching.md
+++ b/docs/guide/agent-model-matching.md
@@ -1,214 +1,193 @@
 # Agent-Model Matching Guide
 
-> **For agents and users**: This document explains the principles behind oh-my-opencode's agent-model assignments. Use it to understand why each agent uses a specific model, and how to customize them correctly.
+> **For agents and users**: How to pick the right model for each agent. Read this before customizing model settings.
+
+Run `opencode models` to see all available models on your system, and `opencode auth login` to authenticate with providers.
 
 ---
 
-## Why Model Matching Matters
+## Model Families: Know Your Options
 
-Each oh-my-opencode agent has a **dedicated system prompt** optimized for a specific model family. Some agents (Atlas, Prometheus) ship separate prompts for GPT vs Claude models, with automatic routing via `isGptModel()` detection. Assigning the wrong model family to an agent doesn't just degrade performance — the agent may receive instructions formatted for a completely different model's reasoning style.
+Not all models behave the same way. Understanding which models are "similar" helps you make safe substitutions.
 
-**Key principle**: Agents are tuned to model families, not individual models. A Claude-tuned agent works with Opus, Sonnet, or Haiku. A GPT-tuned agent works with GPT-5.2 or GPT-5.3-codex. Crossing families requires a model-specific prompt (which only some agents have).
+### Claude-like Models (instruction-following, structured output)
+
+These models respond similarly to Claude and work well with oh-my-opencode's Claude-optimized prompts:
+
+| Model | Provider(s) | Notes |
+|-------|-------------|-------|
+| **Claude Opus 4.6** | anthropic, github-copilot, opencode | Best overall. Default for Sisyphus. |
+| **Claude Sonnet 4.6** | anthropic, github-copilot, opencode | Faster, cheaper. Good balance. |
+| **Claude Haiku 4.5** | anthropic, opencode | Fast and cheap. Good for quick tasks. |
+| **Kimi K2.5** | kimi-for-coding | Behaves very similarly to Claude. Great all-rounder. Default for Atlas. |
+| **Kimi K2.5 Free** | opencode | Free-tier Kimi. Rate-limited but functional. |
+| **GLM 5** | zai-coding-plan, opencode | Claude-like behavior. Good for broad tasks. |
+| **Big Pickle (GLM 4.6)** | opencode | Free-tier GLM. Decent fallback. |
+
+### GPT Models (explicit reasoning, principle-driven)
+
+GPT models need differently structured prompts. Some agents auto-detect GPT and switch prompts:
+
+| Model | Provider(s) | Notes |
+|-------|-------------|-------|
+| **GPT-5.3-codex** | openai, github-copilot, opencode | Deep coding powerhouse. Required for Hephaestus. |
+| **GPT-5.2** | openai, github-copilot, opencode | High intelligence. Default for Oracle. |
+| **GPT-5-Nano** | opencode | Ultra-cheap, fast. Good for simple utility tasks. |
+
+### Different-Behavior Models
+
+These models have unique characteristics — don't assume they'll behave like Claude or GPT:
+
+| Model | Provider(s) | Notes |
+|-------|-------------|-------|
+| **Gemini 3 Pro** | google, github-copilot, opencode | Excels at visual/frontend tasks. Different reasoning style. |
+| **Gemini 3 Flash** | google, github-copilot, opencode | Fast, good for doc search and light tasks. |
+| **MiniMax M2.5** | venice | Fast and smart. Good for utility tasks. |
+| **MiniMax M2.5 Free** | opencode | Free-tier MiniMax. Fast for search/retrieval. |
+
+### Speed-Focused Models
+
+| Model | Provider(s) | Speed | Notes |
+|-------|-------------|-------|-------|
+| **Grok Code Fast 1** | github-copilot, venice | Very fast | Optimized for code grep/search. Default for Explore. |
+| **Claude Haiku 4.5** | anthropic, opencode | Fast | Good balance of speed and intelligence. |
+| **MiniMax M2.5 (Free)** | opencode, venice | Fast | Smart for its speed class. |
+| **GPT-5.3-codex-spark** | openai | Extremely fast | Blazing fast but compacts so aggressively that oh-my-opencode's context management doesn't work well with it. Not recommended for omo agents. |
 
 ---
 
-## Design Philosophy: Intelligence Where It Matters, Speed Everywhere Else
+## Agent Roles and Recommended Models
 
-The model catalog follows a clear hierarchy:
+### Claude-Optimized Agents
 
-1. **Core agents get premium models** — Sisyphus (Claude Opus), Hephaestus (GPT-5.3-codex), Prometheus (Opus/GPT-5.2). These agents handle complex multi-step reasoning where model quality directly impacts output.
+These agents have prompts tuned for Claude-family models. Use Claude > Kimi K2.5 > GLM 5 in that priority order.
 
-2. **Utility agents get fast, free-tier models** — Explore (Grok Code Fast → MiniMax M2.5 Free), Librarian (MiniMax M2.5 Free → Gemini Flash → Big Pickle). These agents do search, grep, and doc retrieval where speed matters more than deep reasoning.
+| Agent | Role | Default Chain | What It Does |
+|-------|------|---------------|--------------|
+| **Sisyphus** | Main ultraworker | Opus (max) → Kimi K2.5 → GLM 5 → Big Pickle | Primary coding agent. Orchestrates everything. **Never use GPT — no GPT prompt exists.** |
+| **Metis** | Plan review | Opus (max) → Kimi K2.5 → GPT-5.2 → Gemini 3 Pro | Reviews Prometheus plans for gaps. |
 
-3. **Orchestrator agents get balanced models** — Atlas (Kimi K2.5 → Sonnet), Metis (Opus → Kimi K2.5). These need good instruction-following but don't need maximum intelligence.
+### Dual-Prompt Agents (Claude + GPT auto-switch)
 
-4. **Free-tier models are first-class citizens** — MiniMax M2.5 Free, Big Pickle, GPT-5-Nano, and Kimi K2.5 Free appear throughout fallback chains. This means oh-my-opencode works well even with OpenCode Zen (free) as the only provider.
+These agents detect your model family at runtime and switch to the appropriate prompt. If you have GPT access, these agents can use it effectively.
+
+Priority: **Claude > GPT > Claude-like models**
+
+| Agent | Role | Default Chain | GPT Prompt? |
+|-------|------|---------------|-------------|
+| **Prometheus** | Strategic planner | Opus (max) → **GPT-5.2 (high)** → Kimi K2.5 → Gemini 3 Pro | Yes — XML-tagged, principle-driven (~300 lines vs ~1,100 Claude) |
+| **Atlas** | Todo orchestrator | **Kimi K2.5** → Sonnet → GPT-5.2 | Yes — GPT-optimized todo management |
+
+### GPT-Native Agents
+
+These agents are built for GPT. Don't override to Claude.
+
+| Agent | Role | Default Chain | Notes |
+|-------|------|---------------|-------|
+| **Hephaestus** | Deep autonomous worker | GPT-5.3-codex (medium) only | "Codex on steroids." No fallback. Requires GPT access. |
+| **Oracle** | Architecture/debugging | GPT-5.2 (high) → Gemini 3 Pro → Opus | High-IQ strategic backup. GPT preferred. |
+| **Momus** | High-accuracy reviewer | GPT-5.2 (medium) → Opus → Gemini 3 Pro | Verification agent. GPT preferred. |
+
+### Utility Agents (Speed > Intelligence)
+
+These agents do search, grep, and retrieval. They intentionally use fast, cheap models. **Don't "upgrade" them to Opus — it wastes tokens on simple tasks.**
+
+| Agent | Role | Default Chain | Design Rationale |
+|-------|------|---------------|------------------|
+| **Explore** | Fast codebase grep | MiniMax M2.5 Free → Grok Code Fast → MiniMax M2.5 → Haiku → GPT-5-Nano | Speed is everything. Grok is blazing fast for grep. |
+| **Librarian** | Docs/code search | MiniMax M2.5 Free → Gemini Flash → Big Pickle | Entirely free-tier. Doc retrieval doesn't need deep reasoning. |
+| **Multimodal Looker** | Vision/screenshots | Kimi K2.5 → Kimi Free → Gemini Flash → GPT-5.2 → GLM-4.6v | Kimi excels at multimodal understanding. |
 
 ---
 
-## Agent-Model Map (Source of Truth)
+## Task Categories
 
-This table reflects the actual fallback chains in `src/shared/model-requirements.ts`. The first available model in the chain is used.
+Categories control which model is used for `background_task` and `delegate_task`. See the [Orchestration System Guide](./understanding-orchestration-system.md) for how agents dispatch tasks to categories.
 
-### Core Agents
-
-| Agent | Role | Fallback Chain (in order) | Has GPT Prompt? |
-|-------|------|---------------------------|-----------------|
-| **Sisyphus** | Main ultraworker | Opus (max) → Kimi K2.5 → Kimi K2.5 Free → GLM 5 → Big Pickle | No — **never use GPT** |
-| **Hephaestus** | Deep autonomous worker | GPT-5.3-codex (medium) — no fallback | N/A (GPT-native) |
-| **Prometheus** | Strategic planner | Opus (max) → **GPT-5.2 (high)** → Kimi K2.5 → Kimi K2.5 Free → Gemini 3 Pro | **Yes** — auto-switches |
-| **Atlas** | Todo orchestrator | **Kimi K2.5** → Kimi K2.5 Free → Sonnet → GPT-5.2 | **Yes** — auto-switches |
-| **Oracle** | Architecture/debugging | GPT-5.2 (high) → Gemini 3 Pro (high) → Opus (max) | No |
-| **Metis** | Plan review consultant | Opus (max) → Kimi K2.5 → Kimi K2.5 Free → GPT-5.2 (high) → Gemini 3 Pro (high) | No |
-| **Momus** | High-accuracy reviewer | GPT-5.2 (medium) → Opus (max) → Gemini 3 Pro (high) | No |
-
-### Utility Agents
-
-| Agent | Role | Fallback Chain (in order) | Design Rationale |
-|-------|------|---------------------------|------------------|
-| **Explore** | Fast codebase grep | Grok Code Fast 1 → **MiniMax M2.5 Free** → Haiku → GPT-5-Nano | Speed over intelligence. Grok Code is fastest for grep-style work. MiniMax Free as cheap fallback. |
-| **Librarian** | Docs/code search | **MiniMax M2.5 Free** → Gemini 3 Flash → Big Pickle | Entirely free-tier chain. Doc retrieval doesn't need Opus-level reasoning. |
-| **Multimodal Looker** | Vision/screenshots | **Kimi K2.5** → Kimi K2.5 Free → Gemini 3 Flash → GPT-5.2 → GLM-4.6v | Kimi excels at multimodal. Gemini Flash as lightweight vision fallback. |
-
-### Task Categories
-
-Categories are used for `background_task` and `delegate_task` dispatching:
-
-| Category | Purpose | Fallback Chain | Notes |
-|----------|---------|----------------|-------|
-| `visual-engineering` | Frontend/UI work | Gemini 3 Pro (high) → GLM 5 → Opus (max) → Kimi K2.5 | Gemini excels at visual tasks |
-| `ultrabrain` | Maximum intelligence | GPT-5.3-codex (xhigh) → Gemini 3 Pro (high) → Opus (max) | Highest reasoning variant |
-| `deep` | Deep coding | GPT-5.3-codex (medium) → Opus (max) → Gemini 3 Pro (high) | Requires GPT availability |
-| `artistry` | Creative/design | Gemini 3 Pro (high) → Opus (max) → GPT-5.2 | Requires Gemini availability |
-| `quick` | Fast simple tasks | Haiku → Gemini 3 Flash → GPT-5-Nano | Cheapest, fastest |
-| `unspecified-high` | General high-quality | Opus (max) → GPT-5.2 (high) → Gemini 3 Pro | Default for complex tasks |
-| `unspecified-low` | General standard | Sonnet → GPT-5.3-codex (medium) → Gemini 3 Flash | Default for standard tasks |
-| `writing` | Text/docs | **Kimi K2.5** → Gemini 3 Flash → Sonnet | Kimi produces best prose quality |
+| Category | When Used | Recommended Models | Notes |
+|----------|-----------|-------------------|-------|
+| `visual-engineering` | Frontend, UI, CSS, design | Gemini 3 Pro (high) → GLM 5 → Opus → Kimi K2.5 | Gemini dominates visual tasks |
+| `ultrabrain` | Maximum reasoning needed | GPT-5.3-codex (xhigh) → Gemini 3 Pro → Opus | Highest intelligence available |
+| `deep` | Deep coding, complex logic | GPT-5.3-codex (medium) → Opus → Gemini 3 Pro | Requires GPT availability |
+| `artistry` | Creative, novel approaches | Gemini 3 Pro (high) → Opus → GPT-5.2 | Requires Gemini availability |
+| `quick` | Simple, fast tasks | Haiku → Gemini Flash → GPT-5-Nano | Cheapest and fastest |
+| `unspecified-high` | General complex work | Opus (max) → GPT-5.2 (high) → Gemini 3 Pro | Default when no category fits |
+| `unspecified-low` | General standard work | Sonnet → GPT-5.3-codex (medium) → Gemini Flash | Everyday tasks |
+| `writing` | Text, docs, prose | Kimi K2.5 → Gemini Flash → Sonnet | Kimi produces best prose |
 
 ---
 
-## Model-Specific Prompt Routing
-
-### Why Different Models Need Different Prompts
+## Why Different Models Need Different Prompts
 
 Claude and GPT models have fundamentally different instruction-following behaviors:
 
-- **Claude models** respond well to **mechanics-driven** prompts — detailed checklists, templates, step-by-step procedures, and explicit anti-patterns. More rules = more compliance.
-- **GPT models** (especially 5.2+) have **stronger instruction adherence** and respond better to **principle-driven** prompts — concise principles, XML-tagged structure, explicit decision criteria. More rules = more contradiction surface area = more drift.
+- **Claude models** respond well to **mechanics-driven** prompts — detailed checklists, templates, step-by-step procedures. More rules = more compliance.
+- **GPT models** (especially 5.2+) respond better to **principle-driven** prompts — concise principles, XML-tagged structure, explicit decision criteria. More rules = more contradiction surface = more drift.
 
-This insight comes from analyzing OpenAI's Codex Plan Mode prompt alongside the GPT-5.2 Prompting Guide:
-- Codex Plan Mode uses 3 clean principles in ~121 lines to achieve what Prometheus's Claude prompt does in ~1,100 lines across 7 files
-- GPT-5.2's "conservative grounding bias" and "more deliberate scaffolding" mean it builds clearer plans by default, but needs **explicit decision criteria** (it won't infer what you want)
-- The key concept is **"Decision Complete"** — a plan must leave ZERO decisions to the implementer. GPT models follow this literally when stated as a principle, while Claude models need enforcement mechanisms
+Key insight from Codex Plan Mode analysis:
+- Codex Plan Mode achieves the same results with 3 principles in ~121 lines that Prometheus's Claude prompt needs ~1,100 lines across 7 files
+- The core concept is **"Decision Complete"** — a plan must leave ZERO decisions to the implementer
+- GPT follows this literally when stated as a principle; Claude needs enforcement mechanisms
 
-### How It Works
-
-Some agents detect the assigned model at runtime and switch prompts:
-
-```typescript
-// From src/agents/prometheus/system-prompt.ts
-export function getPrometheusPrompt(model?: string): string {
-  if (model && isGptModel(model)) return getGptPrometheusPrompt()  // XML-tagged, principle-driven
-  return PROMETHEUS_SYSTEM_PROMPT  // Claude-optimized, modular sections
-}
-```
-
-**Agents with dual prompts:**
-- **Prometheus**: Claude prompt (~1,100 lines, 7 files, mechanics-driven with checklists and templates) vs GPT prompt (~300 lines, single file, principle-driven with XML structure inspired by Codex Plan Mode)
-- **Atlas**: Claude prompt vs GPT prompt (GPT-optimized todo orchestration with explicit scope constraints)
-
-**Why this matters for customization**: If you override Prometheus to use a GPT model, the GPT prompt activates automatically — and it's specifically designed for how GPT reasons. But if you override Sisyphus to use GPT — there is no GPT prompt, and performance will degrade significantly because Sisyphus's prompt is deeply tuned for Claude's reasoning style.
-
-### Model Family Detection
-
-`isGptModel()` matches:
-- Any model starting with `openai/` or `github-copilot/gpt-`
-- Model names starting with common GPT prefixes (`gpt-`, `o1-`, `o3-`, `o4-`, `codex-`)
-
-Everything else is treated as "Claude-like" (Claude, Kimi, GLM, Gemini).
+This is why Prometheus and Atlas ship separate prompts per model family — they auto-detect and switch at runtime via `isGptModel()`.
 
 ---
 
 ## Customization Guide
 
-### When to Customize
-
-Customize model assignments when:
-- You have a specific provider subscription (e.g., only OpenAI, no Anthropic)
-- You want to use a cheaper model for certain agents
-- You're experimenting with new models
-
 ### How to Customize
 
-Override in `oh-my-opencode.json` (user: `~/.config/opencode/oh-my-opencode.json`, project: `.opencode/oh-my-opencode.json`):
+Override in `oh-my-opencode.json`:
 
 ```jsonc
 {
   "agents": {
     "sisyphus": { "model": "kimi-for-coding/k2p5" },
-    "atlas": { "model": "anthropic/claude-sonnet-4-6" },
-    "prometheus": { "model": "openai/gpt-5.2" }  // Will auto-switch to GPT prompt
+    "prometheus": { "model": "openai/gpt-5.2" }  // Auto-switches to GPT prompt
   }
 }
 ```
 
-### Safe Substitutions (same model family)
+### Selection Priority
 
-These swaps are safe because they stay within the same prompt family:
+When choosing models for Claude-optimized agents:
 
-| Agent | Default | Safe Alternatives |
-|-------|---------|-------------------|
-| **Sisyphus** | Claude Opus | Claude Sonnet, Kimi K2.5, GLM 5 (any Claude-like) |
-| **Hephaestus** | GPT-5.3-codex | No alternatives — GPT only |
-| **Prometheus** | Claude Opus | Claude Sonnet (Claude prompt) OR GPT-5.2 (auto-switches to GPT prompt) |
-| **Atlas** | Kimi K2.5 | Claude Sonnet (Claude prompt) OR GPT-5.2 (auto-switches to GPT prompt) |
-| **Oracle** | GPT-5.2 | Gemini 3 Pro, Claude Opus |
-| **Librarian** | MiniMax M2.5 Free | Gemini 3 Flash, Big Pickle, any lightweight model |
-| **Explore** | Grok Code Fast 1 | MiniMax M2.5 Free, Haiku, GPT-5-Nano — speed is key |
+```
+Claude (Opus/Sonnet) > GPT (if agent has dual prompt) > Claude-like (Kimi K2.5, GLM 5)
+```
 
-### Dangerous Substitutions (cross-family without prompt support)
+When choosing models for GPT-native agents:
 
-| Agent | Dangerous Override | Why |
-|-------|-------------------|-----|
-| **Sisyphus** → GPT | No GPT-optimized prompt exists. Sisyphus is deeply tuned for Claude-style reasoning. Performance drops dramatically. |
-| **Hephaestus** → Claude | Hephaestus is purpose-built for GPT's Codex capabilities. Claude cannot replicate this. |
-| **Explore** → Opus | Massive overkill and cost waste. Explore needs speed, not intelligence. |
-| **Librarian** → Opus | Same — doc retrieval is a search task, not a reasoning task. Opus is wasted here. |
+```
+GPT (5.3-codex, 5.2) > Claude Opus (decent fallback) > Gemini (acceptable)
+```
 
-### Explaining to Users
+### Safe vs Dangerous Overrides
 
-When a user asks about model configuration, explain:
+**Safe** (same family):
+- Sisyphus: Opus → Sonnet, Kimi K2.5, GLM 5
+- Prometheus: Opus → GPT-5.2 (auto-switches prompt)
+- Atlas: Kimi K2.5 → Sonnet, GPT-5.2 (auto-switches)
 
-1. **The default works out of the box** — the installer configures optimal models based on their subscriptions
-2. **Each agent has a "home" model family** — Sisyphus is Claude-native, Hephaestus is GPT-native
-3. **Some agents auto-adapt** — Prometheus and Atlas detect GPT models and switch to optimized prompts
-4. **Cross-family overrides are risky** — unless the agent has a dedicated prompt for that family
-5. **Cost optimization is valid** — swapping Opus → Sonnet or Kimi K2.5 for Sisyphus saves money with acceptable quality trade-off
-6. **Utility agents are intentionally cheap** — Librarian and Explore use free-tier models by design. Don't "upgrade" them to Opus thinking it'll help — it just wastes tokens on simple search tasks
-7. **Kimi K2.5 is a versatile workhorse** — it appears as primary for Atlas (orchestration), Multimodal Looker (vision), and writing tasks. It's consistently good across these roles without being expensive.
-8. **Point to this guide** for the full fallback chains and rationale
+**Dangerous** (no prompt support):
+- Sisyphus → GPT: **No GPT prompt. Will degrade significantly.**
+- Hephaestus → Claude: **Built for Codex. Claude can't replicate this.**
+- Explore → Opus: **Massive cost waste. Explore needs speed, not intelligence.**
+- Librarian → Opus: **Same. Doc search doesn't need Opus-level reasoning.**
 
 ---
 
 ## Provider Priority
 
-When multiple providers are available, oh-my-opencode prefers:
-
 ```
-Native (anthropic/, openai/, google/) > Kimi for Coding > GitHub Copilot > OpenCode Zen > Z.ai Coding Plan
-```
-
-Each fallback chain entry specifies which providers can serve that model. The system picks the first entry where at least one provider is connected.
-
-**Notable provider mappings:**
-- `venice` — alternative provider for Grok Code Fast 1 (Explore agent)
-- `opencode` — serves free-tier models (Kimi K2.5 Free, MiniMax M2.5 Free, Big Pickle, GPT-5-Nano) and premium models via OpenCode Zen
-- `zai-coding-plan` — GLM 5 and GLM-4.6v models
-
----
-
-## Quick Decision Tree for Users
-
-```
-What subscriptions do you have?
-
-├── Claude (Anthropic) → Sisyphus works optimally. Prometheus/Metis use Claude prompts.
-├── OpenAI/ChatGPT → Hephaestus unlocked. Oracle/Momus use GPT. Prometheus auto-switches.
-├── Both Claude + OpenAI → Full agent roster. Best experience.
-├── Gemini only → Visual-engineering category excels. Other agents use Gemini as fallback.
-├── Kimi for Coding → Atlas, Multimodal Looker, writing tasks work great. Sisyphus usable.
-├── GitHub Copilot only → Works as fallback provider for all model families.
-├── OpenCode Zen only → Free-tier access. Librarian/Explore work perfectly. Core agents functional but rate-limited.
-└── No subscription → Limited functionality. Consider OpenCode Zen (free).
-
-For each user scenario, the installer (`bunx oh-my-opencode install`) auto-configures the optimal assignment.
+Native (anthropic/, openai/, google/) > Kimi for Coding > GitHub Copilot > Venice > OpenCode Zen > Z.ai Coding Plan
 ```
 
 ---
 
 ## See Also
 
-- [Installation Guide](./installation.md) — Setup with subscription-based model configuration
-- [Configuration Reference](../configurations.md) — Full config options including agent overrides
-- [Overview](./overview.md) — How the agent system works
-- [`src/shared/model-requirements.ts`](../../src/shared/model-requirements.ts) — Source of truth for fallback chains
+- [Installation Guide](./installation.md) — Setup and authentication
+- [Orchestration System](./understanding-orchestration-system.md) — How agents dispatch tasks to categories
+- [Configuration Reference](../configurations.md) — Full config options
+- [`src/shared/model-requirements.ts`](../../src/shared/model-requirements.ts) — Source of truth for fallback chains
\ No newline at end of file