From b1008510f8efc402606c0093d5feb47cbbed16c6 Mon Sep 17 00:00:00 2001 From: YeonGyu-Kim Date: Thu, 19 Feb 2026 14:17:53 +0900 Subject: [PATCH 1/6] docs: add agent-model matching guide for newcomers - Add docs/guide/agent-model-matching.md with TL;DR table, detailed breakdown per agent, configuration examples, decision tree, common pitfalls, and default fallback chains - Update README.md to reference the guide in TOC, Just Install This section, and Features overview --- README.md | 6 ++ docs/guide/agent-model-matching.md | 166 +++++++++++++++++++++++++++++ 2 files changed, 172 insertions(+) create mode 100644 docs/guide/agent-model-matching.md diff --git a/README.md b/README.md index aa134f808..5a1e7a617 100644 --- a/README.md +++ b/README.md @@ -115,6 +115,7 @@ Yes, technically possible. But I cannot recommend using it. - [🪄 The Magic Word: `ultrawork`](#-the-magic-word-ultrawork) - [For Those Who Want to Read: Meet Sisyphus](#for-those-who-want-to-read-meet-sisyphus) - [Just Install This](#just-install-this) + - [Which Model Should I Use?](#which-model-should-i-use) - [For Those Who Want Autonomy: Meet Hephaestus](#for-those-who-want-autonomy-meet-hephaestus) - [Installation](#installation) - [For Humans](#for-humans) @@ -222,6 +223,10 @@ Need to look something up? It scours official docs, your entire codebase history If you don't want all this, as mentioned, you can just pick and choose specific features. +#### Which Model Should I Use? + +New to oh-my-opencode and not sure which model to pair with which agent? Check the **[Agent-Model Matching Guide](docs/guide/agent-model-matching.md)** — a quick reference for newcomers covering recommended models, fallback chains, and common pitfalls for each agent. + ### For Those Who Want Autonomy: Meet Hephaestus ![Meet Hephaestus](.github/assets/hephaestus.png) @@ -307,6 +312,7 @@ See the full [Features Documentation](docs/features.md) for detailed information - **Built-in MCPs**: websearch (Exa), context7 (docs), grep_app (GitHub search) - **Session Tools**: List, read, search, and analyze session history - **Productivity Features**: Ralph Loop, Todo Enforcer, Comment Checker, Think Mode, and more +- **[Agent-Model Matching Guide](docs/guide/agent-model-matching.md)**: Which model works best with which agent ## Configuration diff --git a/docs/guide/agent-model-matching.md b/docs/guide/agent-model-matching.md new file mode 100644 index 000000000..c78aa0b91 --- /dev/null +++ b/docs/guide/agent-model-matching.md @@ -0,0 +1,166 @@ +# Agent-Model Matching Guide for Newcomers + +> **Quick Reference**: Which model to use with which agent for the best results + +This guide helps you match the right AI model to each oh-my-opencode agent based on real-world usage and testing. + +--- + +## TL;DR + +| Agent | Best Models | Avoid | +|-------|-------------|-------| +| **Sisyphus** | Claude Opus, Sonnet, Kimi K2.5, GLM 5 | GPT ❌ | +| **Hephaestus** | GPT-5.3-codex only | Non-GPT ❌ | +| **Prometheus** | Claude Opus | GPT (untested) | +| **Atlas** | Claude Opus, GPT-5.2+ | — | + +--- + +## Detailed Breakdown + +### Sisyphus (ultraworker) +**Purpose**: Primary orchestrator for complex multi-step tasks + +**Recommended Models** (in order of preference): +1. **Claude Opus-4-6** — The best overall performance +2. **Claude Sonnet-4-6** — Satisfiable, often better than pure Claude Code + Opus +3. **Kimi K2.5** — Good for broad tasks, excellent cost-performance +4. **GLM 5** — Good for various tasks, not as capable on broad tasks as Kimi +5. **MiniMax** — Budget option when cost matters + +**⚠️ NEVER USE GPT** — Sisyphus is optimized for Claude-style models and performs poorly on GPT. + +**Configuration Example**: +```json +{ + "agent": { + "sisyphus": { + "model": "anthropic/claude-opus-4-6", + "variant": "max" + } + } +} +``` + +--- + +### Hephaestus (deep worker) +**Purpose**: Deep coding tasks requiring extensive reasoning + +**Required Model**: **GPT-5.3-codex** (always) + +Think of Hephaestus as "Codex on steroids" — it's specifically designed and tuned for GPT models. + +**⚠️ DO NOT USE** if you don't have GPT access. DeepSeek *might* work but is not officially supported. + +**Configuration Example**: +```json +{ + "agent": { + "hephaestus": { + "model": "openai/gpt-5.3-codex", + "variant": "medium" + } + } +} +``` + +--- + +### Prometheus (planner) +**Purpose**: Strategic planning and work plan generation + +**Recommended Model**: **Claude Opus-4-6** (strongly recommended) + +Prometheus is optimized for Claude's reasoning capabilities. GPT compatibility is not yet tested but may be evaluated in the future. + +**Configuration Example**: +```json +{ + "agent": { + "plan": { + "model": "anthropic/claude-opus-4-6", + "variant": "max" + } + } +} +``` + +--- + +### Atlas (orchestrator) +**Purpose**: Todo list orchestration and multi-agent coordination + +**Recommended Models**: +1. **Claude Opus-4-6** — Best performance (recommended) +2. **GPT-5.2+** — Good enough, has GPT-optimized prompt + +Atlas has model-specific prompt detection and will automatically use GPT-optimized instructions when running on GPT models. + +**Configuration Example**: +```json +{ + "agent": { + "atlas": { + "model": "anthropic/claude-opus-4-6" + } + } +} +``` + +--- + +## Quick Decision Tree + +``` +Do you have GPT access? +├── YES → Use Hephaestus for deep coding, Atlas for orchestration +└── NO → Use Sisyphus (Claude/Kimi/GLM) for all tasks + +Need planning/strategy? +├── YES → Use Prometheus (Claude Opus recommended) +└── NO → Skip Prometheus, use other agents directly + +Complex multi-step task? +├── YES → Use Sisyphus (Claude-family models) +└── NO → Use category-specific agents or Hephaestus +``` + +--- + +## Common Pitfalls to Avoid + +1. **Don't use GPT with Sisyphus** — Performance will be subpar +2. **Don't use non-GPT with Hephaestus** — It's specifically built for GPT +3. **Don't force Prometheus on GPT** — It's untested; use Claude for now +4. **Don't overthink Atlas** — It adapts automatically to your model + +--- + +## Model Fallback Chains (Default Behavior) + +The system will automatically fall back through these chains if your preferred model is unavailable: + +**Sisyphus**: Opus → Kimi K2.5 → GLM 4.7 → Big Pickle +**Hephaestus**: GPT-5.3-codex only (no fallback) +**Prometheus**: Opus → Kimi K2.5 → GLM 4.7 → GPT-5.2 → Gemini 3 Pro +**Atlas**: Kimi K2.5 → GLM 4.7 → Opus → GPT-5.2 → Gemini 3 Pro + +--- + +## Tips for Newcomers + +- **Start with Sisyphus + Claude Opus** for general tasks +- **Use Hephaestus when you need deep reasoning** (requires GPT) +- **Try GLM 5 or Kimi K2.5** for cost-effective alternatives to Claude +- **Check the model requirements** in your config to avoid mismatches +- **Use `variant: "max"` or `variant: "high"`** for best results on capable models + +--- + +## See Also + +- [AGENTS.md](../AGENTS.md) — Full agent documentation +- [configurations.md](./configurations.md) — Configuration reference +- [orchestration-guide.md](./orchestration-guide.md) — How orchestration works From 3b8846e956a4143453d93bec601eea2e064807e7 Mon Sep 17 00:00:00 2001 From: YeonGyu-Kim Date: Thu, 19 Feb 2026 14:24:01 +0900 Subject: [PATCH 2/6] fix: correct Atlas model recommendations Atlas primary model is Kimi K2.5, not Opus. Updated TL;DR table and detailed breakdown to reflect actual recommended order: Kimi K2.5 > Sonnet > GPT. --- docs/guide/agent-model-matching.md | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/docs/guide/agent-model-matching.md b/docs/guide/agent-model-matching.md index c78aa0b91..3d6e24943 100644 --- a/docs/guide/agent-model-matching.md +++ b/docs/guide/agent-model-matching.md @@ -13,7 +13,7 @@ This guide helps you match the right AI model to each oh-my-opencode agent based | **Sisyphus** | Claude Opus, Sonnet, Kimi K2.5, GLM 5 | GPT ❌ | | **Hephaestus** | GPT-5.3-codex only | Non-GPT ❌ | | **Prometheus** | Claude Opus | GPT (untested) | -| **Atlas** | Claude Opus, GPT-5.2+ | — | +| **Atlas** | Kimi K2.5, Claude Sonnet, GPT-5.2+ | — | --- @@ -92,9 +92,10 @@ Prometheus is optimized for Claude's reasoning capabilities. GPT compatibility i ### Atlas (orchestrator) **Purpose**: Todo list orchestration and multi-agent coordination -**Recommended Models**: -1. **Claude Opus-4-6** — Best performance (recommended) -2. **GPT-5.2+** — Good enough, has GPT-optimized prompt +**Recommended Models** (in order of preference): +1. **Kimi K2.5** — Best for Atlas orchestration +2. **Claude Sonnet-4-6** — Strong alternative +3. **GPT-5.2+** — Good enough, has GPT-optimized prompt Atlas has model-specific prompt detection and will automatically use GPT-optimized instructions when running on GPT models. @@ -103,7 +104,7 @@ Atlas has model-specific prompt detection and will automatically use GPT-optimiz { "agent": { "atlas": { - "model": "anthropic/claude-opus-4-6" + "model": "kimi/kimi-k2.5" } } } From d9ee0d9c0db8a5743968fe12acae57b47b7a93c3 Mon Sep 17 00:00:00 2001 From: YeonGyu-Kim Date: Thu, 19 Feb 2026 15:01:34 +0900 Subject: [PATCH 3/6] docs: rewrite agent-model matching as technical guide for agents Rewrite agent-model-matching.md as a technical reference that: - Documents actual fallback chains from model-requirements.ts - Explains model-specific prompt routing (Prometheus/Atlas GPT detection) - Covers safe vs dangerous model substitutions with rationale - Includes task categories (visual-engineering, deep, quick, etc.) - Guides agents on how to explain model choices to users - Adds provider priority chain Also update installation.md to reference the guide when users want custom model configuration, with explanation of what is safe to change and why. --- docs/guide/agent-model-matching.md | 262 +++++++++++++++-------------- docs/guide/installation.md | 12 ++ 2 files changed, 148 insertions(+), 126 deletions(-) diff --git a/docs/guide/agent-model-matching.md b/docs/guide/agent-model-matching.md index 3d6e24943..27fce9b66 100644 --- a/docs/guide/agent-model-matching.md +++ b/docs/guide/agent-model-matching.md @@ -1,167 +1,177 @@ -# Agent-Model Matching Guide for Newcomers +# Agent-Model Matching Guide -> **Quick Reference**: Which model to use with which agent for the best results - -This guide helps you match the right AI model to each oh-my-opencode agent based on real-world usage and testing. +> **For agents and users**: This document explains the principles behind oh-my-opencode's agent-model assignments. Use it to understand why each agent uses a specific model, and how to customize them correctly. --- -## TL;DR +## Why Model Matching Matters -| Agent | Best Models | Avoid | -|-------|-------------|-------| -| **Sisyphus** | Claude Opus, Sonnet, Kimi K2.5, GLM 5 | GPT ❌ | -| **Hephaestus** | GPT-5.3-codex only | Non-GPT ❌ | -| **Prometheus** | Claude Opus | GPT (untested) | -| **Atlas** | Kimi K2.5, Claude Sonnet, GPT-5.2+ | — | +Each oh-my-opencode agent has a **dedicated system prompt** optimized for a specific model family. Some agents (Atlas, Prometheus) ship separate prompts for GPT vs Claude models, with automatic routing via `isGptModel()` detection. Assigning the wrong model family to an agent doesn't just degrade performance — the agent may receive instructions formatted for a completely different model's reasoning style. + +**Key principle**: Agents are tuned to model families, not individual models. A Claude-tuned agent works with Opus, Sonnet, or Haiku. A GPT-tuned agent works with GPT-5.2 or GPT-5.3-codex. Crossing families requires a model-specific prompt (which only some agents have). --- -## Detailed Breakdown +## Agent-Model Map (Source of Truth) -### Sisyphus (ultraworker) -**Purpose**: Primary orchestrator for complex multi-step tasks +This table reflects the actual fallback chains in `src/shared/model-requirements.ts`. The first available model in the chain is used. -**Recommended Models** (in order of preference): -1. **Claude Opus-4-6** — The best overall performance -2. **Claude Sonnet-4-6** — Satisfiable, often better than pure Claude Code + Opus -3. **Kimi K2.5** — Good for broad tasks, excellent cost-performance -4. **GLM 5** — Good for various tasks, not as capable on broad tasks as Kimi -5. **MiniMax** — Budget option when cost matters +### Core Agents -**⚠️ NEVER USE GPT** — Sisyphus is optimized for Claude-style models and performs poorly on GPT. +| Agent | Role | Primary Model Family | Fallback Chain | Has GPT Prompt? | +|-------|------|---------------------|----------------|-----------------| +| **Sisyphus** | Main ultraworker | Claude | Opus → Kimi K2.5 → GLM 5 → Big Pickle | No — **never use GPT** | +| **Hephaestus** | Deep autonomous worker | GPT (only) | GPT-5.3-codex (medium) | N/A (GPT-native) | +| **Prometheus** | Strategic planner | Claude (default), GPT (auto-detected) | Opus → GPT-5.2 → Kimi K2.5 → Gemini 3 Pro | **Yes** — `src/agents/prometheus/gpt.ts` | +| **Atlas** | Todo orchestrator | Kimi K2.5 (default), GPT (auto-detected) | Kimi K2.5 → Sonnet → GPT-5.2 | **Yes** — `src/agents/atlas/gpt.ts` | +| **Oracle** | Architecture/debugging | GPT | GPT-5.2 → Gemini 3 Pro → Opus | No | +| **Metis** | Plan review consultant | Claude | Opus → Kimi K2.5 → GPT-5.2 → Gemini 3 Pro | No | +| **Momus** | High-accuracy reviewer | GPT | GPT-5.2 → Opus → Gemini 3 Pro | No | -**Configuration Example**: -```json +### Utility Agents + +| Agent | Role | Primary Model Family | Fallback Chain | +|-------|------|---------------------|----------------| +| **Explore** | Fast codebase grep | Grok/lightweight | Grok Code Fast 1 → MiniMax M2.5 → Haiku → GPT-5-nano | +| **Librarian** | Docs/code search | Lightweight | MiniMax M2.5 → Gemini 3 Flash → Big Pickle | +| **Multimodal Looker** | Vision/screenshots | Kimi/multimodal | Kimi K2.5 → Gemini 3 Flash → GPT-5.2 → GLM-4.6v | + +### Task Categories + +Categories are used for `background_task` and `delegate_task` dispatching: + +| Category | Purpose | Primary Model | Notes | +|----------|---------|---------------|-------| +| `visual-engineering` | Frontend/UI work | Gemini 3 Pro | Gemini excels at visual tasks | +| `ultrabrain` | Maximum intelligence | GPT-5.3-codex (xhigh) | Highest reasoning variant | +| `deep` | Deep coding | GPT-5.3-codex (medium) | Requires GPT availability | +| `artistry` | Creative/design | Gemini 3 Pro | Requires Gemini availability | +| `quick` | Fast simple tasks | Claude Haiku | Cheapest, fastest | +| `unspecified-high` | General high-quality | Claude Opus | Default for complex tasks | +| `unspecified-low` | General standard | Claude Sonnet | Default for standard tasks | +| `writing` | Text/docs | Kimi K2.5 | Best prose quality | + +--- + +## Model-Specific Prompt Routing + +### How It Works + +Some agents detect the assigned model at runtime and switch prompts: + +```typescript +// From src/agents/prometheus/system-prompt.ts +export function getPrometheusPrompt(model?: string): string { + if (model && isGptModel(model)) return getGptPrometheusPrompt() // XML-tagged, principle-driven + return PROMETHEUS_SYSTEM_PROMPT // Claude-optimized, modular sections +} +``` + +**Agents with dual prompts:** +- **Prometheus**: Claude prompt (modular sections) vs GPT prompt (XML-tagged, Codex plan mode style with explicit decision criteria) +- **Atlas**: Claude prompt vs GPT prompt (GPT-optimized todo orchestration) + +**Why this matters for customization**: If you override Prometheus to use a GPT model, the GPT prompt activates automatically. But if you override Sisyphus to use GPT — there is no GPT prompt, and performance will degrade significantly. + +### Model Family Detection + +`isGptModel()` matches: +- Any model starting with `openai/` or `github-copilot/gpt-` +- Model names starting with common GPT prefixes (`gpt-`, `o1-`, `o3-`, `o4-`, `codex-`) + +Everything else is treated as "Claude-like" (Claude, Kimi, GLM, Gemini). + +--- + +## Customization Guide + +### When to Customize + +Customize model assignments when: +- You have a specific provider subscription (e.g., only OpenAI, no Anthropic) +- You want to use a cheaper model for certain agents +- You're experimenting with new models + +### How to Customize + +Override in `oh-my-opencode.json` (user: `~/.config/opencode/oh-my-opencode.json`, project: `.opencode/oh-my-opencode.json`): + +```jsonc { - "agent": { - "sisyphus": { - "model": "anthropic/claude-opus-4-6", - "variant": "max" - } + "agents": { + "sisyphus": { "model": "kimi-for-coding/k2p5" }, + "atlas": { "model": "anthropic/claude-sonnet-4-6" }, + "prometheus": { "model": "openai/gpt-5.2" } // Will auto-switch to GPT prompt } } ``` ---- +### Safe Substitutions (same model family) -### Hephaestus (deep worker) -**Purpose**: Deep coding tasks requiring extensive reasoning +These swaps are safe because they stay within the same prompt family: -**Required Model**: **GPT-5.3-codex** (always) +| Agent | Default | Safe Alternatives | +|-------|---------|-------------------| +| **Sisyphus** | Claude Opus | Claude Sonnet, Kimi K2.5, GLM 5 (any Claude-like) | +| **Hephaestus** | GPT-5.3-codex | No alternatives — GPT only | +| **Prometheus** | Claude Opus | Claude Sonnet (Claude prompt) OR GPT-5.2 (auto-switches to GPT prompt) | +| **Atlas** | Kimi K2.5 | Claude Sonnet (Claude prompt) OR GPT-5.2 (auto-switches to GPT prompt) | +| **Oracle** | GPT-5.2 | Gemini 3 Pro, Claude Opus | -Think of Hephaestus as "Codex on steroids" — it's specifically designed and tuned for GPT models. +### Dangerous Substitutions (cross-family without prompt support) -**⚠️ DO NOT USE** if you don't have GPT access. DeepSeek *might* work but is not officially supported. +| Agent | Dangerous Override | Why | +|-------|-------------------|-----| +| **Sisyphus** → GPT | No GPT-optimized prompt exists. Sisyphus is deeply tuned for Claude-style reasoning. Performance drops dramatically. | +| **Hephaestus** → Claude | Hephaestus is purpose-built for GPT's Codex capabilities. Claude cannot replicate this. | +| **Explore** → Opus | Massive overkill and cost waste. Explore needs speed, not intelligence. | -**Configuration Example**: -```json -{ - "agent": { - "hephaestus": { - "model": "openai/gpt-5.3-codex", - "variant": "medium" - } - } -} -``` +### Explaining to Users + +When a user asks about model configuration, explain: + +1. **The default works out of the box** — the installer configures optimal models based on their subscriptions +2. **Each agent has a "home" model family** — Sisyphus is Claude-native, Hephaestus is GPT-native +3. **Some agents auto-adapt** — Prometheus and Atlas detect GPT models and switch to optimized prompts +4. **Cross-family overrides are risky** — unless the agent has a dedicated prompt for that family +5. **Cost optimization is valid** — swapping Opus → Sonnet or Kimi K2.5 for Sisyphus is fine and saves money +6. **Point to this guide** for the full fallback chains and rationale --- -### Prometheus (planner) -**Purpose**: Strategic planning and work plan generation +## Provider Priority -**Recommended Model**: **Claude Opus-4-6** (strongly recommended) - -Prometheus is optimized for Claude's reasoning capabilities. GPT compatibility is not yet tested but may be evaluated in the future. - -**Configuration Example**: -```json -{ - "agent": { - "plan": { - "model": "anthropic/claude-opus-4-6", - "variant": "max" - } - } -} -``` - ---- - -### Atlas (orchestrator) -**Purpose**: Todo list orchestration and multi-agent coordination - -**Recommended Models** (in order of preference): -1. **Kimi K2.5** — Best for Atlas orchestration -2. **Claude Sonnet-4-6** — Strong alternative -3. **GPT-5.2+** — Good enough, has GPT-optimized prompt - -Atlas has model-specific prompt detection and will automatically use GPT-optimized instructions when running on GPT models. - -**Configuration Example**: -```json -{ - "agent": { - "atlas": { - "model": "kimi/kimi-k2.5" - } - } -} -``` - ---- - -## Quick Decision Tree +When multiple providers are available, oh-my-opencode prefers: ``` -Do you have GPT access? -├── YES → Use Hephaestus for deep coding, Atlas for orchestration -└── NO → Use Sisyphus (Claude/Kimi/GLM) for all tasks - -Need planning/strategy? -├── YES → Use Prometheus (Claude Opus recommended) -└── NO → Skip Prometheus, use other agents directly - -Complex multi-step task? -├── YES → Use Sisyphus (Claude-family models) -└── NO → Use category-specific agents or Hephaestus +Native (anthropic/, openai/, google/) > Kimi for Coding > GitHub Copilot > OpenCode Zen > Z.ai Coding Plan ``` ---- - -## Common Pitfalls to Avoid - -1. **Don't use GPT with Sisyphus** — Performance will be subpar -2. **Don't use non-GPT with Hephaestus** — It's specifically built for GPT -3. **Don't force Prometheus on GPT** — It's untested; use Claude for now -4. **Don't overthink Atlas** — It adapts automatically to your model +Each fallback chain entry specifies which providers can serve that model. The system picks the first entry where at least one provider is connected. --- -## Model Fallback Chains (Default Behavior) +## Quick Decision Tree for Users -The system will automatically fall back through these chains if your preferred model is unavailable: +``` +What subscriptions do you have? -**Sisyphus**: Opus → Kimi K2.5 → GLM 4.7 → Big Pickle -**Hephaestus**: GPT-5.3-codex only (no fallback) -**Prometheus**: Opus → Kimi K2.5 → GLM 4.7 → GPT-5.2 → Gemini 3 Pro -**Atlas**: Kimi K2.5 → GLM 4.7 → Opus → GPT-5.2 → Gemini 3 Pro +├── Claude (Anthropic) → Sisyphus works optimally. Prometheus/Metis use Claude prompts. +├── OpenAI/ChatGPT → Hephaestus unlocked. Oracle/Momus use GPT. Prometheus auto-switches. +├── Both Claude + OpenAI → Full agent roster. Best experience. +├── Gemini only → Visual-engineering category excels. Other agents use Gemini as fallback. +├── GitHub Copilot only → Works as fallback provider for all model families. +├── OpenCode Zen only → Free-tier access to multiple models. Functional but rate-limited. +└── No subscription → Limited functionality. Consider OpenCode Zen (free). ---- - -## Tips for Newcomers - -- **Start with Sisyphus + Claude Opus** for general tasks -- **Use Hephaestus when you need deep reasoning** (requires GPT) -- **Try GLM 5 or Kimi K2.5** for cost-effective alternatives to Claude -- **Check the model requirements** in your config to avoid mismatches -- **Use `variant: "max"` or `variant: "high"`** for best results on capable models +For each user scenario, the installer (`bunx oh-my-opencode install`) auto-configures the optimal assignment. +``` --- ## See Also -- [AGENTS.md](../AGENTS.md) — Full agent documentation -- [configurations.md](./configurations.md) — Configuration reference -- [orchestration-guide.md](./orchestration-guide.md) — How orchestration works +- [Installation Guide](./installation.md) — Setup with subscription-based model configuration +- [Configuration Reference](../configurations.md) — Full config options including agent overrides +- [Overview](./overview.md) — How the agent system works +- [`src/shared/model-requirements.ts`](../../src/shared/model-requirements.ts) — Source of truth for fallback chains diff --git a/docs/guide/installation.md b/docs/guide/installation.md index 051887c2d..e8d27cf54 100644 --- a/docs/guide/installation.md +++ b/docs/guide/installation.md @@ -259,6 +259,18 @@ opencode auth login The plugin works perfectly by default. Do not change settings or turn off features without an explicit request. +### Custom Model Configuration + +If the user wants to override which model an agent uses, refer to the **[Agent-Model Matching Guide](./agent-model-matching.md)** before making changes. That guide explains: + +- **Why each agent uses its default model** — prompt optimization, model family compatibility +- **Which substitutions are safe** — staying within the same model family (e.g., Opus → Sonnet for Sisyphus) +- **Which substitutions are dangerous** — crossing model families without prompt support (e.g., GPT for Sisyphus) +- **How auto-routing works** — Prometheus and Atlas detect GPT models and switch to GPT-optimized prompts automatically +- **Full fallback chains** — what happens when the preferred model is unavailable + +Always explain to the user *why* a model is assigned to an agent when making custom changes. The guide provides the rationale for every assignment. + ### Verify the setup Read this document again, think about you have done everything correctly. From 36432fe18ed43dfc02b8933ada98341c285991aa Mon Sep 17 00:00:00 2001 From: YeonGyu-Kim Date: Thu, 19 Feb 2026 15:04:57 +0900 Subject: [PATCH 4/6] docs: add prompt design rationale from Codex plan mode analysis Expand model-specific prompt routing section with insights from the actual Prometheus GPT prompt development session: - Why Claude vs GPT models need fundamentally different prompts - Principle-driven (GPT) vs mechanics-driven (Claude) approach - "Decision Complete" concept from Codex Plan Mode - Why more rules help Claude but hurt GPT (contradiction surface) - Concrete size comparison (1100 lines Claude vs 300 lines GPT) --- docs/guide/agent-model-matching.md | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/docs/guide/agent-model-matching.md b/docs/guide/agent-model-matching.md index 27fce9b66..889b19d4b 100644 --- a/docs/guide/agent-model-matching.md +++ b/docs/guide/agent-model-matching.md @@ -55,6 +55,18 @@ Categories are used for `background_task` and `delegate_task` dispatching: ## Model-Specific Prompt Routing +### Why Different Models Need Different Prompts + +Claude and GPT models have fundamentally different instruction-following behaviors: + +- **Claude models** respond well to **mechanics-driven** prompts — detailed checklists, templates, step-by-step procedures, and explicit anti-patterns. More rules = more compliance. +- **GPT models** (especially 5.2+) have **stronger instruction adherence** and respond better to **principle-driven** prompts — concise principles, XML-tagged structure, explicit decision criteria. More rules = more contradiction surface area = more drift. + +This insight comes from analyzing OpenAI's Codex Plan Mode prompt alongside the GPT-5.2 Prompting Guide: +- Codex Plan Mode uses 3 clean principles in ~121 lines to achieve what Prometheus's Claude prompt does in ~1,100 lines across 7 files +- GPT-5.2's "conservative grounding bias" and "more deliberate scaffolding" mean it builds clearer plans by default, but needs **explicit decision criteria** (it won't infer what you want) +- The key concept is **"Decision Complete"** — a plan must leave ZERO decisions to the implementer. GPT models follow this literally when stated as a principle, while Claude models need enforcement mechanisms + ### How It Works Some agents detect the assigned model at runtime and switch prompts: @@ -68,10 +80,10 @@ export function getPrometheusPrompt(model?: string): string { ``` **Agents with dual prompts:** -- **Prometheus**: Claude prompt (modular sections) vs GPT prompt (XML-tagged, Codex plan mode style with explicit decision criteria) -- **Atlas**: Claude prompt vs GPT prompt (GPT-optimized todo orchestration) +- **Prometheus**: Claude prompt (~1,100 lines, 7 files, mechanics-driven with checklists and templates) vs GPT prompt (~300 lines, single file, principle-driven with XML structure inspired by Codex Plan Mode) +- **Atlas**: Claude prompt vs GPT prompt (GPT-optimized todo orchestration with explicit scope constraints) -**Why this matters for customization**: If you override Prometheus to use a GPT model, the GPT prompt activates automatically. But if you override Sisyphus to use GPT — there is no GPT prompt, and performance will degrade significantly. +**Why this matters for customization**: If you override Prometheus to use a GPT model, the GPT prompt activates automatically — and it's specifically designed for how GPT reasons. But if you override Sisyphus to use GPT — there is no GPT prompt, and performance will degrade significantly because Sisyphus's prompt is deeply tuned for Claude's reasoning style. ### Model Family Detection From 98d39ceea0d0ac5dfa97348c05aac8772c25c0b6 Mon Sep 17 00:00:00 2001 From: YeonGyu-Kim Date: Thu, 19 Feb 2026 15:09:05 +0900 Subject: [PATCH 5/6] docs: sync agent-model guide with latest catalog changes Update all fallback chains to match current model-requirements.ts: - Librarian: now minimax-m2.5-free -> gemini-flash -> big-pickle (free-tier first) - Explore: add minimax-m2.5-free as #2 after grok-code-fast-1 - Multimodal Looker: reorder to kimi-first (k2p5 -> kimi-free -> flash -> gpt-5.2) - Atlas: remove gemini-3-pro, keep kimi k2.5 -> sonnet -> gpt-5.2 - GLM 4.7 -> GLM 5 everywhere - Add venice provider for grok, opencode provider for glm-5 Add design philosophy section explaining the intelligence hierarchy: premium models for core agents, free-tier for utility agents, balanced for orchestrators. Document why utility agents intentionally use cheap models and why Kimi K2.5 appears as primary for multiple agents. --- docs/guide/agent-model-matching.md | 79 ++++++++++++++++++++---------- 1 file changed, 52 insertions(+), 27 deletions(-) diff --git a/docs/guide/agent-model-matching.md b/docs/guide/agent-model-matching.md index 889b19d4b..6b564d43f 100644 --- a/docs/guide/agent-model-matching.md +++ b/docs/guide/agent-model-matching.md @@ -12,44 +12,58 @@ Each oh-my-opencode agent has a **dedicated system prompt** optimized for a spec --- +## Design Philosophy: Intelligence Where It Matters, Speed Everywhere Else + +The model catalog follows a clear hierarchy: + +1. **Core agents get premium models** — Sisyphus (Claude Opus), Hephaestus (GPT-5.3-codex), Prometheus (Opus/GPT-5.2). These agents handle complex multi-step reasoning where model quality directly impacts output. + +2. **Utility agents get fast, free-tier models** — Explore (Grok Code Fast → MiniMax M2.5 Free), Librarian (MiniMax M2.5 Free → Gemini Flash → Big Pickle). These agents do search, grep, and doc retrieval where speed matters more than deep reasoning. + +3. **Orchestrator agents get balanced models** — Atlas (Kimi K2.5 → Sonnet), Metis (Opus → Kimi K2.5). These need good instruction-following but don't need maximum intelligence. + +4. **Free-tier models are first-class citizens** — MiniMax M2.5 Free, Big Pickle, GPT-5-Nano, and Kimi K2.5 Free appear throughout fallback chains. This means oh-my-opencode works well even with OpenCode Zen (free) as the only provider. + +--- + ## Agent-Model Map (Source of Truth) This table reflects the actual fallback chains in `src/shared/model-requirements.ts`. The first available model in the chain is used. ### Core Agents -| Agent | Role | Primary Model Family | Fallback Chain | Has GPT Prompt? | -|-------|------|---------------------|----------------|-----------------| -| **Sisyphus** | Main ultraworker | Claude | Opus → Kimi K2.5 → GLM 5 → Big Pickle | No — **never use GPT** | -| **Hephaestus** | Deep autonomous worker | GPT (only) | GPT-5.3-codex (medium) | N/A (GPT-native) | -| **Prometheus** | Strategic planner | Claude (default), GPT (auto-detected) | Opus → GPT-5.2 → Kimi K2.5 → Gemini 3 Pro | **Yes** — `src/agents/prometheus/gpt.ts` | -| **Atlas** | Todo orchestrator | Kimi K2.5 (default), GPT (auto-detected) | Kimi K2.5 → Sonnet → GPT-5.2 | **Yes** — `src/agents/atlas/gpt.ts` | -| **Oracle** | Architecture/debugging | GPT | GPT-5.2 → Gemini 3 Pro → Opus | No | -| **Metis** | Plan review consultant | Claude | Opus → Kimi K2.5 → GPT-5.2 → Gemini 3 Pro | No | -| **Momus** | High-accuracy reviewer | GPT | GPT-5.2 → Opus → Gemini 3 Pro | No | +| Agent | Role | Fallback Chain (in order) | Has GPT Prompt? | +|-------|------|---------------------------|-----------------| +| **Sisyphus** | Main ultraworker | Opus (max) → Kimi K2.5 → Kimi K2.5 Free → GLM 5 → Big Pickle | No — **never use GPT** | +| **Hephaestus** | Deep autonomous worker | GPT-5.3-codex (medium) — no fallback | N/A (GPT-native) | +| **Prometheus** | Strategic planner | Opus (max) → **GPT-5.2 (high)** → Kimi K2.5 → Kimi K2.5 Free → Gemini 3 Pro | **Yes** — auto-switches | +| **Atlas** | Todo orchestrator | **Kimi K2.5** → Kimi K2.5 Free → Sonnet → GPT-5.2 | **Yes** — auto-switches | +| **Oracle** | Architecture/debugging | GPT-5.2 (high) → Gemini 3 Pro (high) → Opus (max) | No | +| **Metis** | Plan review consultant | Opus (max) → Kimi K2.5 → Kimi K2.5 Free → GPT-5.2 (high) → Gemini 3 Pro (high) | No | +| **Momus** | High-accuracy reviewer | GPT-5.2 (medium) → Opus (max) → Gemini 3 Pro (high) | No | ### Utility Agents -| Agent | Role | Primary Model Family | Fallback Chain | -|-------|------|---------------------|----------------| -| **Explore** | Fast codebase grep | Grok/lightweight | Grok Code Fast 1 → MiniMax M2.5 → Haiku → GPT-5-nano | -| **Librarian** | Docs/code search | Lightweight | MiniMax M2.5 → Gemini 3 Flash → Big Pickle | -| **Multimodal Looker** | Vision/screenshots | Kimi/multimodal | Kimi K2.5 → Gemini 3 Flash → GPT-5.2 → GLM-4.6v | +| Agent | Role | Fallback Chain (in order) | Design Rationale | +|-------|------|---------------------------|------------------| +| **Explore** | Fast codebase grep | Grok Code Fast 1 → **MiniMax M2.5 Free** → Haiku → GPT-5-Nano | Speed over intelligence. Grok Code is fastest for grep-style work. MiniMax Free as cheap fallback. | +| **Librarian** | Docs/code search | **MiniMax M2.5 Free** → Gemini 3 Flash → Big Pickle | Entirely free-tier chain. Doc retrieval doesn't need Opus-level reasoning. | +| **Multimodal Looker** | Vision/screenshots | **Kimi K2.5** → Kimi K2.5 Free → Gemini 3 Flash → GPT-5.2 → GLM-4.6v | Kimi excels at multimodal. Gemini Flash as lightweight vision fallback. | ### Task Categories Categories are used for `background_task` and `delegate_task` dispatching: -| Category | Purpose | Primary Model | Notes | -|----------|---------|---------------|-------| -| `visual-engineering` | Frontend/UI work | Gemini 3 Pro | Gemini excels at visual tasks | -| `ultrabrain` | Maximum intelligence | GPT-5.3-codex (xhigh) | Highest reasoning variant | -| `deep` | Deep coding | GPT-5.3-codex (medium) | Requires GPT availability | -| `artistry` | Creative/design | Gemini 3 Pro | Requires Gemini availability | -| `quick` | Fast simple tasks | Claude Haiku | Cheapest, fastest | -| `unspecified-high` | General high-quality | Claude Opus | Default for complex tasks | -| `unspecified-low` | General standard | Claude Sonnet | Default for standard tasks | -| `writing` | Text/docs | Kimi K2.5 | Best prose quality | +| Category | Purpose | Fallback Chain | Notes | +|----------|---------|----------------|-------| +| `visual-engineering` | Frontend/UI work | Gemini 3 Pro (high) → GLM 5 → Opus (max) → Kimi K2.5 | Gemini excels at visual tasks | +| `ultrabrain` | Maximum intelligence | GPT-5.3-codex (xhigh) → Gemini 3 Pro (high) → Opus (max) | Highest reasoning variant | +| `deep` | Deep coding | GPT-5.3-codex (medium) → Opus (max) → Gemini 3 Pro (high) | Requires GPT availability | +| `artistry` | Creative/design | Gemini 3 Pro (high) → Opus (max) → GPT-5.2 | Requires Gemini availability | +| `quick` | Fast simple tasks | Haiku → Gemini 3 Flash → GPT-5-Nano | Cheapest, fastest | +| `unspecified-high` | General high-quality | Opus (max) → GPT-5.2 (high) → Gemini 3 Pro | Default for complex tasks | +| `unspecified-low` | General standard | Sonnet → GPT-5.3-codex (medium) → Gemini 3 Flash | Default for standard tasks | +| `writing` | Text/docs | **Kimi K2.5** → Gemini 3 Flash → Sonnet | Kimi produces best prose quality | --- @@ -129,6 +143,8 @@ These swaps are safe because they stay within the same prompt family: | **Prometheus** | Claude Opus | Claude Sonnet (Claude prompt) OR GPT-5.2 (auto-switches to GPT prompt) | | **Atlas** | Kimi K2.5 | Claude Sonnet (Claude prompt) OR GPT-5.2 (auto-switches to GPT prompt) | | **Oracle** | GPT-5.2 | Gemini 3 Pro, Claude Opus | +| **Librarian** | MiniMax M2.5 Free | Gemini 3 Flash, Big Pickle, any lightweight model | +| **Explore** | Grok Code Fast 1 | MiniMax M2.5 Free, Haiku, GPT-5-Nano — speed is key | ### Dangerous Substitutions (cross-family without prompt support) @@ -137,6 +153,7 @@ These swaps are safe because they stay within the same prompt family: | **Sisyphus** → GPT | No GPT-optimized prompt exists. Sisyphus is deeply tuned for Claude-style reasoning. Performance drops dramatically. | | **Hephaestus** → Claude | Hephaestus is purpose-built for GPT's Codex capabilities. Claude cannot replicate this. | | **Explore** → Opus | Massive overkill and cost waste. Explore needs speed, not intelligence. | +| **Librarian** → Opus | Same — doc retrieval is a search task, not a reasoning task. Opus is wasted here. | ### Explaining to Users @@ -146,8 +163,10 @@ When a user asks about model configuration, explain: 2. **Each agent has a "home" model family** — Sisyphus is Claude-native, Hephaestus is GPT-native 3. **Some agents auto-adapt** — Prometheus and Atlas detect GPT models and switch to optimized prompts 4. **Cross-family overrides are risky** — unless the agent has a dedicated prompt for that family -5. **Cost optimization is valid** — swapping Opus → Sonnet or Kimi K2.5 for Sisyphus is fine and saves money -6. **Point to this guide** for the full fallback chains and rationale +5. **Cost optimization is valid** — swapping Opus → Sonnet or Kimi K2.5 for Sisyphus saves money with acceptable quality trade-off +6. **Utility agents are intentionally cheap** — Librarian and Explore use free-tier models by design. Don't "upgrade" them to Opus thinking it'll help — it just wastes tokens on simple search tasks +7. **Kimi K2.5 is a versatile workhorse** — it appears as primary for Atlas (orchestration), Multimodal Looker (vision), and writing tasks. It's consistently good across these roles without being expensive. +8. **Point to this guide** for the full fallback chains and rationale --- @@ -161,6 +180,11 @@ Native (anthropic/, openai/, google/) > Kimi for Coding > GitHub Copilot > OpenC Each fallback chain entry specifies which providers can serve that model. The system picks the first entry where at least one provider is connected. +**Notable provider mappings:** +- `venice` — alternative provider for Grok Code Fast 1 (Explore agent) +- `opencode` — serves free-tier models (Kimi K2.5 Free, MiniMax M2.5 Free, Big Pickle, GPT-5-Nano) and premium models via OpenCode Zen +- `zai-coding-plan` — GLM 5 and GLM-4.6v models + --- ## Quick Decision Tree for Users @@ -172,8 +196,9 @@ What subscriptions do you have? ├── OpenAI/ChatGPT → Hephaestus unlocked. Oracle/Momus use GPT. Prometheus auto-switches. ├── Both Claude + OpenAI → Full agent roster. Best experience. ├── Gemini only → Visual-engineering category excels. Other agents use Gemini as fallback. +├── Kimi for Coding → Atlas, Multimodal Looker, writing tasks work great. Sisyphus usable. ├── GitHub Copilot only → Works as fallback provider for all model families. -├── OpenCode Zen only → Free-tier access to multiple models. Functional but rate-limited. +├── OpenCode Zen only → Free-tier access. Librarian/Explore work perfectly. Core agents functional but rate-limited. └── No subscription → Limited functionality. Consider OpenCode Zen (free). For each user scenario, the installer (`bunx oh-my-opencode install`) auto-configures the optimal assignment. From 6909e5fb4c631002204b8dbad317dd36bfdc7b7b Mon Sep 17 00:00:00 2001 From: YeonGyu-Kim Date: Thu, 19 Feb 2026 15:17:41 +0900 Subject: [PATCH 6/6] docs: restructure agent-model guide by model family and role Complete rewrite organized around model families, agent roles, task categories, and selection priority rules. - Model families: Claude-like (Kimi, GLM/Big Pickle), GPT, different-behavior (Gemini, MiniMax), speed-focused (Grok, Spark) - Agent roles: Claude-optimized, dual-prompt, GPT-native, utility - gpt-5.3-codex-spark: extremely fast but compacts too aggressively - Big Pickle = GLM 4.6 - Explicit guidance: do not upgrade utility agents to Opus - opencode models / opencode auth login references at top - Link to orchestration system guide for task categories --- docs/guide/agent-model-matching.md | 281 +++++++++++++---------------- 1 file changed, 130 insertions(+), 151 deletions(-) diff --git a/docs/guide/agent-model-matching.md b/docs/guide/agent-model-matching.md index 6b564d43f..0d74538ca 100644 --- a/docs/guide/agent-model-matching.md +++ b/docs/guide/agent-model-matching.md @@ -1,214 +1,193 @@ # Agent-Model Matching Guide -> **For agents and users**: This document explains the principles behind oh-my-opencode's agent-model assignments. Use it to understand why each agent uses a specific model, and how to customize them correctly. +> **For agents and users**: How to pick the right model for each agent. Read this before customizing model settings. + +Run `opencode models` to see all available models on your system, and `opencode auth login` to authenticate with providers. --- -## Why Model Matching Matters +## Model Families: Know Your Options -Each oh-my-opencode agent has a **dedicated system prompt** optimized for a specific model family. Some agents (Atlas, Prometheus) ship separate prompts for GPT vs Claude models, with automatic routing via `isGptModel()` detection. Assigning the wrong model family to an agent doesn't just degrade performance — the agent may receive instructions formatted for a completely different model's reasoning style. +Not all models behave the same way. Understanding which models are "similar" helps you make safe substitutions. -**Key principle**: Agents are tuned to model families, not individual models. A Claude-tuned agent works with Opus, Sonnet, or Haiku. A GPT-tuned agent works with GPT-5.2 or GPT-5.3-codex. Crossing families requires a model-specific prompt (which only some agents have). +### Claude-like Models (instruction-following, structured output) + +These models respond similarly to Claude and work well with oh-my-opencode's Claude-optimized prompts: + +| Model | Provider(s) | Notes | +|-------|-------------|-------| +| **Claude Opus 4.6** | anthropic, github-copilot, opencode | Best overall. Default for Sisyphus. | +| **Claude Sonnet 4.6** | anthropic, github-copilot, opencode | Faster, cheaper. Good balance. | +| **Claude Haiku 4.5** | anthropic, opencode | Fast and cheap. Good for quick tasks. | +| **Kimi K2.5** | kimi-for-coding | Behaves very similarly to Claude. Great all-rounder. Default for Atlas. | +| **Kimi K2.5 Free** | opencode | Free-tier Kimi. Rate-limited but functional. | +| **GLM 5** | zai-coding-plan, opencode | Claude-like behavior. Good for broad tasks. | +| **Big Pickle (GLM 4.6)** | opencode | Free-tier GLM. Decent fallback. | + +### GPT Models (explicit reasoning, principle-driven) + +GPT models need differently structured prompts. Some agents auto-detect GPT and switch prompts: + +| Model | Provider(s) | Notes | +|-------|-------------|-------| +| **GPT-5.3-codex** | openai, github-copilot, opencode | Deep coding powerhouse. Required for Hephaestus. | +| **GPT-5.2** | openai, github-copilot, opencode | High intelligence. Default for Oracle. | +| **GPT-5-Nano** | opencode | Ultra-cheap, fast. Good for simple utility tasks. | + +### Different-Behavior Models + +These models have unique characteristics — don't assume they'll behave like Claude or GPT: + +| Model | Provider(s) | Notes | +|-------|-------------|-------| +| **Gemini 3 Pro** | google, github-copilot, opencode | Excels at visual/frontend tasks. Different reasoning style. | +| **Gemini 3 Flash** | google, github-copilot, opencode | Fast, good for doc search and light tasks. | +| **MiniMax M2.5** | venice | Fast and smart. Good for utility tasks. | +| **MiniMax M2.5 Free** | opencode | Free-tier MiniMax. Fast for search/retrieval. | + +### Speed-Focused Models + +| Model | Provider(s) | Speed | Notes | +|-------|-------------|-------|-------| +| **Grok Code Fast 1** | github-copilot, venice | Very fast | Optimized for code grep/search. Default for Explore. | +| **Claude Haiku 4.5** | anthropic, opencode | Fast | Good balance of speed and intelligence. | +| **MiniMax M2.5 (Free)** | opencode, venice | Fast | Smart for its speed class. | +| **GPT-5.3-codex-spark** | openai | Extremely fast | Blazing fast but compacts so aggressively that oh-my-opencode's context management doesn't work well with it. Not recommended for omo agents. | --- -## Design Philosophy: Intelligence Where It Matters, Speed Everywhere Else +## Agent Roles and Recommended Models -The model catalog follows a clear hierarchy: +### Claude-Optimized Agents -1. **Core agents get premium models** — Sisyphus (Claude Opus), Hephaestus (GPT-5.3-codex), Prometheus (Opus/GPT-5.2). These agents handle complex multi-step reasoning where model quality directly impacts output. +These agents have prompts tuned for Claude-family models. Use Claude > Kimi K2.5 > GLM 5 in that priority order. -2. **Utility agents get fast, free-tier models** — Explore (Grok Code Fast → MiniMax M2.5 Free), Librarian (MiniMax M2.5 Free → Gemini Flash → Big Pickle). These agents do search, grep, and doc retrieval where speed matters more than deep reasoning. +| Agent | Role | Default Chain | What It Does | +|-------|------|---------------|--------------| +| **Sisyphus** | Main ultraworker | Opus (max) → Kimi K2.5 → GLM 5 → Big Pickle | Primary coding agent. Orchestrates everything. **Never use GPT — no GPT prompt exists.** | +| **Metis** | Plan review | Opus (max) → Kimi K2.5 → GPT-5.2 → Gemini 3 Pro | Reviews Prometheus plans for gaps. | -3. **Orchestrator agents get balanced models** — Atlas (Kimi K2.5 → Sonnet), Metis (Opus → Kimi K2.5). These need good instruction-following but don't need maximum intelligence. +### Dual-Prompt Agents (Claude + GPT auto-switch) -4. **Free-tier models are first-class citizens** — MiniMax M2.5 Free, Big Pickle, GPT-5-Nano, and Kimi K2.5 Free appear throughout fallback chains. This means oh-my-opencode works well even with OpenCode Zen (free) as the only provider. +These agents detect your model family at runtime and switch to the appropriate prompt. If you have GPT access, these agents can use it effectively. + +Priority: **Claude > GPT > Claude-like models** + +| Agent | Role | Default Chain | GPT Prompt? | +|-------|------|---------------|-------------| +| **Prometheus** | Strategic planner | Opus (max) → **GPT-5.2 (high)** → Kimi K2.5 → Gemini 3 Pro | Yes — XML-tagged, principle-driven (~300 lines vs ~1,100 Claude) | +| **Atlas** | Todo orchestrator | **Kimi K2.5** → Sonnet → GPT-5.2 | Yes — GPT-optimized todo management | + +### GPT-Native Agents + +These agents are built for GPT. Don't override to Claude. + +| Agent | Role | Default Chain | Notes | +|-------|------|---------------|-------| +| **Hephaestus** | Deep autonomous worker | GPT-5.3-codex (medium) only | "Codex on steroids." No fallback. Requires GPT access. | +| **Oracle** | Architecture/debugging | GPT-5.2 (high) → Gemini 3 Pro → Opus | High-IQ strategic backup. GPT preferred. | +| **Momus** | High-accuracy reviewer | GPT-5.2 (medium) → Opus → Gemini 3 Pro | Verification agent. GPT preferred. | + +### Utility Agents (Speed > Intelligence) + +These agents do search, grep, and retrieval. They intentionally use fast, cheap models. **Don't "upgrade" them to Opus — it wastes tokens on simple tasks.** + +| Agent | Role | Default Chain | Design Rationale | +|-------|------|---------------|------------------| +| **Explore** | Fast codebase grep | MiniMax M2.5 Free → Grok Code Fast → MiniMax M2.5 → Haiku → GPT-5-Nano | Speed is everything. Grok is blazing fast for grep. | +| **Librarian** | Docs/code search | MiniMax M2.5 Free → Gemini Flash → Big Pickle | Entirely free-tier. Doc retrieval doesn't need deep reasoning. | +| **Multimodal Looker** | Vision/screenshots | Kimi K2.5 → Kimi Free → Gemini Flash → GPT-5.2 → GLM-4.6v | Kimi excels at multimodal understanding. | --- -## Agent-Model Map (Source of Truth) +## Task Categories -This table reflects the actual fallback chains in `src/shared/model-requirements.ts`. The first available model in the chain is used. +Categories control which model is used for `background_task` and `delegate_task`. See the [Orchestration System Guide](./understanding-orchestration-system.md) for how agents dispatch tasks to categories. -### Core Agents - -| Agent | Role | Fallback Chain (in order) | Has GPT Prompt? | -|-------|------|---------------------------|-----------------| -| **Sisyphus** | Main ultraworker | Opus (max) → Kimi K2.5 → Kimi K2.5 Free → GLM 5 → Big Pickle | No — **never use GPT** | -| **Hephaestus** | Deep autonomous worker | GPT-5.3-codex (medium) — no fallback | N/A (GPT-native) | -| **Prometheus** | Strategic planner | Opus (max) → **GPT-5.2 (high)** → Kimi K2.5 → Kimi K2.5 Free → Gemini 3 Pro | **Yes** — auto-switches | -| **Atlas** | Todo orchestrator | **Kimi K2.5** → Kimi K2.5 Free → Sonnet → GPT-5.2 | **Yes** — auto-switches | -| **Oracle** | Architecture/debugging | GPT-5.2 (high) → Gemini 3 Pro (high) → Opus (max) | No | -| **Metis** | Plan review consultant | Opus (max) → Kimi K2.5 → Kimi K2.5 Free → GPT-5.2 (high) → Gemini 3 Pro (high) | No | -| **Momus** | High-accuracy reviewer | GPT-5.2 (medium) → Opus (max) → Gemini 3 Pro (high) | No | - -### Utility Agents - -| Agent | Role | Fallback Chain (in order) | Design Rationale | -|-------|------|---------------------------|------------------| -| **Explore** | Fast codebase grep | Grok Code Fast 1 → **MiniMax M2.5 Free** → Haiku → GPT-5-Nano | Speed over intelligence. Grok Code is fastest for grep-style work. MiniMax Free as cheap fallback. | -| **Librarian** | Docs/code search | **MiniMax M2.5 Free** → Gemini 3 Flash → Big Pickle | Entirely free-tier chain. Doc retrieval doesn't need Opus-level reasoning. | -| **Multimodal Looker** | Vision/screenshots | **Kimi K2.5** → Kimi K2.5 Free → Gemini 3 Flash → GPT-5.2 → GLM-4.6v | Kimi excels at multimodal. Gemini Flash as lightweight vision fallback. | - -### Task Categories - -Categories are used for `background_task` and `delegate_task` dispatching: - -| Category | Purpose | Fallback Chain | Notes | -|----------|---------|----------------|-------| -| `visual-engineering` | Frontend/UI work | Gemini 3 Pro (high) → GLM 5 → Opus (max) → Kimi K2.5 | Gemini excels at visual tasks | -| `ultrabrain` | Maximum intelligence | GPT-5.3-codex (xhigh) → Gemini 3 Pro (high) → Opus (max) | Highest reasoning variant | -| `deep` | Deep coding | GPT-5.3-codex (medium) → Opus (max) → Gemini 3 Pro (high) | Requires GPT availability | -| `artistry` | Creative/design | Gemini 3 Pro (high) → Opus (max) → GPT-5.2 | Requires Gemini availability | -| `quick` | Fast simple tasks | Haiku → Gemini 3 Flash → GPT-5-Nano | Cheapest, fastest | -| `unspecified-high` | General high-quality | Opus (max) → GPT-5.2 (high) → Gemini 3 Pro | Default for complex tasks | -| `unspecified-low` | General standard | Sonnet → GPT-5.3-codex (medium) → Gemini 3 Flash | Default for standard tasks | -| `writing` | Text/docs | **Kimi K2.5** → Gemini 3 Flash → Sonnet | Kimi produces best prose quality | +| Category | When Used | Recommended Models | Notes | +|----------|-----------|-------------------|-------| +| `visual-engineering` | Frontend, UI, CSS, design | Gemini 3 Pro (high) → GLM 5 → Opus → Kimi K2.5 | Gemini dominates visual tasks | +| `ultrabrain` | Maximum reasoning needed | GPT-5.3-codex (xhigh) → Gemini 3 Pro → Opus | Highest intelligence available | +| `deep` | Deep coding, complex logic | GPT-5.3-codex (medium) → Opus → Gemini 3 Pro | Requires GPT availability | +| `artistry` | Creative, novel approaches | Gemini 3 Pro (high) → Opus → GPT-5.2 | Requires Gemini availability | +| `quick` | Simple, fast tasks | Haiku → Gemini Flash → GPT-5-Nano | Cheapest and fastest | +| `unspecified-high` | General complex work | Opus (max) → GPT-5.2 (high) → Gemini 3 Pro | Default when no category fits | +| `unspecified-low` | General standard work | Sonnet → GPT-5.3-codex (medium) → Gemini Flash | Everyday tasks | +| `writing` | Text, docs, prose | Kimi K2.5 → Gemini Flash → Sonnet | Kimi produces best prose | --- -## Model-Specific Prompt Routing - -### Why Different Models Need Different Prompts +## Why Different Models Need Different Prompts Claude and GPT models have fundamentally different instruction-following behaviors: -- **Claude models** respond well to **mechanics-driven** prompts — detailed checklists, templates, step-by-step procedures, and explicit anti-patterns. More rules = more compliance. -- **GPT models** (especially 5.2+) have **stronger instruction adherence** and respond better to **principle-driven** prompts — concise principles, XML-tagged structure, explicit decision criteria. More rules = more contradiction surface area = more drift. +- **Claude models** respond well to **mechanics-driven** prompts — detailed checklists, templates, step-by-step procedures. More rules = more compliance. +- **GPT models** (especially 5.2+) respond better to **principle-driven** prompts — concise principles, XML-tagged structure, explicit decision criteria. More rules = more contradiction surface = more drift. -This insight comes from analyzing OpenAI's Codex Plan Mode prompt alongside the GPT-5.2 Prompting Guide: -- Codex Plan Mode uses 3 clean principles in ~121 lines to achieve what Prometheus's Claude prompt does in ~1,100 lines across 7 files -- GPT-5.2's "conservative grounding bias" and "more deliberate scaffolding" mean it builds clearer plans by default, but needs **explicit decision criteria** (it won't infer what you want) -- The key concept is **"Decision Complete"** — a plan must leave ZERO decisions to the implementer. GPT models follow this literally when stated as a principle, while Claude models need enforcement mechanisms +Key insight from Codex Plan Mode analysis: +- Codex Plan Mode achieves the same results with 3 principles in ~121 lines that Prometheus's Claude prompt needs ~1,100 lines across 7 files +- The core concept is **"Decision Complete"** — a plan must leave ZERO decisions to the implementer +- GPT follows this literally when stated as a principle; Claude needs enforcement mechanisms -### How It Works - -Some agents detect the assigned model at runtime and switch prompts: - -```typescript -// From src/agents/prometheus/system-prompt.ts -export function getPrometheusPrompt(model?: string): string { - if (model && isGptModel(model)) return getGptPrometheusPrompt() // XML-tagged, principle-driven - return PROMETHEUS_SYSTEM_PROMPT // Claude-optimized, modular sections -} -``` - -**Agents with dual prompts:** -- **Prometheus**: Claude prompt (~1,100 lines, 7 files, mechanics-driven with checklists and templates) vs GPT prompt (~300 lines, single file, principle-driven with XML structure inspired by Codex Plan Mode) -- **Atlas**: Claude prompt vs GPT prompt (GPT-optimized todo orchestration with explicit scope constraints) - -**Why this matters for customization**: If you override Prometheus to use a GPT model, the GPT prompt activates automatically — and it's specifically designed for how GPT reasons. But if you override Sisyphus to use GPT — there is no GPT prompt, and performance will degrade significantly because Sisyphus's prompt is deeply tuned for Claude's reasoning style. - -### Model Family Detection - -`isGptModel()` matches: -- Any model starting with `openai/` or `github-copilot/gpt-` -- Model names starting with common GPT prefixes (`gpt-`, `o1-`, `o3-`, `o4-`, `codex-`) - -Everything else is treated as "Claude-like" (Claude, Kimi, GLM, Gemini). +This is why Prometheus and Atlas ship separate prompts per model family — they auto-detect and switch at runtime via `isGptModel()`. --- ## Customization Guide -### When to Customize - -Customize model assignments when: -- You have a specific provider subscription (e.g., only OpenAI, no Anthropic) -- You want to use a cheaper model for certain agents -- You're experimenting with new models - ### How to Customize -Override in `oh-my-opencode.json` (user: `~/.config/opencode/oh-my-opencode.json`, project: `.opencode/oh-my-opencode.json`): +Override in `oh-my-opencode.json`: ```jsonc { "agents": { "sisyphus": { "model": "kimi-for-coding/k2p5" }, - "atlas": { "model": "anthropic/claude-sonnet-4-6" }, - "prometheus": { "model": "openai/gpt-5.2" } // Will auto-switch to GPT prompt + "prometheus": { "model": "openai/gpt-5.2" } // Auto-switches to GPT prompt } } ``` -### Safe Substitutions (same model family) +### Selection Priority -These swaps are safe because they stay within the same prompt family: +When choosing models for Claude-optimized agents: -| Agent | Default | Safe Alternatives | -|-------|---------|-------------------| -| **Sisyphus** | Claude Opus | Claude Sonnet, Kimi K2.5, GLM 5 (any Claude-like) | -| **Hephaestus** | GPT-5.3-codex | No alternatives — GPT only | -| **Prometheus** | Claude Opus | Claude Sonnet (Claude prompt) OR GPT-5.2 (auto-switches to GPT prompt) | -| **Atlas** | Kimi K2.5 | Claude Sonnet (Claude prompt) OR GPT-5.2 (auto-switches to GPT prompt) | -| **Oracle** | GPT-5.2 | Gemini 3 Pro, Claude Opus | -| **Librarian** | MiniMax M2.5 Free | Gemini 3 Flash, Big Pickle, any lightweight model | -| **Explore** | Grok Code Fast 1 | MiniMax M2.5 Free, Haiku, GPT-5-Nano — speed is key | +``` +Claude (Opus/Sonnet) > GPT (if agent has dual prompt) > Claude-like (Kimi K2.5, GLM 5) +``` -### Dangerous Substitutions (cross-family without prompt support) +When choosing models for GPT-native agents: -| Agent | Dangerous Override | Why | -|-------|-------------------|-----| -| **Sisyphus** → GPT | No GPT-optimized prompt exists. Sisyphus is deeply tuned for Claude-style reasoning. Performance drops dramatically. | -| **Hephaestus** → Claude | Hephaestus is purpose-built for GPT's Codex capabilities. Claude cannot replicate this. | -| **Explore** → Opus | Massive overkill and cost waste. Explore needs speed, not intelligence. | -| **Librarian** → Opus | Same — doc retrieval is a search task, not a reasoning task. Opus is wasted here. | +``` +GPT (5.3-codex, 5.2) > Claude Opus (decent fallback) > Gemini (acceptable) +``` -### Explaining to Users +### Safe vs Dangerous Overrides -When a user asks about model configuration, explain: +**Safe** (same family): +- Sisyphus: Opus → Sonnet, Kimi K2.5, GLM 5 +- Prometheus: Opus → GPT-5.2 (auto-switches prompt) +- Atlas: Kimi K2.5 → Sonnet, GPT-5.2 (auto-switches) -1. **The default works out of the box** — the installer configures optimal models based on their subscriptions -2. **Each agent has a "home" model family** — Sisyphus is Claude-native, Hephaestus is GPT-native -3. **Some agents auto-adapt** — Prometheus and Atlas detect GPT models and switch to optimized prompts -4. **Cross-family overrides are risky** — unless the agent has a dedicated prompt for that family -5. **Cost optimization is valid** — swapping Opus → Sonnet or Kimi K2.5 for Sisyphus saves money with acceptable quality trade-off -6. **Utility agents are intentionally cheap** — Librarian and Explore use free-tier models by design. Don't "upgrade" them to Opus thinking it'll help — it just wastes tokens on simple search tasks -7. **Kimi K2.5 is a versatile workhorse** — it appears as primary for Atlas (orchestration), Multimodal Looker (vision), and writing tasks. It's consistently good across these roles without being expensive. -8. **Point to this guide** for the full fallback chains and rationale +**Dangerous** (no prompt support): +- Sisyphus → GPT: **No GPT prompt. Will degrade significantly.** +- Hephaestus → Claude: **Built for Codex. Claude can't replicate this.** +- Explore → Opus: **Massive cost waste. Explore needs speed, not intelligence.** +- Librarian → Opus: **Same. Doc search doesn't need Opus-level reasoning.** --- ## Provider Priority -When multiple providers are available, oh-my-opencode prefers: - ``` -Native (anthropic/, openai/, google/) > Kimi for Coding > GitHub Copilot > OpenCode Zen > Z.ai Coding Plan -``` - -Each fallback chain entry specifies which providers can serve that model. The system picks the first entry where at least one provider is connected. - -**Notable provider mappings:** -- `venice` — alternative provider for Grok Code Fast 1 (Explore agent) -- `opencode` — serves free-tier models (Kimi K2.5 Free, MiniMax M2.5 Free, Big Pickle, GPT-5-Nano) and premium models via OpenCode Zen -- `zai-coding-plan` — GLM 5 and GLM-4.6v models - ---- - -## Quick Decision Tree for Users - -``` -What subscriptions do you have? - -├── Claude (Anthropic) → Sisyphus works optimally. Prometheus/Metis use Claude prompts. -├── OpenAI/ChatGPT → Hephaestus unlocked. Oracle/Momus use GPT. Prometheus auto-switches. -├── Both Claude + OpenAI → Full agent roster. Best experience. -├── Gemini only → Visual-engineering category excels. Other agents use Gemini as fallback. -├── Kimi for Coding → Atlas, Multimodal Looker, writing tasks work great. Sisyphus usable. -├── GitHub Copilot only → Works as fallback provider for all model families. -├── OpenCode Zen only → Free-tier access. Librarian/Explore work perfectly. Core agents functional but rate-limited. -└── No subscription → Limited functionality. Consider OpenCode Zen (free). - -For each user scenario, the installer (`bunx oh-my-opencode install`) auto-configures the optimal assignment. +Native (anthropic/, openai/, google/) > Kimi for Coding > GitHub Copilot > Venice > OpenCode Zen > Z.ai Coding Plan ``` --- ## See Also -- [Installation Guide](./installation.md) — Setup with subscription-based model configuration -- [Configuration Reference](../configurations.md) — Full config options including agent overrides -- [Overview](./overview.md) — How the agent system works -- [`src/shared/model-requirements.ts`](../../src/shared/model-requirements.ts) — Source of truth for fallback chains +- [Installation Guide](./installation.md) — Setup and authentication +- [Orchestration System](./understanding-orchestration-system.md) — How agents dispatch tasks to categories +- [Configuration Reference](../configurations.md) — Full config options +- [`src/shared/model-requirements.ts`](../../src/shared/model-requirements.ts) — Source of truth for fallback chains \ No newline at end of file