Files
oh-my-openagent/docs/guide/agent-model-matching.md
YeonGyu-Kim 5a3fddf03b docs: reorganize overview.md for better first-time user experience
- Add Quick Start section with clear installation link
- Add 'How It Works: Agent Orchestration' section linking to orchestration.md
- Add 'Agent Model Matching' section with JSON configuration examples
- Restructure content flow for better readability
- Add example JSON config to agent-model-matching.md
- Maintain original voice and strong opinions while improving organization
- All links now properly reference related docs
2026-02-21 17:14:15 +09:00

11 KiB

Agent-Model Matching Guide

For agents and users: How to pick the right model for each agent. Read this before customizing model settings.

Example Configuration

Here's a practical example configuration showing agent-model assignments:

{
  "$schema": "https://raw.githubusercontent.com/code-yeongyu/oh-my-opencode/master/assets/oh-my-opencode.schema.json",

  "agents": {
    // Main orchestrator: Claude Opus or Kimi K2.5 work best
    "sisyphus": {
      "model": "kimi-for-coding/k2p5",
      "ultrawork": { "model": "anthropic/claude-opus-4-6", "variant": "max" }
    },

    // Research agents: cheaper models are fine
    "librarian": { "model": "zai-coding-plan/glm-4.7" },
    "explore":   { "model": "github-copilot/grok-code-fast-1" },

    // Architecture consultation: GPT or Claude Opus
    "oracle": { "model": "openai/gpt-5.2", "variant": "high" },

    // Prometheus inherits sisyphus model; just add prompt guidance
    "prometheus": { "prompt_append": "Leverage deep & quick agents heavily, always in parallel." }
  },

  "categories": {
    // quick — trivial tasks
    "quick": { "model": "opencode/gpt-5-nano" },

    // unspecified-low — moderate tasks
    "unspecified-low": { "model": "kimi-for-coding/k2p5" },

    // unspecified-high — complex work
    "unspecified-high": { "model": "anthropic/claude-sonnet-4-6", "variant": "max" },

    // visual-engineering — Gemini dominates visual tasks
    "visual-engineering": { "model": "google/gemini-3-pro", "variant": "high" },

    // writing — docs/prose
    "writing": { "model": "kimi-for-coding/k2p5" }
  },

  // Limit expensive providers; let cheap ones run freely
  "background_task": {
    "providerConcurrency": { "anthropic": 3, "openai": 3, "opencode": 10, "zai-coding-plan": 10 },
    "modelConcurrency": { "anthropic/claude-opus-4-6": 2, "opencode/gpt-5-nano": 20 }
  }
}

Run opencode models to see all available models on your system, and opencode auth login to authenticate with providers.

Model Families: Know Your Options

Not all models behave the same way. Understanding which models are "similar" helps you make safe substitutions.

Claude-like Models (instruction-following, structured output)

These models respond similarly to Claude and work well with oh-my-opencode's Claude-optimized prompts:

Model Provider(s) Notes
Claude Opus 4.6 anthropic, github-copilot, opencode Best overall. Default for Sisyphus.
Claude Sonnet 4.6 anthropic, github-copilot, opencode Faster, cheaper. Good balance.
Claude Haiku 4.5 anthropic, opencode Fast and cheap. Good for quick tasks.
Kimi K2.5 kimi-for-coding Behaves very similarly to Claude. Great all-rounder. Default for Atlas.
Kimi K2.5 Free opencode Free-tier Kimi. Rate-limited but functional.
GLM 5 zai-coding-plan, opencode Claude-like behavior. Good for broad tasks.
Big Pickle (GLM 4.6) opencode Free-tier GLM. Decent fallback.

GPT Models (explicit reasoning, principle-driven)

GPT models need differently structured prompts. Some agents auto-detect GPT and switch prompts:

Model Provider(s) Notes
GPT-5.3-codex openai, github-copilot, opencode Deep coding powerhouse. Required for Hephaestus.
GPT-5.2 openai, github-copilot, opencode High intelligence. Default for Oracle.
GPT-5-Nano opencode Ultra-cheap, fast. Good for simple utility tasks.

Different-Behavior Models

These models have unique characteristics — don't assume they'll behave like Claude or GPT:

Model Provider(s) Notes
Gemini 3 Pro google, github-copilot, opencode Excels at visual/frontend tasks. Different reasoning style.
Gemini 3 Flash google, github-copilot, opencode Fast, good for doc search and light tasks.
MiniMax M2.5 venice Fast and smart. Good for utility tasks.
MiniMax M2.5 Free opencode Free-tier MiniMax. Fast for search/retrieval.

Speed-Focused Models

Model Provider(s) Speed Notes
Grok Code Fast 1 github-copilot, venice Very fast Optimized for code grep/search. Default for Explore.
Claude Haiku 4.5 anthropic, opencode Fast Good balance of speed and intelligence.
MiniMax M2.5 (Free) opencode, venice Fast Smart for its speed class.
GPT-5.3-codex-spark openai Extremely fast Blazing fast but compacts so aggressively that oh-my-opencode's context management doesn't work well with it. Not recommended for omo agents.

Claude-Optimized Agents

These agents have prompts tuned for Claude-family models. Use Claude > Kimi K2.5 > GLM 5 in that priority order.

Agent Role Default Chain What It Does
Sisyphus Main ultraworker Opus (max) → Kimi K2.5 → GLM 5 → Big Pickle Primary coding agent. Orchestrates everything. Never use GPT — no GPT prompt exists.
Metis Plan review Opus (max) → Kimi K2.5 → GPT-5.2 → Gemini 3 Pro Reviews Prometheus plans for gaps.

Dual-Prompt Agents (Claude + GPT auto-switch)

These agents detect your model family at runtime and switch to the appropriate prompt. If you have GPT access, these agents can use it effectively.

Priority: Claude > GPT > Claude-like models

Agent Role Default Chain GPT Prompt?
Prometheus Strategic planner Opus (max) → GPT-5.2 (high) → Kimi K2.5 → Gemini 3 Pro Yes — XML-tagged, principle-driven (~300 lines vs ~1,100 Claude)
Atlas Todo orchestrator Kimi K2.5 → Sonnet → GPT-5.2 Yes — GPT-optimized todo management

GPT-Native Agents

These agents are built for GPT. Don't override to Claude.

Agent Role Default Chain Notes
Hephaestus Deep autonomous worker GPT-5.3-codex (medium) only "Codex on steroids." No fallback. Requires GPT access.
Oracle Architecture/debugging GPT-5.2 (high) → Gemini 3 Pro → Opus High-IQ strategic backup. GPT preferred.
Momus High-accuracy reviewer GPT-5.2 (medium) → Opus → Gemini 3 Pro Verification agent. GPT preferred.

Utility Agents (Speed > Intelligence)

These agents do search, grep, and retrieval. They intentionally use fast, cheap models. Don't "upgrade" them to Opus — it wastes tokens on simple tasks.

Agent Role Default Chain Design Rationale
Explore Fast codebase grep MiniMax M2.5 Free → Grok Code Fast → MiniMax M2.5 → Haiku → GPT-5-Nano Speed is everything. Grok is blazing fast for grep.
Librarian Docs/code search MiniMax M2.5 Free → Gemini Flash → Big Pickle Entirely free-tier. Doc retrieval doesn't need deep reasoning.
Multimodal Looker Vision/screenshots Kimi K2.5 → Kimi Free → Gemini Flash → GPT-5.2 → GLM-4.6v Kimi excels at multimodal understanding.

Task Categories

Categories control which model is used for background_task and delegate_task. See the Orchestration System Guide for how agents dispatch tasks to categories.

Category When Used Recommended Models Notes
visual-engineering Frontend, UI, CSS, design Gemini 3 Pro (high) → GLM 5 → Opus → Kimi K2.5 Gemini dominates visual tasks
ultrabrain Maximum reasoning needed GPT-5.3-codex (xhigh) → Gemini 3 Pro → Opus Highest intelligence available
deep Deep coding, complex logic GPT-5.3-codex (medium) → Opus → Gemini 3 Pro Requires GPT availability
artistry Creative, novel approaches Gemini 3 Pro (high) → Opus → GPT-5.2 Requires Gemini availability
quick Simple, fast tasks Haiku → Gemini Flash → GPT-5-Nano Cheapest and fastest
unspecified-high General complex work Opus (max) → GPT-5.2 (high) → Gemini 3 Pro Default when no category fits
unspecified-low General standard work Sonnet → GPT-5.3-codex (medium) → Gemini Flash Everyday tasks
writing Text, docs, prose Kimi K2.5 → Gemini Flash → Sonnet Kimi produces best prose

Why Different Models Need Different Prompts

Claude and GPT models have fundamentally different instruction-following behaviors:

  • Claude models respond well to mechanics-driven prompts — detailed checklists, templates, step-by-step procedures. More rules = more compliance.
  • GPT models (especially 5.2+) respond better to principle-driven prompts — concise principles, XML-tagged structure, explicit decision criteria. More rules = more contradiction surface = more drift.

Key insight from Codex Plan Mode analysis:

  • Codex Plan Mode achieves the same results with 3 principles in ~121 lines that Prometheus's Claude prompt needs ~1,100 lines across 7 files
  • The core concept is "Decision Complete" — a plan must leave ZERO decisions to the implementer
  • GPT follows this literally when stated as a principle; Claude needs enforcement mechanisms

This is why Prometheus and Atlas ship separate prompts per model family — they auto-detect and switch at runtime via isGptModel().


Customization Guide

How to Customize

Override in oh-my-opencode.jsonc:

{
  "agents": {
    "sisyphus": { "model": "kimi-for-coding/k2p5" },
    "prometheus": { "model": "openai/gpt-5.2" }  // Auto-switches to GPT prompt
  }
}

Selection Priority

When choosing models for Claude-optimized agents:

Claude (Opus/Sonnet) > GPT (if agent has dual prompt) > Claude-like (Kimi K2.5, GLM 5)

When choosing models for GPT-native agents:

GPT (5.3-codex, 5.2) > Claude Opus (decent fallback) > Gemini (acceptable)

Safe vs Dangerous Overrides

Safe (same family):

  • Sisyphus: Opus → Sonnet, Kimi K2.5, GLM 5
  • Prometheus: Opus → GPT-5.2 (auto-switches prompt)
  • Atlas: Kimi K2.5 → Sonnet, GPT-5.2 (auto-switches)

Dangerous (no prompt support):

  • Sisyphus → GPT: No GPT prompt. Will degrade significantly.
  • Hephaestus → Claude: Built for Codex. Claude can't replicate this.
  • Explore → Opus: Massive cost waste. Explore needs speed, not intelligence.
  • Librarian → Opus: Same. Doc search doesn't need Opus-level reasoning.

Provider Priority

Native (anthropic/, openai/, google/) > Kimi for Coding > GitHub Copilot > Venice > OpenCode Zen > Z.ai Coding Plan

See Also