Files

YeonGyu-Kim 5a3fddf03b docs: reorganize overview.md for better first-time user experience

- Add Quick Start section with clear installation link
- Add 'How It Works: Agent Orchestration' section linking to orchestration.md
- Add 'Agent Model Matching' section with JSON configuration examples
- Restructure content flow for better readability
- Add example JSON config to agent-model-matching.md
- Maintain original voice and strong opinions while improving organization
- All links now properly reference related docs

2026-02-21 17:14:15 +09:00

11 KiB

Raw Blame History

Agent-Model Matching Guide

For agents and users: How to pick the right model for each agent. Read this before customizing model settings.

Example Configuration

Here's a practical example configuration showing agent-model assignments:

{
  "$schema": "https://raw.githubusercontent.com/code-yeongyu/oh-my-opencode/master/assets/oh-my-opencode.schema.json",

  "agents": {
    // Main orchestrator: Claude Opus or Kimi K2.5 work best
    "sisyphus": {
      "model": "kimi-for-coding/k2p5",
      "ultrawork": { "model": "anthropic/claude-opus-4-6", "variant": "max" }
    },

    // Research agents: cheaper models are fine
    "librarian": { "model": "zai-coding-plan/glm-4.7" },
    "explore":   { "model": "github-copilot/grok-code-fast-1" },

    // Architecture consultation: GPT or Claude Opus
    "oracle": { "model": "openai/gpt-5.2", "variant": "high" },

    // Prometheus inherits sisyphus model; just add prompt guidance
    "prometheus": { "prompt_append": "Leverage deep & quick agents heavily, always in parallel." }
  },

  "categories": {
    // quick — trivial tasks
    "quick": { "model": "opencode/gpt-5-nano" },

    // unspecified-low — moderate tasks
    "unspecified-low": { "model": "kimi-for-coding/k2p5" },

    // unspecified-high — complex work
    "unspecified-high": { "model": "anthropic/claude-sonnet-4-6", "variant": "max" },

    // visual-engineering — Gemini dominates visual tasks
    "visual-engineering": { "model": "google/gemini-3-pro", "variant": "high" },

    // writing — docs/prose
    "writing": { "model": "kimi-for-coding/k2p5" }
  },

  // Limit expensive providers; let cheap ones run freely
  "background_task": {
    "providerConcurrency": { "anthropic": 3, "openai": 3, "opencode": 10, "zai-coding-plan": 10 },
    "modelConcurrency": { "anthropic/claude-opus-4-6": 2, "opencode/gpt-5-nano": 20 }
  }
}

Run opencode models to see all available models on your system, and opencode auth login to authenticate with providers.

Model Families: Know Your Options

Not all models behave the same way. Understanding which models are "similar" helps you make safe substitutions.

Claude-like Models (instruction-following, structured output)

These models respond similarly to Claude and work well with oh-my-opencode's Claude-optimized prompts:

Model	Provider(s)	Notes
Claude Opus 4.6	anthropic, github-copilot, opencode	Best overall. Default for Sisyphus.
Claude Sonnet 4.6	anthropic, github-copilot, opencode	Faster, cheaper. Good balance.
Claude Haiku 4.5	anthropic, opencode	Fast and cheap. Good for quick tasks.
Kimi K2.5	kimi-for-coding	Behaves very similarly to Claude. Great all-rounder. Default for Atlas.
Kimi K2.5 Free	opencode	Free-tier Kimi. Rate-limited but functional.
GLM 5	zai-coding-plan, opencode	Claude-like behavior. Good for broad tasks.
Big Pickle (GLM 4.6)	opencode	Free-tier GLM. Decent fallback.

GPT Models (explicit reasoning, principle-driven)

GPT models need differently structured prompts. Some agents auto-detect GPT and switch prompts:

Model	Provider(s)	Notes
GPT-5.3-codex	openai, github-copilot, opencode	Deep coding powerhouse. Required for Hephaestus.
GPT-5.2	openai, github-copilot, opencode	High intelligence. Default for Oracle.
GPT-5-Nano	opencode	Ultra-cheap, fast. Good for simple utility tasks.

Different-Behavior Models

These models have unique characteristics — don't assume they'll behave like Claude or GPT:

Model	Provider(s)	Notes
Gemini 3 Pro	google, github-copilot, opencode	Excels at visual/frontend tasks. Different reasoning style.
Gemini 3 Flash	google, github-copilot, opencode	Fast, good for doc search and light tasks.
MiniMax M2.5	venice	Fast and smart. Good for utility tasks.
MiniMax M2.5 Free	opencode	Free-tier MiniMax. Fast for search/retrieval.

Speed-Focused Models

Model	Provider(s)	Speed	Notes
Grok Code Fast 1	github-copilot, venice	Very fast	Optimized for code grep/search. Default for Explore.
Claude Haiku 4.5	anthropic, opencode	Fast	Good balance of speed and intelligence.
MiniMax M2.5 (Free)	opencode, venice	Fast	Smart for its speed class.
GPT-5.3-codex-spark	openai	Extremely fast	Blazing fast but compacts so aggressively that oh-my-opencode's context management doesn't work well with it. Not recommended for omo agents.

Agent Roles and Recommended Models

Claude-Optimized Agents

These agents have prompts tuned for Claude-family models. Use Claude > Kimi K2.5 > GLM 5 in that priority order.

Agent	Role	Default Chain	What It Does
Sisyphus	Main ultraworker	Opus (max) → Kimi K2.5 → GLM 5 → Big Pickle	Primary coding agent. Orchestrates everything. Never use GPT — no GPT prompt exists.
Metis	Plan review	Opus (max) → Kimi K2.5 → GPT-5.2 → Gemini 3 Pro	Reviews Prometheus plans for gaps.

Dual-Prompt Agents (Claude + GPT auto-switch)

These agents detect your model family at runtime and switch to the appropriate prompt. If you have GPT access, these agents can use it effectively.

Priority: Claude > GPT > Claude-like models

Agent	Role	Default Chain	GPT Prompt?
Prometheus	Strategic planner	Opus (max) → GPT-5.2 (high) → Kimi K2.5 → Gemini 3 Pro	Yes — XML-tagged, principle-driven (~300 lines vs ~1,100 Claude)
Atlas	Todo orchestrator	Kimi K2.5 → Sonnet → GPT-5.2	Yes — GPT-optimized todo management

GPT-Native Agents

These agents are built for GPT. Don't override to Claude.

Agent	Role	Default Chain	Notes
Hephaestus	Deep autonomous worker	GPT-5.3-codex (medium) only	"Codex on steroids." No fallback. Requires GPT access.
Oracle	Architecture/debugging	GPT-5.2 (high) → Gemini 3 Pro → Opus	High-IQ strategic backup. GPT preferred.
Momus	High-accuracy reviewer	GPT-5.2 (medium) → Opus → Gemini 3 Pro	Verification agent. GPT preferred.

Utility Agents (Speed > Intelligence)

These agents do search, grep, and retrieval. They intentionally use fast, cheap models. Don't "upgrade" them to Opus — it wastes tokens on simple tasks.

Agent	Role	Default Chain	Design Rationale
Explore	Fast codebase grep	MiniMax M2.5 Free → Grok Code Fast → MiniMax M2.5 → Haiku → GPT-5-Nano	Speed is everything. Grok is blazing fast for grep.
Librarian	Docs/code search	MiniMax M2.5 Free → Gemini Flash → Big Pickle	Entirely free-tier. Doc retrieval doesn't need deep reasoning.
Multimodal Looker	Vision/screenshots	Kimi K2.5 → Kimi Free → Gemini Flash → GPT-5.2 → GLM-4.6v	Kimi excels at multimodal understanding.

Task Categories

Categories control which model is used for background_task and delegate_task. See the Orchestration System Guide for how agents dispatch tasks to categories.

Category	When Used	Recommended Models	Notes
`visual-engineering`	Frontend, UI, CSS, design	Gemini 3 Pro (high) → GLM 5 → Opus → Kimi K2.5	Gemini dominates visual tasks
`ultrabrain`	Maximum reasoning needed	GPT-5.3-codex (xhigh) → Gemini 3 Pro → Opus	Highest intelligence available
`deep`	Deep coding, complex logic	GPT-5.3-codex (medium) → Opus → Gemini 3 Pro	Requires GPT availability
`artistry`	Creative, novel approaches	Gemini 3 Pro (high) → Opus → GPT-5.2	Requires Gemini availability
`quick`	Simple, fast tasks	Haiku → Gemini Flash → GPT-5-Nano	Cheapest and fastest
`unspecified-high`	General complex work	Opus (max) → GPT-5.2 (high) → Gemini 3 Pro	Default when no category fits
`unspecified-low`	General standard work	Sonnet → GPT-5.3-codex (medium) → Gemini Flash	Everyday tasks
`writing`	Text, docs, prose	Kimi K2.5 → Gemini Flash → Sonnet	Kimi produces best prose

Why Different Models Need Different Prompts

Claude and GPT models have fundamentally different instruction-following behaviors:

Claude models respond well to mechanics-driven prompts — detailed checklists, templates, step-by-step procedures. More rules = more compliance.
GPT models (especially 5.2+) respond better to principle-driven prompts — concise principles, XML-tagged structure, explicit decision criteria. More rules = more contradiction surface = more drift.

Key insight from Codex Plan Mode analysis:

Codex Plan Mode achieves the same results with 3 principles in ~121 lines that Prometheus's Claude prompt needs ~1,100 lines across 7 files
The core concept is "Decision Complete" — a plan must leave ZERO decisions to the implementer
GPT follows this literally when stated as a principle; Claude needs enforcement mechanisms

This is why Prometheus and Atlas ship separate prompts per model family — they auto-detect and switch at runtime via isGptModel().

Customization Guide

How to Customize

Override in oh-my-opencode.jsonc:

{
  "agents": {
    "sisyphus": { "model": "kimi-for-coding/k2p5" },
    "prometheus": { "model": "openai/gpt-5.2" }  // Auto-switches to GPT prompt
  }
}

Selection Priority

When choosing models for Claude-optimized agents:

Claude (Opus/Sonnet) > GPT (if agent has dual prompt) > Claude-like (Kimi K2.5, GLM 5)

When choosing models for GPT-native agents:

GPT (5.3-codex, 5.2) > Claude Opus (decent fallback) > Gemini (acceptable)

Safe vs Dangerous Overrides

Safe (same family):

Sisyphus: Opus → Sonnet, Kimi K2.5, GLM 5
Prometheus: Opus → GPT-5.2 (auto-switches prompt)
Atlas: Kimi K2.5 → Sonnet, GPT-5.2 (auto-switches)

Dangerous (no prompt support):

Sisyphus → GPT: No GPT prompt. Will degrade significantly.
Hephaestus → Claude: Built for Codex. Claude can't replicate this.
Explore → Opus: Massive cost waste. Explore needs speed, not intelligence.
Librarian → Opus: Same. Doc search doesn't need Opus-level reasoning.

Provider Priority

Native (anthropic/, openai/, google/) > Kimi for Coding > GitHub Copilot > Venice > OpenCode Zen > Z.ai Coding Plan

11 KiB Raw Blame History