From c25dbb94b299605feb7d0633103d1f178664edd3 Mon Sep 17 00:00:00 2001 From: YeonGyu-Kim Date: Mon, 16 Mar 2026 21:08:07 +0900 Subject: [PATCH] docs: audit and update agent lists, models, and fallback chains - Update README.md to prioritize Primary Agents (Sisyphus, Hephaestus, Prometheus, Atlas, Junior) - Update overview.md and features.md to distinguish Primary Agents from Specialist Subagents - Update Librarian and Multimodal-Looker models in docs to match source code fallback chains - Ensure accuracy of agent descriptions and roles --- README.md | 7 ++++++- docs/guide/installation.md | 4 ++-- docs/guide/overview.md | 24 ++++++++++++++---------- docs/reference/features.md | 35 +++++++++++++++-------------------- 4 files changed, 37 insertions(+), 33 deletions(-) diff --git a/README.md b/README.md index ac2f9a93a..15ec9b619 100644 --- a/README.md +++ b/README.md @@ -158,6 +158,10 @@ Even only with following subscriptions, ultrawork will work well (this project i **Prometheus** (`claude-opus-4-6` / **`kimi-k2.5`** / **`glm-5`** ) is your strategic planner. Interview mode: it questions, identifies scope, and builds a detailed plan before a single line of code is touched. +**Atlas** (`claude-sonnet-4-6`) is the executor. He takes the plan from Prometheus and drives it to completion, managing the todo list and coordinating subagents. + +**Sisyphus-Junior** is the dedicated executor for category-based tasks. + Every agent is tuned to its model's specific strengths. No manual model-juggling. [Learn more →](docs/guide/overview.md) > Anthropic [blocked OpenCode because of us.](https://x.com/thdxr/status/2010149530486911014) That's why Hephaestus is called "The Legitimate Craftsman." The irony is intentional. @@ -296,7 +300,8 @@ Features you'll think should've always existed. Once you use them, you can't go See full [Features Documentation](docs/reference/features.md). **Quick Overview:** -- **Agents**: Sisyphus (the main agent), Prometheus (planner), Oracle (architecture/debugging), Librarian (docs/code search), Explore (fast codebase grep), Multimodal Looker +- **Primary Agents**: Sisyphus (the main agent), Hephaestus (deep worker), Prometheus (planner), Atlas (executor), Sisyphus-Junior (category executor) +- **Specialist Subagents**: Oracle (architecture/debugging), Librarian (docs/code search), Explore (fast codebase grep), Multimodal Looker (vision) - **Background Agents**: Run multiple agents in parallel like a real dev team - **LSP & AST Tools**: Refactoring, rename, diagnostics, AST-aware code search - **Hash-anchored Edit Tool**: `LINE#ID` references validate content before applying every change. Surgical edits, zero stale-line errors diff --git a/docs/guide/installation.md b/docs/guide/installation.md index 2c18c1a49..a1e653d2b 100644 --- a/docs/guide/installation.md +++ b/docs/guide/installation.md @@ -344,8 +344,8 @@ These agents do search, grep, and retrieval. They intentionally use fast, cheap | Agent | Role | Default Chain | Design Rationale | | --------------------- | ------------------ | ---------------------------------------------------------------------- | -------------------------------------------------------------- | | **Explore** | Fast codebase grep | MiniMax M2.5 Free → Grok Code Fast → MiniMax M2.5 → Haiku → GPT-5-Nano | Speed is everything. Grok is blazing fast for grep. | -| **Librarian** | Docs/code search | MiniMax M2.5 Free → Gemini Flash → Big Pickle | Entirely free-tier. Doc retrieval doesn't need deep reasoning. | -| **Multimodal Looker** | Vision/screenshots | Kimi K2.5 → Kimi Free → Gemini Flash → GPT-5.4 → GLM-4.6v | Kimi excels at multimodal understanding. | +| **Librarian** | Docs/code search | MiniMax M2.5 → MiniMax Free → Haiku → Nano | Fast, cheap models for search. | +| **Multimodal Looker** | Vision/screenshots | GPT-5.4 → Kimi K2.5 → GLM-4.6v → GPT-5-Nano | Strong vision capabilities. | #### Why Different Models Need Different Prompts diff --git a/docs/guide/overview.md b/docs/guide/overview.md index ef1d2aa9a..6c3da841c 100644 --- a/docs/guide/overview.md +++ b/docs/guide/overview.md @@ -60,10 +60,11 @@ User Request ↓ ├─→ [Prometheus] — Strategic planning (interview mode) ├─→ [Atlas] — Todo orchestration and execution - ├─→ [Oracle] — Architecture consultation - ├─→ [Librarian] — Documentation/code search - ├─→ [Explore] — Fast codebase grep - └─→ [Category-based agents] — Specialized by task type + ├─→ [Specialist Subagents] + │ ├─→ [Oracle] — Architecture consultation + │ ├─→ [Librarian] — Documentation/code search + │ └─→ [Explore] — Fast codebase grep + └─→ [Sisyphus-Junior] — Category-based executor ``` When Sisyphus delegates to a subagent, it doesn't pick a model name. It picks a **category** — `visual-engineering`, `ultrabrain`, `quick`, `deep`. The category automatically maps to the right model. You touch nothing. @@ -116,17 +117,20 @@ Atlas executes Prometheus plans. Distributes tasks to specialized subagents. Acc Run `/start-work` to activate Atlas on your latest plan. -### Oracle: The Consultant +### Sisyphus-Junior: The Specialist -Read-only high-IQ consultant for architecture decisions and complex debugging. Consult Oracle when facing unfamiliar patterns, security concerns, or multi-system tradeoffs. +When Sisyphus delegates a task via a specific **Category** (like `visual-engineering` or `deep`), **Sisyphus-Junior** is the agent that performs it. It is optimized for focused execution within a specific domain and cannot re-delegate, preventing infinite loops. -### Supporting Cast +### Specialist Subagents +These agents are primarily designed to be called by other agents or for specific queries, rather than managing a full workflow. + +- **Oracle** — Read-only high-IQ consultant for architecture decisions and complex debugging. +- **Librarian** — Documentation and OSS code search. Stays current on library APIs and best practices. +- **Explore** — Fast codebase grep. Uses speed-focused models for pattern discovery. +- **Multimodal Looker** — Vision and screenshot analysis. - **Metis** — Gap analyzer. Catches what Prometheus missed before plans are finalized. - **Momus** — Ruthless reviewer. Validates plans against clarity, verification, and context criteria. -- **Explore** — Fast codebase grep. Uses speed-focused models for pattern discovery. -- **Librarian** — Documentation and OSS code search. Stays current on library APIs and best practices. -- **Multimodal Looker** — Vision and screenshot analysis. --- diff --git a/docs/reference/features.md b/docs/reference/features.md index 9287f3522..5c143f2c3 100644 --- a/docs/reference/features.md +++ b/docs/reference/features.md @@ -4,31 +4,26 @@ Oh-My-OpenCode provides 11 specialized AI agents. Each has distinct expertise, optimized models, and tool permissions. -### Core Agents +### Primary Agents | Agent | Model | Purpose | | --------------------- | ------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| **Sisyphus** | `claude-opus-4-6` | The default orchestrator. Plans, delegates, and executes complex tasks using specialized subagents with aggressive parallel execution. Todo-driven workflow with extended thinking (32k budget). Fallback: `glm-5` → `big-pickle`. | -| **Hephaestus** | `gpt-5.3-codex` | The Legitimate Craftsman. Autonomous deep worker inspired by AmpCode's deep mode. Goal-oriented execution with thorough research before action. Explores codebase patterns, completes tasks end-to-end without premature stopping. Named after the Greek god of forge and craftsmanship. Fallback: `gpt-5.4` on GitHub Copilot. Requires a GPT-capable provider. | -| **Oracle** | `gpt-5.4` | Architecture decisions, code review, debugging. Read-only consultation with stellar logical reasoning and deep analysis. Inspired by AmpCode. Fallback: `gemini-3.1-pro` → `claude-opus-4-6`. | -| **Librarian** | `gemini-3-flash` | Multi-repo analysis, documentation lookup, OSS implementation examples. Deep codebase understanding with evidence-based answers. Fallback: `minimax-m2.5-free` → `big-pickle`. | -| **Explore** | `grok-code-fast-1` | Fast codebase exploration and contextual grep. Fallback: `minimax-m2.5-free` → `claude-haiku-4-5` → `gpt-5-nano`. | -| **Multimodal-Looker** | `gpt-5.3-codex` | Visual content specialist. Analyzes PDFs, images, diagrams to extract information. Fallback: `k2p5` → `gemini-3-flash` → `glm-4.6v` → `gpt-5-nano`. | +| **Sisyphus** | `claude-opus-4-6` | The default orchestrator. Plans, delegates, and executes complex tasks using specialized subagents with aggressive parallel execution. Todo-driven workflow with extended thinking (32k budget). Fallback: `kimi-k2.5` → `glm-5`. | +| **Hephaestus** | `gpt-5.3-codex` | The Legitimate Craftsman. Autonomous deep worker. Goal-oriented execution with thorough research before action. Explores codebase patterns, completes tasks end-to-end. Fallback: `gpt-5.4` on GitHub Copilot. Requires a GPT-capable provider. | +| **Prometheus** | `claude-opus-4-6` | Strategic planner with interview mode. Creates detailed work plans through iterative questioning. Fallback: `gpt-5.4` → `gemini-3.1-pro`. | +| **Atlas** | `claude-sonnet-4-6`| Executor. Takes the plan from Prometheus and drives it to completion, managing the todo list and coordinating subagents. Fallback: `gpt-5.4` (medium). | +| **Sisyphus-Junior** | _(category-dependent)_ | Category-spawned executor. Model is selected automatically based on the task category. Used when the main agent delegates work via the `task` tool. | -### Planning Agents +### Specialist Subagents -| Agent | Model | Purpose | -| -------------- | ----------------- | -------------------------------------------------------------------------------------------------------------------------------------------------- | -| **Prometheus** | `claude-opus-4-6` | Strategic planner with interview mode. Creates detailed work plans through iterative questioning. Fallback: `gpt-5.4` → `gemini-3.1-pro`. | -| **Metis** | `claude-opus-4-6` | Plan consultant — pre-planning analysis. Identifies hidden intentions, ambiguities, and AI failure points. Fallback: `gpt-5.4` → `gemini-3.1-pro`. | -| **Momus** | `gpt-5.4` | Plan reviewer — validates plans against clarity, verifiability, and completeness standards. Fallback: `claude-opus-4-6` → `gemini-3.1-pro`. | - -### Orchestration Agents - -| Agent | Model | Purpose | -| ------------------- | ---------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| **Atlas** | `claude-sonnet-4-6` | Todo-list orchestrator. Executes planned tasks systematically, managing todo items and coordinating work. Fallback: `gpt-5.4` (medium). | -| **Sisyphus-Junior** | _(category-dependent)_ | Category-spawned executor. Model is selected automatically based on the task category (visual-engineering, quick, deep, etc.). Used when the main agent delegates work via the `task` tool. | +| Agent | Model | Purpose | +| --------------------- | ------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------- | +| **Oracle** | `gpt-5.4` | Architecture decisions, code review, debugging. Read-only consultation. Fallback: `gemini-3.1-pro` → `claude-opus-4-6`. | +| **Librarian** | `minimax-m2.5` | Multi-repo analysis, documentation lookup, OSS implementation examples. Fallback: `minimax-m2.5-free` → `claude-haiku-4-5` → `gpt-5-nano`. | +| **Explore** | `grok-code-fast-1` | Fast codebase exploration and contextual grep. Fallback: `minimax-m2.5` → `minimax-m2.5-free` → `claude-haiku-4-5`. | +| **Multimodal-Looker** | `gpt-5.4` | Visual content specialist. Analyzes PDFs, images, diagrams. Fallback: `kimi-k2.5` → `glm-4.6v` → `gpt-5-nano`. | +| **Metis** | `claude-opus-4-6` | Plan consultant — pre-planning analysis. Identifies hidden intentions, ambiguities, and AI failure points. Fallback: `gpt-5.4` → `gemini-3.1-pro`. | +| **Momus** | `gpt-5.4` | Plan reviewer — validates plans against clarity, verifiability, and completeness standards. Fallback: `claude-opus-4-6` → `gemini-3.1-pro`. | ### Invoking Agents