docs: update hephaestus default model references from gpt-5.3-codex to gpt-5.4

Updated across README (all locales), docs/guide/, docs/reference/, docs/examples/, AGENTS.md files, and test expectations/snapshots. The deep category and multimodal-looker still use gpt-5.3-codex as those are separate from the hephaestus agent.
2026-03-26 19:25:26 +09:00
parent d57ed97386
commit d39891fcab
18 changed files with 44 additions and 45 deletions
--- a/docs/examples/coding-focused.jsonc
+++ b/docs/examples/coding-focused.jsonc
@@ -14,7 +14,7 @@

    // Heavy lifter: maximum autonomy for coding tasks
    "hephaestus": {
-      "model": "openai/gpt-5.3-codex",
+      "model": "openai/gpt-5.4",
      "prompt_append": "You are the primary implementation agent. Own the codebase. Explore, decide, execute. Use LSP and AST-grep aggressively.",
      "permission": { "edit": "allow", "bash": { "git": "allow", "test": "allow" } },
    },
--- a/docs/examples/default.jsonc
+++ b/docs/examples/default.jsonc
@@ -13,7 +13,7 @@

    // Deep autonomous worker: end-to-end implementation
    "hephaestus": {
-      "model": "openai/gpt-5.3-codex",
+      "model": "openai/gpt-5.4",
      "prompt_append": "Explore thoroughly, then implement. Prefer small, testable changes.",
    },

--- a/docs/examples/planning-focused.jsonc
+++ b/docs/examples/planning-focused.jsonc
@@ -14,7 +14,7 @@

    // Implementation: uses planning outputs
    "hephaestus": {
-      "model": "openai/gpt-5.3-codex",
+      "model": "openai/gpt-5.4",
      "prompt_append": "Follow established plans precisely. Ask for clarification when plans are ambiguous.",
    },

--- a/docs/guide/agent-model-matching.md
+++ b/docs/guide/agent-model-matching.md
@@ -27,7 +27,7 @@ Using Sisyphus with older GPT models would be like taking your best project mana

 Hephaestus is the developer who stays in their room coding all day. Doesn't talk much. Might seem socially awkward. But give them a hard technical problem and they'll emerge three hours later with a solution nobody else could have found.

-**This is why Hephaestus uses GPT-5.3 Codex.** Codex is built for exactly this:
+**This is why Hephaestus uses GPT-5.4.** GPT-5.4 is built for exactly this:

 - Deep, autonomous exploration without hand-holding
 - Multi-file reasoning across complex codebases
@@ -82,7 +82,7 @@ These agents are built for GPT's principle-driven style. Their prompts assume au

 | Agent          | Role                    | Fallback Chain                         | Notes                                            |
 | -------------- | ----------------------- | -------------------------------------- | ------------------------------------------------ |
-| **Hephaestus** | Autonomous deep worker  | GPT-5.3 Codex → GPT-5.4 (Copilot)     | Requires GPT access. GPT-5.4 via Copilot as fallback. The craftsman. |
+| **Hephaestus** | Autonomous deep worker  | GPT-5.4                               | Requires GPT access. The craftsman. |
 | **Oracle**     | Architecture consultant | GPT-5.4 → Gemini 3.1 Pro → Claude Opus → opencode-go/glm-5 | Read-only high-IQ consultation.                  |
 | **Momus**      | Ruthless reviewer       | GPT-5.4 → Claude Opus → Gemini 3.1 Pro → opencode-go/glm-5 | Verification and plan review. GPT-5.4 uses xhigh variant. |

@@ -119,7 +119,7 @@ Principle-driven, explicit reasoning, deep technical capability. Best for agents

 | Model             | Strengths                                                                                       |
 | ----------------- | ----------------------------------------------------------------------------------------------- |
-| **GPT-5.3 Codex** | Deep coding powerhouse. Autonomous exploration. Required for Hephaestus.                        |
+| **GPT-5.3 Codex** | Deep coding powerhouse. Autonomous exploration. Still available for deep category and explicit overrides. |
 | **GPT-5.4**       | High intelligence, strategic reasoning. Default for Oracle, Momus, and a key fallback for Prometheus / Atlas. Uses xhigh variant for Momus. |
 | **GPT-5.4 Mini**  | Fast + strong reasoning. Good for lightweight autonomous tasks. Default for quick category. |
 | **GPT-5-Nano**    | Ultra-cheap, fast. Good for simple utility tasks.                                               |
--- a/docs/guide/installation.md
+++ b/docs/guide/installation.md
@@ -285,7 +285,7 @@ Not all models behave the same way. Understanding which models are "similar" hel

 | Model             | Provider(s)                      | Notes                                             |
 | ----------------- | -------------------------------- | ------------------------------------------------- |
-| **GPT-5.3-codex** | openai, github-copilot, opencode | Deep coding powerhouse. Required for Hephaestus.  |
+| **GPT-5.3-codex** | openai, github-copilot, opencode | Deep coding powerhouse. Still available for deep category and explicit overrides. |
 | **GPT-5.4**       | openai, github-copilot, opencode | High intelligence. Default for Oracle.            |
 | **GPT-5.4 Mini**  | openai, github-copilot, opencode | Fast + strong reasoning. Default for quick category.     |
 | **GPT-5-Nano**    | opencode                         | Ultra-cheap, fast. Good for simple utility tasks. |
@@ -334,7 +334,7 @@ Priority: **Claude > GPT > Claude-like models**

 | Agent          | Role                   | Default Chain                          | Notes                                                  |
 | -------------- | ---------------------- | -------------------------------------- | ------------------------------------------------------ |
-| **Hephaestus** | Deep autonomous worker | GPT-5.3-codex (medium) only            | "Codex on steroids." No fallback. Requires GPT access. |
+| **Hephaestus** | Deep autonomous worker | GPT-5.4 (medium) only                  | "Codex on steroids." No fallback. Requires GPT access. |
 | **Oracle**     | Architecture/debugging | GPT-5.4 (high) → Gemini 3.1 Pro → Opus  | High-IQ strategic backup. GPT preferred.               |
 | **Momus**      | High-accuracy reviewer | GPT-5.4 (medium) → Opus → Gemini 3.1 Pro | Verification agent. GPT preferred.                     |

--- a/docs/guide/orchestration.md
+++ b/docs/guide/orchestration.md
@@ -420,7 +420,7 @@ Atlas is automatically activated when you run `/start-work`. You don't need to m

 | Aspect          | Hephaestus                                 | Sisyphus + `ulw` / `ultrawork`                       |
 | --------------- | ------------------------------------------ | ---------------------------------------------------- |
-| **Model**       | GPT-5.3 Codex (medium reasoning)           | Claude Opus 4.6 / GPT-5.4 / GLM 5 depending on setup |
+| **Model**       | GPT-5.4 (medium reasoning)                 | Claude Opus 4.6 / GPT-5.4 / GLM 5 depending on setup |
 | **Approach**    | Autonomous deep worker                     | Keyword-activated ultrawork mode                     |
 | **Best For**    | Complex architectural work, deep reasoning | General complex tasks, "just do it" scenarios        |
 | **Planning**    | Self-plans during execution                | Uses Prometheus plans if available                   |
@@ -443,8 +443,8 @@ Switch to Hephaestus (Tab → Select Hephaestus) when:
   - "Integrate our Rust core with the TypeScript frontend"
   - "Migrate from MongoDB to PostgreSQL with zero downtime"

-4. **You specifically want GPT-5.3 Codex reasoning**
-   - Some problems benefit from GPT-5.3 Codex's training characteristics
+4. **You specifically want GPT-5.4 reasoning**
+   - Some problems benefit from GPT-5.4's training characteristics

 **When to Use Sisyphus + `ulw`:**

@@ -469,7 +469,7 @@ Use the `ulw` keyword in Sisyphus when:
 **Recommendation:**

 - **For most users**: Use `ulw` keyword in Sisyphus. It's the default path and works excellently for 90% of complex tasks.
- **For power users**: Switch to Hephaestus when you specifically need GPT-5.3 Codex's reasoning style or want the "AmpCode deep mode" experience of fully autonomous exploration and execution.
+- **For power users**: Switch to Hephaestus when you specifically need GPT-5.4's reasoning style or want the "AmpCode deep mode" experience of fully autonomous exploration and execution.

 ---

@@ -520,7 +520,7 @@ Type `exit` or start a new session. Atlas is primarily entered via `/start-work`

 **For most tasks**: Type `ulw` in Sisyphus.

-**Use Hephaestus when**: You specifically need GPT-5.3 Codex's reasoning style for deep architectural work or complex debugging.
+**Use Hephaestus when**: You specifically need GPT-5.4's reasoning style for deep architectural work or complex debugging.

 ---

--- a/docs/guide/overview.md
+++ b/docs/guide/overview.md
@@ -93,9 +93,9 @@ Sisyphus still works best on Claude-family models, Kimi, and GLM. GPT-5.4 now ha

 Named with intentional irony. Anthropic blocked OpenCode from using their API because of this project. So the team built an autonomous GPT-native agent instead.

-Hephaestus runs on GPT-5.3 Codex. Give him a goal, not a recipe. He explores the codebase, researches patterns, and executes end-to-end without hand-holding. He is the legitimate craftsman because he was born from necessity, not privilege.
+Hephaestus runs on GPT-5.4. Give him a goal, not a recipe. He explores the codebase, researches patterns, and executes end-to-end without hand-holding. He is the legitimate craftsman because he was born from necessity, not privilege.

-Use Hephaestus when you need deep architectural reasoning, complex debugging across many files, or cross-domain knowledge synthesis. Switch to him explicitly when the work demands GPT-5.3 Codex's particular strengths.
+Use Hephaestus when you need deep architectural reasoning, complex debugging across many files, or cross-domain knowledge synthesis. Switch to him explicitly when the work demands GPT-5.4's particular strengths.

 **Why this beats vanilla Codex CLI:**

@@ -214,8 +214,7 @@ You can override specific agents or categories in your config:

 **GPT models** (explicit reasoning, principle-driven):

- GPT-5.3-codex — deep coding powerhouse, required for Hephaestus
- GPT-5.4 — high intelligence, default for Oracle
+- GPT-5.4 — deep coding powerhouse, required for Hephaestus and default for Oracle
 - GPT-5-Nano — ultra-cheap, fast utility tasks

 **Different-behavior models**:
--- a/docs/reference/configuration.md
+++ b/docs/reference/configuration.md
@@ -268,7 +268,7 @@ Disable categories: `{ "disabled_categories": ["ultrabrain"] }`
 | Agent                 | Default Model       | Provider Priority                                                            |
 | --------------------- | ------------------- | ---------------------------------------------------------------------------- |
 | **Sisyphus**          | `claude-opus-4-6`   | `claude-opus-4-6` → `glm-5` → `big-pickle`                                   |
-| **Hephaestus**        | `gpt-5.3-codex`     | `gpt-5.3-codex` → `gpt-5.4` (GitHub Copilot fallback)                        |
+| **Hephaestus**        | `gpt-5.4`           | `gpt-5.4`                                                                    |
 | **oracle**            | `gpt-5.4`           | `gpt-5.4` → `gemini-3.1-pro` → `claude-opus-4-6`                             |
 | **librarian**         | `minimax-m2.7`      | `minimax-m2.7` → `minimax-m2.7-highspeed` → `claude-haiku-4-5` → `gpt-5-nano` |
 | **explore**           | `grok-code-fast-1`  | `grok-code-fast-1` → `minimax-m2.7-highspeed` → `minimax-m2.7` → `claude-haiku-4-5` → `gpt-5-nano` |
--- a/docs/reference/features.md
+++ b/docs/reference/features.md
@@ -9,7 +9,7 @@ Oh-My-OpenAgent provides 11 specialized AI agents. Each has distinct expertise,
 | Agent                 | Model              | Purpose                                                                                                                                                                                                                                                                                                                                                          |
 | --------------------- | ------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
 | **Sisyphus**          | `claude-opus-4-6`  | The default orchestrator. Plans, delegates, and executes complex tasks using specialized subagents with aggressive parallel execution. Todo-driven workflow with extended thinking (32k budget). Fallback: `glm-5` → `big-pickle`.                                                                                                                               |
-| **Hephaestus**        | `gpt-5.3-codex`    | The Legitimate Craftsman. Autonomous deep worker inspired by AmpCode's deep mode. Goal-oriented execution with thorough research before action. Explores codebase patterns, completes tasks end-to-end without premature stopping. Named after the Greek god of forge and craftsmanship. Fallback: `gpt-5.4` on GitHub Copilot. Requires a GPT-capable provider. |
+| **Hephaestus**        | `gpt-5.4`          | The Legitimate Craftsman. Autonomous deep worker inspired by AmpCode's deep mode. Goal-oriented execution with thorough research before action. Explores codebase patterns, completes tasks end-to-end without premature stopping. Named after the Greek god of forge and craftsmanship. Requires a GPT-capable provider. |
 | **Oracle**            | `gpt-5.4`          | Architecture decisions, code review, debugging. Read-only consultation with stellar logical reasoning and deep analysis. Inspired by AmpCode. Fallback: `gemini-3.1-pro` → `claude-opus-4-6`.                                                                                                                                                                    |
 | **Librarian**         | `minimax-m2.7`     | Multi-repo analysis, documentation lookup, OSS implementation examples. Deep codebase understanding with evidence-based answers. Fallback: `minimax-m2.7-highspeed` → `claude-haiku-4-5` → `gpt-5-nano`.                                                                                                                                                         |
 | **Explore**           | `grok-code-fast-1` | Fast codebase exploration and contextual grep. Fallback: `minimax-m2.7-highspeed` → `minimax-m2.7` → `claude-haiku-4-5` → `gpt-5-nano`.                                                                                                                                                                                                                          |