fix: convert executeSyncTask to async prompt + polling pattern

Oracle agent (and all sync subagent tasks) fails with JSON Parse error in ACP environments because session.prompt() (blocking HTTP) returns empty/incomplete responses. Replace promptSyncWithModelSuggestionRetry with promptWithModelSuggestionRetry (async, fire-and-forget) and add polling loop to wait for response stability, matching the proven pattern from executeUnstableAgentTask. Fixes #1681
@mrm007 has signed the CLA in code-yeongyu/oh-my-opencode#1680
2026-02-09 10:03:54 +09:00 · 2026-02-08 21:41:45 +00:00 · 2026-02-08 17:12:45 +00:00 · 2026-02-08 16:02:43 +00:00 · 2026-02-08 15:44:17 +00:00 · 2026-02-08 20:00:52 +09:00
113 changed files with 6324 additions and 2283 deletions
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -1,7 +1,7 @@
 # PROJECT KNOWLEDGE BASE

-**Generated:** 2026-02-06T18:30:00+09:00
-**Commit:** c6c149e
+**Generated:** 2026-02-08T16:45:00+09:00
+**Commit:** edee865f
 **Branch:** dev

 ---
@@ -135,8 +135,8 @@ oh-my-opencode/
 │   ├── cli/              # CLI installer, doctor - see src/cli/AGENTS.md
 │   ├── mcp/              # Built-in MCPs - see src/mcp/AGENTS.md
 │   ├── config/           # Zod schema (schema.ts 455 lines), TypeScript types
-│   ├── plugin-handlers/  # Plugin config loading (config-handler.ts 501 lines)
-│   ├── index.ts          # Main plugin entry (924 lines)
+│   ├── plugin-handlers/  # Plugin config loading (config-handler.ts 562 lines)
+│   ├── index.ts          # Main plugin entry (999 lines)
 │   ├── plugin-config.ts  # Config loading orchestration
 │   └── plugin-state.ts   # Model cache state
 ├── script/               # build-schema.ts, build-binaries.ts, publish.ts
@@ -170,7 +170,7 @@ oh-my-opencode/
 **Rules:**
 - NEVER write implementation before test
 - NEVER delete failing tests - fix the code
- Test file: `*.test.ts` alongside source (100+ test files)
+- Test file: `*.test.ts` alongside source (163 test files)
 - BDD comments: `//#given`, `//#when`, `//#then`

 ## CONVENTIONS
@@ -180,7 +180,7 @@ oh-my-opencode/
 - **Build**: `bun build` (ESM) + `tsc --emitDeclarationOnly`
 - **Exports**: Barrel pattern via index.ts
 - **Naming**: kebab-case dirs, `createXXXHook`/`createXXXTool` factories
- **Testing**: BDD comments, 100+ test files
+- **Testing**: BDD comments, 163 test files
 - **Temperature**: 0.1 for code agents, max 0.3

 ## ANTI-PATTERNS
@@ -241,19 +241,22 @@ bun test               # 100+ test files

 | File | Lines | Description |
 |------|-------|-------------|
-| `src/features/background-agent/manager.ts` | 1556 | Task lifecycle, concurrency |
+| `src/features/background-agent/manager.ts` | 1642 | Task lifecycle, concurrency |
 | `src/features/builtin-skills/skills/git-master.ts` | 1107 | Git master skill definition |
-| `src/tools/delegate-task/executor.ts` | 983 | Category-based delegation executor |
-| `src/index.ts` | 924 | Main plugin entry |
-| `src/tools/lsp/client.ts` | 803 | LSP client operations |
-| `src/hooks/atlas/index.ts` | 770 | Orchestrator hook |
-| `src/tools/background-task/tools.ts` | 734 | Background task tools |
+| `src/index.ts` | 999 | Main plugin entry |
+| `src/tools/delegate-task/executor.ts` | 969 | Category-based delegation executor |
+| `src/tools/lsp/client.ts` | 851 | LSP client operations |
+| `src/tools/background-task/tools.ts` | 757 | Background task tools |
+| `src/hooks/atlas/index.ts` | 697 | Orchestrator hook |
 | `src/cli/config-manager.ts` | 667 | JSONC config parsing |
 | `src/features/skill-mcp-manager/manager.ts` | 640 | MCP client lifecycle |
 | `src/features/builtin-commands/templates/refactor.ts` | 619 | Refactor command template |
 | `src/agents/hephaestus.ts` | 618 | Autonomous deep worker agent |
+| `src/agents/utils.ts` | 571 | Agent creation, model fallback resolution |
+| `src/plugin-handlers/config-handler.ts` | 562 | Plugin config loading |
 | `src/tools/delegate-task/constants.ts` | 552 | Delegation constants |
 | `src/cli/install.ts` | 542 | Interactive CLI installer |
+| `src/hooks/task-continuation-enforcer.ts` | 530 | Task completion enforcement |
 | `src/agents/sisyphus.ts` | 530 | Main orchestrator agent |

 ## MCP ARCHITECTURE
--- a/bun.lock
+++ b/bun.lock
@@ -28,13 +28,13 @@
        "typescript": "^5.7.3",
      },
      "optionalDependencies": {
-        "oh-my-opencode-darwin-arm64": "3.3.0",
-        "oh-my-opencode-darwin-x64": "3.3.0",
-        "oh-my-opencode-linux-arm64": "3.3.0",
-        "oh-my-opencode-linux-arm64-musl": "3.3.0",
-        "oh-my-opencode-linux-x64": "3.3.0",
-        "oh-my-opencode-linux-x64-musl": "3.3.0",
-        "oh-my-opencode-windows-x64": "3.3.0",
+        "oh-my-opencode-darwin-arm64": "3.3.1",
+        "oh-my-opencode-darwin-x64": "3.3.1",
+        "oh-my-opencode-linux-arm64": "3.3.1",
+        "oh-my-opencode-linux-arm64-musl": "3.3.1",
+        "oh-my-opencode-linux-x64": "3.3.1",
+        "oh-my-opencode-linux-x64-musl": "3.3.1",
+        "oh-my-opencode-windows-x64": "3.3.1",
      },
    },
  },
@@ -226,19 +226,19 @@

    "object-inspect": ["object-inspect@1.13.4", "", {}, "sha512-W67iLl4J2EXEGTbfeHCffrjDfitvLANg0UlX3wFUUSTx92KXRFegMHUVgSqE+wvhAbi4WqjGg9czysTV2Epbew=="],

-    "oh-my-opencode-darwin-arm64": ["oh-my-opencode-darwin-arm64@3.3.0", "", { "os": "darwin", "cpu": "arm64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-P2kZKJqZaA4j0qtGM3I8+ZeH204ai27ni/OXLjtFdOewRjJgrahxaC1XslgK7q/KU9fXz6BQfEqAjbvyPf/rgQ=="],
+    "oh-my-opencode-darwin-arm64": ["oh-my-opencode-darwin-arm64@3.3.1", "", { "os": "darwin", "cpu": "arm64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-R+o42Km6bsIaW6D3I8uu2HCF3BjIWqa/fg38W5y4hJEOw4mL0Q7uV4R+0vtrXRHo9crXTK9ag0fqVQUm+Y6iAQ=="],

-    "oh-my-opencode-darwin-x64": ["oh-my-opencode-darwin-x64@3.3.0", "", { "os": "darwin", "cpu": "x64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-RopOorbW1WyhMQJ+ipuqiOA1GICS+3IkOwNyEe0KZlCLpoEDTyFopIL87HSns+gEQPMxnknroDp8lzxn1AKgjw=="],
+    "oh-my-opencode-darwin-x64": ["oh-my-opencode-darwin-x64@3.3.1", "", { "os": "darwin", "cpu": "x64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-7VTbpR1vH3OEkoJxBKtYuxFPX8M3IbJKoeHWME9iK6FpT11W1ASsjyuhvzB1jcxSeqF8ddMnjitlG5ub6h5EVw=="],

-    "oh-my-opencode-linux-arm64": ["oh-my-opencode-linux-arm64@3.3.0", "", { "os": "linux", "cpu": "arm64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-297iEfuK+05g+q64crPW78Zbgm/j5PGjDDweSPkZ6rI6SEfHMvOIkGxMvN8gugM3zcH8FOCQXoY2nC8b6x3pwQ=="],
+    "oh-my-opencode-linux-arm64": ["oh-my-opencode-linux-arm64@3.3.1", "", { "os": "linux", "cpu": "arm64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-BZ/r/CFlvbOxkdZZrRoT16xFOjibRZHuwQnaE4f0JvOzgK6/HWp3zJI1+2/aX/oK5GA6lZxNWRrJC/SKUi8LEg=="],

-    "oh-my-opencode-linux-arm64-musl": ["oh-my-opencode-linux-arm64-musl@3.3.0", "", { "os": "linux", "cpu": "arm64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-oVxP0+yn66HQYfrl9QT6I7TumRzciuPB4z24+PwKEVcDjPbWXQqLY1gwOGHZAQBPLf0vwewv9ybEDVD42RRH4g=="],
+    "oh-my-opencode-linux-arm64-musl": ["oh-my-opencode-linux-arm64-musl@3.3.1", "", { "os": "linux", "cpu": "arm64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-U90Wruf21h+CJbtcrS7MeTAc/5VOF6RI+5jr7qj/cCxjXNJtjhyJdz/maehArjtgf304+lYCM/Mh1i+G2D3YFQ=="],

-    "oh-my-opencode-linux-x64": ["oh-my-opencode-linux-x64@3.3.0", "", { "os": "linux", "cpu": "x64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-k9LoLkisLJwJNR1J0Bh1bjGtGBkl5D9WzFPSdZCAlyiT6TgG9w5erPTlXqtl2Lt0We5tYUVYlkEIHRMK/ugNsQ=="],
+    "oh-my-opencode-linux-x64": ["oh-my-opencode-linux-x64@3.3.1", "", { "os": "linux", "cpu": "x64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-sYzohSNdwsAhivbXcbhPdF1qqQi2CCI7FSgbmvvfBOMyZ8HAgqOFqYW2r3GPdmtywzkjOTvCzTG56FZwEjx15w=="],

-    "oh-my-opencode-linux-x64-musl": ["oh-my-opencode-linux-x64-musl@3.3.0", "", { "os": "linux", "cpu": "x64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-7asXCeae7wBxJrzoZ7J6Yo1oaOxwUN3bTO7jWurCTMs5TDHO+pEHysgv/nuF1jvj1T+r1vg1H5ZmopuKy1qvXg=="],
+    "oh-my-opencode-linux-x64-musl": ["oh-my-opencode-linux-x64-musl@3.3.1", "", { "os": "linux", "cpu": "x64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-aG5pZ4eWS0YSGUicOnjMkUPrIqQV4poYF+d9SIvrfvlaMcK6WlQn7jXzgNCwJsfGn5lyhSmjshZBEU+v79Ua3w=="],

-    "oh-my-opencode-windows-x64": ["oh-my-opencode-windows-x64@3.3.0", "", { "os": "win32", "cpu": "x64", "bin": { "oh-my-opencode": "bin/oh-my-opencode.exe" } }, "sha512-ABvwfaXb2xdrpbivzlPPJzIm5vXp+QlVakkaHEQf3TU6Mi/+fehH6Qhq/KMh66FDO2gq3xmxbH7nktHRQp9kNA=="],
+    "oh-my-opencode-windows-x64": ["oh-my-opencode-windows-x64@3.3.1", "", { "os": "win32", "cpu": "x64", "bin": { "oh-my-opencode": "bin/oh-my-opencode.exe" } }, "sha512-FGH7cnzBqNwjSkzCDglMsVttaq+MsykAxa7ehaFK+0dnBZArvllS3W13a3dGaANHMZzfK0vz8hNDUdVi7Z63cA=="],

    "on-finished": ["on-finished@2.4.1", "", { "dependencies": { "ee-first": "1.1.1" } }, "sha512-oVlzkg3ENAhCk2zdv7IJwd/QUD4z2RxRwpkcGY8psCVcCYZNq4wYnVWALHM+brtuJjePWiYF/ClmuDr8Ch5+kg=="],

--- a/package.json
+++ b/package.json
@@ -1,6 +1,6 @@
 {
  "name": "oh-my-opencode",
-  "version": "3.3.2",
+  "version": "3.4.0",
  "description": "The Best AI Agent Harness - Batteries-Included OpenCode Plugin with Multi-Model Orchestration, Parallel Background Agents, and Crafted LSP/AST Tools",
  "main": "dist/index.js",
  "types": "dist/index.d.ts",
@@ -74,13 +74,13 @@
    "typescript": "^5.7.3"
  },
  "optionalDependencies": {
-    "oh-my-opencode-darwin-arm64": "3.3.2",
-    "oh-my-opencode-darwin-x64": "3.3.2",
-    "oh-my-opencode-linux-arm64": "3.3.2",
-    "oh-my-opencode-linux-arm64-musl": "3.3.2",
-    "oh-my-opencode-linux-x64": "3.3.2",
-    "oh-my-opencode-linux-x64-musl": "3.3.2",
-    "oh-my-opencode-windows-x64": "3.3.2"
+    "oh-my-opencode-darwin-arm64": "3.4.0",
+    "oh-my-opencode-darwin-x64": "3.4.0",
+    "oh-my-opencode-linux-arm64": "3.4.0",
+    "oh-my-opencode-linux-arm64-musl": "3.4.0",
+    "oh-my-opencode-linux-x64": "3.4.0",
+    "oh-my-opencode-linux-x64-musl": "3.4.0",
+    "oh-my-opencode-windows-x64": "3.4.0"
  },
  "trustedDependencies": [
    "@ast-grep/cli",
--- a/packages/darwin-arm64/package.json
+++ b/packages/darwin-arm64/package.json
@@ -1,6 +1,6 @@
 {
  "name": "oh-my-opencode-darwin-arm64",
-  "version": "3.3.2",
+  "version": "3.4.0",
  "description": "Platform-specific binary for oh-my-opencode (darwin-arm64)",
  "license": "MIT",
  "repository": {
--- a/packages/darwin-x64/package.json
+++ b/packages/darwin-x64/package.json
@@ -1,6 +1,6 @@
 {
  "name": "oh-my-opencode-darwin-x64",
-  "version": "3.3.2",
+  "version": "3.4.0",
  "description": "Platform-specific binary for oh-my-opencode (darwin-x64)",
  "license": "MIT",
  "repository": {
--- a/packages/linux-arm64-musl/package.json
+++ b/packages/linux-arm64-musl/package.json
@@ -1,6 +1,6 @@
 {
  "name": "oh-my-opencode-linux-arm64-musl",
-  "version": "3.3.2",
+  "version": "3.4.0",
  "description": "Platform-specific binary for oh-my-opencode (linux-arm64-musl)",
  "license": "MIT",
  "repository": {
--- a/packages/linux-arm64/package.json
+++ b/packages/linux-arm64/package.json
@@ -1,6 +1,6 @@
 {
  "name": "oh-my-opencode-linux-arm64",
-  "version": "3.3.2",
+  "version": "3.4.0",
  "description": "Platform-specific binary for oh-my-opencode (linux-arm64)",
  "license": "MIT",
  "repository": {
--- a/packages/linux-x64-musl/package.json
+++ b/packages/linux-x64-musl/package.json
@@ -1,6 +1,6 @@
 {
  "name": "oh-my-opencode-linux-x64-musl",
-  "version": "3.3.2",
+  "version": "3.4.0",
  "description": "Platform-specific binary for oh-my-opencode (linux-x64-musl)",
  "license": "MIT",
  "repository": {
--- a/packages/linux-x64/package.json
+++ b/packages/linux-x64/package.json
@@ -1,6 +1,6 @@
 {
  "name": "oh-my-opencode-linux-x64",
-  "version": "3.3.2",
+  "version": "3.4.0",
  "description": "Platform-specific binary for oh-my-opencode (linux-x64)",
  "license": "MIT",
  "repository": {
--- a/packages/windows-x64/package.json
+++ b/packages/windows-x64/package.json
@@ -1,6 +1,6 @@
 {
  "name": "oh-my-opencode-windows-x64",
-  "version": "3.3.2",
+  "version": "3.4.0",
  "description": "Platform-specific binary for oh-my-opencode (windows-x64)",
  "license": "MIT",
  "repository": {
--- a/signatures/cla.json
+++ b/signatures/cla.json
@@ -1239,6 +1239,30 @@
      "created_at": "2026-02-08T02:34:46Z",
      "repoId": 1108837393,
      "pullRequestNo": 1641
+    },
+    {
+      "name": "JunyeongChoi0",
+      "id": 99778164,
+      "comment_id": 3867461224,
+      "created_at": "2026-02-08T16:02:31Z",
+      "repoId": 1108837393,
+      "pullRequestNo": 1674
+    },
+    {
+      "name": "aliozdenisik",
+      "id": 106994209,
+      "comment_id": 3867619266,
+      "created_at": "2026-02-08T17:12:34Z",
+      "repoId": 1108837393,
+      "pullRequestNo": 1676
+    },
+    {
+      "name": "mrm007",
+      "id": 3297808,
+      "comment_id": 3868350953,
+      "created_at": "2026-02-08T21:41:35Z",
+      "repoId": 1108837393,
+      "pullRequestNo": 1680
    }
  ]
 }
--- a/src/AGENTS.md
+++ b/src/AGENTS.md
@@ -0,0 +1,128 @@
+# AGENTS KNOWLEDGE BASE
+
+## OVERVIEW
+
+Main plugin entry point and orchestration layer. 1000+ lines of plugin initialization, hook registration, tool composition, and lifecycle management.
+
+**Core Responsibilities:**
+- Plugin initialization and configuration loading
+- 40+ lifecycle hooks orchestration  
+- 25+ tools composition and filtering
+- Background agent management
+- Session state coordination
+- MCP server lifecycle
+- Tmux integration
+- Claude Code compatibility layer
+
+## STRUCTURE
+```
+src/
+├── index.ts                          # Main plugin entry (1000 lines) - orchestration layer
+├── index.compaction-model-agnostic.static.test.ts  # Compaction hook tests
+├── agents/                           # 11 AI agents (16 files)
+├── cli/                              # CLI commands (9 files) 
+├── config/                           # Schema validation (3 files)
+├── features/                         # Background features (20+ files)
+├── hooks/                            # 40+ lifecycle hooks (14 files)
+├── mcp/                              # MCP server configs (7 files)
+├── plugin-handlers/                  # Config loading (3 files)
+├── shared/                           # Utilities (70 files)
+└── tools/                            # 25+ tools (15 files)
+```
+
+## KEY COMPONENTS
+
+**Plugin Initialization:**
+- `OhMyOpenCodePlugin()`: Main plugin factory (lines 124-841)
+- Configuration loading via `loadPluginConfig()`
+- Hook registration with safe creation patterns
+- Tool composition and disabled tool filtering
+
+**Lifecycle Management:**
+- 40+ hooks: session recovery, continuation enforcers, compaction, context injection
+- Background agent coordination via `BackgroundManager`
+- Tmux session management for multi-pane workflows
+- MCP server lifecycle via `SkillMcpManager`
+
+**Tool Ecosystem:**
+- 25+ tools: LSP, AST-grep, delegation, background tasks, skills
+- Tool filtering based on agent permissions and user config
+- Metadata restoration for tool outputs
+
+**Integration Points:**
+- Claude Code compatibility hooks and commands
+- OpenCode SDK client interactions
+- Session state persistence and recovery
+- Model variant resolution and application
+
+## HOOK REGISTRATION PATTERNS
+
+**Safe Hook Creation:**
+```typescript
+const hook = isHookEnabled("hook-name")
+  ? safeCreateHook("hook-name", () => createHookFactory(ctx), { enabled: safeHookEnabled })
+  : null;
+```
+
+**Hook Categories:**
+- **Session Management**: recovery, notification, compaction
+- **Continuation**: todo/task enforcers, stop guards
+- **Context**: injection, rules, directory content
+- **Tool Enhancement**: output truncation, error recovery, validation
+- **Agent Coordination**: usage reminders, babysitting, delegation
+
+## TOOL COMPOSITION
+
+**Core Tools:**
+```typescript
+const allTools: Record<string, ToolDefinition> = {
+  ...builtinTools,           // Basic file/session operations
+  ...createGrepTools(ctx),   // Content search
+  ...createAstGrepTools(ctx), // AST-aware refactoring
+  task: delegateTask,        // Agent delegation
+  skill: skillTool,          // Skill execution
+  // ... 20+ more tools
+};
+```
+
+**Tool Filtering:**
+- Agent permission-based restrictions
+- User-configured disabled tools
+- Dynamic tool availability based on session state
+
+## SESSION LIFECYCLE
+
+**Session Events:**
+- `session.created`: Initialize session state, tmux setup
+- `session.deleted`: Cleanup resources, clear caches
+- `message.updated`: Update agent assignments
+- `session.error`: Trigger recovery mechanisms
+
+**Continuation Flow:**
+1. User message triggers agent selection
+2. Model/variant resolution applied
+3. Tools execute with hook interception
+4. Continuation enforcers monitor completion
+5. Session compaction preserves context
+
+## CONFIGURATION INTEGRATION
+
+**Plugin Config Loading:**
+- Project + user config merging
+- Schema validation via Zod
+- Migration support for legacy configs
+- Dynamic feature enablement
+
+**Runtime Configuration:**
+- Hook enablement based on `disabled_hooks`
+- Tool filtering via `disabled_tools`
+- Agent overrides and category definitions
+- Experimental feature toggles
+
+## ANTI-PATTERNS
+
+- **Direct hook exports**: All hooks created via factories for testability
+- **Global state pollution**: Session-scoped state management
+- **Synchronous blocking**: Async-first architecture with background coordination
+- **Tight coupling**: Plugin components communicate via events, not direct calls
+- **Memory leaks**: Proper cleanup on session deletion and plugin unload
--- a/src/agents/AGENTS.md
+++ b/src/agents/AGENTS.md
@@ -2,7 +2,7 @@

 ## OVERVIEW

-11 AI agents for multi-model orchestration. Each agent has factory function + metadata + fallback chains.
+32 files containing AI agents and utilities for multi-model orchestration. Each agent has factory function + metadata + fallback chains.

 **Primary Agents** (respect UI model selection):
 - Sisyphus, Atlas, Prometheus
--- a/src/agents/atlas/default.ts
+++ b/src/agents/atlas/default.ts
@@ -274,13 +274,13 @@ ACCUMULATED WISDOM:

 **For exploration (explore/librarian)**: ALWAYS background
 \`\`\`typescript
-task(subagent_type="explore", run_in_background=true, ...)
-task(subagent_type="librarian", run_in_background=true, ...)
+task(subagent_type="explore", load_skills=[], run_in_background=true, ...)
+task(subagent_type="librarian", load_skills=[], run_in_background=true, ...)
 \`\`\`

 **For task execution**: NEVER background
 \`\`\`typescript
-task(category="...", run_in_background=false, ...)
+task(category="...", load_skills=[...], run_in_background=false, ...)
 \`\`\`

 **Parallel task groups**: Invoke multiple in ONE message
--- a/src/agents/atlas/gpt.ts
+++ b/src/agents/atlas/gpt.ts
@@ -231,12 +231,12 @@ ACCUMULATED WISDOM: [from notepad]
 <parallel_execution>
 **Exploration (explore/librarian)**: ALWAYS background
 \`\`\`typescript
-task(subagent_type="explore", run_in_background=true, ...)
+task(subagent_type="explore", load_skills=[], run_in_background=true, ...)
 \`\`\`

 **Task execution**: NEVER background
 \`\`\`typescript
-task(category="...", run_in_background=false, ...)
+task(category="...", load_skills=[...], run_in_background=false, ...)
 \`\`\`

 **Parallel task groups**: Invoke multiple in ONE message
--- a/src/agents/dynamic-agent-prompt-builder.ts
+++ b/src/agents/dynamic-agent-prompt-builder.ts
@@ -1,8 +1,8 @@
-import type { AgentPromptMetadata, BuiltinAgentName } from "./types"
+import type { AgentPromptMetadata } from "./types"
 import { truncateDescription } from "../shared/truncate-description"

 export interface AvailableAgent {
-  name: BuiltinAgentName
+  name: string
  description: string
  metadata: AgentPromptMetadata
 }
--- a/src/agents/prometheus/high-accuracy-mode.ts
+++ b/src/agents/prometheus/high-accuracy-mode.ts
@@ -17,6 +17,7 @@ export const PROMETHEUS_HIGH_ACCURACY_MODE = `# PHASE 3: PLAN GENERATION
 while (true) {
  const result = task(
    subagent_type="momus",
+    load_skills=[],
    prompt=".sisyphus/plans/{name}.md",
    run_in_background=false
  )
--- a/src/agents/prometheus/interview-mode.ts
+++ b/src/agents/prometheus/interview-mode.ts
@@ -66,8 +66,8 @@ Or should I just note down this single fix?"
 **Research First:**
 \`\`\`typescript
 // Prompt structure: CONTEXT (what I'm doing) + GOAL (what I'm trying to achieve) + QUESTION (what I need to know) + REQUEST (what to find)
-task(subagent_type="explore", prompt="I'm refactoring [target] and need to understand its impact scope before making changes. Find all usages via lsp_find_references - show calling code, patterns of use, and potential breaking points.", run_in_background=true)
-task(subagent_type="explore", prompt="I'm about to modify [affected code] and need to ensure behavior preservation. Find existing test coverage - which tests exercise this code, what assertions exist, and any gaps in coverage.", run_in_background=true)
+task(subagent_type="explore", load_skills=[], prompt="I'm refactoring [target] and need to understand its impact scope before making changes. Find all usages via lsp_find_references - show calling code, patterns of use, and potential breaking points.", run_in_background=true)
+task(subagent_type="explore", load_skills=[], prompt="I'm about to modify [affected code] and need to ensure behavior preservation. Find existing test coverage - which tests exercise this code, what assertions exist, and any gaps in coverage.", run_in_background=true)
 \`\`\`

 **Interview Focus:**
@@ -91,9 +91,9 @@ task(subagent_type="explore", prompt="I'm about to modify [affected code] and ne
 \`\`\`typescript
 // Launch BEFORE asking user questions
 // Prompt structure: CONTEXT + GOAL + QUESTION + REQUEST
-task(subagent_type="explore", prompt="I'm building a new [feature] and want to maintain codebase consistency. Find similar implementations in this project - their structure, patterns used, and conventions to follow.", run_in_background=true)
-task(subagent_type="explore", prompt="I'm adding [feature type] to the project and need to understand existing conventions. Find how similar features are organized - file structure, naming patterns, and architectural approach.", run_in_background=true)
-task(subagent_type="librarian", prompt="I'm implementing [technology] and want to follow established best practices. Find official documentation and community recommendations - setup patterns, common pitfalls, and production-ready examples.", run_in_background=true)
+task(subagent_type="explore", load_skills=[], prompt="I'm building a new [feature] and want to maintain codebase consistency. Find similar implementations in this project - their structure, patterns used, and conventions to follow.", run_in_background=true)
+task(subagent_type="explore", load_skills=[], prompt="I'm adding [feature type] to the project and need to understand existing conventions. Find how similar features are organized - file structure, naming patterns, and architectural approach.", run_in_background=true)
+task(subagent_type="librarian", load_skills=[], prompt="I'm implementing [technology] and want to follow established best practices. Find official documentation and community recommendations - setup patterns, common pitfalls, and production-ready examples.", run_in_background=true)
 \`\`\`

 **Interview Focus** (AFTER research):
@@ -132,7 +132,7 @@ Based on your stack, I'd recommend NextAuth.js - it integrates well with Next.js

 Run this check:
 \`\`\`typescript
-task(subagent_type="explore", prompt="I'm assessing this project's test setup before planning work that may require TDD. I need to understand what testing capabilities exist. Find test infrastructure: package.json test scripts, config files (jest.config, vitest.config, pytest.ini), and existing test files. Report: 1) Does test infra exist? 2) What framework? 3) Example test patterns.", run_in_background=true)
+task(subagent_type="explore", load_skills=[], prompt="I'm assessing this project's test setup before planning work that may require TDD. I need to understand what testing capabilities exist. Find test infrastructure: package.json test scripts, config files (jest.config, vitest.config, pytest.ini), and existing test files. Report: 1) Does test infra exist? 2) What framework? 3) Example test patterns.", run_in_background=true)
 \`\`\`

 #### Step 2: Ask the Test Question (MANDATORY)
@@ -230,13 +230,13 @@ Add to draft immediately:

 **Research First:**
 \`\`\`typescript
-task(subagent_type="explore", prompt="I'm planning architectural changes and need to understand the current system design. Find existing architecture: module boundaries, dependency patterns, data flow, and key abstractions used.", run_in_background=true)
-task(subagent_type="librarian", prompt="I'm designing architecture for [domain] and want to make informed decisions. Find architectural best practices - proven patterns, trade-offs, and lessons learned from similar systems.", run_in_background=true)
+task(subagent_type="explore", load_skills=[], prompt="I'm planning architectural changes and need to understand the current system design. Find existing architecture: module boundaries, dependency patterns, data flow, and key abstractions used.", run_in_background=true)
+task(subagent_type="librarian", load_skills=[], prompt="I'm designing architecture for [domain] and want to make informed decisions. Find architectural best practices - proven patterns, trade-offs, and lessons learned from similar systems.", run_in_background=true)
 \`\`\`

 **Oracle Consultation** (recommend when stakes are high):
 \`\`\`typescript
-task(subagent_type="oracle", prompt="Architecture consultation needed: [context]...", run_in_background=false)
+task(subagent_type="oracle", load_skills=[], prompt="Architecture consultation needed: [context]...", run_in_background=false)
 \`\`\`

 **Interview Focus:**
@@ -253,9 +253,9 @@ task(subagent_type="oracle", prompt="Architecture consultation needed: [context]

 **Parallel Investigation:**
 \`\`\`typescript
-task(subagent_type="explore", prompt="I'm researching how to implement [feature] and need to understand current approach. Find how X is currently handled in this codebase - implementation details, edge cases covered, and any known limitations.", run_in_background=true)
-task(subagent_type="librarian", prompt="I'm implementing Y and need authoritative guidance. Find official documentation - API reference, configuration options, and recommended usage patterns.", run_in_background=true)
-task(subagent_type="librarian", prompt="I'm looking for battle-tested implementations of Z. Find open source projects that solve this - focus on production-quality code, how they handle edge cases, and any gotchas documented.", run_in_background=true)
+task(subagent_type="explore", load_skills=[], prompt="I'm researching how to implement [feature] and need to understand current approach. Find how X is currently handled in this codebase - implementation details, edge cases covered, and any known limitations.", run_in_background=true)
+task(subagent_type="librarian", load_skills=[], prompt="I'm implementing Y and need authoritative guidance. Find official documentation - API reference, configuration options, and recommended usage patterns.", run_in_background=true)
+task(subagent_type="librarian", load_skills=[], prompt="I'm looking for battle-tested implementations of Z. Find open source projects that solve this - focus on production-quality code, how they handle edge cases, and any gotchas documented.", run_in_background=true)
 \`\`\`

 **Interview Focus:**
@@ -281,17 +281,17 @@ task(subagent_type="librarian", prompt="I'm looking for battle-tested implementa

 **For Understanding Codebase:**
 \`\`\`typescript
-task(subagent_type="explore", prompt="I'm working on [topic] and need to understand how it's organized in this project. Find all related files - show the structure, patterns used, and conventions I should follow.", run_in_background=true)
+task(subagent_type="explore", load_skills=[], prompt="I'm working on [topic] and need to understand how it's organized in this project. Find all related files - show the structure, patterns used, and conventions I should follow.", run_in_background=true)
 \`\`\`

 **For External Knowledge:**
 \`\`\`typescript
-task(subagent_type="librarian", prompt="I'm integrating [library] and need to understand [specific feature]. Find official documentation - API details, configuration options, and recommended best practices.", run_in_background=true)
+task(subagent_type="librarian", load_skills=[], prompt="I'm integrating [library] and need to understand [specific feature]. Find official documentation - API details, configuration options, and recommended best practices.", run_in_background=true)
 \`\`\`

 **For Implementation Examples:**
 \`\`\`typescript
-task(subagent_type="librarian", prompt="I'm implementing [feature] and want to learn from existing solutions. Find open source implementations - focus on production-quality code, architecture decisions, and common patterns.", run_in_background=true)
+task(subagent_type="librarian", load_skills=[], prompt="I'm implementing [feature] and want to learn from existing solutions. Find open source implementations - focus on production-quality code, architecture decisions, and common patterns.", run_in_background=true)
 \`\`\`

 ## Interview Mode Anti-Patterns
--- a/src/agents/prometheus/plan-generation.ts
+++ b/src/agents/prometheus/plan-generation.ts
@@ -61,6 +61,7 @@ todoWrite([
 \`\`\`typescript
 task(
  subagent_type="metis",
+  load_skills=[],
  prompt=\`Review this planning session before I generate the work plan:

  **User's Goal**: {summarize what user wants}
--- a/src/agents/utils.test.ts
+++ b/src/agents/utils.test.ts
@@ -249,6 +249,222 @@ describe("createBuiltinAgents with model overrides", () => {
    expect(agents.sisyphus.prompt).toContain("frontend-ui-ux")
    expect(agents.sisyphus.prompt).toContain("git-master")
  })
+
+  test("includes custom agents in orchestrator prompts when provided via config", async () => {
+    // #given
+    const fetchSpy = spyOn(shared, "fetchAvailableModels").mockResolvedValue(
+      new Set([
+        "anthropic/claude-opus-4-6",
+        "kimi-for-coding/k2p5",
+        "opencode/kimi-k2.5-free",
+        "zai-coding-plan/glm-4.7",
+        "opencode/glm-4.7-free",
+        "openai/gpt-5.2",
+      ])
+    )
+
+    const customAgentSummaries = [
+      {
+        name: "researcher",
+        description: "Research agent for deep analysis",
+        hidden: false,
+      },
+    ]
+
+    try {
+      // #when
+      const agents = await createBuiltinAgents(
+        [],
+        {},
+        undefined,
+        TEST_DEFAULT_MODEL,
+        undefined,
+        undefined,
+        [],
+        customAgentSummaries
+      )
+
+      // #then
+      expect(agents.sisyphus.prompt).toContain("researcher")
+      expect(agents.hephaestus.prompt).toContain("researcher")
+      expect(agents.atlas.prompt).toContain("researcher")
+    } finally {
+      fetchSpy.mockRestore()
+    }
+  })
+
+  test("excludes hidden custom agents from orchestrator prompts", async () => {
+    // #given
+    const fetchSpy = spyOn(shared, "fetchAvailableModels").mockResolvedValue(
+      new Set(["anthropic/claude-opus-4-6", "openai/gpt-5.2"])
+    )
+
+    const customAgentSummaries = [
+      {
+        name: "hidden-agent",
+        description: "Should never show",
+        hidden: true,
+      },
+    ]
+
+    try {
+      // #when
+      const agents = await createBuiltinAgents(
+        [],
+        {},
+        undefined,
+        TEST_DEFAULT_MODEL,
+        undefined,
+        undefined,
+        [],
+        customAgentSummaries
+      )
+
+      // #then
+      expect(agents.sisyphus.prompt).not.toContain("hidden-agent")
+      expect(agents.hephaestus.prompt).not.toContain("hidden-agent")
+      expect(agents.atlas.prompt).not.toContain("hidden-agent")
+    } finally {
+      fetchSpy.mockRestore()
+    }
+  })
+
+  test("excludes disabled custom agents from orchestrator prompts", async () => {
+    // #given
+    const fetchSpy = spyOn(shared, "fetchAvailableModels").mockResolvedValue(
+      new Set(["anthropic/claude-opus-4-6", "openai/gpt-5.2"])
+    )
+
+    const customAgentSummaries = [
+      {
+        name: "disabled-agent",
+        description: "Should never show",
+        disabled: true,
+      },
+    ]
+
+    try {
+      // #when
+      const agents = await createBuiltinAgents(
+        [],
+        {},
+        undefined,
+        TEST_DEFAULT_MODEL,
+        undefined,
+        undefined,
+        [],
+        customAgentSummaries
+      )
+
+      // #then
+      expect(agents.sisyphus.prompt).not.toContain("disabled-agent")
+      expect(agents.hephaestus.prompt).not.toContain("disabled-agent")
+      expect(agents.atlas.prompt).not.toContain("disabled-agent")
+    } finally {
+      fetchSpy.mockRestore()
+    }
+  })
+
+  test("excludes custom agents when disabledAgents contains their name (case-insensitive)", async () => {
+    // #given
+    const fetchSpy = spyOn(shared, "fetchAvailableModels").mockResolvedValue(
+      new Set(["anthropic/claude-opus-4-6", "openai/gpt-5.2"])
+    )
+
+    const disabledAgents = ["ReSeArChEr"]
+    const customAgentSummaries = [
+      {
+        name: "researcher",
+        description: "Should never show",
+      },
+    ]
+
+    try {
+      // #when
+      const agents = await createBuiltinAgents(
+        disabledAgents,
+        {},
+        undefined,
+        TEST_DEFAULT_MODEL,
+        undefined,
+        undefined,
+        [],
+        customAgentSummaries
+      )
+
+      // #then
+      expect(agents.sisyphus.prompt).not.toContain("researcher")
+      expect(agents.hephaestus.prompt).not.toContain("researcher")
+      expect(agents.atlas.prompt).not.toContain("researcher")
+    } finally {
+      fetchSpy.mockRestore()
+    }
+  })
+
+  test("deduplicates custom agents case-insensitively", async () => {
+    // #given
+    const fetchSpy = spyOn(shared, "fetchAvailableModels").mockResolvedValue(
+      new Set(["anthropic/claude-opus-4-6", "openai/gpt-5.2"])
+    )
+
+    const customAgentSummaries = [
+      { name: "Researcher", description: "First" },
+      { name: "researcher", description: "Second" },
+    ]
+
+    try {
+      // #when
+      const agents = await createBuiltinAgents(
+        [],
+        {},
+        undefined,
+        TEST_DEFAULT_MODEL,
+        undefined,
+        undefined,
+        [],
+        customAgentSummaries
+      )
+
+      // #then
+      const matches = agents.sisyphus.prompt.match(/Custom agent: researcher/gi) ?? []
+      expect(matches.length).toBe(1)
+    } finally {
+      fetchSpy.mockRestore()
+    }
+  })
+
+  test("sanitizes custom agent strings for markdown tables", async () => {
+    // #given
+    const fetchSpy = spyOn(shared, "fetchAvailableModels").mockResolvedValue(
+      new Set(["anthropic/claude-opus-4-6", "openai/gpt-5.2"])
+    )
+
+    const customAgentSummaries = [
+      {
+        name: "table-agent",
+        description: "Line1\nAlpha | Beta",
+      },
+    ]
+
+    try {
+      // #when
+      const agents = await createBuiltinAgents(
+        [],
+        {},
+        undefined,
+        TEST_DEFAULT_MODEL,
+        undefined,
+        undefined,
+        [],
+        customAgentSummaries
+      )
+
+      // #then
+      expect(agents.sisyphus.prompt).toContain("Line1 Alpha \\| Beta")
+    } finally {
+      fetchSpy.mockRestore()
+    }
+  })
 })

 describe("createBuiltinAgents without systemDefaultModel", () => {
@@ -991,4 +1207,29 @@ describe("Deadlock prevention - fetchAvailableModels must not receive client", (
     fetchSpy.mockRestore?.()
     cacheSpy.mockRestore?.()
   })
+  test("Hephaestus variant override respects user config over hardcoded default", async () => {
+    // #given - user provides variant in config
+    const overrides = {
+      hephaestus: { variant: "high" },
+    }
+
+    // #when
+    const agents = await createBuiltinAgents([], overrides, undefined, TEST_DEFAULT_MODEL)
+
+    // #then - user variant takes precedence over hardcoded "medium"
+    expect(agents.hephaestus).toBeDefined()
+    expect(agents.hephaestus.variant).toBe("high")
+  })
+
+  test("Hephaestus uses default variant when no user override provided", async () => {
+    // #given - no variant override in config
+    const overrides = {}
+
+    // #when
+    const agents = await createBuiltinAgents([], overrides, undefined, TEST_DEFAULT_MODEL)
+
+    // #then - default "medium" variant is applied
+    expect(agents.hephaestus).toBeDefined()
+    expect(agents.hephaestus.variant).toBe("medium")
+  })
 })
--- a/src/agents/utils.ts
+++ b/src/agents/utils.ts
@@ -11,7 +11,18 @@ import { createAtlasAgent, atlasPromptMetadata } from "./atlas"
 import { createMomusAgent, momusPromptMetadata } from "./momus"
 import { createHephaestusAgent } from "./hephaestus"
 import type { AvailableAgent, AvailableCategory, AvailableSkill } from "./dynamic-agent-prompt-builder"
-import { deepMerge, fetchAvailableModels, resolveModelPipeline, AGENT_MODEL_REQUIREMENTS, readConnectedProvidersCache, isModelAvailable, isAnyFallbackModelAvailable, isAnyProviderConnected, migrateAgentConfig } from "../shared"
+import {
+  deepMerge,
+  fetchAvailableModels,
+  resolveModelPipeline,
+  AGENT_MODEL_REQUIREMENTS,
+  readConnectedProvidersCache,
+  isModelAvailable,
+  isAnyFallbackModelAvailable,
+  isAnyProviderConnected,
+  migrateAgentConfig,
+  truncateDescription,
+} from "../shared"
 import { DEFAULT_CATEGORIES, CATEGORY_DESCRIPTIONS } from "../tools/delegate-task/constants"
 import { resolveMultipleSkills } from "../features/opencode-skill-loader/skill-content"
 import { createBuiltinSkills } from "../features/builtin-skills"
@@ -52,6 +63,64 @@ function isFactory(source: AgentSource): source is AgentFactory {
  return typeof source === "function"
 }

+type RegisteredAgentSummary = {
+  name: string
+  description: string
+}
+
+function sanitizeMarkdownTableCell(value: string): string {
+  return value
+    .replace(/\r?\n/g, " ")
+    .replace(/\|/g, "\\|")
+    .replace(/\s+/g, " ")
+    .trim()
+}
+
+function isRecord(value: unknown): value is Record<string, unknown> {
+  return typeof value === "object" && value !== null
+}
+
+function parseRegisteredAgentSummaries(input: unknown): RegisteredAgentSummary[] {
+  if (!Array.isArray(input)) return []
+
+  const result: RegisteredAgentSummary[] = []
+  for (const item of input) {
+    if (!isRecord(item)) continue
+
+    const name = typeof item.name === "string" ? item.name : undefined
+    if (!name) continue
+
+    const hidden = item.hidden
+    if (hidden === true) continue
+
+    const disabled = item.disabled
+    if (disabled === true) continue
+
+    const enabled = item.enabled
+    if (enabled === false) continue
+
+    const description = typeof item.description === "string" ? item.description : ""
+    result.push({ name, description: sanitizeMarkdownTableCell(description) })
+  }
+
+  return result
+}
+
+function buildCustomAgentMetadata(agentName: string, description: string): AgentPromptMetadata {
+  const shortDescription = sanitizeMarkdownTableCell(truncateDescription(description))
+  const safeAgentName = sanitizeMarkdownTableCell(agentName)
+  return {
+    category: "specialist",
+    cost: "CHEAP",
+    triggers: [
+      {
+        domain: `Custom agent: ${safeAgentName}`,
+        trigger: shortDescription || "Use when this agent's description matches the task",
+      },
+    ],
+  }
+}
+
 export function buildAgent(
  source: AgentSource,
  model: string,
@@ -233,13 +302,13 @@ export async function createBuiltinAgents(
  categories?: CategoriesConfig,
  gitMasterConfig?: GitMasterConfig,
  discoveredSkills: LoadedSkill[] = [],
-  client?: any,
+  customAgentSummaries?: unknown,
  browserProvider?: BrowserAutomationProvider,
  uiSelectedModel?: string,
  disabledSkills?: Set<string>
 ): Promise<Record<string, AgentConfig>> {
  const connectedProviders = readConnectedProvidersCache()
-  // IMPORTANT: Do NOT pass client to fetchAvailableModels during plugin initialization.
+  // IMPORTANT: Do NOT call OpenCode client APIs during plugin initialization.
  // This function is called from config handler, and calling client API causes deadlock.
  // See: https://github.com/code-yeongyu/oh-my-opencode/issues/1301
  const availableModels = await fetchAvailableModels(undefined, {
@@ -279,6 +348,10 @@ export async function createBuiltinAgents(

  const availableSkills: AvailableSkill[] = [...builtinAvailable, ...discoveredAvailable]

+  const registeredAgents = parseRegisteredAgentSummaries(customAgentSummaries)
+  const builtinAgentNames = new Set(Object.keys(agentSources).map((n) => n.toLowerCase()))
+  const disabledAgentNames = new Set(disabledAgents.map((n) => n.toLowerCase()))
+
  // Collect general agents first (for availableAgents), but don't add to result yet
  const pendingAgentConfigs: Map<string, AgentConfig> = new Map()

@@ -335,14 +408,27 @@ export async function createBuiltinAgents(
    // Store for later - will be added after sisyphus and hephaestus
    pendingAgentConfigs.set(name, config)

-    const metadata = agentMetadata[agentName]
-    if (metadata) {
-      availableAgents.push({
-        name: agentName,
-        description: config.description ?? "",
-        metadata,
-      })
-    }
+     const metadata = agentMetadata[agentName]
+     if (metadata) {
+       availableAgents.push({
+         name: agentName,
+         description: config.description ?? "",
+         metadata,
+       })
+     }
+   }
+
+  for (const agent of registeredAgents) {
+    const lowerName = agent.name.toLowerCase()
+    if (builtinAgentNames.has(lowerName)) continue
+    if (disabledAgentNames.has(lowerName)) continue
+    if (availableAgents.some((a) => a.name.toLowerCase() === lowerName)) continue
+
+    availableAgents.push({
+      name: agent.name,
+      description: agent.description,
+      metadata: buildCustomAgentMetadata(agent.name, agent.description),
+    })
  }

   const sisyphusOverride = agentOverrides["sisyphus"]
@@ -423,13 +509,13 @@ export async function createBuiltinAgents(
          availableCategories
        )

-        hephaestusConfig = { ...hephaestusConfig, variant: hephaestusResolvedVariant ?? "medium" }
-
+        if (!hephaestusOverride?.variant) {
+          hephaestusConfig = { ...hephaestusConfig, variant: hephaestusResolvedVariant ?? "medium" }
+        }
        const hepOverrideCategory = (hephaestusOverride as Record<string, unknown> | undefined)?.category as string | undefined
        if (hepOverrideCategory) {
          hephaestusConfig = applyCategoryOverride(hephaestusConfig, hepOverrideCategory, mergedCategories)
        }
-
        if (directory && hephaestusConfig.prompt) {
          const envContext = createEnvContext()
          hephaestusConfig = { ...hephaestusConfig, prompt: hephaestusConfig.prompt + envContext }
--- a/src/cli/AGENTS.md
+++ b/src/cli/AGENTS.md
@@ -2,7 +2,7 @@

 ## OVERVIEW

-CLI entry: `bunx oh-my-opencode`. 5 commands with Commander.js + @clack/prompts TUI.
+CLI entry: `bunx oh-my-opencode`. 70 CLI utilities and commands with Commander.js + @clack/prompts TUI.

 **Commands**: install (interactive setup), doctor (14 health checks), run (session launcher), get-local-version, mcp-oauth

--- a/src/cli/run/session-resolver.test.ts
+++ b/src/cli/run/session-resolver.test.ts
@@ -1,6 +1,8 @@
-import { describe, it, expect, beforeEach, mock, spyOn } from "bun:test"
-import { resolveSession } from "./session-resolver"
-import type { OpencodeClient } from "./types"
+/// <reference types="bun-types" />
+
+import { beforeEach, describe, expect, it, mock, spyOn } from "bun:test";
+import { resolveSession } from "./session-resolver";
+import type { OpencodeClient } from "./types";

 const createMockClient = (overrides: {
  getResult?: { error?: unknown; data?: { id: string } }
@@ -58,7 +60,9 @@ describe("resolveSession", () => {
    const result = resolveSession({ client: mockClient, sessionId })

    // then
-    await expect(result).rejects.toThrow(`Session not found: ${sessionId}`)
+    await Promise.resolve(
+      expect(result).rejects.toThrow(`Session not found: ${sessionId}`)
+    )
    expect(mockClient.session.get).toHaveBeenCalledWith({
      path: { id: sessionId },
    })
@@ -77,7 +81,12 @@ describe("resolveSession", () => {
    // then
    expect(result).toBe("new-session-id")
    expect(mockClient.session.create).toHaveBeenCalledWith({
-      body: { title: "oh-my-opencode run" },
+      body: {
+        title: "oh-my-opencode run",
+        permission: [
+          { permission: "question", action: "deny", pattern: "*" },
+        ],
+      },
    })
    expect(mockClient.session.get).not.toHaveBeenCalled()
  })
@@ -98,7 +107,12 @@ describe("resolveSession", () => {
    expect(result).toBe("retried-session-id")
    expect(mockClient.session.create).toHaveBeenCalledTimes(2)
    expect(mockClient.session.create).toHaveBeenCalledWith({
-      body: { title: "oh-my-opencode run" },
+      body: {
+        title: "oh-my-opencode run",
+        permission: [
+          { permission: "question", action: "deny", pattern: "*" },
+        ],
+      },
    })
  })

@@ -116,7 +130,9 @@ describe("resolveSession", () => {
    const result = resolveSession({ client: mockClient })

    // then
-    await expect(result).rejects.toThrow("Failed to create session after all retries")
+    await Promise.resolve(
+      expect(result).rejects.toThrow("Failed to create session after all retries")
+    )
    expect(mockClient.session.create).toHaveBeenCalledTimes(3)
  })

@@ -134,7 +150,9 @@ describe("resolveSession", () => {
    const result = resolveSession({ client: mockClient })

    // then
-    await expect(result).rejects.toThrow("Failed to create session after all retries")
+    await Promise.resolve(
+      expect(result).rejects.toThrow("Failed to create session after all retries")
+    )
    expect(mockClient.session.create).toHaveBeenCalledTimes(3)
  })
 })
--- a/src/cli/run/session-resolver.ts
+++ b/src/cli/run/session-resolver.ts
@@ -19,14 +19,18 @@ export async function resolveSession(options: {
    return sessionId
  }

-  let lastError: unknown
  for (let attempt = 1; attempt <= SESSION_CREATE_MAX_RETRIES; attempt++) {
    const res = await client.session.create({
-      body: { title: "oh-my-opencode run" },
+      body: {
+        title: "oh-my-opencode run",
+        // In CLI run mode there's no TUI to answer questions.
+        permission: [
+          { permission: "question", action: "deny" as const, pattern: "*" },
+        ],
+      } as any,
    })

    if (res.error) {
-      lastError = res.error
      console.error(
        pc.yellow(`Session create attempt ${attempt}/${SESSION_CREATE_MAX_RETRIES} failed:`)
      )
@@ -44,9 +48,6 @@ export async function resolveSession(options: {
      return res.data.id
    }

-    lastError = new Error(
-      `Unexpected response: ${JSON.stringify(res, null, 2)}`
-    )
    console.error(
      pc.yellow(
        `Session create attempt ${attempt}/${SESSION_CREATE_MAX_RETRIES}: No session ID returned`
--- a/src/config/AGENTS.md
+++ b/src/config/AGENTS.md
@@ -0,0 +1,93 @@
+**Generated:** 2026-02-08T16:45:00+09:00
+**Commit:** f2b7b759
+**Branch:** dev
+
+## OVERVIEW
+
+Zod schema definitions for plugin configuration. 455+ lines of type-safe config validation with JSONC support, multi-level inheritance, and comprehensive agent/category overrides.
+
+## STRUCTURE
+```
+config/
+├── schema.ts              # Main Zod schema (455 lines) - agents, categories, experimental features
+├── schema.test.ts         # Schema validation tests (17909 lines)
+└── index.ts               # Barrel export
+```
+
+## SCHEMA COMPONENTS
+
+**Agent Configuration:**
+- `AgentOverrideConfigSchema`: Model, variant, temperature, permissions, tools
+- `AgentOverridesSchema`: Per-agent overrides (sisyphus, hephaestus, prometheus, etc.)
+- `AgentPermissionSchema`: Tool access control (edit, bash, webfetch, task)
+
+**Category Configuration:**
+- `CategoryConfigSchema`: Model defaults, thinking budgets, tool restrictions
+- `CategoriesConfigSchema`: Named categories (visual-engineering, ultrabrain, deep, etc.)
+
+**Experimental Features:**
+- `ExperimentalConfigSchema`: Dynamic context pruning, task system, plugin timeouts
+- `DynamicContextPruningConfigSchema`: Intelligent context management
+
+**Built-in Enums:**
+- `AgentNameSchema`: sisyphus, hephaestus, prometheus, oracle, librarian, explore, multimodal-looker, metis, momus, atlas
+- `HookNameSchema`: 100+ hook names for lifecycle management
+- `BuiltinCommandNameSchema`: init-deep, ralph-loop, refactor, start-work
+- `BuiltinSkillNameSchema`: playwright, agent-browser, git-master
+
+## CONFIGURATION HIERARCHY
+
+1. **Project config** (`.opencode/oh-my-opencode.json`)
+2. **User config** (`~/.config/opencode/oh-my-opencode.json`)
+3. **Defaults** (hardcoded fallbacks)
+
+**Multi-level inheritance:** Project → User → Defaults
+
+## VALIDATION FEATURES
+
+- **JSONC support**: Comments and trailing commas
+- **Type safety**: Full TypeScript inference
+- **Migration support**: Legacy config compatibility
+- **Schema versioning**: $schema field for validation
+
+## KEY SCHEMAS
+
+| Schema | Purpose | Lines |
+|--------|---------|-------|
+| `OhMyOpenCodeConfigSchema` | Root config schema | 400+ |
+| `AgentOverrideConfigSchema` | Agent customization | 50+ |
+| `CategoryConfigSchema` | Task category defaults | 30+ |
+| `ExperimentalConfigSchema` | Beta features | 40+ |
+
+## USAGE PATTERNS
+
+**Agent Override:**
+```typescript
+agents: {
+  sisyphus: {
+    model: "anthropic/claude-opus-4-6",
+    variant: "max",
+    temperature: 0.1
+  }
+}
+```
+
+**Category Definition:**
+```typescript
+categories: {
+  "visual-engineering": {
+    model: "google/gemini-3-pro",
+    variant: "high"
+  }
+}
+```
+
+**Experimental Features:**
+```typescript
+experimental: {
+  dynamic_context_pruning: {
+    enabled: true,
+    notification: "detailed"
+  }
+}
+```
--- a/src/features/AGENTS.md
+++ b/src/features/AGENTS.md
@@ -2,61 +2,29 @@

 ## OVERVIEW

-17 feature modules: background agents, skill MCPs, builtin skills/commands, Claude Code compatibility layer, task management.
-
-**Feature Types**: Task orchestration, Skill definitions, Command templates, Claude Code loaders, Supporting utilities
+Background agents, skills, Claude Code compat, builtin commands, MCP managers, etc.

 ## STRUCTURE

-```
 features/
-├── background-agent/           # Task lifecycle (1556 lines)
-│   ├── manager.ts              # Launch → poll → complete
-│   └── concurrency.ts          # Per-provider limits
-├── builtin-skills/             # Core skills
-│   └── skills/                 # playwright, agent-browser, frontend-ui-ux, git-master, dev-browser
-├── builtin-commands/           # ralph-loop, refactor, ulw-loop, init-deep, start-work, cancel-ralph, stop-continuation
-├── claude-code-agent-loader/   # ~/.claude/agents/*.md
-├── claude-code-command-loader/ # ~/.claude/commands/*.md
-├── claude-code-mcp-loader/     # .mcp.json with ${VAR} expansion
-├── claude-code-plugin-loader/  # installed_plugins.json (486 lines)
-├── claude-code-session-state/  # Session persistence
-├── opencode-skill-loader/      # Skills from 6 directories (loader.ts 311 lines)
-├── context-injector/           # AGENTS.md/README.md injection
-├── boulder-state/              # Todo state persistence
-├── hook-message-injector/      # Message injection
-├── task-toast-manager/         # Background task notifications
-├── skill-mcp-manager/          # MCP client lifecycle (640 lines)
-├── tmux-subagent/              # Tmux session management (472 lines)
-├── mcp-oauth/                  # MCP OAuth handling
-└── claude-tasks/               # Task schema/storage - see AGENTS.md
-```
+├── background-agent/                      # Task lifecycle, concurrency (manager.ts 1642 lines)
+├── builtin-skills/                       # Skills like git-master (1107 lines)
+├── builtin-commands/                     # Commands like refactor (619 lines)
+├── skill-mcp-manager/                    # MCP client lifecycle (640 lines)
+├── claude-code-plugin-loader/            # Plugin loading
+├── claude-code-mcp-loader/               # MCP loading
+├── claude-code-session-state/            # Session state
+├── claude-code-command-loader/           # Command loading
+├── claude-code-agent-loader/             # Agent loading
+├── context-injector/                     # Context injection
+├── hook-message-injector/                # Message injection
+├── task-toast-manager/                   # Task toasts
+├── boulder-state/                        # State management
+├── tmux-subagent/                        # Tmux subagent
+├── mcp-oauth/                            # OAuth for MCP
+├── opencode-skill-loader/                # Skill loading
+├── tool-metadata-store/                  # Tool metadata

-## LOADER PRIORITY
+## HOW TO ADD

-| Type | Priority (highest first) |
-|------|--------------------------|
-| Commands | `.opencode/command/` > `~/.config/opencode/command/` > `.claude/commands/` |
-| Skills | `.opencode/skills/` > `~/.config/opencode/skills/` > `.claude/skills/` |
-| MCPs | `.claude/.mcp.json` > `.mcp.json` > `~/.claude/.mcp.json` |
-
-## BACKGROUND AGENT
-
- **Lifecycle**: `launch` → `poll` (2s) → `complete`
- **Stability**: 3 consecutive polls = idle
- **Concurrency**: Per-provider/model limits via `ConcurrencyManager`
- **Cleanup**: 30m TTL, 3m stale timeout
- **State**: Per-session Maps, cleaned on `session.deleted`
-
-## SKILL MCP
-
- **Lazy**: Clients created on first call
- **Transports**: stdio, http (SSE/Streamable)
- **Lifecycle**: 5m idle cleanup
-
-## ANTI-PATTERNS
-
- **Sequential delegation**: Use `task` parallel
- **Trust self-reports**: ALWAYS verify
- **Main thread blocks**: No heavy I/O in loader init
- **Direct state mutation**: Use managers for boulder/session state
+Create dir with index.ts, types.ts, etc.
--- a/src/features/background-agent/manager.test.ts
+++ b/src/features/background-agent/manager.test.ts
@@ -1123,6 +1123,99 @@ describe("BackgroundManager.tryCompleteTask", () => {
    expect(task.status).toBe("completed")
    expect(getPendingByParent(manager).get(task.parentSessionID)).toBeUndefined()
  })
+
+  test("should avoid overlapping promptAsync calls when tasks complete concurrently", async () => {
+    // given
+    type PromptAsyncBody = Record<string, unknown> & { noReply?: boolean }
+
+    let resolveMessages: ((value: { data: unknown[] }) => void) | undefined
+    const messagesBarrier = new Promise<{ data: unknown[] }>((resolve) => {
+      resolveMessages = resolve
+    })
+
+    const promptBodies: PromptAsyncBody[] = []
+    let promptInFlight = false
+    let rejectedCount = 0
+    let promptCallCount = 0
+
+    let releaseFirstPrompt: (() => void) | undefined
+    let resolveFirstStarted: (() => void) | undefined
+    const firstStarted = new Promise<void>((resolve) => {
+      resolveFirstStarted = resolve
+    })
+
+    const client = {
+      session: {
+        prompt: async () => ({}),
+        abort: async () => ({}),
+        messages: async () => messagesBarrier,
+        promptAsync: async (args: { path: { id: string }; body: PromptAsyncBody }) => {
+          promptBodies.push(args.body)
+
+          if (!promptInFlight) {
+            promptCallCount += 1
+            if (promptCallCount === 1) {
+              promptInFlight = true
+              resolveFirstStarted?.()
+              return await new Promise((resolve) => {
+                releaseFirstPrompt = () => {
+                  promptInFlight = false
+                  resolve({})
+                }
+              })
+            }
+
+            return {}
+          }
+
+          rejectedCount += 1
+          throw new Error("BUSY")
+        },
+      },
+    }
+
+    manager.shutdown()
+    manager = new BackgroundManager({ client, directory: tmpdir() } as unknown as PluginInput)
+
+    const parentSessionID = "parent-session"
+    const taskA = createMockTask({
+      id: "task-a",
+      sessionID: "session-a",
+      parentSessionID,
+    })
+    const taskB = createMockTask({
+      id: "task-b",
+      sessionID: "session-b",
+      parentSessionID,
+    })
+
+    getTaskMap(manager).set(taskA.id, taskA)
+    getTaskMap(manager).set(taskB.id, taskB)
+    getPendingByParent(manager).set(parentSessionID, new Set([taskA.id, taskB.id]))
+
+    // when
+    const completionA = tryCompleteTaskForTest(manager, taskA)
+    const completionB = tryCompleteTaskForTest(manager, taskB)
+    resolveMessages?.({ data: [] })
+
+    await firstStarted
+
+    // Give the second completion a chance to attempt promptAsync while the first is in-flight.
+    // In the buggy implementation, this triggers an overlap and increments rejectedCount.
+    for (let i = 0; i < 20; i++) {
+      await Promise.resolve()
+      if (rejectedCount > 0) break
+      if (promptBodies.length >= 2) break
+    }
+
+    releaseFirstPrompt?.()
+    await Promise.all([completionA, completionB])
+
+    // then
+    expect(rejectedCount).toBe(0)
+    expect(promptBodies.length).toBe(2)
+    expect(promptBodies.some((b) => b.noReply === false)).toBe(true)
+  })
 })

 describe("BackgroundManager.trackTask", () => {
@@ -1319,14 +1412,14 @@ describe("BackgroundManager - Non-blocking Queue Integration", () => {
  let manager: BackgroundManager
  let mockClient: ReturnType<typeof createMockClient>

-   function createMockClient() {
-     return {
-       session: {
-         create: async () => ({ data: { id: `ses_${crypto.randomUUID()}` } }),
-         get: async () => ({ data: { directory: "/test/dir" } }),
-         prompt: async () => ({}),
-         promptAsync: async () => ({}),
-         messages: async () => ({ data: [] }),
+    function createMockClient() {
+      return {
+        session: {
+          create: async (_args?: any) => ({ data: { id: `ses_${crypto.randomUUID()}` } }),
+          get: async () => ({ data: { directory: "/test/dir" } }),
+          prompt: async () => ({}),
+          promptAsync: async () => ({}),
+          messages: async () => ({ data: [] }),
         todo: async () => ({ data: [] }),
         status: async () => ({ data: {} }),
         abort: async () => ({}),
@@ -1427,6 +1520,55 @@ describe("BackgroundManager - Non-blocking Queue Integration", () => {
  })

  describe("task transitions pending→running when slot available", () => {
+    test("should inherit parent session permission rules (and force deny question)", async () => {
+      // given
+      const createCalls: any[] = []
+      const parentPermission = [
+        { permission: "question", action: "allow" as const, pattern: "*" },
+        { permission: "plan_enter", action: "deny" as const, pattern: "*" },
+      ]
+
+      const customClient = {
+        session: {
+          create: async (args?: any) => {
+            createCalls.push(args)
+            return { data: { id: `ses_${crypto.randomUUID()}` } }
+          },
+          get: async () => ({ data: { directory: "/test/dir", permission: parentPermission } }),
+          prompt: async () => ({}),
+          promptAsync: async () => ({}),
+          messages: async () => ({ data: [] }),
+          todo: async () => ({ data: [] }),
+          status: async () => ({ data: {} }),
+          abort: async () => ({}),
+        },
+      }
+      manager.shutdown()
+      manager = new BackgroundManager({ client: customClient, directory: tmpdir() } as unknown as PluginInput, {
+        defaultConcurrency: 5,
+      })
+
+      const input = {
+        description: "Test task",
+        prompt: "Do something",
+        agent: "test-agent",
+        parentSessionID: "parent-session",
+        parentMessageID: "parent-message",
+      }
+
+      // when
+      await manager.launch(input)
+      await new Promise(resolve => setTimeout(resolve, 50))
+
+      // then
+      expect(createCalls).toHaveLength(1)
+      const permission = createCalls[0]?.body?.permission
+      expect(permission).toEqual([
+        { permission: "plan_enter", action: "deny", pattern: "*" },
+        { permission: "question", action: "deny", pattern: "*" },
+      ])
+    })
+
    test("should transition first task to running immediately", async () => {
      // given
      const config = { defaultConcurrency: 5 }
--- a/src/features/background-agent/manager.ts
+++ b/src/features/background-agent/manager.ts
@@ -89,6 +89,7 @@ export class BackgroundManager {
  private processingKeys: Set<string> = new Set()
  private completionTimers: Map<string, ReturnType<typeof setTimeout>> = new Map()
  private idleDeferralTimers: Map<string, ReturnType<typeof setTimeout>> = new Map()
+  private notificationQueueByParent: Map<string, Promise<void>> = new Map()

  constructor(
    ctx: PluginInput,
@@ -235,13 +236,17 @@ export class BackgroundManager {
    const parentDirectory = parentSession?.data?.directory ?? this.directory
    log(`[background-agent] Parent dir: ${parentSession?.data?.directory}, using: ${parentDirectory}`)

+    const inheritedPermission = (parentSession as any)?.data?.permission
+    const permissionRules = Array.isArray(inheritedPermission)
+      ? inheritedPermission.filter((r: any) => r?.permission !== "question")
+      : []
+    permissionRules.push({ permission: "question", action: "deny" as const, pattern: "*" })
+
    const createResult = await this.client.session.create({
      body: {
        parentID: input.parentSessionID,
        title: `${input.description} (@${input.agent} subagent)`,
-        permission: [
-          { permission: "question", action: "deny" as const, pattern: "*" },
-        ],
+        permission: permissionRules,
      } as any,
      query: {
        directory: parentDirectory,
@@ -358,7 +363,7 @@ export class BackgroundManager {

        this.markForNotification(existingTask)
        this.cleanupPendingByParent(existingTask)
-        this.notifyParentSession(existingTask).catch(err => {
+        this.enqueueNotificationForParent(existingTask.parentSessionID, () => this.notifyParentSession(existingTask)).catch(err => {
          log("[background-agent] Failed to notify on error:", err)
        })
      }
@@ -615,7 +620,7 @@ export class BackgroundManager {

      this.markForNotification(existingTask)
      this.cleanupPendingByParent(existingTask)
-      this.notifyParentSession(existingTask).catch(err => {
+      this.enqueueNotificationForParent(existingTask.parentSessionID, () => this.notifyParentSession(existingTask)).catch(err => {
        log("[background-agent] Failed to notify on resume error:", err)
      })
    })
@@ -949,7 +954,7 @@ export class BackgroundManager {
    this.markForNotification(task)

    try {
-      await this.notifyParentSession(task)
+      await this.enqueueNotificationForParent(task.parentSessionID, () => this.notifyParentSession(task))
      log(`[background-agent] Task cancelled via ${source}:`, task.id)
    } catch (err) {
      log("[background-agent] Error in notifyParentSession for cancelled task:", { taskId: task.id, error: err })
@@ -1084,7 +1089,7 @@ export class BackgroundManager {
    }

    try {
-      await this.notifyParentSession(task)
+      await this.enqueueNotificationForParent(task.parentSessionID, () => this.notifyParentSession(task))
      log(`[background-agent] Task completed via ${source}:`, task.id)
    } catch (err) {
      log("[background-agent] Error in notifyParentSession:", { taskId: task.id, error: err })
@@ -1114,16 +1119,19 @@ export class BackgroundManager {

    // Update pending tracking and check if all tasks complete
    const pendingSet = this.pendingByParent.get(task.parentSessionID)
+    let allComplete = false
+    let remainingCount = 0
    if (pendingSet) {
      pendingSet.delete(task.id)
-      if (pendingSet.size === 0) {
+      remainingCount = pendingSet.size
+      allComplete = remainingCount === 0
+      if (allComplete) {
        this.pendingByParent.delete(task.parentSessionID)
      }
+    } else {
+      allComplete = true
    }

-    const allComplete = !pendingSet || pendingSet.size === 0
-    const remainingCount = pendingSet?.size ?? 0
-
    const statusText = task.status === "completed" ? "COMPLETED" : "CANCELLED"
    const errorInfo = task.error ? `\n**Error:** ${task.error}` : ""
    
@@ -1378,7 +1386,7 @@ Use \`background_output(task_id="${task.id}")\` to retrieve this result when rea
      log(`[background-agent] Task ${task.id} interrupted: stale timeout`)

      try {
-        await this.notifyParentSession(task)
+        await this.enqueueNotificationForParent(task.parentSessionID, () => this.notifyParentSession(task))
      } catch (err) {
        log("[background-agent] Error in notifyParentSession for stale task:", { taskId: task.id, error: err })
      }
@@ -1572,12 +1580,37 @@ Use \`background_output(task_id="${task.id}")\` to retrieve this result when rea
    this.tasks.clear()
    this.notifications.clear()
    this.pendingByParent.clear()
+    this.notificationQueueByParent.clear()
    this.queuesByKey.clear()
    this.processingKeys.clear()
    this.unregisterProcessCleanup()
    log("[background-agent] Shutdown complete")

  }
+
+  private enqueueNotificationForParent(
+    parentSessionID: string | undefined,
+    operation: () => Promise<void>
+  ): Promise<void> {
+    if (!parentSessionID) {
+      return operation()
+    }
+
+    const previous = this.notificationQueueByParent.get(parentSessionID) ?? Promise.resolve()
+    const current = previous
+      .catch(() => {})
+      .then(operation)
+
+    this.notificationQueueByParent.set(parentSessionID, current)
+
+    void current.finally(() => {
+      if (this.notificationQueueByParent.get(parentSessionID) === current) {
+        this.notificationQueueByParent.delete(parentSessionID)
+      }
+    }).catch(() => {})
+
+    return current
+  }
 }

 function registerProcessSignal(
--- a/src/features/background-agent/spawner.test.ts
+++ b/src/features/background-agent/spawner.test.ts
@@ -0,0 +1,65 @@
+import { describe, test, expect } from "bun:test"
+
+import { createTask, startTask } from "./spawner"
+
+describe("background-agent spawner.startTask", () => {
+  test("should inherit parent session permission rules (and force deny question)", async () => {
+    //#given
+    const createCalls: any[] = []
+    const parentPermission = [
+      { permission: "question", action: "allow" as const, pattern: "*" },
+      { permission: "plan_enter", action: "deny" as const, pattern: "*" },
+    ]
+
+    const client = {
+      session: {
+        get: async () => ({ data: { directory: "/parent/dir", permission: parentPermission } }),
+        create: async (args?: any) => {
+          createCalls.push(args)
+          return { data: { id: "ses_child" } }
+        },
+        promptAsync: async () => ({}),
+      },
+    }
+
+    const task = createTask({
+      description: "Test task",
+      prompt: "Do work",
+      agent: "explore",
+      parentSessionID: "ses_parent",
+      parentMessageID: "msg_parent",
+    })
+
+    const item = {
+      task,
+      input: {
+        description: task.description,
+        prompt: task.prompt,
+        agent: task.agent,
+        parentSessionID: task.parentSessionID,
+        parentMessageID: task.parentMessageID,
+        parentModel: task.parentModel,
+        parentAgent: task.parentAgent,
+        model: task.model,
+      },
+    }
+
+    const ctx = {
+      client,
+      directory: "/fallback",
+      concurrencyManager: { release: () => {} },
+      tmuxEnabled: false,
+      onTaskError: () => {},
+    }
+
+    //#when
+    await startTask(item as any, ctx as any)
+
+    //#then
+    expect(createCalls).toHaveLength(1)
+    expect(createCalls[0]?.body?.permission).toEqual([
+      { permission: "plan_enter", action: "deny", pattern: "*" },
+      { permission: "question", action: "deny", pattern: "*" },
+    ])
+  })
+})
--- a/src/features/background-agent/spawner.ts
+++ b/src/features/background-agent/spawner.ts
@@ -58,13 +58,17 @@ export async function startTask(
  const parentDirectory = parentSession?.data?.directory ?? directory
  log(`[background-agent] Parent dir: ${parentSession?.data?.directory}, using: ${parentDirectory}`)

+  const inheritedPermission = (parentSession as any)?.data?.permission
+  const permissionRules = Array.isArray(inheritedPermission)
+    ? inheritedPermission.filter((r: any) => r?.permission !== "question")
+    : []
+  permissionRules.push({ permission: "question", action: "deny" as const, pattern: "*" })
+
  const createResult = await client.session.create({
    body: {
      parentID: input.parentSessionID,
      title: `Background: ${input.description}`,
-      permission: [
-        { permission: "question", action: "deny" as const, pattern: "*" },
-      ],
+      permission: permissionRules,
    // eslint-disable-next-line @typescript-eslint/no-explicit-any
    } as any,
    query: {
--- a/src/features/background-agent/spawner/background-session-creator.ts
+++ b/src/features/background-agent/spawner/background-session-creator.ts
@@ -0,0 +1,46 @@
+import type { OpencodeClient } from "../constants"
+import type { ConcurrencyManager } from "../concurrency"
+import type { LaunchInput } from "../types"
+import { log } from "../../../shared"
+
+export async function createBackgroundSession(options: {
+  client: OpencodeClient
+  input: LaunchInput
+  parentDirectory: string
+  concurrencyManager: ConcurrencyManager
+  concurrencyKey: string
+}): Promise<string> {
+  const { client, input, parentDirectory, concurrencyManager, concurrencyKey } = options
+
+  const body = {
+    parentID: input.parentSessionID,
+    title: `Background: ${input.description}`,
+    permission: [{ permission: "question", action: "deny" as const, pattern: "*" }],
+  }
+
+  const createResult = await client.session
+    .create({
+      body,
+      query: {
+        directory: parentDirectory,
+      },
+    })
+    .catch((error) => {
+      concurrencyManager.release(concurrencyKey)
+      throw error
+    })
+
+  if (createResult.error) {
+    concurrencyManager.release(concurrencyKey)
+    throw new Error(`Failed to create background session: ${createResult.error}`)
+  }
+
+  if (!createResult.data?.id) {
+    concurrencyManager.release(concurrencyKey)
+    throw new Error("Failed to create background session: API returned no session ID")
+  }
+
+  const sessionID = createResult.data.id
+  log("[background-agent] Background session created", { sessionID })
+  return sessionID
+}
--- a/src/features/background-agent/spawner/concurrency-key-from-launch-input.ts
+++ b/src/features/background-agent/spawner/concurrency-key-from-launch-input.ts
@@ -0,0 +1,7 @@
+import type { LaunchInput } from "../types"
+
+export function getConcurrencyKeyFromLaunchInput(input: LaunchInput): string {
+  return input.model
+    ? `${input.model.providerID}/${input.model.modelID}`
+    : input.agent
+}
--- a/src/features/background-agent/spawner/parent-directory-resolver.ts
+++ b/src/features/background-agent/spawner/parent-directory-resolver.ts
@@ -0,0 +1,21 @@
+import type { OpencodeClient } from "../constants"
+import { log } from "../../../shared"
+
+export async function resolveParentDirectory(options: {
+  client: OpencodeClient
+  parentSessionID: string
+  defaultDirectory: string
+}): Promise<string> {
+  const { client, parentSessionID, defaultDirectory } = options
+
+  const parentSession = await client.session
+    .get({ path: { id: parentSessionID } })
+    .catch((error) => {
+      log(`[background-agent] Failed to get parent session: ${error}`)
+      return null
+    })
+
+  const parentDirectory = parentSession?.data?.directory ?? defaultDirectory
+  log(`[background-agent] Parent dir: ${parentSession?.data?.directory}, using: ${parentDirectory}`)
+  return parentDirectory
+}
--- a/src/features/background-agent/spawner/tmux-callback-invoker.ts
+++ b/src/features/background-agent/spawner/tmux-callback-invoker.ts
@@ -0,0 +1,39 @@
+import type { OnSubagentSessionCreated } from "../constants"
+import { TMUX_CALLBACK_DELAY_MS } from "../constants"
+import { log } from "../../../shared"
+import { isInsideTmux } from "../../../shared/tmux"
+
+export async function maybeInvokeTmuxCallback(options: {
+  onSubagentSessionCreated?: OnSubagentSessionCreated
+  tmuxEnabled: boolean
+  sessionID: string
+  parentID: string
+  title: string
+}): Promise<void> {
+  const { onSubagentSessionCreated, tmuxEnabled, sessionID, parentID, title } = options
+
+  log("[background-agent] tmux callback check", {
+    hasCallback: !!onSubagentSessionCreated,
+    tmuxEnabled,
+    isInsideTmux: isInsideTmux(),
+    sessionID,
+    parentID,
+  })
+
+  if (!onSubagentSessionCreated || !tmuxEnabled || !isInsideTmux()) {
+    log("[background-agent] SKIP tmux callback - conditions not met")
+    return
+  }
+
+  log("[background-agent] Invoking tmux callback NOW", { sessionID })
+  await onSubagentSessionCreated({
+    sessionID,
+    parentID,
+    title,
+  }).catch((error) => {
+    log("[background-agent] Failed to spawn tmux pane:", error)
+  })
+
+  log("[background-agent] tmux callback completed, waiting")
+  await new Promise<void>((resolve) => setTimeout(resolve, TMUX_CALLBACK_DELAY_MS))
+}
--- a/src/features/builtin-commands/commands.test.ts
+++ b/src/features/builtin-commands/commands.test.ts
@@ -0,0 +1,138 @@
+import { describe, test, expect } from "bun:test"
+import { loadBuiltinCommands } from "./commands"
+import { HANDOFF_TEMPLATE } from "./templates/handoff"
+import type { BuiltinCommandName } from "./types"
+
+describe("loadBuiltinCommands", () => {
+  test("should include handoff command in loaded commands", () => {
+    //#given
+    const disabledCommands: BuiltinCommandName[] = []
+
+    //#when
+    const commands = loadBuiltinCommands(disabledCommands)
+
+    //#then
+    expect(commands.handoff).toBeDefined()
+    expect(commands.handoff.name).toBe("handoff")
+  })
+
+  test("should exclude handoff when disabled", () => {
+    //#given
+    const disabledCommands: BuiltinCommandName[] = ["handoff"]
+
+    //#when
+    const commands = loadBuiltinCommands(disabledCommands)
+
+    //#then
+    expect(commands.handoff).toBeUndefined()
+  })
+
+  test("should include handoff template content in command template", () => {
+    //#given - no disabled commands
+
+    //#when
+    const commands = loadBuiltinCommands()
+
+    //#then
+    expect(commands.handoff.template).toContain(HANDOFF_TEMPLATE)
+  })
+
+  test("should include session context variables in handoff template", () => {
+    //#given - no disabled commands
+
+    //#when
+    const commands = loadBuiltinCommands()
+
+    //#then
+    expect(commands.handoff.template).toContain("$SESSION_ID")
+    expect(commands.handoff.template).toContain("$TIMESTAMP")
+    expect(commands.handoff.template).toContain("$ARGUMENTS")
+  })
+
+  test("should have correct description for handoff", () => {
+    //#given - no disabled commands
+
+    //#when
+    const commands = loadBuiltinCommands()
+
+    //#then
+    expect(commands.handoff.description).toContain("context summary")
+  })
+})
+
+describe("HANDOFF_TEMPLATE", () => {
+  test("should include session reading instruction", () => {
+    //#given - the template string
+
+    //#when / #then
+    expect(HANDOFF_TEMPLATE).toContain("session_read")
+  })
+
+  test("should include compaction-style sections in output format", () => {
+    //#given - the template string
+
+    //#when / #then
+    expect(HANDOFF_TEMPLATE).toContain("USER REQUESTS (AS-IS)")
+    expect(HANDOFF_TEMPLATE).toContain("EXPLICIT CONSTRAINTS")
+  })
+
+  test("should include programmatic context gathering instructions", () => {
+    //#given - the template string
+
+    //#when / #then
+    expect(HANDOFF_TEMPLATE).toContain("todoread")
+    expect(HANDOFF_TEMPLATE).toContain("git diff")
+    expect(HANDOFF_TEMPLATE).toContain("git status")
+  })
+
+  test("should include context extraction format", () => {
+    //#given - the template string
+
+    //#when / #then
+    expect(HANDOFF_TEMPLATE).toContain("WORK COMPLETED")
+    expect(HANDOFF_TEMPLATE).toContain("CURRENT STATE")
+    expect(HANDOFF_TEMPLATE).toContain("PENDING TASKS")
+    expect(HANDOFF_TEMPLATE).toContain("KEY FILES")
+    expect(HANDOFF_TEMPLATE).toContain("IMPORTANT DECISIONS")
+    expect(HANDOFF_TEMPLATE).toContain("CONTEXT FOR CONTINUATION")
+    expect(HANDOFF_TEMPLATE).toContain("GOAL")
+  })
+
+  test("should enforce first person perspective", () => {
+    //#given - the template string
+
+    //#when / #then
+    expect(HANDOFF_TEMPLATE).toContain("first person perspective")
+  })
+
+  test("should limit key files to 10", () => {
+    //#given - the template string
+
+    //#when / #then
+    expect(HANDOFF_TEMPLATE).toContain("Maximum 10 files")
+  })
+
+  test("should instruct plain text format without markdown", () => {
+    //#given - the template string
+
+    //#when / #then
+    expect(HANDOFF_TEMPLATE).toContain("Plain text with bullets")
+    expect(HANDOFF_TEMPLATE).toContain("No markdown headers")
+  })
+
+  test("should include user instructions for new session", () => {
+    //#given - the template string
+
+    //#when / #then
+    expect(HANDOFF_TEMPLATE).toContain("new session")
+    expect(HANDOFF_TEMPLATE).toContain("opencode")
+  })
+
+  test("should not contain emojis", () => {
+    //#given - the template string
+
+    //#when / #then
+    const emojiRegex = /[\u{1F600}-\u{1F64F}\u{1F300}-\u{1F5FF}\u{1F680}-\u{1F6FF}\u{1F1E0}-\u{1F1FF}\u{2702}-\u{27B0}\u{24C2}-\u{1F251}\u{1F900}-\u{1F9FF}\u{1FA00}-\u{1FA6F}\u{1FA70}-\u{1FAFF}\u{2600}-\u{26FF}\u{2700}-\u{27BF}]/u
+    expect(emojiRegex.test(HANDOFF_TEMPLATE)).toBe(false)
+  })
+})
--- a/src/features/builtin-commands/commands.ts
+++ b/src/features/builtin-commands/commands.ts
@@ -5,6 +5,7 @@ import { RALPH_LOOP_TEMPLATE, CANCEL_RALPH_TEMPLATE } from "./templates/ralph-lo
 import { STOP_CONTINUATION_TEMPLATE } from "./templates/stop-continuation"
 import { REFACTOR_TEMPLATE } from "./templates/refactor"
 import { START_WORK_TEMPLATE } from "./templates/start-work"
+import { HANDOFF_TEMPLATE } from "./templates/handoff"

 const BUILTIN_COMMAND_DEFINITIONS: Record<BuiltinCommandName, Omit<CommandDefinition, "name">> = {
  "init-deep": {
@@ -77,6 +78,22 @@ $ARGUMENTS
 ${STOP_CONTINUATION_TEMPLATE}
 </command-instruction>`,
  },
+  handoff: {
+    description: "(builtin) Create a detailed context summary for continuing work in a new session",
+    template: `<command-instruction>
+${HANDOFF_TEMPLATE}
+</command-instruction>
+
+<session-context>
+Session ID: $SESSION_ID
+Timestamp: $TIMESTAMP
+</session-context>
+
+<user-request>
+$ARGUMENTS
+</user-request>`,
+    argumentHint: "[goal]",
+  },
 }

 export function loadBuiltinCommands(
--- a/src/features/builtin-commands/templates/handoff.ts
+++ b/src/features/builtin-commands/templates/handoff.ts
@@ -0,0 +1,177 @@
+export const HANDOFF_TEMPLATE = `# Handoff Command
+
+## Purpose
+
+Use /handoff when:
+- The current session context is getting too long and quality is degrading
+- You want to start fresh while preserving essential context from this session
+- The context window is approaching capacity
+
+This creates a detailed context summary that can be used to continue work in a new session.
+
+---
+
+# PHASE 0: VALIDATE REQUEST
+
+Before proceeding, confirm:
+- [ ] There is meaningful work or context in this session to preserve
+- [ ] The user wants to create a handoff summary (not just asking about it)
+
+If the session is nearly empty or has no meaningful context, inform the user there is nothing substantial to hand off.
+
+---
+
+# PHASE 1: GATHER PROGRAMMATIC CONTEXT
+
+Execute these tools to gather concrete data:
+
+1. session_read({ session_id: "$SESSION_ID" }) — full session history
+2. todoread() — current task progress
+3. Bash({ command: "git diff --stat HEAD~10..HEAD" }) — recent file changes
+4. Bash({ command: "git status --porcelain" }) — uncommitted changes
+
+Suggested execution order:
+
+\`\`\`
+session_read({ session_id: "$SESSION_ID" })
+todoread()
+Bash({ command: "git diff --stat HEAD~10..HEAD" })
+Bash({ command: "git status --porcelain" })
+\`\`\`
+
+Analyze the gathered outputs to understand:
+- What the user asked for (exact wording)
+- What work was completed
+- What tasks remain incomplete (include todo state)
+- What decisions were made
+- What files were modified or discussed (include git diff/stat + status)
+- What patterns, constraints, or preferences were established
+
+---
+
+# PHASE 2: EXTRACT CONTEXT
+
+Write the context summary from first person perspective ("I did...", "I told you...").
+
+Focus on:
+- Capabilities and behavior, not file-by-file implementation details
+- What matters for continuing the work
+- Avoiding excessive implementation details (variable names, storage keys, constants) unless critical
+- USER REQUESTS (AS-IS) must be verbatim (do not paraphrase)
+- EXPLICIT CONSTRAINTS must be verbatim only (do not invent)
+
+Questions to consider when extracting:
+- What did I just do or implement?
+- What instructions did I already give which are still relevant (e.g. follow patterns in the codebase)?
+- What files did I tell you are important or that I am working on?
+- Did I provide a plan or spec that should be included?
+- What did I already tell you that is important (libraries, patterns, constraints, preferences)?
+- What important technical details did I discover (APIs, methods, patterns)?
+- What caveats, limitations, or open questions did I find?
+
+---
+
+# PHASE 3: FORMAT OUTPUT
+
+Generate a handoff summary using this exact format:
+
+\`\`\`
+HANDOFF CONTEXT
+===============
+
+USER REQUESTS (AS-IS)
+---------------------
+- [Exact verbatim user requests - NOT paraphrased]
+
+GOAL
+----
+[One sentence describing what should be done next]
+
+WORK COMPLETED
+--------------
+- [First person bullet points of what was done]
+- [Include specific file paths when relevant]
+- [Note key implementation decisions]
+
+CURRENT STATE
+-------------
+- [Current state of the codebase or task]
+- [Build/test status if applicable]
+- [Any environment or configuration state]
+
+PENDING TASKS
+-------------
+- [Tasks that were planned but not completed]
+- [Next logical steps to take]
+- [Any blockers or issues encountered]
+- [Include current todo state from todoread()]
+
+KEY FILES
+---------
+- [path/to/file1] - [brief role description]
+- [path/to/file2] - [brief role description]
+(Maximum 10 files, prioritized by importance)
+- (Include files from git diff/stat and git status)
+
+IMPORTANT DECISIONS
+-------------------
+- [Technical decisions that were made and why]
+- [Trade-offs that were considered]
+- [Patterns or conventions established]
+
+EXPLICIT CONSTRAINTS
+--------------------
+- [Verbatim constraints only - from user or existing AGENTS.md]
+- If none, write: None
+
+CONTEXT FOR CONTINUATION
+------------------------
+- [What the next session needs to know to continue]
+- [Warnings or gotchas to be aware of]
+- [References to documentation if relevant]
+\`\`\`
+
+Rules for the summary:
+- Plain text with bullets
+- No markdown headers with # (use the format above with dashes)
+- No bold, italic, or code fences within content
+- Use workspace-relative paths for files
+- Keep it focused - only include what matters for continuation
+- Pick an appropriate length based on complexity
+- USER REQUESTS (AS-IS) and EXPLICIT CONSTRAINTS must be verbatim only
+
+---
+
+# PHASE 4: PROVIDE INSTRUCTIONS
+
+After generating the summary, instruct the user:
+
+\`\`\`
+---
+
+TO CONTINUE IN A NEW SESSION:
+
+1. Press 'n' in OpenCode TUI to open a new session, or run 'opencode' in a new terminal
+2. Paste the HANDOFF CONTEXT above as your first message
+3. Add your request: "Continue from the handoff context above. [Your next task]"
+
+The new session will have all context needed to continue seamlessly.
+\`\`\`
+
+---
+
+# IMPORTANT CONSTRAINTS
+
+- DO NOT attempt to programmatically create new sessions (no API available to agents)
+- DO provide a self-contained summary that works without access to this session
+- DO include workspace-relative file paths
+- DO NOT include sensitive information (API keys, credentials, secrets)
+- DO NOT exceed 10 files in the KEY FILES section
+- DO keep the GOAL section to a single sentence or short paragraph
+
+---
+
+# EXECUTE NOW
+
+Begin by gathering programmatic context, then synthesize the handoff summary.
+`
--- a/src/features/builtin-commands/types.ts
+++ b/src/features/builtin-commands/types.ts
@@ -1,6 +1,6 @@
 import type { CommandDefinition } from "../claude-code-command-loader"

-export type BuiltinCommandName = "init-deep" | "ralph-loop" | "cancel-ralph" | "ulw-loop" | "refactor" | "start-work" | "stop-continuation"
+export type BuiltinCommandName = "init-deep" | "ralph-loop" | "cancel-ralph" | "ulw-loop" | "refactor" | "start-work" | "stop-continuation" | "handoff"

 export interface BuiltinCommandConfig {
  disabled_commands?: BuiltinCommandName[]
--- a/src/features/claude-code-mcp-loader/loader.test.ts
+++ b/src/features/claude-code-mcp-loader/loader.test.ts
@@ -8,6 +8,17 @@ const TEST_DIR = join(tmpdir(), "mcp-loader-test-" + Date.now())
 describe("getSystemMcpServerNames", () => {
  beforeEach(() => {
    mkdirSync(TEST_DIR, { recursive: true })
+
+    // Isolate tests from real user environment (e.g., ~/.claude.json).
+    // loader.ts reads user-level config via os.homedir() + getClaudeConfigDir().
+    mock.module("os", () => ({
+      homedir: () => TEST_DIR,
+      tmpdir,
+    }))
+
+    mock.module("../../shared", () => ({
+      getClaudeConfigDir: () => join(TEST_DIR, ".claude"),
+    }))
  })

  afterEach(() => {
--- a/src/features/claude-tasks/index.ts
+++ b/src/features/claude-tasks/index.ts
@@ -1,2 +1,3 @@
 export * from "./types"
 export * from "./storage"
+export * from "./session-storage"
--- a/src/features/claude-tasks/session-storage.test.ts
+++ b/src/features/claude-tasks/session-storage.test.ts
@@ -0,0 +1,204 @@
+import { describe, test, expect, beforeEach, afterEach } from "bun:test"
+import { existsSync, mkdirSync, rmSync, writeFileSync, readdirSync } from "fs"
+import { join } from "path"
+import type { OhMyOpenCodeConfig } from "../../config/schema"
+import {
+  getSessionTaskDir,
+  listSessionTaskFiles,
+  listAllSessionDirs,
+  findTaskAcrossSessions,
+} from "./session-storage"
+
+const TEST_DIR = ".test-session-storage"
+const TEST_DIR_ABS = join(process.cwd(), TEST_DIR)
+
+function makeConfig(storagePath: string): Partial<OhMyOpenCodeConfig> {
+  return {
+    sisyphus: {
+      tasks: { storage_path: storagePath, claude_code_compat: false },
+    },
+  }
+}
+
+describe("getSessionTaskDir", () => {
+  test("returns session-scoped subdirectory under base task dir", () => {
+    //#given
+    const config = makeConfig("/tmp/tasks")
+    const sessionID = "ses_abc123"
+
+    //#when
+    const result = getSessionTaskDir(config, sessionID)
+
+    //#then
+    expect(result).toBe("/tmp/tasks/ses_abc123")
+  })
+
+  test("uses relative storage path joined with cwd", () => {
+    //#given
+    const config = makeConfig(TEST_DIR)
+    const sessionID = "ses_xyz"
+
+    //#when
+    const result = getSessionTaskDir(config, sessionID)
+
+    //#then
+    expect(result).toBe(join(TEST_DIR_ABS, "ses_xyz"))
+  })
+})
+
+describe("listSessionTaskFiles", () => {
+  beforeEach(() => {
+    if (existsSync(TEST_DIR_ABS)) {
+      rmSync(TEST_DIR_ABS, { recursive: true, force: true })
+    }
+  })
+
+  afterEach(() => {
+    if (existsSync(TEST_DIR_ABS)) {
+      rmSync(TEST_DIR_ABS, { recursive: true, force: true })
+    }
+  })
+
+  test("returns empty array when session directory does not exist", () => {
+    //#given
+    const config = makeConfig(TEST_DIR)
+
+    //#when
+    const result = listSessionTaskFiles(config, "nonexistent-session")
+
+    //#then
+    expect(result).toEqual([])
+  })
+
+  test("lists only T-*.json files in the session directory", () => {
+    //#given
+    const config = makeConfig(TEST_DIR)
+    const sessionDir = join(TEST_DIR_ABS, "ses_001")
+    mkdirSync(sessionDir, { recursive: true })
+    writeFileSync(join(sessionDir, "T-aaa.json"), "{}", "utf-8")
+    writeFileSync(join(sessionDir, "T-bbb.json"), "{}", "utf-8")
+    writeFileSync(join(sessionDir, "other.txt"), "nope", "utf-8")
+
+    //#when
+    const result = listSessionTaskFiles(config, "ses_001")
+
+    //#then
+    expect(result).toHaveLength(2)
+    expect(result).toContain("T-aaa")
+    expect(result).toContain("T-bbb")
+  })
+
+  test("does not list tasks from other sessions", () => {
+    //#given
+    const config = makeConfig(TEST_DIR)
+    const session1Dir = join(TEST_DIR_ABS, "ses_001")
+    const session2Dir = join(TEST_DIR_ABS, "ses_002")
+    mkdirSync(session1Dir, { recursive: true })
+    mkdirSync(session2Dir, { recursive: true })
+    writeFileSync(join(session1Dir, "T-from-s1.json"), "{}", "utf-8")
+    writeFileSync(join(session2Dir, "T-from-s2.json"), "{}", "utf-8")
+
+    //#when
+    const result = listSessionTaskFiles(config, "ses_001")
+
+    //#then
+    expect(result).toEqual(["T-from-s1"])
+  })
+})
+
+describe("listAllSessionDirs", () => {
+  beforeEach(() => {
+    if (existsSync(TEST_DIR_ABS)) {
+      rmSync(TEST_DIR_ABS, { recursive: true, force: true })
+    }
+  })
+
+  afterEach(() => {
+    if (existsSync(TEST_DIR_ABS)) {
+      rmSync(TEST_DIR_ABS, { recursive: true, force: true })
+    }
+  })
+
+  test("returns empty array when base directory does not exist", () => {
+    //#given
+    const config = makeConfig(TEST_DIR)
+
+    //#when
+    const result = listAllSessionDirs(config)
+
+    //#then
+    expect(result).toEqual([])
+  })
+
+  test("returns only directory entries (not files)", () => {
+    //#given
+    const config = makeConfig(TEST_DIR)
+    mkdirSync(TEST_DIR_ABS, { recursive: true })
+    mkdirSync(join(TEST_DIR_ABS, "ses_001"), { recursive: true })
+    mkdirSync(join(TEST_DIR_ABS, "ses_002"), { recursive: true })
+    writeFileSync(join(TEST_DIR_ABS, ".lock"), "{}", "utf-8")
+    writeFileSync(join(TEST_DIR_ABS, "T-legacy.json"), "{}", "utf-8")
+
+    //#when
+    const result = listAllSessionDirs(config)
+
+    //#then
+    expect(result).toHaveLength(2)
+    expect(result).toContain("ses_001")
+    expect(result).toContain("ses_002")
+  })
+})
+
+describe("findTaskAcrossSessions", () => {
+  beforeEach(() => {
+    if (existsSync(TEST_DIR_ABS)) {
+      rmSync(TEST_DIR_ABS, { recursive: true, force: true })
+    }
+  })
+
+  afterEach(() => {
+    if (existsSync(TEST_DIR_ABS)) {
+      rmSync(TEST_DIR_ABS, { recursive: true, force: true })
+    }
+  })
+
+  test("returns null when task does not exist in any session", () => {
+    //#given
+    const config = makeConfig(TEST_DIR)
+    mkdirSync(join(TEST_DIR_ABS, "ses_001"), { recursive: true })
+
+    //#when
+    const result = findTaskAcrossSessions(config, "T-nonexistent")
+
+    //#then
+    expect(result).toBeNull()
+  })
+
+  test("finds task in the correct session directory", () => {
+    //#given
+    const config = makeConfig(TEST_DIR)
+    const session2Dir = join(TEST_DIR_ABS, "ses_002")
+    mkdirSync(join(TEST_DIR_ABS, "ses_001"), { recursive: true })
+    mkdirSync(session2Dir, { recursive: true })
+    writeFileSync(join(session2Dir, "T-target.json"), '{"id":"T-target"}', "utf-8")
+
+    //#when
+    const result = findTaskAcrossSessions(config, "T-target")
+
+    //#then
+    expect(result).not.toBeNull()
+    expect(result!.sessionID).toBe("ses_002")
+    expect(result!.path).toBe(join(session2Dir, "T-target.json"))
+  })
+
+  test("returns null when base directory does not exist", () => {
+    //#given
+    const config = makeConfig(TEST_DIR)
+
+    //#when
+    const result = findTaskAcrossSessions(config, "T-any")
+
+    //#then
+    expect(result).toBeNull()
+  })
+})
--- a/src/features/claude-tasks/session-storage.ts
+++ b/src/features/claude-tasks/session-storage.ts
@@ -0,0 +1,52 @@
+import { join } from "path"
+import { existsSync, readdirSync, statSync } from "fs"
+import { getTaskDir } from "./storage"
+import type { OhMyOpenCodeConfig } from "../../config/schema"
+
+export function getSessionTaskDir(
+  config: Partial<OhMyOpenCodeConfig>,
+  sessionID: string,
+): string {
+  return join(getTaskDir(config), sessionID)
+}
+
+export function listSessionTaskFiles(
+  config: Partial<OhMyOpenCodeConfig>,
+  sessionID: string,
+): string[] {
+  const dir = getSessionTaskDir(config, sessionID)
+  if (!existsSync(dir)) return []
+  return readdirSync(dir)
+    .filter((f) => f.endsWith(".json") && f.startsWith("T-"))
+    .map((f) => f.replace(".json", ""))
+}
+
+export function listAllSessionDirs(
+  config: Partial<OhMyOpenCodeConfig>,
+): string[] {
+  const baseDir = getTaskDir(config)
+  if (!existsSync(baseDir)) return []
+  return readdirSync(baseDir).filter((entry) => {
+    const fullPath = join(baseDir, entry)
+    return statSync(fullPath).isDirectory()
+  })
+}
+
+export interface TaskLocation {
+  path: string
+  sessionID: string
+}
+
+export function findTaskAcrossSessions(
+  config: Partial<OhMyOpenCodeConfig>,
+  taskId: string,
+): TaskLocation | null {
+  const sessionDirs = listAllSessionDirs(config)
+  for (const sessionID of sessionDirs) {
+    const taskPath = join(getSessionTaskDir(config, sessionID), `${taskId}.json`)
+    if (existsSync(taskPath)) {
+      return { path: taskPath, sessionID }
+    }
+  }
+  return null
+}
--- a/src/features/claude-tasks/storage.test.ts
+++ b/src/features/claude-tasks/storage.test.ts
@@ -20,6 +20,7 @@ const TEST_DIR_ABS = join(process.cwd(), TEST_DIR)

 describe("getTaskDir", () => {
  const originalTaskListId = process.env.ULTRAWORK_TASK_LIST_ID
+  const originalClaudeTaskListId = process.env.CLAUDE_CODE_TASK_LIST_ID

  beforeEach(() => {
    if (originalTaskListId === undefined) {
@@ -27,6 +28,12 @@ describe("getTaskDir", () => {
    } else {
      process.env.ULTRAWORK_TASK_LIST_ID = originalTaskListId
    }
+
+    if (originalClaudeTaskListId === undefined) {
+      delete process.env.CLAUDE_CODE_TASK_LIST_ID
+    } else {
+      process.env.CLAUDE_CODE_TASK_LIST_ID = originalClaudeTaskListId
+    }
  })

  afterEach(() => {
@@ -35,6 +42,12 @@ describe("getTaskDir", () => {
    } else {
      process.env.ULTRAWORK_TASK_LIST_ID = originalTaskListId
    }
+
+    if (originalClaudeTaskListId === undefined) {
+      delete process.env.CLAUDE_CODE_TASK_LIST_ID
+    } else {
+      process.env.CLAUDE_CODE_TASK_LIST_ID = originalClaudeTaskListId
+    }
  })

  test("returns global config path for default config", () => {
@@ -62,6 +75,19 @@ describe("getTaskDir", () => {
    expect(result).toBe(join(configDir, "tasks", "custom-list-id"))
  })

+  test("respects CLAUDE_CODE_TASK_LIST_ID env var when ULTRAWORK_TASK_LIST_ID not set", () => {
+    //#given
+    delete process.env.ULTRAWORK_TASK_LIST_ID
+    process.env.CLAUDE_CODE_TASK_LIST_ID = "claude list/id"
+    const configDir = getOpenCodeConfigDir({ binary: "opencode" })
+
+    //#when
+    const result = getTaskDir()
+
+    //#then
+    expect(result).toBe(join(configDir, "tasks", "claude-list-id"))
+  })
+
  test("falls back to sanitized cwd basename when env var not set", () => {
    //#given
    delete process.env.ULTRAWORK_TASK_LIST_ID
@@ -114,6 +140,7 @@ describe("getTaskDir", () => {

 describe("resolveTaskListId", () => {
  const originalTaskListId = process.env.ULTRAWORK_TASK_LIST_ID
+  const originalClaudeTaskListId = process.env.CLAUDE_CODE_TASK_LIST_ID

  beforeEach(() => {
    if (originalTaskListId === undefined) {
@@ -121,6 +148,12 @@ describe("resolveTaskListId", () => {
    } else {
      process.env.ULTRAWORK_TASK_LIST_ID = originalTaskListId
    }
+
+    if (originalClaudeTaskListId === undefined) {
+      delete process.env.CLAUDE_CODE_TASK_LIST_ID
+    } else {
+      process.env.CLAUDE_CODE_TASK_LIST_ID = originalClaudeTaskListId
+    }
  })

  afterEach(() => {
@@ -129,6 +162,12 @@ describe("resolveTaskListId", () => {
    } else {
      process.env.ULTRAWORK_TASK_LIST_ID = originalTaskListId
    }
+
+    if (originalClaudeTaskListId === undefined) {
+      delete process.env.CLAUDE_CODE_TASK_LIST_ID
+    } else {
+      process.env.CLAUDE_CODE_TASK_LIST_ID = originalClaudeTaskListId
+    }
  })

  test("returns env var when set", () => {
@@ -142,6 +181,30 @@ describe("resolveTaskListId", () => {
    expect(result).toBe("custom-list")
  })

+  test("returns CLAUDE_CODE_TASK_LIST_ID when ULTRAWORK_TASK_LIST_ID not set", () => {
+    //#given
+    delete process.env.ULTRAWORK_TASK_LIST_ID
+    process.env.CLAUDE_CODE_TASK_LIST_ID = "claude-list"
+
+    //#when
+    const result = resolveTaskListId()
+
+    //#then
+    expect(result).toBe("claude-list")
+  })
+
+  test("sanitizes CLAUDE_CODE_TASK_LIST_ID special characters", () => {
+    //#given
+    delete process.env.ULTRAWORK_TASK_LIST_ID
+    process.env.CLAUDE_CODE_TASK_LIST_ID = "claude list/id"
+
+    //#when
+    const result = resolveTaskListId()
+
+    //#then
+    expect(result).toBe("claude-list-id")
+  })
+
  test("sanitizes special characters", () => {
    //#given
    process.env.ULTRAWORK_TASK_LIST_ID = "custom list/id"
--- a/src/features/claude-tasks/storage.ts
+++ b/src/features/claude-tasks/storage.ts
@@ -26,6 +26,9 @@ export function resolveTaskListId(config: Partial<OhMyOpenCodeConfig> = {}): str
  const envId = process.env.ULTRAWORK_TASK_LIST_ID?.trim()
  if (envId) return sanitizePathSegment(envId)

+  const claudeEnvId = process.env.CLAUDE_CODE_TASK_LIST_ID?.trim()
+  if (claudeEnvId) return sanitizePathSegment(claudeEnvId)
+
  const configId = config.sisyphus?.tasks?.task_list_id?.trim()
  if (configId) return sanitizePathSegment(configId)

--- a/src/features/tmux-subagent/manager-cleanup.ts
+++ b/src/features/tmux-subagent/manager-cleanup.ts
@@ -0,0 +1,43 @@
+import type { TmuxConfig } from "../../config/schema"
+import type { TrackedSession } from "./types"
+import { log } from "../../shared"
+import { queryWindowState } from "./pane-state-querier"
+import { executeAction } from "./action-executor"
+import { TmuxPollingManager } from "./polling-manager"
+
+export class ManagerCleanup {
+  constructor(
+    private sessions: Map<string, TrackedSession>,
+    private sourcePaneId: string | undefined,
+    private pollingManager: TmuxPollingManager,
+    private tmuxConfig: TmuxConfig,
+    private serverUrl: string
+  ) {}
+
+  async cleanup(): Promise<void> {
+    this.pollingManager.stopPolling()
+
+    if (this.sessions.size > 0) {
+      log("[tmux-session-manager] closing all panes", { count: this.sessions.size })
+      const state = this.sourcePaneId ? await queryWindowState(this.sourcePaneId) : null
+      
+      if (state) {
+        const closePromises = Array.from(this.sessions.values()).map((s) =>
+          executeAction(
+            { type: "close", paneId: s.paneId, sessionId: s.sessionId },
+            { config: this.tmuxConfig, serverUrl: this.serverUrl, windowState: state }
+          ).catch((err) =>
+            log("[tmux-session-manager] cleanup error for pane", {
+              paneId: s.paneId,
+              error: String(err),
+            }),
+          ),
+        )
+        await Promise.all(closePromises)
+      }
+      this.sessions.clear()
+    }
+
+    log("[tmux-session-manager] cleanup complete")
+  }
+}
--- a/src/features/tmux-subagent/manager.ts
+++ b/src/features/tmux-subagent/manager.ts
@@ -13,7 +13,7 @@ import { log } from "../../shared"
 import { queryWindowState } from "./pane-state-querier"
 import { decideSpawnActions, decideCloseAction, type SessionMapping } from "./decision-engine"
 import { executeActions, executeAction } from "./action-executor"
-
+import { TmuxPollingManager } from "./polling-manager"
 type OpencodeClient = PluginInput["client"]

 interface SessionCreatedEvent {
@@ -57,9 +57,8 @@ export class TmuxSessionManager {
  private sourcePaneId: string | undefined
  private sessions = new Map<string, TrackedSession>()
  private pendingSessions = new Set<string>()
-  private pollInterval?: ReturnType<typeof setInterval>
  private deps: TmuxUtilDeps
-
+  private pollingManager: TmuxPollingManager
  constructor(ctx: PluginInput, tmuxConfig: TmuxConfig, deps: TmuxUtilDeps = defaultTmuxDeps) {
    this.client = ctx.client
    this.tmuxConfig = tmuxConfig
@@ -67,7 +66,11 @@ export class TmuxSessionManager {
    const defaultPort = process.env.OPENCODE_PORT ?? "4096"
    this.serverUrl = ctx.serverUrl?.toString() ?? `http://localhost:${defaultPort}`
    this.sourcePaneId = deps.getCurrentPaneId()
-
+    this.pollingManager = new TmuxPollingManager(
+      this.client,
+      this.sessions,
+      this.closeSessionById.bind(this)
+    )
    log("[tmux-session-manager] initialized", {
      configEnabled: this.tmuxConfig.enabled,
      tmuxConfig: this.tmuxConfig,
@@ -75,7 +78,6 @@ export class TmuxSessionManager {
      sourcePaneId: this.sourcePaneId,
    })
  }
-
  private isEnabled(): boolean {
    return this.tmuxConfig.enabled && this.deps.isInsideTmux()
  }
@@ -125,6 +127,12 @@ export class TmuxSessionManager {
    return false
  }

+  // NOTE: Exposed (via `as any`) for test stability checks.
+  // Actual polling is owned by TmuxPollingManager.
+  private async pollSessions(): Promise<void> {
+    await (this.pollingManager as any).pollSessions()
+  }
+
  async onSessionCreated(event: SessionCreatedEvent): Promise<void> {
    const enabled = this.isEnabled()
    log("[tmux-session-manager] onSessionCreated called", {
@@ -239,7 +247,7 @@ export class TmuxSessionManager {
          paneId: result.spawnedPaneId,
          sessionReady,
        })
-        this.startPolling()
+        this.pollingManager.startPolling()
      } else {
        log("[tmux-session-manager] spawn failed", {
          success: result.success,
@@ -278,140 +286,10 @@ export class TmuxSessionManager {
    this.sessions.delete(event.sessionID)

    if (this.sessions.size === 0) {
-      this.stopPolling()
+      this.pollingManager.stopPolling()
    }
  }

-  private startPolling(): void {
-    if (this.pollInterval) return
-
-    this.pollInterval = setInterval(
-      () => this.pollSessions(),
-      POLL_INTERVAL_BACKGROUND_MS,
-    )
-    log("[tmux-session-manager] polling started")
-  }
-
-  private stopPolling(): void {
-    if (this.pollInterval) {
-      clearInterval(this.pollInterval)
-      this.pollInterval = undefined
-      log("[tmux-session-manager] polling stopped")
-    }
-  }
-
-  private async pollSessions(): Promise<void> {
-    if (this.sessions.size === 0) {
-      this.stopPolling()
-      return
-    }
-
-    try {
-      const statusResult = await this.client.session.status({ path: undefined })
-      const allStatuses = (statusResult.data ?? {}) as Record<string, { type: string }>
-
-      log("[tmux-session-manager] pollSessions", {
-        trackedSessions: Array.from(this.sessions.keys()),
-        allStatusKeys: Object.keys(allStatuses),
-      })
-
-      const now = Date.now()
-      const sessionsToClose: string[] = []
-
-      for (const [sessionId, tracked] of this.sessions.entries()) {
-        const status = allStatuses[sessionId]
-        const isIdle = status?.type === "idle"
-
-        if (status) {
-          tracked.lastSeenAt = new Date(now)
-        }
-
-        const missingSince = !status ? now - tracked.lastSeenAt.getTime() : 0
-        const missingTooLong = missingSince >= SESSION_MISSING_GRACE_MS
-        const isTimedOut = now - tracked.createdAt.getTime() > SESSION_TIMEOUT_MS
-        const elapsedMs = now - tracked.createdAt.getTime()
-
-        // Stability detection: Don't close immediately on idle
-        // Wait for STABLE_POLLS_REQUIRED consecutive polls with same message count
-        let shouldCloseViaStability = false
-
-        if (isIdle && elapsedMs >= MIN_STABILITY_TIME_MS) {
-          // Fetch message count to detect if agent is still producing output
-          try {
-            const messagesResult = await this.client.session.messages({ 
-              path: { id: sessionId } 
-            })
-            const currentMsgCount = Array.isArray(messagesResult.data) 
-              ? messagesResult.data.length 
-              : 0
-
-            if (tracked.lastMessageCount === currentMsgCount) {
-              // Message count unchanged - increment stable polls
-              tracked.stableIdlePolls = (tracked.stableIdlePolls ?? 0) + 1
-              
-              if (tracked.stableIdlePolls >= STABLE_POLLS_REQUIRED) {
-                // Double-check status before closing
-                const recheckResult = await this.client.session.status({ path: undefined })
-                const recheckStatuses = (recheckResult.data ?? {}) as Record<string, { type: string }>
-                const recheckStatus = recheckStatuses[sessionId]
-                
-                if (recheckStatus?.type === "idle") {
-                  shouldCloseViaStability = true
-                } else {
-                  // Status changed - reset stability counter
-                  tracked.stableIdlePolls = 0
-                  log("[tmux-session-manager] stability reached but session not idle on recheck, resetting", {
-                    sessionId,
-                    recheckStatus: recheckStatus?.type,
-                  })
-                }
-              }
-            } else {
-              // New messages - agent is still working, reset stability counter
-              tracked.stableIdlePolls = 0
-            }
-            
-            tracked.lastMessageCount = currentMsgCount
-          } catch (msgErr) {
-            log("[tmux-session-manager] failed to fetch messages for stability check", {
-              sessionId,
-              error: String(msgErr),
-            })
-            // On error, don't close - be conservative
-          }
-        } else if (!isIdle) {
-          // Not idle - reset stability counter
-          tracked.stableIdlePolls = 0
-        }
-
-        log("[tmux-session-manager] session check", {
-          sessionId,
-          statusType: status?.type,
-          isIdle,
-          elapsedMs,
-          stableIdlePolls: tracked.stableIdlePolls,
-          lastMessageCount: tracked.lastMessageCount,
-          missingSince,
-          missingTooLong,
-          isTimedOut,
-          shouldCloseViaStability,
-        })
-
-        // Close if: stability detection confirmed OR missing too long OR timed out
-        // Note: We no longer close immediately on idle - stability detection handles that
-        if (shouldCloseViaStability || missingTooLong || isTimedOut) {
-          sessionsToClose.push(sessionId)
-        }
-      }
-
-      for (const sessionId of sessionsToClose) {
-        log("[tmux-session-manager] closing session due to poll", { sessionId })
-        await this.closeSessionById(sessionId)
-      }
-    } catch (err) {
-      log("[tmux-session-manager] poll error", { error: String(err) })
-    }
-  }

  private async closeSessionById(sessionId: string): Promise<void> {
    const tracked = this.sessions.get(sessionId)
@@ -433,7 +311,7 @@ export class TmuxSessionManager {
    this.sessions.delete(sessionId)

    if (this.sessions.size === 0) {
-      this.stopPolling()
+      this.pollingManager.stopPolling()
    }
  }

@@ -444,7 +322,7 @@ export class TmuxSessionManager {
  }

  async cleanup(): Promise<void> {
-    this.stopPolling()
+    this.pollingManager.stopPolling()

    if (this.sessions.size > 0) {
      log("[tmux-session-manager] closing all panes", { count: this.sessions.size })
--- a/src/features/tmux-subagent/polling-manager.ts
+++ b/src/features/tmux-subagent/polling-manager.ts
@@ -0,0 +1,139 @@
+import type { OpencodeClient } from "../../tools/delegate-task/types"
+import { POLL_INTERVAL_BACKGROUND_MS } from "../../shared/tmux"
+import type { TrackedSession } from "./types"
+import { SESSION_MISSING_GRACE_MS } from "../../shared/tmux"
+import { log } from "../../shared"
+
+const SESSION_TIMEOUT_MS = 10 * 60 * 1000
+const MIN_STABILITY_TIME_MS = 10 * 1000
+const STABLE_POLLS_REQUIRED = 3
+
+export class TmuxPollingManager {
+  private pollInterval?: ReturnType<typeof setInterval>
+
+  constructor(
+    private client: OpencodeClient,
+    private sessions: Map<string, TrackedSession>,
+    private closeSessionById: (sessionId: string) => Promise<void>
+  ) {}
+
+  startPolling(): void {
+    if (this.pollInterval) return
+
+    this.pollInterval = setInterval(
+      () => this.pollSessions(),
+      POLL_INTERVAL_BACKGROUND_MS, // POLL_INTERVAL_BACKGROUND_MS
+    )
+    log("[tmux-session-manager] polling started")
+  }
+
+  stopPolling(): void {
+    if (this.pollInterval) {
+      clearInterval(this.pollInterval)
+      this.pollInterval = undefined
+      log("[tmux-session-manager] polling stopped")
+    }
+  }
+
+  private async pollSessions(): Promise<void> {
+    if (this.sessions.size === 0) {
+      this.stopPolling()
+      return
+    }
+
+    try {
+      const statusResult = await this.client.session.status({ path: undefined })
+      const allStatuses = (statusResult.data ?? {}) as Record<string, { type: string }>
+
+      log("[tmux-session-manager] pollSessions", {
+        trackedSessions: Array.from(this.sessions.keys()),
+        allStatusKeys: Object.keys(allStatuses),
+      })
+
+      const now = Date.now()
+      const sessionsToClose: string[] = []
+
+      for (const [sessionId, tracked] of this.sessions.entries()) {
+        const status = allStatuses[sessionId]
+        const isIdle = status?.type === "idle"
+
+        if (status) {
+          tracked.lastSeenAt = new Date(now)
+        }
+
+        const missingSince = !status ? now - tracked.lastSeenAt.getTime() : 0
+        const missingTooLong = missingSince >= SESSION_MISSING_GRACE_MS
+        const isTimedOut = now - tracked.createdAt.getTime() > SESSION_TIMEOUT_MS
+        const elapsedMs = now - tracked.createdAt.getTime()
+
+        let shouldCloseViaStability = false
+
+        if (isIdle && elapsedMs >= MIN_STABILITY_TIME_MS) {
+          try {
+            const messagesResult = await this.client.session.messages({ 
+              path: { id: sessionId } 
+            })
+            const currentMsgCount = Array.isArray(messagesResult.data) 
+              ? messagesResult.data.length 
+              : 0
+
+            if (tracked.lastMessageCount === currentMsgCount) {
+              tracked.stableIdlePolls = (tracked.stableIdlePolls ?? 0) + 1
+              
+              if (tracked.stableIdlePolls >= STABLE_POLLS_REQUIRED) {
+                const recheckResult = await this.client.session.status({ path: undefined })
+                const recheckStatuses = (recheckResult.data ?? {}) as Record<string, { type: string }>
+                const recheckStatus = recheckStatuses[sessionId]
+                
+                if (recheckStatus?.type === "idle") {
+                  shouldCloseViaStability = true
+                } else {
+                  tracked.stableIdlePolls = 0
+                  log("[tmux-session-manager] stability reached but session not idle on recheck, resetting", {
+                    sessionId,
+                    recheckStatus: recheckStatus?.type,
+                  })
+                }
+              }
+            } else {
+              tracked.stableIdlePolls = 0
+            }
+            
+            tracked.lastMessageCount = currentMsgCount
+          } catch (msgErr) {
+            log("[tmux-session-manager] failed to fetch messages for stability check", {
+              sessionId,
+              error: String(msgErr),
+            })
+          }
+        } else if (!isIdle) {
+          tracked.stableIdlePolls = 0
+        }
+
+        log("[tmux-session-manager] session check", {
+          sessionId,
+          statusType: status?.type,
+          isIdle,
+          elapsedMs,
+          stableIdlePolls: tracked.stableIdlePolls,
+          lastMessageCount: tracked.lastMessageCount,
+          missingSince,
+          missingTooLong,
+          isTimedOut,
+          shouldCloseViaStability,
+        })
+
+        if (shouldCloseViaStability || missingTooLong || isTimedOut) {
+          sessionsToClose.push(sessionId)
+        }
+      }
+
+      for (const sessionId of sessionsToClose) {
+        log("[tmux-session-manager] closing session due to poll", { sessionId })
+        await this.closeSessionById(sessionId)
+      }
+    } catch (err) {
+      log("[tmux-session-manager] poll error", { error: String(err) })
+    }
+  }
+}
--- a/src/features/tmux-subagent/session-cleaner.ts
+++ b/src/features/tmux-subagent/session-cleaner.ts
@@ -0,0 +1,80 @@
+import type { TmuxConfig } from "../../config/schema"
+import type { TrackedSession } from "./types"
+import type { SessionMapping } from "./decision-engine"
+import { log } from "../../shared"
+import { queryWindowState } from "./pane-state-querier"
+import { decideCloseAction } from "./decision-engine"
+import { executeAction } from "./action-executor"
+import { TmuxPollingManager } from "./polling-manager"
+
+export interface TmuxUtilDeps {
+  isInsideTmux: () => boolean
+  getCurrentPaneId: () => string | undefined
+}
+
+export class SessionCleaner {
+  constructor(
+    private tmuxConfig: TmuxConfig,
+    private deps: TmuxUtilDeps,
+    private sessions: Map<string, TrackedSession>,
+    private sourcePaneId: string | undefined,
+    private getSessionMappings: () => SessionMapping[],
+    private pollingManager: TmuxPollingManager,
+    private serverUrl: string
+  ) {}
+
+  private isEnabled(): boolean {
+    return this.tmuxConfig.enabled && this.deps.isInsideTmux()
+  }
+
+  async onSessionDeleted(event: { sessionID: string }): Promise<void> {
+    if (!this.isEnabled()) return
+    if (!this.sourcePaneId) return
+
+    const tracked = this.sessions.get(event.sessionID)
+    if (!tracked) return
+
+    log("[tmux-session-manager] onSessionDeleted", { sessionId: event.sessionID })
+
+    const state = await queryWindowState(this.sourcePaneId)
+    if (!state) {
+      this.sessions.delete(event.sessionID)
+      return
+    }
+
+    const closeAction = decideCloseAction(state, event.sessionID, this.getSessionMappings())
+    if (closeAction) {
+      await executeAction(closeAction, { config: this.tmuxConfig, serverUrl: this.serverUrl, windowState: state })
+    }
+
+    this.sessions.delete(event.sessionID)
+
+    if (this.sessions.size === 0) {
+      this.pollingManager.stopPolling()
+    }
+  }
+
+  async closeSessionById(sessionId: string): Promise<void> {
+    const tracked = this.sessions.get(sessionId)
+    if (!tracked) return
+
+    log("[tmux-session-manager] closing session pane", {
+      sessionId,
+      paneId: tracked.paneId,
+    })
+
+    const state = this.sourcePaneId ? await queryWindowState(this.sourcePaneId) : null
+    if (state) {
+      await executeAction(
+        { type: "close", paneId: tracked.paneId, sessionId },
+        { config: this.tmuxConfig, serverUrl: this.serverUrl, windowState: state }
+      )
+    }
+
+    this.sessions.delete(sessionId)
+
+    if (this.sessions.size === 0) {
+      this.pollingManager.stopPolling()
+    }
+  }
+}
--- a/src/features/tmux-subagent/session-spawner.ts
+++ b/src/features/tmux-subagent/session-spawner.ts
@@ -0,0 +1,166 @@
+import type { TmuxConfig } from "../../config/schema"
+import type { TrackedSession, CapacityConfig } from "./types"
+import { log } from "../../shared"
+import { queryWindowState } from "./pane-state-querier"
+import { decideSpawnActions, type SessionMapping } from "./decision-engine"
+import { executeActions } from "./action-executor"
+import { TmuxPollingManager } from "./polling-manager"
+
+interface SessionCreatedEvent {
+  type: string
+  properties?: { info?: { id?: string; parentID?: string; title?: string } }
+}
+
+export interface TmuxUtilDeps {
+  isInsideTmux: () => boolean
+  getCurrentPaneId: () => string | undefined
+}
+
+export class SessionSpawner {
+  constructor(
+    private tmuxConfig: TmuxConfig,
+    private deps: TmuxUtilDeps,
+    private sessions: Map<string, TrackedSession>,
+    private pendingSessions: Set<string>,
+    private sourcePaneId: string | undefined,
+    private getCapacityConfig: () => CapacityConfig,
+    private getSessionMappings: () => SessionMapping[],
+    private waitForSessionReady: (sessionId: string) => Promise<boolean>,
+    private pollingManager: TmuxPollingManager,
+    private serverUrl: string
+  ) {}
+
+  private isEnabled(): boolean {
+    return this.tmuxConfig.enabled && this.deps.isInsideTmux()
+  }
+
+  async onSessionCreated(event: SessionCreatedEvent): Promise<void> {
+    const enabled = this.isEnabled()
+    log("[tmux-session-manager] onSessionCreated called", {
+      enabled,
+      tmuxConfigEnabled: this.tmuxConfig.enabled,
+      isInsideTmux: this.deps.isInsideTmux(),
+      eventType: event.type,
+      infoId: event.properties?.info?.id,
+      infoParentID: event.properties?.info?.parentID,
+    })
+
+    if (!enabled) return
+    if (event.type !== "session.created") return
+
+    const info = event.properties?.info
+    if (!info?.id || !info?.parentID) return
+
+    const sessionId = info.id
+    const title = info.title ?? "Subagent"
+
+    if (this.sessions.has(sessionId) || this.pendingSessions.has(sessionId)) {
+      log("[tmux-session-manager] session already tracked or pending", { sessionId })
+      return
+    }
+
+    if (!this.sourcePaneId) {
+      log("[tmux-session-manager] no source pane id")
+      return
+    }
+
+    this.pendingSessions.add(sessionId)
+
+    try {
+      const state = await queryWindowState(this.sourcePaneId)
+      if (!state) {
+        log("[tmux-session-manager] failed to query window state")
+        return
+      }
+
+      log("[tmux-session-manager] window state queried", {
+        windowWidth: state.windowWidth,
+        mainPane: state.mainPane?.paneId,
+        agentPaneCount: state.agentPanes.length,
+        agentPanes: state.agentPanes.map((p) => p.paneId),
+      })
+
+      const decision = decideSpawnActions(
+        state,
+        sessionId,
+        title,
+        this.getCapacityConfig(),
+        this.getSessionMappings()
+      )
+
+      log("[tmux-session-manager] spawn decision", {
+        canSpawn: decision.canSpawn,
+        reason: decision.reason,
+        actionCount: decision.actions.length,
+        actions: decision.actions.map((a) => {
+          if (a.type === "close") return { type: "close", paneId: a.paneId }
+          if (a.type === "replace") return { type: "replace", paneId: a.paneId, newSessionId: a.newSessionId }
+          return { type: "spawn", sessionId: a.sessionId }
+        }),
+      })
+
+      if (!decision.canSpawn) {
+        log("[tmux-session-manager] cannot spawn", { reason: decision.reason })
+        return
+      }
+
+      const result = await executeActions(
+        decision.actions,
+        { config: this.tmuxConfig, serverUrl: this.serverUrl, windowState: state }
+      )
+
+      for (const { action, result: actionResult } of result.results) {
+        if (action.type === "close" && actionResult.success) {
+          this.sessions.delete(action.sessionId)
+          log("[tmux-session-manager] removed closed session from cache", {
+            sessionId: action.sessionId,
+          })
+        }
+        if (action.type === "replace" && actionResult.success) {
+          this.sessions.delete(action.oldSessionId)
+          log("[tmux-session-manager] removed replaced session from cache", {
+            oldSessionId: action.oldSessionId,
+            newSessionId: action.newSessionId,
+          })
+        }
+      }
+
+      if (result.success && result.spawnedPaneId) {
+        const sessionReady = await this.waitForSessionReady(sessionId)
+        
+        if (!sessionReady) {
+          log("[tmux-session-manager] session not ready after timeout, tracking anyway", {
+            sessionId,
+            paneId: result.spawnedPaneId,
+          })
+        }
+        
+        const now = Date.now()
+        this.sessions.set(sessionId, {
+          sessionId,
+          paneId: result.spawnedPaneId,
+          description: title,
+          createdAt: new Date(now),
+          lastSeenAt: new Date(now),
+        })
+        log("[tmux-session-manager] pane spawned and tracked", {
+          sessionId,
+          paneId: result.spawnedPaneId,
+          sessionReady,
+        })
+        this.pollingManager.startPolling()
+      } else {
+        log("[tmux-session-manager] spawn failed", {
+          success: result.success,
+          results: result.results.map((r) => ({
+            type: r.action.type,
+            success: r.result.success,
+            error: r.result.error,
+          })),
+        })
+      }
+    } finally {
+      this.pendingSessions.delete(sessionId)
+    }
+  }
+}
--- a/src/hooks/AGENTS.md
+++ b/src/hooks/AGENTS.md
@@ -2,7 +2,7 @@

 ## OVERVIEW

-40+ lifecycle hooks intercepting/modifying agent behavior across 5 events.
+163 lifecycle hooks intercepting/modifying agent behavior across 5 events.

 **Event Types**:
 - `UserPromptSubmit` (`chat.message`) - Can block
--- a/src/hooks/atlas/index.ts
+++ b/src/hooks/atlas/index.ts
@@ -1,5 +1,4 @@
 import type { PluginInput } from "@opencode-ai/plugin"
-import { execSync } from "node:child_process"
 import { existsSync, readdirSync } from "node:fs"
 import { join } from "node:path"
 import {
@@ -12,6 +11,7 @@ import { findNearestMessageWithFields, MESSAGE_STORAGE } from "../../features/ho
 import { log } from "../../shared/logger"
 import { createSystemDirective, SYSTEM_DIRECTIVE_PREFIX, SystemDirectiveTypes } from "../../shared/system-directive"
 import { isCallerOrchestrator, getMessageDir } from "../../shared/session-utils"
+import { collectGitDiffStats, formatFileChanges } from "../../shared/git-worktree"
 import type { BackgroundManager } from "../../features/background-agent"

 export const HOOK_NAME = "atlas"
@@ -269,113 +269,6 @@ function extractSessionIdFromOutput(output: string): string {
  return match?.[1] ?? "<session_id>"
 }

-interface GitFileStat {
-  path: string
-  added: number
-  removed: number
-  status: "modified" | "added" | "deleted"
-}
-
-function getGitDiffStats(directory: string): GitFileStat[] {
-  try {
-    const output = execSync("git diff --numstat HEAD", {
-      cwd: directory,
-      encoding: "utf-8",
-      timeout: 5000,
-      stdio: ["pipe", "pipe", "pipe"],
-    }).trim()
-
-    if (!output) return []
-
-    const statusOutput = execSync("git status --porcelain", {
-      cwd: directory,
-      encoding: "utf-8",
-      timeout: 5000,
-      stdio: ["pipe", "pipe", "pipe"],
-    }).trim()
-
-    const statusMap = new Map<string, "modified" | "added" | "deleted">()
-    for (const line of statusOutput.split("\n")) {
-      if (!line) continue
-      const status = line.substring(0, 2).trim()
-      const filePath = line.substring(3)
-      if (status === "A" || status === "??") {
-        statusMap.set(filePath, "added")
-      } else if (status === "D") {
-        statusMap.set(filePath, "deleted")
-      } else {
-        statusMap.set(filePath, "modified")
-      }
-    }
-
-    const stats: GitFileStat[] = []
-    for (const line of output.split("\n")) {
-      const parts = line.split("\t")
-      if (parts.length < 3) continue
-
-      const [addedStr, removedStr, path] = parts
-      const added = addedStr === "-" ? 0 : parseInt(addedStr, 10)
-      const removed = removedStr === "-" ? 0 : parseInt(removedStr, 10)
-
-      stats.push({
-        path,
-        added,
-        removed,
-        status: statusMap.get(path) ?? "modified",
-      })
-    }
-
-    return stats
-  } catch {
-    return []
-  }
-}
-
-function formatFileChanges(stats: GitFileStat[], notepadPath?: string): string {
-  if (stats.length === 0) return "[FILE CHANGES SUMMARY]\nNo file changes detected.\n"
-
-  const modified = stats.filter((s) => s.status === "modified")
-  const added = stats.filter((s) => s.status === "added")
-  const deleted = stats.filter((s) => s.status === "deleted")
-
-  const lines: string[] = ["[FILE CHANGES SUMMARY]"]
-
-  if (modified.length > 0) {
-    lines.push("Modified files:")
-    for (const f of modified) {
-      lines.push(`  ${f.path}  (+${f.added}, -${f.removed})`)
-    }
-    lines.push("")
-  }
-
-  if (added.length > 0) {
-    lines.push("Created files:")
-    for (const f of added) {
-      lines.push(`  ${f.path}  (+${f.added})`)
-    }
-    lines.push("")
-  }
-
-  if (deleted.length > 0) {
-    lines.push("Deleted files:")
-    for (const f of deleted) {
-      lines.push(`  ${f.path}  (-${f.removed})`)
-    }
-    lines.push("")
-  }
-
-  if (notepadPath) {
-    const notepadStat = stats.find((s) => s.path.includes("notepad") || s.path.includes(".sisyphus"))
-    if (notepadStat) {
-      lines.push("[NOTEPAD UPDATED]")
-      lines.push(`  ${notepadStat.path}  (+${notepadStat.added})`)
-      lines.push("")
-    }
-  }
-
-  return lines.join("\n")
-}
-
 interface ToolExecuteAfterInput {
  tool: string
  sessionID?: string
@@ -750,8 +643,8 @@ export function createAtlasHook(
      }
      
      if (output.output && typeof output.output === "string") {
-        const gitStats = getGitDiffStats(ctx.directory)
-        const fileChanges = formatFileChanges(gitStats)
+    const gitStats = collectGitDiffStats(ctx.directory)
+    const fileChanges = formatFileChanges(gitStats)
        const subagentSessionId = extractSessionIdFromOutput(output.output)

        const boulderState = readBoulderState(ctx.directory)
--- a/src/hooks/interactive-bash-session/hook.ts
+++ b/src/hooks/interactive-bash-session/hook.ts
@@ -0,0 +1,125 @@
+import type { PluginInput } from "@opencode-ai/plugin";
+import { loadInteractiveBashSessionState, saveInteractiveBashSessionState, clearInteractiveBashSessionState } from "./storage";
+import { buildSessionReminderMessage } from "./constants";
+import type { InteractiveBashSessionState } from "./types";
+import { tokenizeCommand, findSubcommand, extractSessionNameFromTokens } from "./parser";
+import { getOrCreateState, isOmoSession, killAllTrackedSessions } from "./state-manager";
+import { subagentSessions } from "../../features/claude-code-session-state";
+
+interface ToolExecuteInput {
+  tool: string;
+  sessionID: string;
+  callID: string;
+  args?: Record<string, unknown>;
+}
+
+interface ToolExecuteOutput {
+  title: string;
+  output: string;
+  metadata: unknown;
+}
+
+interface EventInput {
+  event: {
+    type: string;
+    properties?: unknown;
+  };
+}
+
+export function createInteractiveBashSessionHook(ctx: PluginInput) {
+  const sessionStates = new Map<string, InteractiveBashSessionState>();
+
+  function getOrCreateStateLocal(sessionID: string): InteractiveBashSessionState {
+    return getOrCreateState(sessionID, sessionStates);
+  }
+
+  async function killAllTrackedSessionsLocal(
+    state: InteractiveBashSessionState,
+  ): Promise<void> {
+    await killAllTrackedSessions(state);
+    
+    for (const sessionId of subagentSessions) {
+      ctx.client.session.abort({ path: { id: sessionId } }).catch(() => {})
+    }
+  }
+
+  const toolExecuteAfter = async (
+    input: ToolExecuteInput,
+    output: ToolExecuteOutput,
+  ) => {
+    const { tool, sessionID, args } = input;
+    const toolLower = tool.toLowerCase();
+
+    if (toolLower !== "interactive_bash") {
+      return;
+    }
+
+    if (typeof args?.tmux_command !== "string") {
+      return;
+    }
+
+    const tmuxCommand = args.tmux_command;
+    const tokens = tokenizeCommand(tmuxCommand);
+    const subCommand = findSubcommand(tokens);
+    const state = getOrCreateStateLocal(sessionID);
+    let stateChanged = false;
+
+    const toolOutput = output?.output ?? ""
+    if (toolOutput.startsWith("Error:")) {
+      return
+    }
+
+    const isNewSession = subCommand === "new-session";
+    const isKillSession = subCommand === "kill-session";
+    const isKillServer = subCommand === "kill-server";
+
+    const sessionName = extractSessionNameFromTokens(tokens, subCommand);
+
+    if (isNewSession && isOmoSession(sessionName)) {
+      state.tmuxSessions.add(sessionName!);
+      stateChanged = true;
+    } else if (isKillSession && isOmoSession(sessionName)) {
+      state.tmuxSessions.delete(sessionName!);
+      stateChanged = true;
+    } else if (isKillServer) {
+      state.tmuxSessions.clear();
+      stateChanged = true;
+    }
+
+    if (stateChanged) {
+      state.updatedAt = Date.now();
+      saveInteractiveBashSessionState(state);
+    }
+
+    const isSessionOperation = isNewSession || isKillSession || isKillServer;
+    if (isSessionOperation) {
+      const reminder = buildSessionReminderMessage(
+        Array.from(state.tmuxSessions),
+      );
+      if (reminder) {
+        output.output += reminder;
+      }
+    }
+  };
+
+  const eventHandler = async ({ event }: EventInput) => {
+    const props = event.properties as Record<string, unknown> | undefined;
+
+    if (event.type === "session.deleted") {
+      const sessionInfo = props?.info as { id?: string } | undefined;
+      const sessionID = sessionInfo?.id;
+
+      if (sessionID) {
+        const state = getOrCreateStateLocal(sessionID);
+        await killAllTrackedSessionsLocal(state);
+        sessionStates.delete(sessionID);
+        clearInteractiveBashSessionState(sessionID);
+      }
+    }
+  };
+
+  return {
+    "tool.execute.after": toolExecuteAfter,
+    event: eventHandler,
+  };
+}
--- a/src/hooks/interactive-bash-session/index.ts
+++ b/src/hooks/interactive-bash-session/index.ts
@@ -1,267 +1,4 @@
-import type { PluginInput } from "@opencode-ai/plugin";
-import {
-  loadInteractiveBashSessionState,
-  saveInteractiveBashSessionState,
-  clearInteractiveBashSessionState,
-} from "./storage";
-import { OMO_SESSION_PREFIX, buildSessionReminderMessage } from "./constants";
-import type { InteractiveBashSessionState } from "./types";
-import { subagentSessions } from "../../features/claude-code-session-state";
-
-interface ToolExecuteInput {
-  tool: string;
-  sessionID: string;
-  callID: string;
-  args?: Record<string, unknown>;
-}
-
-interface ToolExecuteOutput {
-  title: string;
-  output: string;
-  metadata: unknown;
-}
-
-interface EventInput {
-  event: {
-    type: string;
-    properties?: unknown;
-  };
-}
-
-/**
- * Quote-aware command tokenizer with escape handling
- * Handles single/double quotes and backslash escapes
- */
-function tokenizeCommand(cmd: string): string[] {
-  const tokens: string[] = []
-  let current = ""
-  let inQuote = false
-  let quoteChar = ""
-  let escaped = false
-
-  for (let i = 0; i < cmd.length; i++) {
-    const char = cmd[i]
-
-    if (escaped) {
-      current += char
-      escaped = false
-      continue
-    }
-
-    if (char === "\\") {
-      escaped = true
-      continue
-    }
-
-    if ((char === "'" || char === '"') && !inQuote) {
-      inQuote = true
-      quoteChar = char
-    } else if (char === quoteChar && inQuote) {
-      inQuote = false
-      quoteChar = ""
-    } else if (char === " " && !inQuote) {
-      if (current) {
-        tokens.push(current)
-        current = ""
-      }
-    } else {
-      current += char
-    }
-  }
-
-  if (current) tokens.push(current)
-  return tokens
-}
-
-/**
- * Normalize session name by stripping :window and .pane suffixes
- * e.g., "omo-x:1" -> "omo-x", "omo-x:1.2" -> "omo-x"
- */
-function normalizeSessionName(name: string): string {
-  return name.split(":")[0].split(".")[0]
-}
-
-function findFlagValue(tokens: string[], flag: string): string | null {
-  for (let i = 0; i < tokens.length - 1; i++) {
-    if (tokens[i] === flag) return tokens[i + 1]
-  }
-  return null
-}
-
-/**
- * Extract session name from tokens, considering the subCommand
- * For new-session: prioritize -s over -t
- * For other commands: use -t
- */
-function extractSessionNameFromTokens(tokens: string[], subCommand: string): string | null {
-  if (subCommand === "new-session") {
-    const sFlag = findFlagValue(tokens, "-s")
-    if (sFlag) return normalizeSessionName(sFlag)
-    const tFlag = findFlagValue(tokens, "-t")
-    if (tFlag) return normalizeSessionName(tFlag)
-  } else {
-    const tFlag = findFlagValue(tokens, "-t")
-    if (tFlag) return normalizeSessionName(tFlag)
-  }
-  return null
-}
-
-/**
- * Find the tmux subcommand from tokens, skipping global options.
- * tmux allows global options before the subcommand:
- * e.g., `tmux -L socket-name new-session -s omo-x`
- * Global options with args: -L, -S, -f, -c, -T
- * Standalone flags: -C, -v, -V, etc.
- * Special: -- (end of options marker)
- */
-function findSubcommand(tokens: string[]): string {
-  // Options that require an argument: -L, -S, -f, -c, -T
-  const globalOptionsWithArgs = new Set(["-L", "-S", "-f", "-c", "-T"])
-
-  let i = 0
-  while (i < tokens.length) {
-    const token = tokens[i]
-
-    // Handle end of options marker
-    if (token === "--") {
-      // Next token is the subcommand
-      return tokens[i + 1] ?? ""
-    }
-
-    if (globalOptionsWithArgs.has(token)) {
-      // Skip the option and its argument
-      i += 2
-      continue
-    }
-
-    if (token.startsWith("-")) {
-      // Skip standalone flags like -C, -v, -V
-      i++
-      continue
-    }
-
-    // Found the subcommand
-    return token
-  }
-
-  return ""
-}
-
-export function createInteractiveBashSessionHook(ctx: PluginInput) {
-  const sessionStates = new Map<string, InteractiveBashSessionState>();
-
-  function getOrCreateState(sessionID: string): InteractiveBashSessionState {
-    if (!sessionStates.has(sessionID)) {
-      const persisted = loadInteractiveBashSessionState(sessionID);
-      const state: InteractiveBashSessionState = persisted ?? {
-        sessionID,
-        tmuxSessions: new Set<string>(),
-        updatedAt: Date.now(),
-      };
-      sessionStates.set(sessionID, state);
-    }
-    return sessionStates.get(sessionID)!;
-  }
-
-  function isOmoSession(sessionName: string | null): boolean {
-    return sessionName !== null && sessionName.startsWith(OMO_SESSION_PREFIX);
-  }
-
-  async function killAllTrackedSessions(
-    state: InteractiveBashSessionState,
-  ): Promise<void> {
-    for (const sessionName of state.tmuxSessions) {
-      try {
-        const proc = Bun.spawn(["tmux", "kill-session", "-t", sessionName], {
-          stdout: "ignore",
-          stderr: "ignore",
-        });
-        await proc.exited;
-      } catch {}
-    }
-
-    for (const sessionId of subagentSessions) {
-      ctx.client.session.abort({ path: { id: sessionId } }).catch(() => {})
-    }
-  }
-
-  const toolExecuteAfter = async (
-    input: ToolExecuteInput,
-    output: ToolExecuteOutput,
-  ) => {
-    const { tool, sessionID, args } = input;
-    const toolLower = tool.toLowerCase();
-
-    if (toolLower !== "interactive_bash") {
-      return;
-    }
-
-    if (typeof args?.tmux_command !== "string") {
-      return;
-    }
-
-    const tmuxCommand = args.tmux_command;
-    const tokens = tokenizeCommand(tmuxCommand);
-    const subCommand = findSubcommand(tokens);
-    const state = getOrCreateState(sessionID);
-    let stateChanged = false;
-
-    const toolOutput = output?.output ?? ""
-    if (toolOutput.startsWith("Error:")) {
-      return
-    }
-
-    const isNewSession = subCommand === "new-session";
-    const isKillSession = subCommand === "kill-session";
-    const isKillServer = subCommand === "kill-server";
-
-    const sessionName = extractSessionNameFromTokens(tokens, subCommand);
-
-    if (isNewSession && isOmoSession(sessionName)) {
-      state.tmuxSessions.add(sessionName!);
-      stateChanged = true;
-    } else if (isKillSession && isOmoSession(sessionName)) {
-      state.tmuxSessions.delete(sessionName!);
-      stateChanged = true;
-    } else if (isKillServer) {
-      state.tmuxSessions.clear();
-      stateChanged = true;
-    }
-
-    if (stateChanged) {
-      state.updatedAt = Date.now();
-      saveInteractiveBashSessionState(state);
-    }
-
-    const isSessionOperation = isNewSession || isKillSession || isKillServer;
-    if (isSessionOperation) {
-      const reminder = buildSessionReminderMessage(
-        Array.from(state.tmuxSessions),
-      );
-      if (reminder) {
-        output.output += reminder;
-      }
-    }
-  };
-
-  const eventHandler = async ({ event }: EventInput) => {
-    const props = event.properties as Record<string, unknown> | undefined;
-
-    if (event.type === "session.deleted") {
-      const sessionInfo = props?.info as { id?: string } | undefined;
-      const sessionID = sessionInfo?.id;
-
-      if (sessionID) {
-        const state = getOrCreateState(sessionID);
-        await killAllTrackedSessions(state);
-        sessionStates.delete(sessionID);
-        clearInteractiveBashSessionState(sessionID);
-      }
-    }
-  };
-
-  return {
-    "tool.execute.after": toolExecuteAfter,
-    event: eventHandler,
-  };
-}
+export { createInteractiveBashSessionHook } from "./hook";
+export * from "./types";
+export * from "./constants";
+export * from "./storage";
--- a/src/hooks/interactive-bash-session/parser.ts
+++ b/src/hooks/interactive-bash-session/parser.ts
@@ -0,0 +1,118 @@
+/**
+ * Quote-aware command tokenizer with escape handling
+ * Handles single/double quotes and backslash escapes
+ */
+export function tokenizeCommand(cmd: string): string[] {
+  const tokens: string[] = []
+  let current = ""
+  let inQuote = false
+  let quoteChar = ""
+  let escaped = false
+
+  for (let i = 0; i < cmd.length; i++) {
+    const char = cmd[i]
+
+    if (escaped) {
+      current += char
+      escaped = false
+      continue
+    }
+
+    if (char === "\\") {
+      escaped = true
+      continue
+    }
+
+    if ((char === "'" || char === '"') && !inQuote) {
+      inQuote = true
+      quoteChar = char
+    } else if (char === quoteChar && inQuote) {
+      inQuote = false
+      quoteChar = ""
+    } else if (char === " " && !inQuote) {
+      if (current) {
+        tokens.push(current)
+        current = ""
+      }
+    } else {
+      current += char
+    }
+  }
+
+  if (current) tokens.push(current)
+  return tokens
+}
+
+/**
+ * Normalize session name by stripping :window and .pane suffixes
+ * e.g., "omo-x:1" -> "omo-x", "omo-x:1.2" -> "omo-x"
+ */
+export function normalizeSessionName(name: string): string {
+  return name.split(":")[0].split(".")[0]
+}
+
+export function findFlagValue(tokens: string[], flag: string): string | null {
+  for (let i = 0; i < tokens.length - 1; i++) {
+    if (tokens[i] === flag) return tokens[i + 1]
+  }
+  return null
+}
+
+/**
+ * Extract session name from tokens, considering the subCommand
+ * For new-session: prioritize -s over -t
+ * For other commands: use -t
+ */
+export function extractSessionNameFromTokens(tokens: string[], subCommand: string): string | null {
+  if (subCommand === "new-session") {
+    const sFlag = findFlagValue(tokens, "-s")
+    if (sFlag) return normalizeSessionName(sFlag)
+    const tFlag = findFlagValue(tokens, "-t")
+    if (tFlag) return normalizeSessionName(tFlag)
+  } else {
+    const tFlag = findFlagValue(tokens, "-t")
+    if (tFlag) return normalizeSessionName(tFlag)
+  }
+  return null
+}
+
+/**
+ * Find the tmux subcommand from tokens, skipping global options.
+ * tmux allows global options before the subcommand:
+ * e.g., `tmux -L socket-name new-session -s omo-x`
+ * Global options with args: -L, -S, -f, -c, -T
+ * Standalone flags: -C, -v, -V, etc.
+ * Special: -- (end of options marker)
+ */
+export function findSubcommand(tokens: string[]): string {
+  // Options that require an argument: -L, -S, -f, -c, -T
+  const globalOptionsWithArgs = new Set(["-L", "-S", "-f", "-c", "-T"])
+
+  let i = 0
+  while (i < tokens.length) {
+    const token = tokens[i]
+
+    // Handle end of options marker
+    if (token === "--") {
+      // Next token is the subcommand
+      return tokens[i + 1] ?? ""
+    }
+
+    if (globalOptionsWithArgs.has(token)) {
+      // Skip the option and its argument
+      i += 2
+      continue
+    }
+
+    if (token.startsWith("-")) {
+      // Skip standalone flags like -C, -v, -V
+      i++
+      continue
+    }
+
+    // Found the subcommand
+    return token
+  }
+
+  return ""
+}
--- a/src/hooks/interactive-bash-session/state-manager.ts
+++ b/src/hooks/interactive-bash-session/state-manager.ts
@@ -0,0 +1,40 @@
+import type { InteractiveBashSessionState } from "./types";
+import { loadInteractiveBashSessionState, saveInteractiveBashSessionState } from "./storage";
+import { OMO_SESSION_PREFIX } from "./constants";
+import { subagentSessions } from "../../features/claude-code-session-state";
+
+export function getOrCreateState(sessionID: string, sessionStates: Map<string, InteractiveBashSessionState>): InteractiveBashSessionState {
+  if (!sessionStates.has(sessionID)) {
+    const persisted = loadInteractiveBashSessionState(sessionID);
+    const state: InteractiveBashSessionState = persisted ?? {
+      sessionID,
+      tmuxSessions: new Set<string>(),
+      updatedAt: Date.now(),
+    };
+    sessionStates.set(sessionID, state);
+  }
+  return sessionStates.get(sessionID)!;
+}
+
+export function isOmoSession(sessionName: string | null): boolean {
+  return sessionName !== null && sessionName.startsWith(OMO_SESSION_PREFIX);
+}
+
+export async function killAllTrackedSessions(
+  state: InteractiveBashSessionState,
+): Promise<void> {
+  for (const sessionName of state.tmuxSessions) {
+    try {
+      const proc = Bun.spawn(["tmux", "kill-session", "-t", sessionName], {
+        stdout: "ignore",
+        stderr: "ignore",
+      });
+      await proc.exited;
+    } catch {}
+  }
+
+  for (const sessionId of subagentSessions) {
+    // Note: ctx is not available here, so we can't call ctx.client.session.abort
+    // This will need to be handled in the hook where ctx is available
+  }
+}
--- a/src/hooks/keyword-detector/ultrawork/default.ts
+++ b/src/hooks/keyword-detector/ultrawork/default.ts
@@ -104,7 +104,7 @@ TELL THE USER WHAT AGENTS YOU WILL LEVERAGE NOW TO SATISFY USER'S REQUEST.
 | Architecture decision needed | MUST call plan agent |

 \`\`\`
-task(subagent_type="plan", prompt="<gathered context + user request>")
+task(subagent_type="plan", load_skills=[], prompt="<gathered context + user request>")
 \`\`\`

 **WHY PLAN AGENT IS MANDATORY:**
@@ -119,9 +119,9 @@ task(subagent_type="plan", prompt="<gathered context + user request>")

 | Scenario | Action |
 |----------|--------|
-| Plan agent asks clarifying questions | \`task(session_id="{returned_session_id}", prompt="<your answer>")\` |
-| Need to refine the plan | \`task(session_id="{returned_session_id}", prompt="Please adjust: <feedback>")\` |
-| Plan needs more detail | \`task(session_id="{returned_session_id}", prompt="Add more detail to Task N")\` |
+| Plan agent asks clarifying questions | \`task(session_id="{returned_session_id}", load_skills=[], prompt="<your answer>")\` |
+| Need to refine the plan | \`task(session_id="{returned_session_id}", load_skills=[], prompt="Please adjust: <feedback>")\` |
+| Plan needs more detail | \`task(session_id="{returned_session_id}", load_skills=[], prompt="Add more detail to Task N")\` |

 **WHY SESSION_ID IS CRITICAL:**
 - Plan agent retains FULL conversation context
@@ -131,10 +131,10 @@ task(subagent_type="plan", prompt="<gathered context + user request>")

 \`\`\`
 // WRONG: Starting fresh loses all context
-task(subagent_type="plan", prompt="Here's more info...")
+task(subagent_type="plan", load_skills=[], prompt="Here's more info...")

 // CORRECT: Resume preserves everything
-task(session_id="ses_abc123", prompt="Here's my answer to your question: ...")
+task(session_id="ses_abc123", load_skills=[], prompt="Here's my answer to your question: ...")
 \`\`\`

 **FAILURE TO CALL PLAN AGENT = INCOMPLETE WORK.**
@@ -147,10 +147,10 @@ task(session_id="ses_abc123", prompt="Here's my answer to your question: ...")

 | Task Type | Action | Why |
 |-----------|--------|-----|
-| Codebase exploration | task(subagent_type="explore", run_in_background=true) | Parallel, context-efficient |
-| Documentation lookup | task(subagent_type="librarian", run_in_background=true) | Specialized knowledge |
-| Planning | task(subagent_type="plan") | Parallel task graph + structured TODO list |
-| Hard problem (conventional) | task(subagent_type="oracle") | Architecture, debugging, complex logic |
+| Codebase exploration | task(subagent_type="explore", load_skills=[], run_in_background=true) | Parallel, context-efficient |
+| Documentation lookup | task(subagent_type="librarian", load_skills=[], run_in_background=true) | Specialized knowledge |
+| Planning | task(subagent_type="plan", load_skills=[]) | Parallel task graph + structured TODO list |
+| Hard problem (conventional) | task(subagent_type="oracle", load_skills=[]) | Architecture, debugging, complex logic |
 | Hard problem (non-conventional) | task(category="artistry", load_skills=[...]) | Different approach needed |
 | Implementation | task(category="...", load_skills=[...]) | Domain-optimized models |

--- a/src/hooks/keyword-detector/ultrawork/gpt5.2.ts
+++ b/src/hooks/keyword-detector/ultrawork/gpt5.2.ts
@@ -73,10 +73,10 @@ Use these when they provide clear value based on the decision framework above:

 | Resource | When to Use | How to Use |
 |----------|-------------|------------|
-| explore agent | Need codebase patterns you don't have | \`task(subagent_type="explore", run_in_background=true, ...)\` |
-| librarian agent | External library docs, OSS examples | \`task(subagent_type="librarian", run_in_background=true, ...)\` |
-| oracle agent | Stuck on architecture/debugging after 2+ attempts | \`task(subagent_type="oracle", ...)\` |
-| plan agent | Complex multi-step with dependencies (5+ steps) | \`task(subagent_type="plan", ...)\` |
+| explore agent | Need codebase patterns you don't have | \`task(subagent_type="explore", load_skills=[], run_in_background=true, ...)\` |
+| librarian agent | External library docs, OSS examples | \`task(subagent_type="librarian", load_skills=[], run_in_background=true, ...)\` |
+| oracle agent | Stuck on architecture/debugging after 2+ attempts | \`task(subagent_type="oracle", load_skills=[], ...)\` |
+| plan agent | Complex multi-step with dependencies (5+ steps) | \`task(subagent_type="plan", load_skills=[], ...)\` |
 | task category | Specialized work matching a category | \`task(category="...", load_skills=[...])\` |

 <tool_usage_rules>
--- a/src/hooks/keyword-detector/ultrawork/planner.ts
+++ b/src/hooks/keyword-detector/ultrawork/planner.ts
@@ -38,9 +38,9 @@ You ARE the planner. Your job: create bulletproof work plans.
 ### Research Protocol
 1. **Fire parallel background agents** for comprehensive context:
   \`\`\`
-   task(agent="explore", prompt="Find existing patterns for [topic] in codebase", background=true)
-   task(agent="explore", prompt="Find test infrastructure and conventions", background=true)
-   task(agent="librarian", prompt="Find official docs and best practices for [technology]", background=true)
+   task(subagent_type="explore", load_skills=[], prompt="Find existing patterns for [topic] in codebase", run_in_background=true)
+   task(subagent_type="explore", load_skills=[], prompt="Find test infrastructure and conventions", run_in_background=true)
+   task(subagent_type="librarian", load_skills=[], prompt="Find official docs and best practices for [technology]", run_in_background=true)
   \`\`\`
 2. **Wait for results** before planning - rushed plans fail
 3. **Synthesize findings** into informed requirements
--- a/src/hooks/ralph-loop/index.test.ts
+++ b/src/hooks/ralph-loop/index.test.ts
@@ -511,6 +511,38 @@ describe("ralph-loop", () => {
      expect(messagesCalls[0].sessionID).toBe("session-123")
    })

+    test("should detect completion promise in reasoning part via session messages API", async () => {
+      //#given - active loop with assistant reasoning containing completion promise
+      mockSessionMessages = [
+        { info: { role: "user" }, parts: [{ type: "text", text: "Build something" }] },
+        {
+          info: { role: "assistant" },
+          parts: [
+            { type: "reasoning", text: "I am done now. <promise>REASONING_DONE</promise>" },
+          ],
+        },
+      ]
+      const hook = createRalphLoopHook(createMockPluginInput(), {
+        getTranscriptPath: () => join(TEST_DIR, "nonexistent.jsonl"),
+      })
+      hook.startLoop("session-123", "Build something", {
+        completionPromise: "REASONING_DONE",
+      })
+
+      //#when - session goes idle
+      await hook.event({
+        event: {
+          type: "session.idle",
+          properties: { sessionID: "session-123" },
+        },
+      })
+
+      //#then - loop completed via API detection, no continuation
+      expect(promptCalls.length).toBe(0)
+      expect(toastCalls.some((t) => t.title === "Ralph Loop Complete!")).toBe(true)
+      expect(hook.getState()).toBeNull()
+    })
+
    test("should handle multiple iterations correctly", async () => {
      // given - active loop
      const hook = createRalphLoopHook(createMockPluginInput())
@@ -596,13 +628,14 @@ describe("ralph-loop", () => {
      expect(promptCalls.length).toBe(1)
    })

-    test("should only check LAST assistant message for completion", async () => {
-      // given - multiple assistant messages, only first has completion promise
+    test("should check last 3 assistant messages for completion", async () => {
+      // given - multiple assistant messages, promise in recent (not last) assistant message
      mockSessionMessages = [
        { info: { role: "user" }, parts: [{ type: "text", text: "Start task" }] },
-        { info: { role: "assistant" }, parts: [{ type: "text", text: "I'll work on it. <promise>DONE</promise>" }] },
+        { info: { role: "assistant" }, parts: [{ type: "text", text: "Working on it." }] },
        { info: { role: "user" }, parts: [{ type: "text", text: "Continue" }] },
-        { info: { role: "assistant" }, parts: [{ type: "text", text: "Working on more features..." }] },
+        { info: { role: "assistant" }, parts: [{ type: "text", text: "Nearly there... <promise>DONE</promise>" }] },
+        { info: { role: "assistant" }, parts: [{ type: "text", text: "(extra output after promise)" }] },
      ]
      const hook = createRalphLoopHook(createMockPluginInput(), {
        getTranscriptPath: () => join(TEST_DIR, "nonexistent.jsonl"),
@@ -614,35 +647,36 @@ describe("ralph-loop", () => {
        event: { type: "session.idle", properties: { sessionID: "session-123" } },
      })

-      // then - loop should continue (last message has no completion promise)
-      expect(promptCalls.length).toBe(1)
-      expect(hook.getState()?.iteration).toBe(2)
-    })
-
-    test("should detect completion only in LAST assistant message", async () => {
-      // given - last assistant message has completion promise
-      mockSessionMessages = [
-        { info: { role: "user" }, parts: [{ type: "text", text: "Start task" }] },
-        { info: { role: "assistant" }, parts: [{ type: "text", text: "Starting work..." }] },
-        { info: { role: "user" }, parts: [{ type: "text", text: "Continue" }] },
-        { info: { role: "assistant" }, parts: [{ type: "text", text: "Task complete! <promise>DONE</promise>" }] },
-      ]
-      const hook = createRalphLoopHook(createMockPluginInput(), {
-        getTranscriptPath: () => join(TEST_DIR, "nonexistent.jsonl"),
-      })
-      hook.startLoop("session-123", "Build something", { completionPromise: "DONE" })
-
-      // when - session goes idle
-      await hook.event({
-        event: { type: "session.idle", properties: { sessionID: "session-123" } },
-      })
-
-      // then - loop should complete (last message has completion promise)
+      // then - loop should complete (promise found within last 3 assistant messages)
      expect(promptCalls.length).toBe(0)
      expect(toastCalls.some((t) => t.title === "Ralph Loop Complete!")).toBe(true)
      expect(hook.getState()).toBeNull()
    })

+    test("should NOT detect completion if promise is older than last 3 assistant messages", async () => {
+      // given - promise appears in an assistant message older than last 3
+      mockSessionMessages = [
+        { info: { role: "user" }, parts: [{ type: "text", text: "Start task" }] },
+        { info: { role: "assistant" }, parts: [{ type: "text", text: "Promise early <promise>DONE</promise>" }] },
+        { info: { role: "assistant" }, parts: [{ type: "text", text: "More work 1" }] },
+        { info: { role: "assistant" }, parts: [{ type: "text", text: "More work 2" }] },
+        { info: { role: "assistant" }, parts: [{ type: "text", text: "More work 3" }] },
+      ]
+      const hook = createRalphLoopHook(createMockPluginInput(), {
+        getTranscriptPath: () => join(TEST_DIR, "nonexistent.jsonl"),
+      })
+      hook.startLoop("session-123", "Build something", { completionPromise: "DONE" })
+
+      // when - session goes idle
+      await hook.event({
+        event: { type: "session.idle", properties: { sessionID: "session-123" } },
+      })
+
+      // then - loop should continue (promise is older than last 3 assistant messages)
+      expect(promptCalls.length).toBe(1)
+      expect(hook.getState()?.iteration).toBe(2)
+    })
+
    test("should allow starting new loop while previous loop is active (different session)", async () => {
      // given - active loop in session A
      const hook = createRalphLoopHook(createMockPluginInput())
@@ -928,7 +962,7 @@ Original task: Build something`
      const elapsed = Date.now() - startTime

      // then - should complete quickly (not hang for 10s)
-      expect(elapsed).toBeLessThan(2000)
+      expect(elapsed).toBeLessThan(6000)
      // then - loop should continue (API error = no completion detected)
      expect(promptCalls.length).toBe(1)
      expect(apiCallCount).toBeGreaterThan(0)
--- a/src/hooks/ralph-loop/index.ts
+++ b/src/hooks/ralph-loop/index.ts
@@ -67,7 +67,7 @@ export interface RalphLoopHook {
  getState: () => RalphLoopState | null
 }

-const DEFAULT_API_TIMEOUT = 3000
+const DEFAULT_API_TIMEOUT = 5000

 export function createRalphLoopHook(
  ctx: PluginInput,
@@ -80,6 +80,23 @@ export function createRalphLoopHook(
  const apiTimeout = options?.apiTimeout ?? DEFAULT_API_TIMEOUT
  const checkSessionExists = options?.checkSessionExists

+  async function withTimeout<TData>(promise: Promise<TData>, timeoutMs: number): Promise<TData> {
+    let timeoutId: ReturnType<typeof setTimeout> | undefined
+    const timeoutPromise = new Promise<never>((_, reject) => {
+      timeoutId = setTimeout(() => {
+        reject(new Error("API timeout"))
+      }, timeoutMs)
+    })
+
+    try {
+      return await Promise.race([promise, timeoutPromise])
+    } finally {
+      if (timeoutId !== undefined) {
+        clearTimeout(timeoutId)
+      }
+    }
+  }
+
  function getSessionState(sessionID: string): SessionState {
    let state = sessions.get(sessionID)
    if (!state) {
@@ -126,34 +143,44 @@ export function createRalphLoopHook(
    promise: string
  ): Promise<boolean> {
    try {
-      const response = await Promise.race([
+      const response = await withTimeout(
        ctx.client.session.messages({
          path: { id: sessionID },
          query: { directory: ctx.directory },
        }),
-        new Promise<never>((_, reject) =>
-          setTimeout(() => reject(new Error("API timeout")), apiTimeout)
-        ),
-      ])
+        apiTimeout
+      )

      const messages = (response as { data?: unknown[] }).data ?? []
      if (!Array.isArray(messages)) return false

-      const assistantMessages = (messages as OpenCodeSessionMessage[]).filter(
-        (msg) => msg.info?.role === "assistant"
-      )
-      const lastAssistant = assistantMessages[assistantMessages.length - 1]
-      if (!lastAssistant?.parts) return false
+      const assistantMessages = (messages as OpenCodeSessionMessage[]).filter((msg) => msg.info?.role === "assistant")
+      if (assistantMessages.length === 0) return false

      const pattern = new RegExp(`<promise>\\s*${escapeRegex(promise)}\\s*</promise>`, "is")
-      const responseText = lastAssistant.parts
-        .filter((p) => p.type === "text")
-        .map((p) => p.text ?? "")
-        .join("\n")

-      return pattern.test(responseText)
+      const recentAssistants = assistantMessages.slice(-3)
+      for (const assistant of recentAssistants) {
+        if (!assistant.parts) continue
+
+        const responseText = assistant.parts
+          .filter((p) => p.type === "text" || p.type === "reasoning")
+          .map((p) => p.text ?? "")
+          .join("\n")
+
+        if (pattern.test(responseText)) {
+          return true
+        }
+      }
+
+      return false
    } catch (err) {
-      log(`[${HOOK_NAME}] Session messages check failed`, { sessionID, error: String(err) })
+      setTimeout(() => {
+        log(`[${HOOK_NAME}] Session messages check failed`, {
+          sessionID,
+          error: String(err),
+        })
+      }, 0)
      return false
    }
  }
@@ -343,7 +370,10 @@ export function createRalphLoopHook(
        let model: { providerID: string; modelID: string } | undefined

        try {
-          const messagesResp = await ctx.client.session.messages({ path: { id: sessionID } })
+          const messagesResp = await withTimeout(
+            ctx.client.session.messages({ path: { id: sessionID } }),
+            apiTimeout
+          )
          const messages = (messagesResp.data ?? []) as Array<{
            info?: { agent?: string; model?: { providerID: string; modelID: string }; modelID?: string; providerID?: string }
          }>
--- a/src/hooks/session-recovery/index.test.ts
+++ b/src/hooks/session-recovery/index.test.ts
@@ -129,6 +129,63 @@ describe("detectErrorType", () => {
    })
  })

+  describe("assistant_prefill_unsupported errors", () => {
+    it("should detect assistant message prefill error from direct message", () => {
+      //#given an error about assistant message prefill not being supported
+      const error = {
+        message: "This model does not support assistant message prefill. The conversation must end with a user message.",
+      }
+
+      //#when detectErrorType is called
+      const result = detectErrorType(error)
+
+      //#then should return assistant_prefill_unsupported
+      expect(result).toBe("assistant_prefill_unsupported")
+    })
+
+    it("should detect assistant message prefill error from nested error object", () => {
+      //#given an Anthropic API error with nested structure matching the real error format
+      const error = {
+        error: {
+          type: "invalid_request_error",
+          message: "This model does not support assistant message prefill. The conversation must end with a user message.",
+        },
+      }
+
+      //#when detectErrorType is called
+      const result = detectErrorType(error)
+
+      //#then should return assistant_prefill_unsupported
+      expect(result).toBe("assistant_prefill_unsupported")
+    })
+
+    it("should detect error with only 'conversation must end with a user message' fragment", () => {
+      //#given an error containing only the user message requirement
+      const error = {
+        message: "The conversation must end with a user message.",
+      }
+
+      //#when detectErrorType is called
+      const result = detectErrorType(error)
+
+      //#then should return assistant_prefill_unsupported
+      expect(result).toBe("assistant_prefill_unsupported")
+    })
+
+    it("should detect error with only 'assistant message prefill' fragment", () => {
+      //#given an error containing only the prefill mention
+      const error = {
+        message: "This model does not support assistant message prefill.",
+      }
+
+      //#when detectErrorType is called
+      const result = detectErrorType(error)
+
+      //#then should return assistant_prefill_unsupported
+      expect(result).toBe("assistant_prefill_unsupported")
+    })
+  })
+
  describe("unrecognized errors", () => {
    it("should return null for unrecognized error patterns", () => {
      // given an unrelated error
--- a/src/hooks/session-recovery/index.ts
+++ b/src/hooks/session-recovery/index.ts
@@ -28,6 +28,7 @@ type RecoveryErrorType =
  | "tool_result_missing"
  | "thinking_block_order"
  | "thinking_disabled_violation"
+  | "assistant_prefill_unsupported"
  | null

 interface MessageInfo {
@@ -126,6 +127,13 @@ function extractMessageIndex(error: unknown): number | null {
 export function detectErrorType(error: unknown): RecoveryErrorType {
  const message = getErrorMessage(error)

+  if (
+    message.includes("assistant message prefill") ||
+    message.includes("conversation must end with a user message")
+  ) {
+    return "assistant_prefill_unsupported"
+  }
+
  // IMPORTANT: Check thinking_block_order BEFORE tool_result_missing
  // because Anthropic's extended thinking error messages contain "tool_use" and "tool_result"
  // in the documentation URL, which would incorrectly match tool_result_missing
@@ -375,11 +383,13 @@ export function createSessionRecoveryHook(ctx: PluginInput, options?: SessionRec
        tool_result_missing: "Tool Crash Recovery",
        thinking_block_order: "Thinking Block Recovery",
        thinking_disabled_violation: "Thinking Strip Recovery",
+        assistant_prefill_unsupported: "Prefill Error Recovery",
      }
      const toastMessages: Record<RecoveryErrorType & string, string> = {
        tool_result_missing: "Injecting cancelled tool results...",
        thinking_block_order: "Fixing message structure...",
        thinking_disabled_violation: "Stripping thinking blocks...",
+        assistant_prefill_unsupported: "Sending 'Continue' to recover...",
      }

      await ctx.client.tui
@@ -411,6 +421,8 @@ export function createSessionRecoveryHook(ctx: PluginInput, options?: SessionRec
          const resumeConfig = extractResumeConfig(lastUser, sessionID)
          await resumeSession(ctx.client, resumeConfig)
        }
+      } else if (errorType === "assistant_prefill_unsupported") {
+        success = true
      }

      return success
--- a/src/hooks/unstable-agent-babysitter/index.test.ts
+++ b/src/hooks/unstable-agent-babysitter/index.test.ts
@@ -1,8 +1,9 @@
+import { afterEach, describe, expect, test } from "bun:test"
 import { _resetForTesting, setMainSession } from "../../features/claude-code-session-state"
 import type { BackgroundTask } from "../../features/background-agent"
 import { createUnstableAgentBabysitterHook } from "./index"

-const projectDir = "/Users/yeongyu/local-workspaces/oh-my-opencode"
+const projectDir = process.cwd()

 type BabysitterContext = Parameters<typeof createUnstableAgentBabysitterHook>[0]

@@ -21,6 +22,9 @@ function createMockPluginInput(options: {
        prompt: async (input: unknown) => {
          promptCalls.push({ input })
        },
+        promptAsync: async (input: unknown) => {
+          promptCalls.push({ input })
+        },
      },
    },
  }
--- a/src/index.ts
+++ b/src/index.ts
@@ -403,12 +403,7 @@ const OhMyOpenCodePlugin: Plugin = async (ctx) => {
      ), { enabled: safeHookEnabled })
    : null;

-  if (sessionRecovery && todoContinuationEnforcer) {
-    sessionRecovery.setOnAbortCallback(todoContinuationEnforcer.markRecovering);
-    sessionRecovery.setOnRecoveryCompleteCallback(
-      todoContinuationEnforcer.markRecoveryComplete,
-    );
-  }
+  // sessionRecovery callbacks are setters; compose callbacks so both enforcers are notified.

  const backgroundNotificationHook = isHookEnabled("background-notification")
    ? safeCreateHook("background-notification", () => createBackgroundNotificationHook(backgroundManager), { enabled: safeHookEnabled })
@@ -488,6 +483,7 @@ const OhMyOpenCodePlugin: Plugin = async (ctx) => {
    disabledSkills,
    availableCategories,
    availableSkills,
+    agentOverrides: pluginConfig.agents,
    onSyncSessionCreated: async (event) => {
      log("[index] onSyncSessionCreated callback", {
        sessionID: event.sessionID,
@@ -543,6 +539,16 @@ const OhMyOpenCodePlugin: Plugin = async (ctx) => {
  });

  const taskSystemEnabled = pluginConfig.experimental?.task_system ?? false;
+
+  if (sessionRecovery && todoContinuationEnforcer) {
+    sessionRecovery.setOnAbortCallback((sessionID) => {
+      todoContinuationEnforcer?.markRecovering(sessionID);
+    });
+    sessionRecovery.setOnRecoveryCompleteCallback((sessionID) => {
+      todoContinuationEnforcer?.markRecoveryComplete(sessionID);
+    });
+  }
+
  const taskToolsRecord: Record<string, ToolDefinition> = taskSystemEnabled
    ? {
        task_create: createTaskCreateTool(pluginConfig, ctx),
--- a/src/mcp/AGENTS.md
+++ b/src/mcp/AGENTS.md
@@ -2,7 +2,7 @@

 ## OVERVIEW

-Tier 1 of three-tier MCP system: 3 built-in remote HTTP MCPs.
+Tier 1 of three-tier MCP system: 8 built-in remote HTTP MCPs.

 **Three-Tier System**:
 1. **Built-in** (this directory): websearch, context7, grep_app
--- a/src/mcp/websearch.test.ts
+++ b/src/mcp/websearch.test.ts
@@ -1,16 +1,30 @@
-import { describe, expect, test, beforeEach, afterEach } from "bun:test"
+import { afterEach, beforeEach, describe, expect, test } from "bun:test"
 import { createWebsearchConfig } from "./websearch"

 describe("websearch MCP provider configuration", () => {
-  const originalEnv = { ...process.env }
+  let originalExaApiKey: string | undefined
+  let originalTavilyApiKey: string | undefined

  beforeEach(() => {
+    originalExaApiKey = process.env.EXA_API_KEY
+    originalTavilyApiKey = process.env.TAVILY_API_KEY
+
    delete process.env.EXA_API_KEY
    delete process.env.TAVILY_API_KEY
  })

  afterEach(() => {
-    process.env = { ...originalEnv }
+    if (originalExaApiKey === undefined) {
+      delete process.env.EXA_API_KEY
+    } else {
+      process.env.EXA_API_KEY = originalExaApiKey
+    }
+
+    if (originalTavilyApiKey === undefined) {
+      delete process.env.TAVILY_API_KEY
+    } else {
+      process.env.TAVILY_API_KEY = originalTavilyApiKey
+    }
  })

  test("returns Exa config when no config provided", () => {
@@ -21,6 +35,7 @@ describe("websearch MCP provider configuration", () => {

    //#then
    expect(result.url).toContain("mcp.exa.ai")
+    expect(result.url).toContain("tools=web_search_exa")
    expect(result.type).toBe("remote")
    expect(result.enabled).toBe(true)
  })
@@ -34,10 +49,11 @@ describe("websearch MCP provider configuration", () => {

    //#then
    expect(result.url).toContain("mcp.exa.ai")
+    expect(result.url).toContain("tools=web_search_exa")
    expect(result.type).toBe("remote")
  })

-  test("includes x-api-key header when EXA_API_KEY is set", () => {
+  test("appends exaApiKey query param when EXA_API_KEY is set", () => {
    //#given
    const apiKey = "test-exa-key-12345"
    process.env.EXA_API_KEY = apiKey
@@ -46,7 +62,30 @@ describe("websearch MCP provider configuration", () => {
    const result = createWebsearchConfig()

    //#then
-    expect(result.headers).toEqual({ "x-api-key": apiKey })
+    expect(result.url).toContain(`exaApiKey=${encodeURIComponent(apiKey)}`)
+  })
+
+  test("does not set x-api-key header when EXA_API_KEY is set", () => {
+    //#given
+    process.env.EXA_API_KEY = "test-exa-key-12345"
+
+    //#when
+    const result = createWebsearchConfig()
+
+    //#then
+    expect(result.headers).toBeUndefined()
+  })
+
+  test("URL-encodes EXA_API_KEY when it contains special characters", () => {
+    //#given an EXA_API_KEY with special characters (+ & =)
+    const apiKey = "a+b&c=d"
+    process.env.EXA_API_KEY = apiKey
+
+    //#when createWebsearchConfig is called
+    const result = createWebsearchConfig()
+
+    //#then the URL contains the properly encoded key via encodeURIComponent
+    expect(result.url).toContain(`exaApiKey=${encodeURIComponent(apiKey)}`)
  })

  test("returns Tavily config when provider is 'tavily' and TAVILY_API_KEY set", () => {
@@ -77,7 +116,8 @@ describe("websearch MCP provider configuration", () => {

  test("returns Exa when both keys present but no explicit provider", () => {
    //#given
-    process.env.EXA_API_KEY = "test-exa-key"
+    const exaKey = "test-exa-key"
+    process.env.EXA_API_KEY = exaKey
    process.env.TAVILY_API_KEY = "test-tavily-key"

    //#when
@@ -85,7 +125,8 @@ describe("websearch MCP provider configuration", () => {

    //#then
    expect(result.url).toContain("mcp.exa.ai")
-    expect(result.headers).toEqual({ "x-api-key": "test-exa-key" })
+    expect(result.url).toContain(`exaApiKey=${encodeURIComponent(exaKey)}`)
+    expect(result.headers).toBeUndefined()
  })

  test("Tavily config uses Authorization Bearer header format", () => {
@@ -111,6 +152,8 @@ describe("websearch MCP provider configuration", () => {

    //#then
    expect(result.url).toContain("mcp.exa.ai")
+    expect(result.url).toContain("tools=web_search_exa")
+    expect(result.url).not.toContain("exaApiKey=")
    expect(result.headers).toBeUndefined()
  })
 })
--- a/src/mcp/websearch.ts
+++ b/src/mcp/websearch.ts
@@ -31,11 +31,10 @@ export function createWebsearchConfig(config?: WebsearchConfig): RemoteMcpConfig
  // Default to Exa
  return {
    type: "remote" as const,
-    url: "https://mcp.exa.ai/mcp?tools=web_search_exa",
+    url: process.env.EXA_API_KEY
+      ? `https://mcp.exa.ai/mcp?tools=web_search_exa&exaApiKey=${encodeURIComponent(process.env.EXA_API_KEY)}`
+      : "https://mcp.exa.ai/mcp?tools=web_search_exa",
    enabled: true,
-    headers: process.env.EXA_API_KEY
-      ? { "x-api-key": process.env.EXA_API_KEY }
-      : undefined,
    oauth: false as const,
  }
 }
--- a/src/plugin-handlers/AGENTS.md
+++ b/src/plugin-handlers/AGENTS.md
@@ -0,0 +1,96 @@
+**Generated:** 2026-02-08T16:45:00+09:00
+**Commit:** f2b7b759
+**Branch:** dev
+
+## OVERVIEW
+
+Plugin component loading and configuration orchestration. 500+ lines of config merging, migration, and component discovery for Claude Code compatibility.
+
+## STRUCTURE
+```
+plugin-handlers/
+├── config-handler.ts       # Main config orchestrator (563 lines) - agent/skill/command loading
+├── config-handler.test.ts  # Config handler tests (34426 lines)
+├── plan-model-inheritance.ts # Plan agent model inheritance logic (657 lines)
+├── plan-model-inheritance.test.ts # Inheritance tests (3696 lines)
+└── index.ts               # Barrel export
+```
+
+## CORE FUNCTIONS
+
+**Config Handler (`createConfigHandler`):**
+- Loads all plugin components (agents, skills, commands, MCPs)
+- Applies permission migrations for compatibility
+- Merges user/project/global configurations
+- Handles Claude Code plugin integration
+
+**Plan Model Inheritance:**
+- Demotes plan agent to prometheus when planner enabled
+- Preserves user overrides during migration
+- Handles model/variant inheritance from categories
+
+## LOADING PHASES
+
+1. **Plugin Discovery**: Load Claude Code plugins with timeout protection
+2. **Component Loading**: Parallel loading of agents, skills, commands
+3. **Config Merging**: User → Project → Global → Defaults
+4. **Migration**: Legacy config format compatibility
+5. **Permission Application**: Tool access control per agent
+
+## KEY FEATURES
+
+**Parallel Loading:**
+- Concurrent discovery of user/project/global components
+- Timeout protection for plugin loading (default: 10s)
+- Error isolation (failed plugins don't break others)
+
+**Migration Support:**
+- Agent name mapping (old → new names)
+- Permission format conversion
+- Config structure updates
+
+**Claude Code Integration:**
+- Plugin component loading
+- MCP server discovery
+- Agent/skill/command compatibility
+
+## CONFIGURATION FLOW
+
+```
+User Config → Migration → Merging → Validation → Agent Creation → Permission Application
+```
+
+## TESTING COVERAGE
+
+- **Config Handler**: 34426 lines of tests
+- **Plan Inheritance**: 3696 lines of tests
+- **Migration Logic**: Legacy compatibility verification
+- **Parallel Loading**: Timeout and error handling
+
+## USAGE PATTERNS
+
+**Config Handler Creation:**
+```typescript
+const handler = createConfigHandler({
+  ctx: { directory: projectDir },
+  pluginConfig: userConfig,
+  modelCacheState: cache
+});
+```
+
+**Plan Demotion:**
+```typescript
+const demotedPlan = buildPlanDemoteConfig(
+  prometheusConfig,
+  userPlanOverrides
+);
+```
+
+**Component Loading:**
+```typescript
+const [agents, skills, commands] = await Promise.all([
+  loadUserAgents(),
+  loadProjectSkills(),
+  loadGlobalCommands()
+]);
+```
--- a/src/plugin-handlers/config-handler.test.ts
+++ b/src/plugin-handlers/config-handler.test.ts
@@ -1,3 +1,5 @@
+/// <reference types="bun-types" />
+
 import { describe, test, expect, spyOn, beforeEach, afterEach } from "bun:test"
 import { resolveCategoryConfig, createConfigHandler } from "./config-handler"
 import type { CategoryConfig } from "../config/schema"
@@ -600,6 +602,187 @@ describe("Prometheus direct override priority over category", () => {
  })
 })

+describe("Plan agent model inheritance from prometheus", () => {
+  test("plan agent inherits all model-related settings from resolved prometheus config", async () => {
+    //#given - prometheus resolves to claude-opus-4-6 with model settings
+    spyOn(shared, "resolveModelPipeline" as any).mockReturnValue({
+      model: "anthropic/claude-opus-4-6",
+      provenance: "provider-fallback",
+      variant: "max",
+    })
+    const pluginConfig: OhMyOpenCodeConfig = {
+      sisyphus_agent: {
+        planner_enabled: true,
+        replace_plan: true,
+      },
+    }
+    const config: Record<string, unknown> = {
+      model: "anthropic/claude-opus-4-6",
+      agent: {
+        plan: {
+          name: "plan",
+          mode: "primary",
+          prompt: "original plan prompt",
+        },
+      },
+    }
+    const handler = createConfigHandler({
+      ctx: { directory: "/tmp" },
+      pluginConfig,
+      modelCacheState: {
+        anthropicContext1MEnabled: false,
+        modelContextLimitsCache: new Map(),
+      },
+    })
+
+    //#when
+    await handler(config)
+
+    //#then - plan inherits model and variant from prometheus, but NOT prompt
+    const agents = config.agent as Record<string, { mode?: string; model?: string; variant?: string; prompt?: string }>
+    expect(agents.plan).toBeDefined()
+    expect(agents.plan.mode).toBe("subagent")
+    expect(agents.plan.model).toBe("anthropic/claude-opus-4-6")
+    expect(agents.plan.variant).toBe("max")
+    expect(agents.plan.prompt).toBeUndefined()
+  })
+
+  test("plan agent inherits temperature, reasoningEffort, and other model settings from prometheus", async () => {
+    //#given - prometheus configured with category that has temperature and reasoningEffort
+    spyOn(shared, "resolveModelPipeline" as any).mockReturnValue({
+      model: "openai/gpt-5.2",
+      provenance: "override",
+      variant: "high",
+    })
+    const pluginConfig: OhMyOpenCodeConfig = {
+      sisyphus_agent: {
+        planner_enabled: true,
+        replace_plan: true,
+      },
+      agents: {
+        prometheus: {
+          model: "openai/gpt-5.2",
+          variant: "high",
+          temperature: 0.3,
+          top_p: 0.9,
+          maxTokens: 16000,
+          reasoningEffort: "high",
+          textVerbosity: "medium",
+          thinking: { type: "enabled", budgetTokens: 8000 },
+        },
+      },
+    }
+    const config: Record<string, unknown> = {
+      model: "anthropic/claude-opus-4-6",
+      agent: {},
+    }
+    const handler = createConfigHandler({
+      ctx: { directory: "/tmp" },
+      pluginConfig,
+      modelCacheState: {
+        anthropicContext1MEnabled: false,
+        modelContextLimitsCache: new Map(),
+      },
+    })
+
+    //#when
+    await handler(config)
+
+    //#then - plan inherits ALL model-related settings from resolved prometheus
+    const agents = config.agent as Record<string, Record<string, unknown>>
+    expect(agents.plan).toBeDefined()
+    expect(agents.plan.mode).toBe("subagent")
+    expect(agents.plan.model).toBe("openai/gpt-5.2")
+    expect(agents.plan.variant).toBe("high")
+    expect(agents.plan.temperature).toBe(0.3)
+    expect(agents.plan.top_p).toBe(0.9)
+    expect(agents.plan.maxTokens).toBe(16000)
+    expect(agents.plan.reasoningEffort).toBe("high")
+    expect(agents.plan.textVerbosity).toBe("medium")
+    expect(agents.plan.thinking).toEqual({ type: "enabled", budgetTokens: 8000 })
+  })
+
+  test("plan agent user override takes priority over prometheus inherited settings", async () => {
+    //#given - prometheus resolves to opus, but user has plan override for gpt-5.2
+    spyOn(shared, "resolveModelPipeline" as any).mockReturnValue({
+      model: "anthropic/claude-opus-4-6",
+      provenance: "provider-fallback",
+      variant: "max",
+    })
+    const pluginConfig: OhMyOpenCodeConfig = {
+      sisyphus_agent: {
+        planner_enabled: true,
+        replace_plan: true,
+      },
+      agents: {
+        plan: {
+          model: "openai/gpt-5.2",
+          variant: "high",
+          temperature: 0.5,
+        },
+      },
+    }
+    const config: Record<string, unknown> = {
+      model: "anthropic/claude-opus-4-6",
+      agent: {},
+    }
+    const handler = createConfigHandler({
+      ctx: { directory: "/tmp" },
+      pluginConfig,
+      modelCacheState: {
+        anthropicContext1MEnabled: false,
+        modelContextLimitsCache: new Map(),
+      },
+    })
+
+    //#when
+    await handler(config)
+
+    //#then - plan uses its own override, not prometheus settings
+    const agents = config.agent as Record<string, Record<string, unknown>>
+    expect(agents.plan.model).toBe("openai/gpt-5.2")
+    expect(agents.plan.variant).toBe("high")
+    expect(agents.plan.temperature).toBe(0.5)
+  })
+
+  test("plan agent does NOT inherit prompt, description, or color from prometheus", async () => {
+    //#given
+    spyOn(shared, "resolveModelPipeline" as any).mockReturnValue({
+      model: "anthropic/claude-opus-4-6",
+      provenance: "provider-fallback",
+      variant: "max",
+    })
+    const pluginConfig: OhMyOpenCodeConfig = {
+      sisyphus_agent: {
+        planner_enabled: true,
+        replace_plan: true,
+      },
+    }
+    const config: Record<string, unknown> = {
+      model: "anthropic/claude-opus-4-6",
+      agent: {},
+    }
+    const handler = createConfigHandler({
+      ctx: { directory: "/tmp" },
+      pluginConfig,
+      modelCacheState: {
+        anthropicContext1MEnabled: false,
+        modelContextLimitsCache: new Map(),
+      },
+    })
+
+    //#when
+    await handler(config)
+
+    //#then - plan has model settings but NOT prompt/description/color
+    const agents = config.agent as Record<string, Record<string, unknown>>
+    expect(agents.plan.model).toBe("anthropic/claude-opus-4-6")
+    expect(agents.plan.prompt).toBeUndefined()
+    expect(agents.plan.description).toBeUndefined()
+    expect(agents.plan.color).toBeUndefined()
+  })
+})
+
 describe("Deadlock prevention - fetchAvailableModels must not receive client", () => {
  test("fetchAvailableModels should be called with undefined client to prevent deadlock during plugin init", async () => {
    // given - This test ensures we don't regress on issue #1301
@@ -762,3 +945,117 @@ describe("config-handler plugin loading error boundary (#1559)", () => {
    expect(commands["test-cmd"]).toBeDefined()
  })
 })
+
+describe("per-agent todowrite/todoread deny when task_system enabled", () => {
+  const PRIMARY_AGENTS = ["sisyphus", "hephaestus", "atlas", "prometheus", "sisyphus-junior"]
+
+  test("denies todowrite and todoread for primary agents when task_system is enabled", async () => {
+    //#given
+    const createBuiltinAgentsMock = agents.createBuiltinAgents as unknown as {
+      mockResolvedValue: (value: Record<string, unknown>) => void
+    }
+    createBuiltinAgentsMock.mockResolvedValue({
+      sisyphus: { name: "sisyphus", prompt: "test", mode: "primary" },
+      hephaestus: { name: "hephaestus", prompt: "test", mode: "primary" },
+      atlas: { name: "atlas", prompt: "test", mode: "primary" },
+      prometheus: { name: "prometheus", prompt: "test", mode: "primary" },
+      "sisyphus-junior": { name: "sisyphus-junior", prompt: "test", mode: "subagent" },
+      oracle: { name: "oracle", prompt: "test", mode: "subagent" },
+    })
+
+    const pluginConfig: OhMyOpenCodeConfig = {
+      experimental: { task_system: true },
+    }
+    const config: Record<string, unknown> = {
+      model: "anthropic/claude-opus-4-6",
+      agent: {},
+    }
+    const handler = createConfigHandler({
+      ctx: { directory: "/tmp" },
+      pluginConfig,
+      modelCacheState: {
+        anthropicContext1MEnabled: false,
+        modelContextLimitsCache: new Map(),
+      },
+    })
+
+    //#when
+    await handler(config)
+
+    //#then
+    const agentResult = config.agent as Record<string, { permission?: Record<string, unknown> }>
+    for (const agentName of PRIMARY_AGENTS) {
+      expect(agentResult[agentName]?.permission?.todowrite).toBe("deny")
+      expect(agentResult[agentName]?.permission?.todoread).toBe("deny")
+    }
+  })
+
+  test("does not deny todowrite/todoread when task_system is disabled", async () => {
+    //#given
+    const createBuiltinAgentsMock = agents.createBuiltinAgents as unknown as {
+      mockResolvedValue: (value: Record<string, unknown>) => void
+    }
+    createBuiltinAgentsMock.mockResolvedValue({
+      sisyphus: { name: "sisyphus", prompt: "test", mode: "primary" },
+      hephaestus: { name: "hephaestus", prompt: "test", mode: "primary" },
+    })
+
+    const pluginConfig: OhMyOpenCodeConfig = {
+      experimental: { task_system: false },
+    }
+    const config: Record<string, unknown> = {
+      model: "anthropic/claude-opus-4-6",
+      agent: {},
+    }
+    const handler = createConfigHandler({
+      ctx: { directory: "/tmp" },
+      pluginConfig,
+      modelCacheState: {
+        anthropicContext1MEnabled: false,
+        modelContextLimitsCache: new Map(),
+      },
+    })
+
+    //#when
+    await handler(config)
+
+    //#then
+    const agentResult = config.agent as Record<string, { permission?: Record<string, unknown> }>
+    expect(agentResult.sisyphus?.permission?.todowrite).toBeUndefined()
+    expect(agentResult.sisyphus?.permission?.todoread).toBeUndefined()
+    expect(agentResult.hephaestus?.permission?.todowrite).toBeUndefined()
+    expect(agentResult.hephaestus?.permission?.todoread).toBeUndefined()
+  })
+
+  test("does not deny todowrite/todoread when task_system is undefined", async () => {
+    //#given
+    const createBuiltinAgentsMock = agents.createBuiltinAgents as unknown as {
+      mockResolvedValue: (value: Record<string, unknown>) => void
+    }
+    createBuiltinAgentsMock.mockResolvedValue({
+      sisyphus: { name: "sisyphus", prompt: "test", mode: "primary" },
+    })
+
+    const pluginConfig: OhMyOpenCodeConfig = {}
+    const config: Record<string, unknown> = {
+      model: "anthropic/claude-opus-4-6",
+      agent: {},
+    }
+    const handler = createConfigHandler({
+      ctx: { directory: "/tmp" },
+      pluginConfig,
+      modelCacheState: {
+        anthropicContext1MEnabled: false,
+        modelContextLimitsCache: new Map(),
+      },
+    })
+
+    //#when
+    await handler(config)
+
+    //#then
+    const agentResult = config.agent as Record<string, { permission?: Record<string, unknown> }>
+    expect(agentResult.sisyphus?.permission?.todowrite).toBeUndefined()
+    expect(agentResult.sisyphus?.permission?.todoread).toBeUndefined()
+  })
+})
--- a/src/plugin-handlers/config-handler.ts
+++ b/src/plugin-handlers/config-handler.ts
@@ -32,6 +32,7 @@ import { AGENT_NAME_MAP } from "../shared/migration";
 import { AGENT_MODEL_REQUIREMENTS } from "../shared/model-requirements";
 import { PROMETHEUS_SYSTEM_PROMPT, PROMETHEUS_PERMISSION } from "../agents/prometheus";
 import { DEFAULT_CATEGORIES } from "../tools/delegate-task/constants";
+import { buildPlanDemoteConfig } from "./plan-model-inheritance";
 import type { ModelCacheState } from "../plugin-state";
 import type { CategoryConfig } from "../config/schema";

@@ -183,19 +184,40 @@ export function createConfigHandler(deps: ConfigHandlerDeps) {
    // Pass it as uiSelectedModel so it takes highest priority in model resolution
    const currentModel = config.model as string | undefined;
    const disabledSkills = new Set<string>(pluginConfig.disabled_skills ?? []);
-    const builtinAgents = await createBuiltinAgents(
-      migratedDisabledAgents,
-      pluginConfig.agents,
-      ctx.directory,
-      undefined, // systemDefaultModel - let fallback chain handle this
-      pluginConfig.categories,
-      pluginConfig.git_master,
-      allDiscoveredSkills,
-      ctx.client,
-      browserProvider,
-      currentModel, // uiSelectedModel - takes highest priority
-      disabledSkills
-    );
+
+    type AgentConfig = Record<
+      string,
+      Record<string, unknown> | undefined
+    > & {
+      build?: Record<string, unknown>;
+      plan?: Record<string, unknown>;
+      explore?: { tools?: Record<string, unknown> };
+      librarian?: { tools?: Record<string, unknown> };
+      "multimodal-looker"?: { tools?: Record<string, unknown> };
+      atlas?: { tools?: Record<string, unknown> };
+      sisyphus?: { tools?: Record<string, unknown> };
+    };
+    const configAgent = config.agent as AgentConfig | undefined;
+
+    function isRecord(value: unknown): value is Record<string, unknown> {
+      return typeof value === "object" && value !== null;
+    }
+
+    function buildCustomAgentSummaryInput(agents: Record<string, unknown> | undefined): unknown[] {
+      if (!agents) return [];
+
+      const result: unknown[] = [];
+      for (const [name, value] of Object.entries(agents)) {
+        if (!isRecord(value)) continue;
+
+        const description = typeof value.description === "string" ? value.description : "";
+        const hidden = value.hidden === true;
+        const disabled = value.disabled === true || value.enabled === false;
+        result.push({ name, description, hidden, disabled });
+      }
+
+      return result;
+    }

    // Claude Code agents: Do NOT apply permission migration
    // Claude Code uses whitelist-based tools format which is semantically different
@@ -216,6 +238,27 @@ export function createConfigHandler(deps: ConfigHandlerDeps) {
      ])
    );

+    const customAgentSummaries = [
+      ...buildCustomAgentSummaryInput(configAgent),
+      ...buildCustomAgentSummaryInput(userAgents),
+      ...buildCustomAgentSummaryInput(projectAgents),
+      ...buildCustomAgentSummaryInput(pluginAgents),
+    ];
+
+    const builtinAgents = await createBuiltinAgents(
+      migratedDisabledAgents,
+      pluginConfig.agents,
+      ctx.directory,
+      undefined, // systemDefaultModel - let fallback chain handle this
+      pluginConfig.categories,
+      pluginConfig.git_master,
+      allDiscoveredSkills,
+      customAgentSummaries,
+      browserProvider,
+      currentModel, // uiSelectedModel - takes highest priority
+      disabledSkills
+    );
+
    const isSisyphusEnabled = pluginConfig.sisyphus_agent?.disabled !== true;
    const builderEnabled =
      pluginConfig.sisyphus_agent?.default_builder_enabled ?? false;
@@ -224,20 +267,6 @@ export function createConfigHandler(deps: ConfigHandlerDeps) {
    const replacePlan = pluginConfig.sisyphus_agent?.replace_plan ?? true;
    const shouldDemotePlan = plannerEnabled && replacePlan;

-    type AgentConfig = Record<
-      string,
-      Record<string, unknown> | undefined
-    > & {
-      build?: Record<string, unknown>;
-      plan?: Record<string, unknown>;
-      explore?: { tools?: Record<string, unknown> };
-      librarian?: { tools?: Record<string, unknown> };
-      "multimodal-looker"?: { tools?: Record<string, unknown> };
-      atlas?: { tools?: Record<string, unknown> };
-      sisyphus?: { tools?: Record<string, unknown> };
-    };
-    const configAgent = config.agent as AgentConfig | undefined;
-
    if (isSisyphusEnabled && builtinAgents.sisyphus) {
      (config as { default_agent?: string }).default_agent = "sisyphus";

@@ -385,8 +414,10 @@ export function createConfigHandler(deps: ConfigHandlerDeps) {
        : {};

      const planDemoteConfig = shouldDemotePlan
-           ? { mode: "subagent" as const
-          }
+        ? buildPlanDemoteConfig(
+            agentConfig["prometheus"] as Record<string, unknown> | undefined,
+            pluginConfig.agents?.plan as Record<string, unknown> | undefined,
+          )
        : undefined;

      config.agent = {
@@ -433,6 +464,11 @@ export function createConfigHandler(deps: ConfigHandlerDeps) {
    // In CLI run mode, deny Question tool for all agents (no TUI to answer questions)
    const isCliRunMode = process.env.OPENCODE_CLI_RUN_MODE === "true";
    const questionPermission = isCliRunMode ? "deny" : "allow";
+
+    // When task system is enabled, deny todowrite/todoread per-agent so models never see them
+    const todoPermission = pluginConfig.experimental?.task_system
+      ? { todowrite: "deny" as const, todoread: "deny" as const }
+      : {};
    
    if (agentResult.librarian) {
      const agent = agentResult.librarian as AgentWithPermission;
@@ -444,23 +480,23 @@ export function createConfigHandler(deps: ConfigHandlerDeps) {
    }
    if (agentResult["atlas"]) {
      const agent = agentResult["atlas"] as AgentWithPermission;
-      agent.permission = { ...agent.permission, task: "allow", call_omo_agent: "deny", "task_*": "allow", teammate: "allow" };
+      agent.permission = { ...agent.permission, ...todoPermission, task: "allow", call_omo_agent: "deny", "task_*": "allow", teammate: "allow" };
    }
    if (agentResult.sisyphus) {
      const agent = agentResult.sisyphus as AgentWithPermission;
-      agent.permission = { ...agent.permission, call_omo_agent: "deny", task: "allow", question: questionPermission, "task_*": "allow", teammate: "allow" };
+      agent.permission = { ...agent.permission, ...todoPermission, call_omo_agent: "deny", task: "allow", question: questionPermission, "task_*": "allow", teammate: "allow" };
    }
    if (agentResult.hephaestus) {
      const agent = agentResult.hephaestus as AgentWithPermission;
-      agent.permission = { ...agent.permission, call_omo_agent: "deny", task: "allow", question: questionPermission };
+      agent.permission = { ...agent.permission, ...todoPermission, call_omo_agent: "deny", task: "allow", question: questionPermission };
    }
    if (agentResult["prometheus"]) {
      const agent = agentResult["prometheus"] as AgentWithPermission;
-      agent.permission = { ...agent.permission, call_omo_agent: "deny", task: "allow", question: questionPermission, "task_*": "allow", teammate: "allow" };
+      agent.permission = { ...agent.permission, ...todoPermission, call_omo_agent: "deny", task: "allow", question: questionPermission, "task_*": "allow", teammate: "allow" };
    }
    if (agentResult["sisyphus-junior"]) {
      const agent = agentResult["sisyphus-junior"] as AgentWithPermission;
-      agent.permission = { ...agent.permission, task: "allow", "task_*": "allow", teammate: "allow" };
+      agent.permission = { ...agent.permission, ...todoPermission, task: "allow", "task_*": "allow", teammate: "allow" };
    }

    config.permission = {
--- a/src/plugin-handlers/plan-model-inheritance.test.ts
+++ b/src/plugin-handlers/plan-model-inheritance.test.ts
@@ -0,0 +1,118 @@
+import { describe, test, expect } from "bun:test"
+import { buildPlanDemoteConfig } from "./plan-model-inheritance"
+
+describe("buildPlanDemoteConfig", () => {
+  test("returns only mode when prometheus and plan override are both undefined", () => {
+    //#given
+    const prometheusConfig = undefined
+    const planOverride = undefined
+
+    //#when
+    const result = buildPlanDemoteConfig(prometheusConfig, planOverride)
+
+    //#then
+    expect(result).toEqual({ mode: "subagent" })
+  })
+
+  test("extracts all model settings from prometheus config", () => {
+    //#given
+    const prometheusConfig = {
+      name: "prometheus",
+      model: "anthropic/claude-opus-4-6",
+      variant: "max",
+      mode: "all",
+      prompt: "You are Prometheus...",
+      permission: { edit: "allow" },
+      description: "Plan agent (Prometheus)",
+      color: "#FF5722",
+      temperature: 0.1,
+      top_p: 0.95,
+      maxTokens: 32000,
+      thinking: { type: "enabled", budgetTokens: 10000 },
+      reasoningEffort: "high",
+      textVerbosity: "medium",
+      providerOptions: { key: "value" },
+    }
+
+    //#when
+    const result = buildPlanDemoteConfig(prometheusConfig, undefined)
+
+    //#then - picks model settings, NOT prompt/permission/description/color/name/mode
+    expect(result.mode).toBe("subagent")
+    expect(result.model).toBe("anthropic/claude-opus-4-6")
+    expect(result.variant).toBe("max")
+    expect(result.temperature).toBe(0.1)
+    expect(result.top_p).toBe(0.95)
+    expect(result.maxTokens).toBe(32000)
+    expect(result.thinking).toEqual({ type: "enabled", budgetTokens: 10000 })
+    expect(result.reasoningEffort).toBe("high")
+    expect(result.textVerbosity).toBe("medium")
+    expect(result.providerOptions).toEqual({ key: "value" })
+    expect(result.prompt).toBeUndefined()
+    expect(result.permission).toBeUndefined()
+    expect(result.description).toBeUndefined()
+    expect(result.color).toBeUndefined()
+    expect(result.name).toBeUndefined()
+  })
+
+  test("plan override takes priority over prometheus for all model settings", () => {
+    //#given
+    const prometheusConfig = {
+      model: "anthropic/claude-opus-4-6",
+      variant: "max",
+      temperature: 0.1,
+      reasoningEffort: "high",
+    }
+    const planOverride = {
+      model: "openai/gpt-5.2",
+      variant: "high",
+      temperature: 0.5,
+      reasoningEffort: "low",
+    }
+
+    //#when
+    const result = buildPlanDemoteConfig(prometheusConfig, planOverride)
+
+    //#then
+    expect(result.model).toBe("openai/gpt-5.2")
+    expect(result.variant).toBe("high")
+    expect(result.temperature).toBe(0.5)
+    expect(result.reasoningEffort).toBe("low")
+  })
+
+  test("falls back to prometheus when plan override has partial settings", () => {
+    //#given
+    const prometheusConfig = {
+      model: "anthropic/claude-opus-4-6",
+      variant: "max",
+      temperature: 0.1,
+      reasoningEffort: "high",
+    }
+    const planOverride = {
+      model: "openai/gpt-5.2",
+    }
+
+    //#when
+    const result = buildPlanDemoteConfig(prometheusConfig, planOverride)
+
+    //#then - plan model wins, rest inherits from prometheus
+    expect(result.model).toBe("openai/gpt-5.2")
+    expect(result.variant).toBe("max")
+    expect(result.temperature).toBe(0.1)
+    expect(result.reasoningEffort).toBe("high")
+  })
+
+  test("skips undefined values from both sources", () => {
+    //#given
+    const prometheusConfig = {
+      model: "anthropic/claude-opus-4-6",
+    }
+
+    //#when
+    const result = buildPlanDemoteConfig(prometheusConfig, undefined)
+
+    //#then
+    expect(result).toEqual({ mode: "subagent", model: "anthropic/claude-opus-4-6" })
+    expect(Object.keys(result)).toEqual(["mode", "model"])
+  })
+})
--- a/src/plugin-handlers/plan-model-inheritance.ts
+++ b/src/plugin-handlers/plan-model-inheritance.ts
@@ -0,0 +1,27 @@
+const MODEL_SETTINGS_KEYS = [
+  "model",
+  "variant",
+  "temperature",
+  "top_p",
+  "maxTokens",
+  "thinking",
+  "reasoningEffort",
+  "textVerbosity",
+  "providerOptions",
+] as const
+
+export function buildPlanDemoteConfig(
+  prometheusConfig: Record<string, unknown> | undefined,
+  planOverride: Record<string, unknown> | undefined,
+): Record<string, unknown> {
+  const modelSettings: Record<string, unknown> = {}
+
+  for (const key of MODEL_SETTINGS_KEYS) {
+    const value = planOverride?.[key] ?? prometheusConfig?.[key]
+    if (value !== undefined) {
+      modelSettings[key] = value
+    }
+  }
+
+  return { mode: "subagent" as const, ...modelSettings }
+}
--- a/src/shared/AGENTS.md
+++ b/src/shared/AGENTS.md
@@ -2,7 +2,7 @@

 ## OVERVIEW

-66 cross-cutting utilities. Import via barrel pattern: `import { log, deepMerge } from "../../shared"`
+88 cross-cutting utilities. Import via barrel pattern: `import { log, deepMerge } from "../../shared"`

 **Categories**: Path resolution, Token truncation, Config parsing, Model resolution, System directives, Tool restrictions

--- a/src/shared/git-worktree/collect-git-diff-stats.test.ts
+++ b/src/shared/git-worktree/collect-git-diff-stats.test.ts
@@ -0,0 +1,66 @@
+/// <reference types="bun-types" />
+
+import { describe, expect, mock, test } from "bun:test"
+
+const execSyncMock = mock(() => {
+  throw new Error("execSync should not be called")
+})
+
+const execFileSyncMock = mock((file: string, args: string[], _opts: { cwd?: string }) => {
+  if (file !== "git") throw new Error(`unexpected file: ${file}`)
+  const subcommand = args[0]
+
+  if (subcommand === "diff") {
+    return "1\t2\tfile.ts\n"
+  }
+
+  if (subcommand === "status") {
+    return " M file.ts\n"
+  }
+
+  throw new Error(`unexpected args: ${args.join(" ")}`)
+})
+
+mock.module("node:child_process", () => ({
+  execSync: execSyncMock,
+  execFileSync: execFileSyncMock,
+}))
+
+const { collectGitDiffStats } = await import("./collect-git-diff-stats")
+
+describe("collectGitDiffStats", () => {
+  test("uses execFileSync with arg arrays (no shell injection)", () => {
+    //#given
+    const directory = "/tmp/safe-repo;touch /tmp/pwn"
+
+    //#when
+    const result = collectGitDiffStats(directory)
+
+    //#then
+    expect(execSyncMock).not.toHaveBeenCalled()
+    expect(execFileSyncMock).toHaveBeenCalledTimes(2)
+
+    const [firstCallFile, firstCallArgs, firstCallOpts] = execFileSyncMock.mock
+      .calls[0]! as unknown as [string, string[], { cwd?: string }]
+    expect(firstCallFile).toBe("git")
+    expect(firstCallArgs).toEqual(["diff", "--numstat", "HEAD"])
+    expect(firstCallOpts.cwd).toBe(directory)
+    expect(firstCallArgs.join(" ")).not.toContain(directory)
+
+    const [secondCallFile, secondCallArgs, secondCallOpts] = execFileSyncMock.mock
+      .calls[1]! as unknown as [string, string[], { cwd?: string }]
+    expect(secondCallFile).toBe("git")
+    expect(secondCallArgs).toEqual(["status", "--porcelain"])
+    expect(secondCallOpts.cwd).toBe(directory)
+    expect(secondCallArgs.join(" ")).not.toContain(directory)
+
+    expect(result).toEqual([
+      {
+        path: "file.ts",
+        added: 1,
+        removed: 2,
+        status: "modified",
+      },
+    ])
+  })
+})
--- a/src/shared/git-worktree/collect-git-diff-stats.ts
+++ b/src/shared/git-worktree/collect-git-diff-stats.ts
@@ -0,0 +1,29 @@
+import { execFileSync } from "node:child_process"
+import { parseGitStatusPorcelain } from "./parse-status-porcelain"
+import { parseGitDiffNumstat } from "./parse-diff-numstat"
+import type { GitFileStat } from "./types"
+
+export function collectGitDiffStats(directory: string): GitFileStat[] {
+  try {
+    const diffOutput = execFileSync("git", ["diff", "--numstat", "HEAD"], {
+      cwd: directory,
+      encoding: "utf-8",
+      timeout: 5000,
+      stdio: ["pipe", "pipe", "pipe"],
+    }).trim()
+
+    if (!diffOutput) return []
+
+    const statusOutput = execFileSync("git", ["status", "--porcelain"], {
+      cwd: directory,
+      encoding: "utf-8",
+      timeout: 5000,
+      stdio: ["pipe", "pipe", "pipe"],
+    }).trim()
+
+    const statusMap = parseGitStatusPorcelain(statusOutput)
+    return parseGitDiffNumstat(diffOutput, statusMap)
+  } catch {
+    return []
+  }
+}
--- a/src/shared/git-worktree/format-file-changes.ts
+++ b/src/shared/git-worktree/format-file-changes.ts
@@ -0,0 +1,46 @@
+import type { GitFileStat } from "./types"
+
+export function formatFileChanges(stats: GitFileStat[], notepadPath?: string): string {
+  if (stats.length === 0) return "[FILE CHANGES SUMMARY]\nNo file changes detected.\n"
+
+  const modified = stats.filter((s) => s.status === "modified")
+  const added = stats.filter((s) => s.status === "added")
+  const deleted = stats.filter((s) => s.status === "deleted")
+
+  const lines: string[] = ["[FILE CHANGES SUMMARY]"]
+
+  if (modified.length > 0) {
+    lines.push("Modified files:")
+    for (const f of modified) {
+      lines.push(`  ${f.path}  (+${f.added}, -${f.removed})`)
+    }
+    lines.push("")
+  }
+
+  if (added.length > 0) {
+    lines.push("Created files:")
+    for (const f of added) {
+      lines.push(`  ${f.path}  (+${f.added})`)
+    }
+    lines.push("")
+  }
+
+  if (deleted.length > 0) {
+    lines.push("Deleted files:")
+    for (const f of deleted) {
+      lines.push(`  ${f.path}  (-${f.removed})`)
+    }
+    lines.push("")
+  }
+
+  if (notepadPath) {
+    const notepadStat = stats.find((s) => s.path.includes("notepad") || s.path.includes(".sisyphus"))
+    if (notepadStat) {
+      lines.push("[NOTEPAD UPDATED]")
+      lines.push(`  ${notepadStat.path}  (+${notepadStat.added})`)
+      lines.push("")
+    }
+  }
+
+  return lines.join("\n")
+}
--- a/src/shared/git-worktree/git-worktree.test.ts
+++ b/src/shared/git-worktree/git-worktree.test.ts
@@ -0,0 +1,51 @@
+/// <reference types="bun-types" />
+
+import { describe, expect, test } from "bun:test"
+import { formatFileChanges, parseGitDiffNumstat, parseGitStatusPorcelain } from "./index"
+
+describe("git-worktree", () => {
+  test("#given status porcelain output #when parsing #then maps paths to statuses", () => {
+    const porcelain = [
+      " M src/a.ts",
+      "A  src/b.ts",
+      "?? src/c.ts",
+      "D  src/d.ts",
+    ].join("\n")
+
+    const map = parseGitStatusPorcelain(porcelain)
+    expect(map.get("src/a.ts")).toBe("modified")
+    expect(map.get("src/b.ts")).toBe("added")
+    expect(map.get("src/c.ts")).toBe("added")
+    expect(map.get("src/d.ts")).toBe("deleted")
+  })
+
+  test("#given diff numstat and status map #when parsing #then returns typed stats", () => {
+    const porcelain = [" M src/a.ts", "A  src/b.ts"].join("\n")
+    const statusMap = parseGitStatusPorcelain(porcelain)
+
+    const numstat = ["1\t2\tsrc/a.ts", "3\t0\tsrc/b.ts", "-\t-\tbin.dat"].join("\n")
+    const stats = parseGitDiffNumstat(numstat, statusMap)
+
+    expect(stats).toEqual([
+      { path: "src/a.ts", added: 1, removed: 2, status: "modified" },
+      { path: "src/b.ts", added: 3, removed: 0, status: "added" },
+      { path: "bin.dat", added: 0, removed: 0, status: "modified" },
+    ])
+  })
+
+  test("#given git file stats #when formatting #then produces grouped summary", () => {
+    const summary = formatFileChanges([
+      { path: "src/a.ts", added: 1, removed: 2, status: "modified" },
+      { path: "src/b.ts", added: 3, removed: 0, status: "added" },
+      { path: "src/c.ts", added: 0, removed: 4, status: "deleted" },
+    ])
+
+    expect(summary).toContain("[FILE CHANGES SUMMARY]")
+    expect(summary).toContain("Modified files:")
+    expect(summary).toContain("Created files:")
+    expect(summary).toContain("Deleted files:")
+    expect(summary).toContain("src/a.ts")
+    expect(summary).toContain("src/b.ts")
+    expect(summary).toContain("src/c.ts")
+  })
+})
--- a/src/shared/git-worktree/index.ts
+++ b/src/shared/git-worktree/index.ts
@@ -0,0 +1,5 @@
+export type { GitFileStatus, GitFileStat } from "./types"
+export { parseGitStatusPorcelain } from "./parse-status-porcelain"
+export { parseGitDiffNumstat } from "./parse-diff-numstat"
+export { collectGitDiffStats } from "./collect-git-diff-stats"
+export { formatFileChanges } from "./format-file-changes"
--- a/src/shared/git-worktree/parse-diff-numstat.ts
+++ b/src/shared/git-worktree/parse-diff-numstat.ts
@@ -0,0 +1,27 @@
+import type { GitFileStat, GitFileStatus } from "./types"
+
+export function parseGitDiffNumstat(
+  output: string,
+  statusMap: Map<string, GitFileStatus>
+): GitFileStat[] {
+  if (!output) return []
+
+  const stats: GitFileStat[] = []
+  for (const line of output.split("\n")) {
+    const parts = line.split("\t")
+    if (parts.length < 3) continue
+
+    const [addedStr, removedStr, path] = parts
+    const added = addedStr === "-" ? 0 : parseInt(addedStr, 10)
+    const removed = removedStr === "-" ? 0 : parseInt(removedStr, 10)
+
+    stats.push({
+      path,
+      added,
+      removed,
+      status: statusMap.get(path) ?? "modified",
+    })
+  }
+
+  return stats
+}
--- a/src/shared/git-worktree/parse-status-porcelain.ts
+++ b/src/shared/git-worktree/parse-status-porcelain.ts
@@ -0,0 +1,25 @@
+import type { GitFileStatus } from "./types"
+
+export function parseGitStatusPorcelain(output: string): Map<string, GitFileStatus> {
+  const map = new Map<string, GitFileStatus>()
+  if (!output) return map
+
+  for (const line of output.split("\n")) {
+    if (!line) continue
+
+    const status = line.substring(0, 2).trim()
+    const filePath = line.substring(3)
+
+    if (!filePath) continue
+
+    if (status === "A" || status === "??") {
+      map.set(filePath, "added")
+    } else if (status === "D") {
+      map.set(filePath, "deleted")
+    } else {
+      map.set(filePath, "modified")
+    }
+  }
+
+  return map
+}
--- a/src/shared/git-worktree/types.ts
+++ b/src/shared/git-worktree/types.ts
@@ -0,0 +1,8 @@
+export type GitFileStatus = "modified" | "added" | "deleted"
+
+export interface GitFileStat {
+  path: string
+  added: number
+  removed: number
+  status: GitFileStatus
+}
--- a/src/shared/index.ts
+++ b/src/shared/index.ts
@@ -41,5 +41,6 @@ export * from "./tmux"
 export * from "./model-suggestion-retry"
 export * from "./opencode-server-auth"
 export * from "./port-utils"
+export * from "./git-worktree"
 export * from "./safe-create-hook"
 export * from "./truncate-description"
--- a/src/shared/migration/config-migration.ts
+++ b/src/shared/migration/config-migration.ts
@@ -8,30 +8,32 @@ export function migrateConfigFile(
  configPath: string,
  rawConfig: Record<string, unknown>
 ): boolean {
+  // Work on a deep copy — only apply changes to rawConfig if file write succeeds
+  const copy = structuredClone(rawConfig)
  let needsWrite = false

  // Load previously applied migrations
-  const existingMigrations = Array.isArray(rawConfig._migrations)
-    ? new Set(rawConfig._migrations as string[])
+  const existingMigrations = Array.isArray(copy._migrations)
+    ? new Set(copy._migrations as string[])
    : new Set<string>()
  const allNewMigrations: string[] = []

-  if (rawConfig.agents && typeof rawConfig.agents === "object") {
-    const { migrated, changed } = migrateAgentNames(rawConfig.agents as Record<string, unknown>)
+  if (copy.agents && typeof copy.agents === "object") {
+    const { migrated, changed } = migrateAgentNames(copy.agents as Record<string, unknown>)
    if (changed) {
-      rawConfig.agents = migrated
+      copy.agents = migrated
      needsWrite = true
    }
  }

  // Migrate model versions in agents (skip already-applied migrations)
-  if (rawConfig.agents && typeof rawConfig.agents === "object") {
+  if (copy.agents && typeof copy.agents === "object") {
    const { migrated, changed, newMigrations } = migrateModelVersions(
-      rawConfig.agents as Record<string, unknown>,
+      copy.agents as Record<string, unknown>,
      existingMigrations
    )
    if (changed) {
-      rawConfig.agents = migrated
+      copy.agents = migrated
      needsWrite = true
      log("Migrated model versions in agents config")
    }
@@ -39,13 +41,13 @@ export function migrateConfigFile(
  }

  // Migrate model versions in categories (skip already-applied migrations)
-  if (rawConfig.categories && typeof rawConfig.categories === "object") {
+  if (copy.categories && typeof copy.categories === "object") {
    const { migrated, changed, newMigrations } = migrateModelVersions(
-      rawConfig.categories as Record<string, unknown>,
+      copy.categories as Record<string, unknown>,
      existingMigrations
    )
    if (changed) {
-      rawConfig.categories = migrated
+      copy.categories = migrated
      needsWrite = true
      log("Migrated model versions in categories config")
    }
@@ -56,20 +58,20 @@ export function migrateConfigFile(
  if (allNewMigrations.length > 0) {
    const updatedMigrations = Array.from(existingMigrations)
    updatedMigrations.push(...allNewMigrations)
-    rawConfig._migrations = updatedMigrations
+    copy._migrations = updatedMigrations
    needsWrite = true
  }

-  if (rawConfig.omo_agent) {
-    rawConfig.sisyphus_agent = rawConfig.omo_agent
-    delete rawConfig.omo_agent
+  if (copy.omo_agent) {
+    copy.sisyphus_agent = copy.omo_agent
+    delete copy.omo_agent
    needsWrite = true
  }

-  if (rawConfig.disabled_agents && Array.isArray(rawConfig.disabled_agents)) {
+  if (copy.disabled_agents && Array.isArray(copy.disabled_agents)) {
    const migrated: string[] = []
    let changed = false
-    for (const agent of rawConfig.disabled_agents as string[]) {
+    for (const agent of copy.disabled_agents as string[]) {
      const newAgent = AGENT_NAME_MAP[agent.toLowerCase()] ?? AGENT_NAME_MAP[agent] ?? agent
      if (newAgent !== agent) {
        changed = true
@@ -77,15 +79,15 @@ export function migrateConfigFile(
      migrated.push(newAgent)
    }
    if (changed) {
-      rawConfig.disabled_agents = migrated
+      copy.disabled_agents = migrated
      needsWrite = true
    }
  }

-  if (rawConfig.disabled_hooks && Array.isArray(rawConfig.disabled_hooks)) {
-    const { migrated, changed, removed } = migrateHookNames(rawConfig.disabled_hooks as string[])
+  if (copy.disabled_hooks && Array.isArray(copy.disabled_hooks)) {
+    const { migrated, changed, removed } = migrateHookNames(copy.disabled_hooks as string[])
    if (changed) {
-      rawConfig.disabled_hooks = migrated
+      copy.disabled_hooks = migrated
      needsWrite = true
    }
    if (removed.length > 0) {
@@ -99,13 +101,25 @@ export function migrateConfigFile(
    try {
      const timestamp = new Date().toISOString().replace(/[:.]/g, "-")
      const backupPath = `${configPath}.bak.${timestamp}`
-      fs.copyFileSync(configPath, backupPath)
+      try {
+        fs.copyFileSync(configPath, backupPath)
+      } catch {
+        // Original file may not exist yet — skip backup
+      }

-      fs.writeFileSync(configPath, JSON.stringify(rawConfig, null, 2) + "\n", "utf-8")
+      fs.writeFileSync(configPath, JSON.stringify(copy, null, 2) + "\n", "utf-8")
      log(`Migrated config file: ${configPath} (backup: ${backupPath})`)
    } catch (err) {
      log(`Failed to write migrated config to ${configPath}:`, err)
+      // File write failed — rawConfig is untouched, preserving user's original values
+      return false
    }
+
+    // File write succeeded — apply changes to the original rawConfig
+    for (const key of Object.keys(rawConfig)) {
+      delete rawConfig[key]
+    }
+    Object.assign(rawConfig, copy)
  }

  return needsWrite
--- a/src/shared/model-availability.test.ts
+++ b/src/shared/model-availability.test.ts
@@ -1,26 +1,43 @@
-import { describe, it, expect, beforeEach, afterEach } from "bun:test"
+declare const require: (name: string) => any
+const { describe, it, expect, beforeEach, afterEach, beforeAll } = require("bun:test")
 import { mkdtempSync, writeFileSync, rmSync } from "fs"
 import { tmpdir } from "os"
 import { join } from "path"
-import { fetchAvailableModels, fuzzyMatchModel, getConnectedProviders, __resetModelCache, isModelAvailable } from "./model-availability"
+
+let __resetModelCache: () => void
+let fetchAvailableModels: (client?: unknown, options?: { connectedProviders?: string[] | null }) => Promise<Set<string>>
+let fuzzyMatchModel: (target: string, available: Set<string>, providers?: string[]) => string | null
+let isModelAvailable: (targetModel: string, availableModels: Set<string>) => boolean
+let getConnectedProviders: (client: unknown) => Promise<string[]>
+
+beforeAll(async () => {
+  ;({
+    __resetModelCache,
+    fetchAvailableModels,
+    fuzzyMatchModel,
+    isModelAvailable,
+    getConnectedProviders,
+  } = await import("./model-availability"))
+})

 describe("fetchAvailableModels", () => {
  let tempDir: string
-  let originalXdgCache: string | undefined
+	let originalXdgCache: string | undefined
+

  beforeEach(() => {
    __resetModelCache()
    tempDir = mkdtempSync(join(tmpdir(), "opencode-test-"))
-    originalXdgCache = process.env.XDG_CACHE_HOME
-    process.env.XDG_CACHE_HOME = tempDir
+		originalXdgCache = process.env.XDG_CACHE_HOME
+		process.env.XDG_CACHE_HOME = tempDir
  })

  afterEach(() => {
    if (originalXdgCache !== undefined) {
-      process.env.XDG_CACHE_HOME = originalXdgCache
-    } else {
-      delete process.env.XDG_CACHE_HOME
-    }
+			process.env.XDG_CACHE_HOME = originalXdgCache
+		} else {
+			delete process.env.XDG_CACHE_HOME
+		}
    rmSync(tempDir, { recursive: true, force: true })
  })

--- a/src/shared/model-availability.ts
+++ b/src/shared/model-availability.ts
@@ -2,7 +2,7 @@ import { existsSync, readFileSync } from "fs"
 import { join } from "path"
 import { log } from "./logger"
 import { getOpenCodeCacheDir } from "./data-path"
-import { readProviderModelsCache, hasProviderModelsCache, readConnectedProvidersCache } from "./connected-providers-cache"
+import * as connectedProvidersCache from "./connected-providers-cache"

 /**
 * Fuzzy match a target model name against available models
@@ -181,7 +181,7 @@ export async function fetchAvailableModels(
 	const connectedSet = new Set(connectedProvidersList)
 	const modelSet = new Set<string>()

-	const providerModelsCache = readProviderModelsCache()
+	const providerModelsCache = connectedProvidersCache.readProviderModelsCache()
 	if (providerModelsCache) {
 		const providerCount = Object.keys(providerModelsCache.models).length
 		if (providerCount === 0) {
@@ -189,7 +189,8 @@ export async function fetchAvailableModels(
 		} else {
 		log("[fetchAvailableModels] using provider-models cache (whitelist-filtered)")
 		
-		for (const [providerId, modelIds] of Object.entries(providerModelsCache.models)) {
+		const modelsByProvider = providerModelsCache.models as Record<string, Array<string | { id?: string }>>
+		for (const [providerId, modelIds] of Object.entries(modelsByProvider)) {
 			if (!connectedSet.has(providerId)) {
 				continue
 			}
@@ -300,7 +301,7 @@ export function isAnyFallbackModelAvailable(
 	// Fallback: check if any provider in the chain is connected
 	// This handles race conditions where availableModels is empty or incomplete
 	// but we know the provider is connected.
-	const connectedProviders = readConnectedProvidersCache()
+	const connectedProviders = connectedProvidersCache.readConnectedProvidersCache()
 	if (connectedProviders) {
 		const connectedSet = new Set(connectedProviders)
 		for (const entry of fallbackChain) {
@@ -332,7 +333,7 @@ export function isAnyProviderConnected(
 		}
 	}

-	const connectedProviders = readConnectedProvidersCache()
+	const connectedProviders = connectedProvidersCache.readConnectedProvidersCache()
 	if (connectedProviders) {
 		const connectedSet = new Set(connectedProviders)
 		for (const provider of providers) {
@@ -349,7 +350,7 @@ export function isAnyProviderConnected(
 export function __resetModelCache(): void {}

 export function isModelCacheAvailable(): boolean {
-	if (hasProviderModelsCache()) {
+	if (connectedProvidersCache.hasProviderModelsCache()) {
 		return true
 	}
 	const cacheFile = join(getOpenCodeCacheDir(), "models.json")
--- a/src/shared/model-resolution-pipeline.ts
+++ b/src/shared/model-resolution-pipeline.ts
@@ -1,5 +1,5 @@
 import { log } from "./logger"
-import { readConnectedProvidersCache } from "./connected-providers-cache"
+import * as connectedProvidersCache from "./connected-providers-cache"
 import { fuzzyMatchModel } from "./model-availability"
 import type { FallbackEntry } from "./model-requirements"

@@ -11,6 +11,7 @@ export type ModelResolutionRequest = {
  }
  constraints: {
    availableModels: Set<string>
+    connectedProviders?: string[] | null
  }
  policy?: {
    fallbackChain?: FallbackEntry[]
@@ -73,7 +74,7 @@ export function resolveModelPipeline(
        return { model: match, provenance: "category-default", attempted }
      }
    } else {
-      const connectedProviders = readConnectedProvidersCache()
+      const connectedProviders = constraints.connectedProviders ?? connectedProvidersCache.readConnectedProvidersCache()
      if (connectedProviders === null) {
        log("Model resolved via category default (no cache, first run)", {
          model: normalizedCategoryDefault,
@@ -98,7 +99,7 @@ export function resolveModelPipeline(

  if (fallbackChain && fallbackChain.length > 0) {
    if (availableModels.size === 0) {
-      const connectedProviders = readConnectedProvidersCache()
+      const connectedProviders = constraints.connectedProviders ?? connectedProvidersCache.readConnectedProvidersCache()
      const connectedSet = connectedProviders ? new Set(connectedProviders) : null

      if (connectedSet === null) {
--- a/src/tools/AGENTS.md
+++ b/src/tools/AGENTS.md
@@ -2,7 +2,7 @@

 ## OVERVIEW

-25+ tools across 8 categories. Two patterns: Direct ToolDefinition (static) and Factory Function (context-dependent).
+113 tools across 8 categories. Two patterns: Direct ToolDefinition (static) and Factory Function (context-dependent).

 **Categories**: LSP (6), AST-Grep (2), Search (2), Session (4), Task (4), Agent delegation (2), Background (2), Skill (3), System (2)

--- a/src/tools/background-task/index.ts
+++ b/src/tools/background-task/index.ts
@@ -1,8 +1,8 @@
 export {
+  createBackgroundTask,
  createBackgroundOutput,
  createBackgroundCancel,
 } from "./tools"

 export type * from "./types"
 export * from "./constants"
-export type { BackgroundOutputClient, BackgroundOutputManager, BackgroundCancelClient } from "./tools"
--- a/src/tools/background-task/modules/background-cancel.ts
+++ b/src/tools/background-task/modules/background-cancel.ts
@@ -0,0 +1,116 @@
+import { tool, type ToolDefinition } from "@opencode-ai/plugin"
+import type { BackgroundCancelClient } from "../types"
+import type { BackgroundManager } from "../../../features/background-agent"
+import type { BackgroundCancelArgs } from "../types"
+import { BACKGROUND_CANCEL_DESCRIPTION } from "../constants"
+
+export function createBackgroundCancel(manager: BackgroundManager, client: BackgroundCancelClient): ToolDefinition {
+  return tool({
+    description: BACKGROUND_CANCEL_DESCRIPTION,
+    args: {
+      taskId: tool.schema.string().optional().describe("Task ID to cancel (required if all=false)"),
+      all: tool.schema.boolean().optional().describe("Cancel all running background tasks (default: false)"),
+    },
+    async execute(args: BackgroundCancelArgs, toolContext) {
+      try {
+        const cancelAll = args.all === true
+
+        if (!cancelAll && !args.taskId) {
+          return `[ERROR] Invalid arguments: Either provide a taskId or set all=true to cancel all running tasks.`
+        }
+
+        if (cancelAll) {
+          const tasks = manager.getAllDescendantTasks(toolContext.sessionID)
+          const cancellableTasks = tasks.filter((t: any) => t.status === "running" || t.status === "pending")
+
+          if (cancellableTasks.length === 0) {
+            return `No running or pending background tasks to cancel.`
+          }
+
+          const cancelledInfo: Array<{
+            id: string
+            description: string
+            status: string
+            sessionID?: string
+          }> = []
+
+          for (const task of cancellableTasks) {
+            const originalStatus = task.status
+            const cancelled = await manager.cancelTask(task.id, {
+              source: "background_cancel",
+              abortSession: originalStatus === "running",
+              skipNotification: true,
+            })
+            if (!cancelled) continue
+            cancelledInfo.push({
+              id: task.id,
+              description: task.description,
+              status: originalStatus === "pending" ? "pending" : "running",
+              sessionID: task.sessionID,
+            })
+          }
+
+          const tableRows = cancelledInfo
+            .map(t => `| \`${t.id}\` | ${t.description} | ${t.status} | ${t.sessionID ? `\`${t.sessionID}\`` : "(not started)"} |`)
+            .join("\n")
+
+           const resumableTasks = cancelledInfo.filter(t => t.sessionID)
+           const resumeSection = resumableTasks.length > 0
+             ? `\n## Continue Instructions
+
+To continue a cancelled task, use:
+\`\`\`
+task(session_id="<session_id>", prompt="Continue: <your follow-up>")
+\`\`\`
+
+Continuable sessions:
+${resumableTasks.map(t => `- \`${t.sessionID}\` (${t.description})`).join("\n")}`
+             : ""
+
+          return `Cancelled ${cancelledInfo.length} background task(s):
+
+| Task ID | Description | Status | Session ID |
+|---------|-------------|--------|------------|
+${tableRows}
+${resumeSection}`
+        }
+
+        const task = manager.getTask(args.taskId!)
+        if (!task) {
+          return `[ERROR] Task not found: ${args.taskId}`
+        }
+
+        if (task.status !== "running" && task.status !== "pending") {
+          return `[ERROR] Cannot cancel task: current status is "${task.status}".
+Only running or pending tasks can be cancelled.`
+        }
+
+        const cancelled = await manager.cancelTask(task.id, {
+          source: "background_cancel",
+          abortSession: task.status === "running",
+          skipNotification: true,
+        })
+        if (!cancelled) {
+          return `[ERROR] Failed to cancel task: ${task.id}`
+        }
+
+        if (task.status === "pending") {
+          return `Pending task cancelled successfully
+
+Task ID: ${task.id}
+Description: ${task.description}
+Status: ${task.status}`
+        }
+
+        return `Task cancelled successfully
+
+Task ID: ${task.id}
+Description: ${task.description}
+Session ID: ${task.sessionID}
+Status: ${task.status}`
+      } catch (error) {
+        return `[ERROR] Error cancelling task: ${error instanceof Error ? error.message : String(error)}`
+      }
+    },
+  })
+}
--- a/src/tools/background-task/modules/background-output.ts
+++ b/src/tools/background-task/modules/background-output.ts
@@ -0,0 +1,137 @@
+import { tool, type ToolDefinition } from "@opencode-ai/plugin"
+import type { BackgroundOutputManager, BackgroundOutputClient } from "../types"
+import type { BackgroundOutputArgs } from "../types"
+import { BACKGROUND_OUTPUT_DESCRIPTION } from "../constants"
+import { formatTaskStatus, formatTaskResult, formatFullSession } from "./formatters"
+import { delay } from "./utils"
+import { storeToolMetadata } from "../../../features/tool-metadata-store"
+import type { BackgroundTask } from "../../../features/background-agent"
+import type { ToolContextWithMetadata } from "./utils"
+
+const SISYPHUS_JUNIOR_AGENT = "sisyphus-junior"
+
+type ToolContextWithCallId = ToolContextWithMetadata & {
+  callID?: string
+  callId?: string
+  call_id?: string
+}
+
+function resolveToolCallID(ctx: ToolContextWithCallId): string | undefined {
+  if (typeof ctx.callID === "string" && ctx.callID.trim() !== "") {
+    return ctx.callID
+  }
+  if (typeof ctx.callId === "string" && ctx.callId.trim() !== "") {
+    return ctx.callId
+  }
+  if (typeof ctx.call_id === "string" && ctx.call_id.trim() !== "") {
+    return ctx.call_id
+  }
+  return undefined
+}
+
+function formatResolvedTitle(task: BackgroundTask): string {
+  const label = task.agent === SISYPHUS_JUNIOR_AGENT && task.category
+    ? task.category
+    : task.agent
+  return `${label} - ${task.description}`
+}
+
+export function createBackgroundOutput(manager: BackgroundOutputManager, client: BackgroundOutputClient): ToolDefinition {
+  return tool({
+    description: BACKGROUND_OUTPUT_DESCRIPTION,
+    args: {
+      task_id: tool.schema.string().describe("Task ID to get output from"),
+      block: tool.schema.boolean().optional().describe("Wait for completion (default: false). System notifies when done, so blocking is rarely needed."),
+      timeout: tool.schema.number().optional().describe("Max wait time in ms (default: 60000, max: 600000)"),
+      full_session: tool.schema.boolean().optional().describe("Return full session messages with filters (default: false)"),
+      include_thinking: tool.schema.boolean().optional().describe("Include thinking/reasoning parts in full_session output (default: false)"),
+      message_limit: tool.schema.number().optional().describe("Max messages to return (capped at 100)"),
+      since_message_id: tool.schema.string().optional().describe("Return messages after this message ID (exclusive)"),
+      include_tool_results: tool.schema.boolean().optional().describe("Include tool results in full_session output (default: false)"),
+      thinking_max_chars: tool.schema.number().optional().describe("Max characters for thinking content (default: 2000)"),
+    },
+    async execute(args: BackgroundOutputArgs, toolContext) {
+      try {
+        const ctx = toolContext as ToolContextWithCallId
+        const task = manager.getTask(args.task_id)
+        if (!task) {
+          return `Task not found: ${args.task_id}`
+        }
+
+        const resolvedTitle = formatResolvedTitle(task)
+        const meta = {
+          title: resolvedTitle,
+          metadata: {
+            task_id: task.id,
+            agent: task.agent,
+            category: task.category,
+            description: task.description,
+            sessionId: task.sessionID ?? "pending",
+          } as Record<string, unknown>,
+        }
+        await ctx.metadata?.(meta)
+        const callID = resolveToolCallID(ctx)
+        if (callID) {
+          storeToolMetadata(ctx.sessionID, callID, meta)
+        }
+
+        if (args.full_session === true) {
+          return await formatFullSession(task, client, {
+            includeThinking: args.include_thinking === true,
+            messageLimit: args.message_limit,
+            sinceMessageId: args.since_message_id,
+            includeToolResults: args.include_tool_results === true,
+            thinkingMaxChars: args.thinking_max_chars,
+          })
+        }
+
+        const shouldBlock = args.block === true
+        const timeoutMs = Math.min(args.timeout ?? 60000, 600000)
+
+        // Already completed: return result immediately (regardless of block flag)
+        if (task.status === "completed") {
+          return await formatTaskResult(task, client)
+        }
+
+        // Error or cancelled: return status immediately
+        if (task.status === "error" || task.status === "cancelled") {
+          return formatTaskStatus(task)
+        }
+
+        // Non-blocking and still running: return status
+        if (!shouldBlock) {
+          return formatTaskStatus(task)
+        }
+
+        // Blocking: poll until completion or timeout
+        const startTime = Date.now()
+
+        while (Date.now() - startTime < timeoutMs) {
+          await delay(1000)
+
+          const currentTask = manager.getTask(args.task_id)
+          if (!currentTask) {
+            return `Task was deleted: ${args.task_id}`
+          }
+
+          if (currentTask.status === "completed") {
+            return await formatTaskResult(currentTask, client)
+          }
+
+          if (currentTask.status === "error" || currentTask.status === "cancelled") {
+            return formatTaskStatus(currentTask)
+          }
+        }
+
+        // Timeout exceeded: return current status
+        const finalTask = manager.getTask(args.task_id)
+        if (!finalTask) {
+          return `Task was deleted: ${args.task_id}`
+        }
+        return `Timeout exceeded (${timeoutMs}ms). Task still ${finalTask.status}.\n\n${formatTaskStatus(finalTask)}`
+      } catch (error) {
+        return `Error getting output: ${error instanceof Error ? error.message : String(error)}`
+      }
+    },
+  })
+}
--- a/src/tools/background-task/modules/background-task.ts
+++ b/src/tools/background-task/modules/background-task.ts
@@ -0,0 +1,105 @@
+import { tool, type ToolDefinition } from "@opencode-ai/plugin"
+import type { BackgroundManager } from "../../../features/background-agent"
+import type { BackgroundTaskArgs } from "../types"
+import { BACKGROUND_TASK_DESCRIPTION } from "../constants"
+import { findNearestMessageWithFields, findFirstMessageWithAgent, MESSAGE_STORAGE } from "../../../features/hook-message-injector"
+import { getSessionAgent } from "../../../features/claude-code-session-state"
+import { log } from "../../../shared/logger"
+import { storeToolMetadata } from "../../../features/tool-metadata-store"
+import { getMessageDir, delay, type ToolContextWithMetadata } from "./utils"
+
+export function createBackgroundTask(manager: BackgroundManager): ToolDefinition {
+  return tool({
+    description: BACKGROUND_TASK_DESCRIPTION,
+    args: {
+      description: tool.schema.string().describe("Short task description (shown in status)"),
+      prompt: tool.schema.string().describe("Full detailed prompt for the agent"),
+      agent: tool.schema.string().describe("Agent type to use (any registered agent)"),
+    },
+    async execute(args: BackgroundTaskArgs, toolContext) {
+      const ctx = toolContext as ToolContextWithMetadata
+
+      if (!args.agent || args.agent.trim() === "") {
+        return `[ERROR] Agent parameter is required. Please specify which agent to use (e.g., "explore", "librarian", "build", etc.)`
+      }
+
+      try {
+        const messageDir = getMessageDir(ctx.sessionID)
+        const prevMessage = messageDir ? findNearestMessageWithFields(messageDir) : null
+        const firstMessageAgent = messageDir ? findFirstMessageWithAgent(messageDir) : null
+        const sessionAgent = getSessionAgent(ctx.sessionID)
+        const parentAgent = ctx.agent ?? sessionAgent ?? firstMessageAgent ?? prevMessage?.agent
+        
+        log("[background_task] parentAgent resolution", {
+          sessionID: ctx.sessionID,
+          ctxAgent: ctx.agent,
+          sessionAgent,
+          firstMessageAgent,
+          prevMessageAgent: prevMessage?.agent,
+          resolvedParentAgent: parentAgent,
+        })
+        
+        const parentModel = prevMessage?.model?.providerID && prevMessage?.model?.modelID
+          ? { 
+              providerID: prevMessage.model.providerID, 
+              modelID: prevMessage.model.modelID,
+              ...(prevMessage.model.variant ? { variant: prevMessage.model.variant } : {})
+            }
+          : undefined
+
+        const task = await manager.launch({
+          description: args.description,
+          prompt: args.prompt,
+          agent: args.agent.trim(),
+          parentSessionID: ctx.sessionID,
+          parentMessageID: ctx.messageID,
+          parentModel,
+          parentAgent,
+        })
+
+        const WAIT_FOR_SESSION_INTERVAL_MS = 50
+        const WAIT_FOR_SESSION_TIMEOUT_MS = 30000
+        const waitStart = Date.now()
+        let sessionId = task.sessionID
+        while (!sessionId && Date.now() - waitStart < WAIT_FOR_SESSION_TIMEOUT_MS) {
+          if (ctx.abort?.aborted) {
+            await manager.cancelTask(task.id)
+            return `Task aborted and cancelled while waiting for session to start.\n\nTask ID: ${task.id}`
+          }
+          await delay(WAIT_FOR_SESSION_INTERVAL_MS)
+          const updated = manager.getTask(task.id)
+          if (!updated || updated.status === "error") {
+            return `Task ${!updated ? "was deleted" : `entered error state`}.\n\nTask ID: ${task.id}`
+          }
+          sessionId = updated?.sessionID
+        }
+
+        const bgMeta = {
+          title: args.description,
+          metadata: { sessionId: sessionId ?? "pending" } as Record<string, unknown>,
+        }
+        await ctx.metadata?.(bgMeta)
+        const callID = (ctx as any).callID as string | undefined
+        if (callID) {
+          storeToolMetadata(ctx.sessionID, callID, bgMeta)
+        }
+
+        return `Background task launched successfully.
+
+Task ID: ${task.id}
+Session ID: ${sessionId ?? "pending"}
+Description: ${task.description}
+Agent: ${task.agent}
+Status: ${task.status}
+
+The system will notify you when the task completes.
+Use \`background_output\` tool with task_id="${task.id}" to check progress:
+- block=false (default): Check status immediately - returns full status info
+- block=true: Wait for completion (rarely needed since system notifies)`
+      } catch (error) {
+        const message = error instanceof Error ? error.message : String(error)
+        return `[ERROR] Failed to launch background task: ${message}`
+      }
+    },
+  })
+}
--- a/src/tools/background-task/modules/formatters.ts
+++ b/src/tools/background-task/modules/formatters.ts
@@ -0,0 +1,311 @@
+import type { BackgroundTask } from "../../../features/background-agent"
+import type { BackgroundOutputClient } from "../types"
+import { formatDuration, truncateText, formatMessageTime } from "./utils"
+import { extractMessages, getErrorMessage, type BackgroundOutputMessagesResult, type FullSessionMessage, extractToolResultText } from "./message-processing"
+import { consumeNewMessages } from "../../../shared/session-cursor"
+
+const MAX_MESSAGE_LIMIT = 100
+const THINKING_MAX_CHARS = 2000
+
+export function formatTaskStatus(task: BackgroundTask): string {
+  let duration: string
+  if (task.status === "pending" && task.queuedAt) {
+    duration = formatDuration(task.queuedAt, undefined)
+  } else if (task.startedAt) {
+    duration = formatDuration(task.startedAt, task.completedAt)
+  } else {
+    duration = "N/A"
+  }
+  const promptPreview = truncateText(task.prompt, 500)
+  
+  let progressSection = ""
+  if (task.progress?.lastTool) {
+    progressSection = `\n| Last tool | ${task.progress.lastTool} |`
+  }
+
+  let lastMessageSection = ""
+  if (task.progress?.lastMessage) {
+    const truncated = truncateText(task.progress.lastMessage, 500)
+    const messageTime = task.progress.lastMessageAt 
+      ? task.progress.lastMessageAt.toISOString()
+      : "N/A"
+    lastMessageSection = `
+
+## Last Message (${messageTime})
+
+\`\`\`
+${truncated}
+\`\`\``
+  }
+
+  let statusNote = ""
+  if (task.status === "pending") {
+    statusNote = `
+
+> **Queued**: Task is waiting for a concurrency slot to become available.`
+  } else if (task.status === "running") {
+    statusNote = `
+
+> **Note**: No need to wait explicitly - the system will notify you when this task completes.`
+  } else if (task.status === "error") {
+    statusNote = `
+
+> **Failed**: The task encountered an error. Check the last message for details.`
+  }
+
+  const durationLabel = task.status === "pending" ? "Queued for" : "Duration"
+
+  return `# Task Status
+
+| Field | Value |
+|-------|-------|
+| Task ID | \`${task.id}\` |
+| Description | ${task.description} |
+| Agent | ${task.agent} |
+| Status | **${task.status}** |
+| ${durationLabel} | ${duration} |
+| Session ID | \`${task.sessionID}\` |${progressSection}
+${statusNote}
+## Original Prompt
+
+\`\`\`
+${promptPreview}
+\`\`\`${lastMessageSection}`
+}
+
+export async function formatTaskResult(task: BackgroundTask, client: BackgroundOutputClient): Promise<string> {
+  if (!task.sessionID) {
+    return `Error: Task has no sessionID`
+  }
+  
+  const messagesResult: BackgroundOutputMessagesResult = await client.session.messages({
+    path: { id: task.sessionID },
+  })
+
+  const errorMessage = getErrorMessage(messagesResult)
+  if (errorMessage) {
+    return `Error fetching messages: ${errorMessage}`
+  }
+
+  const messages = extractMessages(messagesResult)
+
+  if (!Array.isArray(messages) || messages.length === 0) {
+    return `Task Result
+
+Task ID: ${task.id}
+Description: ${task.description}
+Duration: ${formatDuration(task.startedAt ?? new Date(), task.completedAt)}
+Session ID: ${task.sessionID}
+
+---
+
+(No messages found)`
+  }
+
+  // Include both assistant messages AND tool messages
+  // Tool results (grep, glob, bash output) come from role "tool"
+  const relevantMessages = messages.filter(
+    (m) => m.info?.role === "assistant" || m.info?.role === "tool"
+  )
+
+  if (relevantMessages.length === 0) {
+    return `Task Result
+
+Task ID: ${task.id}
+Description: ${task.description}
+Duration: ${formatDuration(task.startedAt ?? new Date(), task.completedAt)}
+Session ID: ${task.sessionID}
+
+---
+
+(No assistant or tool response found)`
+  }
+
+  // Sort by time ascending (oldest first) to process messages in order
+  const sortedMessages = [...relevantMessages].sort((a, b) => {
+    const timeA = String((a as { info?: { time?: string } }).info?.time ?? "")
+    const timeB = String((b as { info?: { time?: string } }).info?.time ?? "")
+    return timeA.localeCompare(timeB)
+  })
+  
+  const newMessages = consumeNewMessages(task.sessionID, sortedMessages)
+  if (newMessages.length === 0) {
+    const duration = formatDuration(task.startedAt ?? new Date(), task.completedAt)
+    return `Task Result
+
+Task ID: ${task.id}
+Description: ${task.description}
+Duration: ${duration}
+Session ID: ${task.sessionID}
+
+---
+
+(No new output since last check)`
+  }
+
+  // Extract content from ALL messages, not just the last one
+  // Tool results may be in earlier messages while the final message is empty
+  const extractedContent: string[] = []
+  
+  for (const message of newMessages) {
+    for (const part of message.parts ?? []) {
+      // Handle both "text" and "reasoning" parts (thinking models use "reasoning")
+      if ((part.type === "text" || part.type === "reasoning") && part.text) {
+        extractedContent.push(part.text)
+      } else if (part.type === "tool_result") {
+        // Tool results contain the actual output from tool calls
+        const toolResult = part as { content?: string | Array<{ type: string; text?: string }> }
+        if (typeof toolResult.content === "string" && toolResult.content) {
+          extractedContent.push(toolResult.content)
+        } else if (Array.isArray(toolResult.content)) {
+          // Handle array of content blocks
+          for (const block of toolResult.content) {
+            // Handle both "text" and "reasoning" parts (thinking models use "reasoning")
+            if ((block.type === "text" || block.type === "reasoning") && block.text) {
+              extractedContent.push(block.text)
+            }
+          }
+        }
+      }
+    }
+  }
+  
+  const textContent = extractedContent
+    .filter((text) => text.length > 0)
+    .join("\n\n")
+
+  const duration = formatDuration(task.startedAt ?? new Date(), task.completedAt)
+
+  return `Task Result
+
+Task ID: ${task.id}
+Description: ${task.description}
+Duration: ${duration}
+Session ID: ${task.sessionID}
+
+---
+
+${textContent || "(No text output)"}`
+}
+
+export async function formatFullSession(
+  task: BackgroundTask,
+  client: BackgroundOutputClient,
+  options: {
+    includeThinking: boolean
+    messageLimit?: number
+    sinceMessageId?: string
+    includeToolResults: boolean
+    thinkingMaxChars?: number
+  }
+): Promise<string> {
+  if (!task.sessionID) {
+    return formatTaskStatus(task)
+  }
+
+  const messagesResult: BackgroundOutputMessagesResult = await client.session.messages({
+    path: { id: task.sessionID },
+  })
+
+  const errorMessage = getErrorMessage(messagesResult)
+  if (errorMessage) {
+    return `Error fetching messages: ${errorMessage}`
+  }
+
+  const rawMessages = extractMessages(messagesResult)
+  if (!Array.isArray(rawMessages)) {
+    return "Error fetching messages: invalid response"
+  }
+
+  const sortedMessages = [...(rawMessages as FullSessionMessage[])].sort((a, b) => {
+    const timeA = String(a.info?.time ?? "")
+    const timeB = String(b.info?.time ?? "")
+    return timeA.localeCompare(timeB)
+  })
+
+  let filteredMessages = sortedMessages
+
+  if (options.sinceMessageId) {
+    const index = filteredMessages.findIndex((message) => message.id === options.sinceMessageId)
+    if (index === -1) {
+      return `Error: since_message_id not found: ${options.sinceMessageId}`
+    }
+    filteredMessages = filteredMessages.slice(index + 1)
+  }
+
+  const includeThinking = options.includeThinking
+  const includeToolResults = options.includeToolResults
+  const thinkingMaxChars = options.thinkingMaxChars ?? THINKING_MAX_CHARS
+
+  const normalizedMessages: FullSessionMessage[] = []
+  for (const message of filteredMessages) {
+    const parts = (message.parts ?? []).filter((part) => {
+      if (part.type === "thinking" || part.type === "reasoning") {
+        return includeThinking
+      }
+      if (part.type === "tool_result") {
+        return includeToolResults
+      }
+      return part.type === "text"
+    })
+
+    if (parts.length === 0) {
+      continue
+    }
+
+    normalizedMessages.push({ ...message, parts })
+  }
+
+  const limit = typeof options.messageLimit === "number"
+    ? Math.min(options.messageLimit, MAX_MESSAGE_LIMIT)
+    : undefined
+  const hasMore = limit !== undefined && normalizedMessages.length > limit
+  const visibleMessages = limit !== undefined
+    ? normalizedMessages.slice(0, limit)
+    : normalizedMessages
+
+  const lines: string[] = []
+  lines.push("# Full Session Output")
+  lines.push("")
+  lines.push(`Task ID: ${task.id}`)
+  lines.push(`Description: ${task.description}`)
+  lines.push(`Status: ${task.status}`)
+  lines.push(`Session ID: ${task.sessionID}`)
+  lines.push(`Total messages: ${normalizedMessages.length}`)
+  lines.push(`Returned: ${visibleMessages.length}`)
+  lines.push(`Has more: ${hasMore ? "true" : "false"}`)
+  lines.push("")
+  lines.push("## Messages")
+
+  if (visibleMessages.length === 0) {
+    lines.push("")
+    lines.push("(No messages found)")
+    return lines.join("\n")
+  }
+
+  for (const message of visibleMessages) {
+    const role = message.info?.role ?? "unknown"
+    const agent = message.info?.agent ? ` (${message.info.agent})` : ""
+    const time = formatMessageTime(message.info?.time)
+    const idLabel = message.id ? ` id=${message.id}` : ""
+    lines.push("")
+    lines.push(`[${role}${agent}] ${time}${idLabel}`)
+
+    for (const part of message.parts ?? []) {
+      if (part.type === "text" && part.text) {
+        lines.push(part.text.trim())
+      } else if (part.type === "thinking" && part.thinking) {
+        lines.push(`[thinking] ${truncateText(part.thinking, thinkingMaxChars)}`)
+      } else if (part.type === "reasoning" && part.text) {
+        lines.push(`[thinking] ${truncateText(part.text, thinkingMaxChars)}`)
+      } else if (part.type === "tool_result") {
+        const toolTexts = extractToolResultText(part)
+        for (const toolText of toolTexts) {
+          lines.push(`[tool result] ${toolText}`)
+        }
+      }
+    }
+  }
+
+  return lines.join("\n")
+}
--- a/src/tools/background-task/modules/message-processing.ts
+++ b/src/tools/background-task/modules/message-processing.ts
@@ -0,0 +1,75 @@
+export type BackgroundOutputMessage = {
+  info?: { role?: string; time?: string | { created?: number }; agent?: string }
+  parts?: Array<{
+    type?: string
+    text?: string
+    content?: string | Array<{ type: string; text?: string }>
+    name?: string
+  }>
+}
+
+export type BackgroundOutputMessagesResult =
+  | { data?: BackgroundOutputMessage[]; error?: unknown }
+  | BackgroundOutputMessage[]
+
+export type FullSessionMessagePart = {
+  type?: string
+  text?: string
+  thinking?: string
+  content?: string | Array<{ type?: string; text?: string }>
+  output?: string
+}
+
+export type FullSessionMessage = {
+  id?: string
+  info?: { role?: string; time?: string; agent?: string }
+  parts?: FullSessionMessagePart[]
+}
+
+export function getErrorMessage(value: BackgroundOutputMessagesResult): string | null {
+  if (Array.isArray(value)) return null
+  if (value.error === undefined || value.error === null) return null
+  if (typeof value.error === "string" && value.error.length > 0) return value.error
+  return String(value.error)
+}
+
+export function isSessionMessage(value: unknown): value is {
+  info?: { role?: string; time?: string }
+  parts?: Array<{
+    type?: string
+    text?: string
+    content?: string | Array<{ type: string; text?: string }>
+    name?: string
+  }>
+} {
+  return typeof value === "object" && value !== null
+}
+
+export function extractMessages(value: BackgroundOutputMessagesResult): BackgroundOutputMessage[] {
+  if (Array.isArray(value)) {
+    return value.filter(isSessionMessage)
+  }
+  if (Array.isArray(value.data)) {
+    return value.data.filter(isSessionMessage)
+  }
+  return []
+}
+
+export function extractToolResultText(part: FullSessionMessagePart): string[] {
+  if (typeof part.content === "string" && part.content.length > 0) {
+    return [part.content]
+  }
+
+  if (Array.isArray(part.content)) {
+    const blocks = part.content
+      .filter((block) => (block.type === "text" || block.type === "reasoning") && block.text)
+      .map((block) => block.text as string)
+    if (blocks.length > 0) return blocks
+  }
+
+  if (part.output && part.output.length > 0) {
+    return [part.output]
+  }
+
+  return []
+}
--- a/src/tools/background-task/modules/utils.ts
+++ b/src/tools/background-task/modules/utils.ts
@@ -0,0 +1,65 @@
+import { existsSync, readdirSync } from "node:fs"
+import { join } from "node:path"
+import { MESSAGE_STORAGE } from "../../../features/hook-message-injector"
+
+export function getMessageDir(sessionID: string): string | null {
+  if (!existsSync(MESSAGE_STORAGE)) return null
+
+  const directPath = join(MESSAGE_STORAGE, sessionID)
+  if (existsSync(directPath)) return directPath
+
+  for (const dir of readdirSync(MESSAGE_STORAGE)) {
+    const sessionPath = join(MESSAGE_STORAGE, dir, sessionID)
+    if (existsSync(sessionPath)) return sessionPath
+  }
+
+  return null
+}
+
+export function formatDuration(start: Date, end?: Date): string {
+  const duration = (end ?? new Date()).getTime() - start.getTime()
+  const seconds = Math.floor(duration / 1000)
+  const minutes = Math.floor(seconds / 60)
+  const hours = Math.floor(minutes / 60)
+
+  if (hours > 0) {
+    return `${hours}h ${minutes % 60}m ${seconds % 60}s`
+  } else if (minutes > 0) {
+    return `${minutes}m ${seconds % 60}s`
+  } else {
+    return `${seconds}s`
+  }
+}
+
+export function truncateText(text: string, maxLength: number): string {
+  if (text.length <= maxLength) return text
+  return text.slice(0, maxLength) + "..."
+}
+
+export function delay(ms: number): Promise<void> {
+  return new Promise(resolve => setTimeout(resolve, ms))
+}
+
+export function formatMessageTime(value: unknown): string {
+  if (typeof value === "string") {
+    const date = new Date(value)
+    return Number.isNaN(date.getTime()) ? value : date.toISOString()
+  }
+  if (typeof value === "object" && value !== null) {
+    if ("created" in value) {
+      const created = (value as { created?: number }).created
+      if (typeof created === "number") {
+        return new Date(created).toISOString()
+      }
+    }
+  }
+  return "Unknown time"
+}
+
+export type ToolContextWithMetadata = {
+  sessionID: string
+  messageID: string
+  agent: string
+  abort: AbortSignal
+  metadata?: (input: { title?: string; metadata?: Record<string, unknown> }) => void
+}
--- a/src/tools/background-task/tools.test.ts
+++ b/src/tools/background-task/tools.test.ts
@@ -1,7 +1,11 @@
+/// <reference types="bun-types" />
+
+import { describe, test, expect } from "bun:test"
 import { createBackgroundCancel, createBackgroundOutput } from "./tools"
 import type { BackgroundManager, BackgroundTask } from "../../features/background-agent"
 import type { ToolContext } from "@opencode-ai/plugin/tool"
 import type { BackgroundCancelClient, BackgroundOutputManager, BackgroundOutputClient } from "./tools"
+import { consumeToolMetadata, clearPendingStore } from "../../features/tool-metadata-store"

 const projectDir = "/Users/yeongyu/local-workspaces/oh-my-opencode"

@@ -49,6 +53,59 @@ function createTask(overrides: Partial<BackgroundTask> = {}): BackgroundTask {
 }

 describe("background_output full_session", () => {
+  test("resolves task_id into title metadata", async () => {
+    // #given
+    clearPendingStore()
+
+    const task = createTask({
+      id: "task-1",
+      agent: "explore",
+      description: "Find how task output is rendered",
+      status: "running",
+    })
+    const manager = createMockManager(task)
+    const client = createMockClient({})
+    const tool = createBackgroundOutput(manager, client)
+    const ctxWithCallId = {
+      ...mockContext,
+      callID: "call-1",
+    } as unknown as ToolContext
+
+    // #when
+    await tool.execute({ task_id: "task-1" }, ctxWithCallId)
+
+    // #then
+    const restored = consumeToolMetadata("test-session", "call-1")
+    expect(restored?.title).toBe("explore - Find how task output is rendered")
+  })
+
+  test("shows category instead of agent for sisyphus-junior", async () => {
+    // #given
+    clearPendingStore()
+
+    const task = createTask({
+      id: "task-1",
+      agent: "sisyphus-junior",
+      category: "quick",
+      description: "Fix flaky test",
+      status: "running",
+    })
+    const manager = createMockManager(task)
+    const client = createMockClient({})
+    const tool = createBackgroundOutput(manager, client)
+    const ctxWithCallId = {
+      ...mockContext,
+      callID: "call-1",
+    } as unknown as ToolContext
+
+    // #when
+    await tool.execute({ task_id: "task-1" }, ctxWithCallId)
+
+    // #then
+    const restored = consumeToolMetadata("test-session", "call-1")
+    expect(restored?.title).toBe("quick - Fix flaky test")
+  })
+
  test("includes thinking and tool results when enabled", async () => {
    // #given
    const task = createTask()
--- a/src/tools/background-task/tools.ts
+++ b/src/tools/background-task/tools.ts
@@ -1,7 +1,5 @@
 import { tool, type ToolDefinition } from "@opencode-ai/plugin"
-import { existsSync, readdirSync } from "node:fs"
-import { join } from "node:path"
-import type { BackgroundManager, BackgroundTask } from "../../features/background-agent"
+import type { BackgroundManager } from "../../features/background-agent"
 import type { BackgroundTaskArgs, BackgroundOutputArgs, BackgroundCancelArgs } from "./types"
 import { BACKGROUND_TASK_DESCRIPTION, BACKGROUND_OUTPUT_DESCRIPTION, BACKGROUND_CANCEL_DESCRIPTION } from "./constants"
 import { findNearestMessageWithFields, findFirstMessageWithAgent, MESSAGE_STORAGE } from "../../features/hook-message-injector"
@@ -10,748 +8,22 @@ import { log } from "../../shared/logger"
 import { consumeNewMessages } from "../../shared/session-cursor"
 import { storeToolMetadata } from "../../features/tool-metadata-store"

-type BackgroundOutputMessage = {
-  info?: { role?: string; time?: string | { created?: number }; agent?: string }
-  parts?: Array<{
-    type?: string
-    text?: string
-    content?: string | Array<{ type: string; text?: string }>
-    name?: string
-  }>
-}
-
-type BackgroundOutputMessagesResult =
-  | { data?: BackgroundOutputMessage[]; error?: unknown }
-  | BackgroundOutputMessage[]
-
-export type BackgroundOutputClient = {
-  session: {
-    messages: (args: { path: { id: string } }) => Promise<BackgroundOutputMessagesResult>
-  }
-}
-
-export type BackgroundCancelClient = {
-  session: {
-    abort: (args: { path: { id: string } }) => Promise<unknown>
-  }
-}
-
-export type BackgroundOutputManager = Pick<BackgroundManager, "getTask">
-
-const MAX_MESSAGE_LIMIT = 100
-const THINKING_MAX_CHARS = 2000
-
-type FullSessionMessagePart = {
-  type?: string
-  text?: string
-  thinking?: string
-  content?: string | Array<{ type?: string; text?: string }>
-  output?: string
-}
-
-type FullSessionMessage = {
-  id?: string
-  info?: { role?: string; time?: string; agent?: string }
-  parts?: FullSessionMessagePart[]
-}
-
-function getMessageDir(sessionID: string): string | null {
-  if (!existsSync(MESSAGE_STORAGE)) return null
-
-  const directPath = join(MESSAGE_STORAGE, sessionID)
-  if (existsSync(directPath)) return directPath
-
-  for (const dir of readdirSync(MESSAGE_STORAGE)) {
-    const sessionPath = join(MESSAGE_STORAGE, dir, sessionID)
-    if (existsSync(sessionPath)) return sessionPath
-  }
-
-  return null
-}
-
-function formatDuration(start: Date, end?: Date): string {
-  const duration = (end ?? new Date()).getTime() - start.getTime()
-  const seconds = Math.floor(duration / 1000)
-  const minutes = Math.floor(seconds / 60)
-  const hours = Math.floor(minutes / 60)
-
-  if (hours > 0) {
-    return `${hours}h ${minutes % 60}m ${seconds % 60}s`
-  } else if (minutes > 0) {
-    return `${minutes}m ${seconds % 60}s`
-  } else {
-    return `${seconds}s`
-  }
-}
-
-type ToolContextWithMetadata = {
-  sessionID: string
-  messageID: string
-  agent: string
-  abort: AbortSignal
-  metadata?: (input: { title?: string; metadata?: Record<string, unknown> }) => void
-}
-
-export function createBackgroundTask(manager: BackgroundManager): ToolDefinition {
-  return tool({
-    description: BACKGROUND_TASK_DESCRIPTION,
-    args: {
-      description: tool.schema.string().describe("Short task description (shown in status)"),
-      prompt: tool.schema.string().describe("Full detailed prompt for the agent"),
-      agent: tool.schema.string().describe("Agent type to use (any registered agent)"),
-    },
-    async execute(args: BackgroundTaskArgs, toolContext) {
-      const ctx = toolContext as ToolContextWithMetadata
-
-      if (!args.agent || args.agent.trim() === "") {
-        return `[ERROR] Agent parameter is required. Please specify which agent to use (e.g., "explore", "librarian", "build", etc.)`
-      }
-
-      try {
-        const messageDir = getMessageDir(ctx.sessionID)
-        const prevMessage = messageDir ? findNearestMessageWithFields(messageDir) : null
-        const firstMessageAgent = messageDir ? findFirstMessageWithAgent(messageDir) : null
-        const sessionAgent = getSessionAgent(ctx.sessionID)
-        const parentAgent = ctx.agent ?? sessionAgent ?? firstMessageAgent ?? prevMessage?.agent
-        
-        log("[background_task] parentAgent resolution", {
-          sessionID: ctx.sessionID,
-          ctxAgent: ctx.agent,
-          sessionAgent,
-          firstMessageAgent,
-          prevMessageAgent: prevMessage?.agent,
-          resolvedParentAgent: parentAgent,
-        })
-        
-        const parentModel = prevMessage?.model?.providerID && prevMessage?.model?.modelID
-          ? { 
-              providerID: prevMessage.model.providerID, 
-              modelID: prevMessage.model.modelID,
-              ...(prevMessage.model.variant ? { variant: prevMessage.model.variant } : {})
-            }
-          : undefined
-
-        const task = await manager.launch({
-          description: args.description,
-          prompt: args.prompt,
-          agent: args.agent.trim(),
-          parentSessionID: ctx.sessionID,
-          parentMessageID: ctx.messageID,
-          parentModel,
-          parentAgent,
-        })
-
-        const WAIT_FOR_SESSION_INTERVAL_MS = 50
-        const WAIT_FOR_SESSION_TIMEOUT_MS = 30000
-        const waitStart = Date.now()
-        let sessionId = task.sessionID
-        while (!sessionId && Date.now() - waitStart < WAIT_FOR_SESSION_TIMEOUT_MS) {
-          if (ctx.abort?.aborted) {
-            await manager.cancelTask(task.id)
-            return `Task aborted and cancelled while waiting for session to start.\n\nTask ID: ${task.id}`
-          }
-          await delay(WAIT_FOR_SESSION_INTERVAL_MS)
-          const updated = manager.getTask(task.id)
-          if (!updated || updated.status === "error") {
-            return `Task ${!updated ? "was deleted" : `entered error state`}.\n\nTask ID: ${task.id}`
-          }
-          sessionId = updated?.sessionID
-        }
-
-        const bgMeta = {
-          title: args.description,
-          metadata: { sessionId: sessionId ?? "pending" } as Record<string, unknown>,
-        }
-        await ctx.metadata?.(bgMeta)
-        const callID = (ctx as any).callID as string | undefined
-        if (callID) {
-          storeToolMetadata(ctx.sessionID, callID, bgMeta)
-        }
-
-        return `Background task launched successfully.
-
-Task ID: ${task.id}
-Session ID: ${sessionId ?? "pending"}
-Description: ${task.description}
-Agent: ${task.agent}
-Status: ${task.status}
-
-The system will notify you when the task completes.
-Use \`background_output\` tool with task_id="${task.id}" to check progress:
- block=false (default): Check status immediately - returns full status info
- block=true: Wait for completion (rarely needed since system notifies)`
-      } catch (error) {
-        const message = error instanceof Error ? error.message : String(error)
-        return `[ERROR] Failed to launch background task: ${message}`
-      }
-    },
-  })
-}
-
-function delay(ms: number): Promise<void> {
-  return new Promise(resolve => setTimeout(resolve, ms))
-}
-
-function truncateText(text: string, maxLength: number): string {
-  if (text.length <= maxLength) return text
-  return text.slice(0, maxLength) + "..."
-}
-
-function formatTaskStatus(task: BackgroundTask): string {
-  let duration: string
-  if (task.status === "pending" && task.queuedAt) {
-    duration = formatDuration(task.queuedAt, undefined)
-  } else if (task.startedAt) {
-    duration = formatDuration(task.startedAt, task.completedAt)
-  } else {
-    duration = "N/A"
-  }
-  const promptPreview = truncateText(task.prompt, 500)
-  
-  let progressSection = ""
-  if (task.progress?.lastTool) {
-    progressSection = `\n| Last tool | ${task.progress.lastTool} |`
-  }
-
-  let lastMessageSection = ""
-  if (task.progress?.lastMessage) {
-    const truncated = truncateText(task.progress.lastMessage, 500)
-    const messageTime = task.progress.lastMessageAt 
-      ? task.progress.lastMessageAt.toISOString()
-      : "N/A"
-    lastMessageSection = `
-
-## Last Message (${messageTime})
-
-\`\`\`
-${truncated}
-\`\`\``
-  }
-
-  let statusNote = ""
-  if (task.status === "pending") {
-    statusNote = `
-
-> **Queued**: Task is waiting for a concurrency slot to become available.`
-  } else if (task.status === "running") {
-    statusNote = `
-
-> **Note**: No need to wait explicitly - the system will notify you when this task completes.`
-  } else if (task.status === "error") {
-    statusNote = `
-
-> **Failed**: The task encountered an error. Check the last message for details.`
-  }
-
-  const durationLabel = task.status === "pending" ? "Queued for" : "Duration"
-
-  return `# Task Status
-
-| Field | Value |
-|-------|-------|
-| Task ID | \`${task.id}\` |
-| Description | ${task.description} |
-| Agent | ${task.agent} |
-| Status | **${task.status}** |
-| ${durationLabel} | ${duration} |
-| Session ID | \`${task.sessionID}\` |${progressSection}
-${statusNote}
-## Original Prompt
-
-\`\`\`
-${promptPreview}
-\`\`\`${lastMessageSection}`
-}
-
-function getErrorMessage(value: BackgroundOutputMessagesResult): string | null {
-  if (Array.isArray(value)) return null
-  if (value.error === undefined || value.error === null) return null
-  if (typeof value.error === "string" && value.error.length > 0) return value.error
-  return String(value.error)
-}
-
-function isSessionMessage(value: unknown): value is {
-  info?: { role?: string; time?: string }
-  parts?: Array<{
-    type?: string
-    text?: string
-    content?: string | Array<{ type: string; text?: string }>
-    name?: string
-  }>
-} {
-  return typeof value === "object" && value !== null
-}
-
-function extractMessages(value: BackgroundOutputMessagesResult): BackgroundOutputMessage[] {
-  if (Array.isArray(value)) {
-    return value.filter(isSessionMessage)
-  }
-  if (Array.isArray(value.data)) {
-    return value.data.filter(isSessionMessage)
-  }
-  return []
-}
-
-async function formatTaskResult(task: BackgroundTask, client: BackgroundOutputClient): Promise<string> {
-  if (!task.sessionID) {
-    return `Error: Task has no sessionID`
-  }
-  
-  const messagesResult: BackgroundOutputMessagesResult = await client.session.messages({
-    path: { id: task.sessionID },
-  })
-
-  const errorMessage = getErrorMessage(messagesResult)
-  if (errorMessage) {
-    return `Error fetching messages: ${errorMessage}`
-  }
-
-  const messages = extractMessages(messagesResult)
-
-  if (!Array.isArray(messages) || messages.length === 0) {
-    return `Task Result
-
-Task ID: ${task.id}
-Description: ${task.description}
-Duration: ${formatDuration(task.startedAt ?? new Date(), task.completedAt)}
-Session ID: ${task.sessionID}
-
---
-
-(No messages found)`
-  }
-
-  // Include both assistant messages AND tool messages
-  // Tool results (grep, glob, bash output) come from role "tool"
-  const relevantMessages = messages.filter(
-    (m) => m.info?.role === "assistant" || m.info?.role === "tool"
-  )
-
-  if (relevantMessages.length === 0) {
-    return `Task Result
-
-Task ID: ${task.id}
-Description: ${task.description}
-Duration: ${formatDuration(task.startedAt ?? new Date(), task.completedAt)}
-Session ID: ${task.sessionID}
-
---
-
-(No assistant or tool response found)`
-  }
-
-  // Sort by time ascending (oldest first) to process messages in order
-  const sortedMessages = [...relevantMessages].sort((a, b) => {
-    const timeA = String((a as { info?: { time?: string } }).info?.time ?? "")
-    const timeB = String((b as { info?: { time?: string } }).info?.time ?? "")
-    return timeA.localeCompare(timeB)
-  })
-  
-  const newMessages = consumeNewMessages(task.sessionID, sortedMessages)
-  if (newMessages.length === 0) {
-    const duration = formatDuration(task.startedAt ?? new Date(), task.completedAt)
-    return `Task Result
-
-Task ID: ${task.id}
-Description: ${task.description}
-Duration: ${duration}
-Session ID: ${task.sessionID}
-
---
-
-(No new output since last check)`
-  }
-
-  // Extract content from ALL messages, not just the last one
-  // Tool results may be in earlier messages while the final message is empty
-  const extractedContent: string[] = []
-  
-  for (const message of newMessages) {
-    for (const part of message.parts ?? []) {
-      // Handle both "text" and "reasoning" parts (thinking models use "reasoning")
-      if ((part.type === "text" || part.type === "reasoning") && part.text) {
-        extractedContent.push(part.text)
-      } else if (part.type === "tool_result") {
-        // Tool results contain the actual output from tool calls
-        const toolResult = part as { content?: string | Array<{ type: string; text?: string }> }
-        if (typeof toolResult.content === "string" && toolResult.content) {
-          extractedContent.push(toolResult.content)
-        } else if (Array.isArray(toolResult.content)) {
-          // Handle array of content blocks
-          for (const block of toolResult.content) {
-            // Handle both "text" and "reasoning" parts (thinking models use "reasoning")
-            if ((block.type === "text" || block.type === "reasoning") && block.text) {
-              extractedContent.push(block.text)
-            }
-          }
-        }
-      }
-    }
-  }
-  
-  const textContent = extractedContent
-    .filter((text) => text.length > 0)
-    .join("\n\n")
-
-  const duration = formatDuration(task.startedAt ?? new Date(), task.completedAt)
-
-  return `Task Result
-
-Task ID: ${task.id}
-Description: ${task.description}
-Duration: ${duration}
-Session ID: ${task.sessionID}
-
---
-
-${textContent || "(No text output)"}`
-}
-
-function extractToolResultText(part: FullSessionMessagePart): string[] {
-  if (typeof part.content === "string" && part.content.length > 0) {
-    return [part.content]
-  }
-
-  if (Array.isArray(part.content)) {
-    const blocks = part.content
-      .filter((block) => (block.type === "text" || block.type === "reasoning") && block.text)
-      .map((block) => block.text as string)
-    if (blocks.length > 0) return blocks
-  }
-
-  if (part.output && part.output.length > 0) {
-    return [part.output]
-  }
-
-  return []
-}
-
-async function formatFullSession(
-  task: BackgroundTask,
-  client: BackgroundOutputClient,
-  options: {
-    includeThinking: boolean
-    messageLimit?: number
-    sinceMessageId?: string
-    includeToolResults: boolean
-    thinkingMaxChars?: number
-  }
-): Promise<string> {
-  if (!task.sessionID) {
-    return formatTaskStatus(task)
-  }
-
-  const messagesResult: BackgroundOutputMessagesResult = await client.session.messages({
-    path: { id: task.sessionID },
-  })
-
-  const errorMessage = getErrorMessage(messagesResult)
-  if (errorMessage) {
-    return `Error fetching messages: ${errorMessage}`
-  }
-
-  const rawMessages = extractMessages(messagesResult)
-  if (!Array.isArray(rawMessages)) {
-    return "Error fetching messages: invalid response"
-  }
-
-  const sortedMessages = [...(rawMessages as FullSessionMessage[])].sort((a, b) => {
-    const timeA = String(a.info?.time ?? "")
-    const timeB = String(b.info?.time ?? "")
-    return timeA.localeCompare(timeB)
-  })
-
-  let filteredMessages = sortedMessages
-
-  if (options.sinceMessageId) {
-    const index = filteredMessages.findIndex((message) => message.id === options.sinceMessageId)
-    if (index === -1) {
-      return `Error: since_message_id not found: ${options.sinceMessageId}`
-    }
-    filteredMessages = filteredMessages.slice(index + 1)
-  }
-
-  const includeThinking = options.includeThinking
-  const includeToolResults = options.includeToolResults
-  const thinkingMaxChars = options.thinkingMaxChars ?? THINKING_MAX_CHARS
-
-  const normalizedMessages: FullSessionMessage[] = []
-  for (const message of filteredMessages) {
-    const parts = (message.parts ?? []).filter((part) => {
-      if (part.type === "thinking" || part.type === "reasoning") {
-        return includeThinking
-      }
-      if (part.type === "tool_result") {
-        return includeToolResults
-      }
-      return part.type === "text"
-    })
-
-    if (parts.length === 0) {
-      continue
-    }
-
-    normalizedMessages.push({ ...message, parts })
-  }
-
-  const limit = typeof options.messageLimit === "number"
-    ? Math.min(options.messageLimit, MAX_MESSAGE_LIMIT)
-    : undefined
-  const hasMore = limit !== undefined && normalizedMessages.length > limit
-  const visibleMessages = limit !== undefined
-    ? normalizedMessages.slice(0, limit)
-    : normalizedMessages
-
-  const lines: string[] = []
-  lines.push("# Full Session Output")
-  lines.push("")
-  lines.push(`Task ID: ${task.id}`)
-  lines.push(`Description: ${task.description}`)
-  lines.push(`Status: ${task.status}`)
-  lines.push(`Session ID: ${task.sessionID}`)
-  lines.push(`Total messages: ${normalizedMessages.length}`)
-  lines.push(`Returned: ${visibleMessages.length}`)
-  lines.push(`Has more: ${hasMore ? "true" : "false"}`)
-  lines.push("")
-  lines.push("## Messages")
-
-  if (visibleMessages.length === 0) {
-    lines.push("")
-    lines.push("(No messages found)")
-    return lines.join("\n")
-  }
-
-  for (const message of visibleMessages) {
-    const role = message.info?.role ?? "unknown"
-    const agent = message.info?.agent ? ` (${message.info.agent})` : ""
-    const time = formatMessageTime(message.info?.time)
-    const idLabel = message.id ? ` id=${message.id}` : ""
-    lines.push("")
-    lines.push(`[${role}${agent}] ${time}${idLabel}`)
-
-    for (const part of message.parts ?? []) {
-      if (part.type === "text" && part.text) {
-        lines.push(part.text.trim())
-      } else if (part.type === "thinking" && part.thinking) {
-        lines.push(`[thinking] ${truncateText(part.thinking, thinkingMaxChars)}`)
-      } else if (part.type === "reasoning" && part.text) {
-        lines.push(`[thinking] ${truncateText(part.text, thinkingMaxChars)}`)
-      } else if (part.type === "tool_result") {
-        const toolTexts = extractToolResultText(part)
-        for (const toolText of toolTexts) {
-          lines.push(`[tool result] ${toolText}`)
-        }
-      }
-    }
-  }
-
-  return lines.join("\n")
-}
-
-export function createBackgroundOutput(manager: BackgroundOutputManager, client: BackgroundOutputClient): ToolDefinition {
-  return tool({
-    description: BACKGROUND_OUTPUT_DESCRIPTION,
-    args: {
-      task_id: tool.schema.string().describe("Task ID to get output from"),
-      block: tool.schema.boolean().optional().describe("Wait for completion (default: false). System notifies when done, so blocking is rarely needed."),
-      timeout: tool.schema.number().optional().describe("Max wait time in ms (default: 60000, max: 600000)"),
-      full_session: tool.schema.boolean().optional().describe("Return full session messages with filters (default: false)"),
-      include_thinking: tool.schema.boolean().optional().describe("Include thinking/reasoning parts in full_session output (default: false)"),
-      message_limit: tool.schema.number().optional().describe("Max messages to return (capped at 100)"),
-      since_message_id: tool.schema.string().optional().describe("Return messages after this message ID (exclusive)"),
-      include_tool_results: tool.schema.boolean().optional().describe("Include tool results in full_session output (default: false)"),
-      thinking_max_chars: tool.schema.number().optional().describe("Max characters for thinking content (default: 2000)"),
-    },
-    async execute(args: BackgroundOutputArgs) {
-      try {
-        const task = manager.getTask(args.task_id)
-        if (!task) {
-          return `Task not found: ${args.task_id}`
-        }
-
-        if (args.full_session === true) {
-          return await formatFullSession(task, client, {
-            includeThinking: args.include_thinking === true,
-            messageLimit: args.message_limit,
-            sinceMessageId: args.since_message_id,
-            includeToolResults: args.include_tool_results === true,
-            thinkingMaxChars: args.thinking_max_chars,
-          })
-        }
-
-        const shouldBlock = args.block === true
-        const timeoutMs = Math.min(args.timeout ?? 60000, 600000)
-
-        // Already completed: return result immediately (regardless of block flag)
-        if (task.status === "completed") {
-          return await formatTaskResult(task, client)
-        }
-
-        // Error or cancelled: return status immediately
-        if (task.status === "error" || task.status === "cancelled") {
-          return formatTaskStatus(task)
-        }
-
-        // Non-blocking and still running: return status
-        if (!shouldBlock) {
-          return formatTaskStatus(task)
-        }
-
-        // Blocking: poll until completion or timeout
-        const startTime = Date.now()
-
-        while (Date.now() - startTime < timeoutMs) {
-          await delay(1000)
-
-          const currentTask = manager.getTask(args.task_id)
-          if (!currentTask) {
-            return `Task was deleted: ${args.task_id}`
-          }
-
-          if (currentTask.status === "completed") {
-            return await formatTaskResult(currentTask, client)
-          }
-
-          if (currentTask.status === "error" || currentTask.status === "cancelled") {
-            return formatTaskStatus(currentTask)
-          }
-        }
-
-        // Timeout exceeded: return current status
-        const finalTask = manager.getTask(args.task_id)
-        if (!finalTask) {
-          return `Task was deleted: ${args.task_id}`
-        }
-        return `Timeout exceeded (${timeoutMs}ms). Task still ${finalTask.status}.\n\n${formatTaskStatus(finalTask)}`
-      } catch (error) {
-        return `Error getting output: ${error instanceof Error ? error.message : String(error)}`
-      }
-    },
-  })
-}
-
-export function createBackgroundCancel(manager: BackgroundManager, client: BackgroundCancelClient): ToolDefinition {
-  return tool({
-    description: BACKGROUND_CANCEL_DESCRIPTION,
-    args: {
-      taskId: tool.schema.string().optional().describe("Task ID to cancel (required if all=false)"),
-      all: tool.schema.boolean().optional().describe("Cancel all running background tasks (default: false)"),
-    },
-    async execute(args: BackgroundCancelArgs, toolContext) {
-      try {
-        const cancelAll = args.all === true
-
-        if (!cancelAll && !args.taskId) {
-          return `[ERROR] Invalid arguments: Either provide a taskId or set all=true to cancel all running tasks.`
-        }
-
-        if (cancelAll) {
-          const tasks = manager.getAllDescendantTasks(toolContext.sessionID)
-          const cancellableTasks = tasks.filter(t => t.status === "running" || t.status === "pending")
-
-          if (cancellableTasks.length === 0) {
-            return `No running or pending background tasks to cancel.`
-          }
-
-          const cancelledInfo: Array<{
-            id: string
-            description: string
-            status: string
-            sessionID?: string
-          }> = []
-
-          for (const task of cancellableTasks) {
-            const originalStatus = task.status
-            const cancelled = await manager.cancelTask(task.id, {
-              source: "background_cancel",
-              abortSession: originalStatus === "running",
-              skipNotification: true,
-            })
-            if (!cancelled) continue
-            cancelledInfo.push({
-              id: task.id,
-              description: task.description,
-              status: originalStatus === "pending" ? "pending" : "running",
-              sessionID: task.sessionID,
-            })
-          }
-
-          const tableRows = cancelledInfo
-            .map(t => `| \`${t.id}\` | ${t.description} | ${t.status} | ${t.sessionID ? `\`${t.sessionID}\`` : "(not started)"} |`)
-            .join("\n")
-
-           const resumableTasks = cancelledInfo.filter(t => t.sessionID)
-           const resumeSection = resumableTasks.length > 0
-             ? `\n## Continue Instructions
-
-To continue a cancelled task, use:
-\`\`\`
-task(session_id="<session_id>", prompt="Continue: <your follow-up>")
-\`\`\`
-
-Continuable sessions:
-${resumableTasks.map(t => `- \`${t.sessionID}\` (${t.description})`).join("\n")}`
-             : ""
-
-          return `Cancelled ${cancelledInfo.length} background task(s):
-
-| Task ID | Description | Status | Session ID |
-|---------|-------------|--------|------------|
-${tableRows}
-${resumeSection}`
-        }
-
-        const task = manager.getTask(args.taskId!)
-        if (!task) {
-          return `[ERROR] Task not found: ${args.taskId}`
-        }
-
-        if (task.status !== "running" && task.status !== "pending") {
-          return `[ERROR] Cannot cancel task: current status is "${task.status}".
-Only running or pending tasks can be cancelled.`
-        }
-
-        const cancelled = await manager.cancelTask(task.id, {
-          source: "background_cancel",
-          abortSession: task.status === "running",
-          skipNotification: true,
-        })
-        if (!cancelled) {
-          return `[ERROR] Failed to cancel task: ${task.id}`
-        }
-
-        if (task.status === "pending") {
-          return `Pending task cancelled successfully
-
-Task ID: ${task.id}
-Description: ${task.description}
-Status: ${task.status}`
-        }
-
-        return `Task cancelled successfully
-
-Task ID: ${task.id}
-Description: ${task.description}
-Session ID: ${task.sessionID}
-Status: ${task.status}`
-      } catch (error) {
-        return `[ERROR] Error cancelling task: ${error instanceof Error ? error.message : String(error)}`
-      }
-    },
-  })
-}
-function formatMessageTime(value: unknown): string {
-  if (typeof value === "string") {
-    const date = new Date(value)
-    return Number.isNaN(date.getTime()) ? value : date.toISOString()
-  }
-  if (typeof value === "object" && value !== null) {
-    if ("created" in value) {
-      const created = (value as { created?: number }).created
-      if (typeof created === "number") {
-        return new Date(created).toISOString()
-      }
-    }
-  }
-  return "Unknown time"
-}
+// Re-export types and functions from modules
+export { createBackgroundTask } from "./modules/background-task"
+export { createBackgroundOutput } from "./modules/background-output"
+export { createBackgroundCancel } from "./modules/background-cancel"
+export type {
+  BackgroundOutputMessage,
+  BackgroundOutputMessagesResult,
+  BackgroundOutputClient,
+  BackgroundCancelClient,
+  BackgroundOutputManager,
+  FullSessionMessagePart,
+  FullSessionMessage,
+  ToolContextWithMetadata,
+} from "./types"
+
+// Legacy exports for backward compatibility - these will be removed once all imports are updated
+export { formatDuration, truncateText, delay, formatMessageTime } from "./modules/utils"
+export { getErrorMessage, isSessionMessage, extractMessages, extractToolResultText } from "./modules/message-processing"
+export { formatTaskStatus, formatTaskResult, formatFullSession } from "./modules/formatters"
--- a/src/tools/background-task/types.ts
+++ b/src/tools/background-task/types.ts
@@ -20,3 +20,53 @@ export interface BackgroundCancelArgs {
  taskId?: string
  all?: boolean
 }
+
+export type BackgroundOutputMessage = {
+  info?: { role?: string; time?: string | { created?: number }; agent?: string }
+  parts?: Array<{
+    type?: string
+    text?: string
+    content?: string | Array<{ type: string; text?: string }>
+    name?: string
+  }>
+}
+
+export type BackgroundOutputMessagesResult =
+  | { data?: BackgroundOutputMessage[]; error?: unknown }
+  | BackgroundOutputMessage[]
+
+export type BackgroundOutputClient = {
+  session: {
+    messages: (args: { path: { id: string } }) => Promise<BackgroundOutputMessagesResult>
+  }
+}
+
+export type BackgroundCancelClient = {
+  session: {
+    abort: (args: { path: { id: string } }) => Promise<unknown>
+  }
+}
+
+export type BackgroundOutputManager = Pick<import("../../features/background-agent").BackgroundManager, "getTask">
+
+export type FullSessionMessagePart = {
+  type?: string
+  text?: string
+  thinking?: string
+  content?: string | Array<{ type?: string; text?: string }>
+  output?: string
+}
+
+export type FullSessionMessage = {
+  id?: string
+  info?: { role?: string; time?: string; agent?: string }
+  parts?: FullSessionMessagePart[]
+}
+
+export type ToolContextWithMetadata = {
+  sessionID: string
+  messageID: string
+  agent: string
+  abort: AbortSignal
+  metadata?: (input: { title?: string; metadata?: Record<string, unknown> }) => void
+}
--- a/src/tools/call-omo-agent/background-executor.ts
+++ b/src/tools/call-omo-agent/background-executor.ts
@@ -0,0 +1,83 @@
+import type { CallOmoAgentArgs } from "./types"
+import type { BackgroundManager } from "../../features/background-agent"
+import { log } from "../../shared"
+import { consumeNewMessages } from "../../shared/session-cursor"
+import { findFirstMessageWithAgent, findNearestMessageWithFields } from "../../features/hook-message-injector"
+import { getSessionAgent } from "../../features/claude-code-session-state"
+import { getMessageDir } from "./message-dir"
+
+export async function executeBackground(
+  args: CallOmoAgentArgs,
+  toolContext: {
+    sessionID: string
+    messageID: string
+    agent: string
+    abort: AbortSignal
+    metadata?: (input: { title?: string; metadata?: Record<string, unknown> }) => void
+  },
+  manager: BackgroundManager
+): Promise<string> {
+  try {
+    const messageDir = getMessageDir(toolContext.sessionID)
+    const prevMessage = messageDir ? findNearestMessageWithFields(messageDir) : null
+    const firstMessageAgent = messageDir ? findFirstMessageWithAgent(messageDir) : null
+    const sessionAgent = getSessionAgent(toolContext.sessionID)
+    const parentAgent = toolContext.agent ?? sessionAgent ?? firstMessageAgent ?? prevMessage?.agent
+    
+    log("[call_omo_agent] parentAgent resolution", {
+      sessionID: toolContext.sessionID,
+      messageDir,
+      ctxAgent: toolContext.agent,
+      sessionAgent,
+      firstMessageAgent,
+      prevMessageAgent: prevMessage?.agent,
+      resolvedParentAgent: parentAgent,
+    })
+
+    const task = await manager.launch({
+      description: args.description,
+      prompt: args.prompt,
+      agent: args.subagent_type,
+      parentSessionID: toolContext.sessionID,
+      parentMessageID: toolContext.messageID,
+      parentAgent,
+    })
+
+    const WAIT_FOR_SESSION_INTERVAL_MS = 50
+    const WAIT_FOR_SESSION_TIMEOUT_MS = 30000
+    const waitStart = Date.now()
+    let sessionId = task.sessionID
+    while (!sessionId && Date.now() - waitStart < WAIT_FOR_SESSION_TIMEOUT_MS) {
+      if (toolContext.abort?.aborted) {
+        return `Task aborted while waiting for session to start.\n\nTask ID: ${task.id}`
+      }
+      const updated = manager.getTask(task.id)
+      if (updated?.status === "error" || updated?.status === "cancelled") {
+        return `Task failed to start (status: ${updated.status}).\n\nTask ID: ${task.id}`
+      }
+      await new Promise(resolve => setTimeout(resolve, WAIT_FOR_SESSION_INTERVAL_MS))
+      sessionId = manager.getTask(task.id)?.sessionID
+    }
+
+    await toolContext.metadata?.({
+      title: args.description,
+      metadata: { sessionId: sessionId ?? "pending" },
+    })
+
+    return `Background agent task launched successfully.
+
+Task ID: ${task.id}
+Session ID: ${sessionId ?? "pending"}
+Description: ${task.description}
+Agent: ${task.agent} (subagent)
+Status: ${task.status}
+
+The system will notify you when the task completes.
+Use \`background_output\` tool with task_id="${task.id}" to check progress:
+- block=false (default): Check status immediately - returns full status info
+- block=true: Wait for completion (rarely needed since system notifies)`
+  } catch (error) {
+    const message = error instanceof Error ? error.message : String(error)
+    return `Failed to launch background agent task: ${message}`
+  }
+}
--- a/src/tools/call-omo-agent/completion-poller.ts
+++ b/src/tools/call-omo-agent/completion-poller.ts
@@ -0,0 +1,67 @@
+import type { PluginInput } from "@opencode-ai/plugin"
+import { log } from "../../shared"
+
+export async function waitForCompletion(
+  sessionID: string,
+  toolContext: {
+    sessionID: string
+    messageID: string
+    agent: string
+    abort: AbortSignal
+    metadata?: (input: { title?: string; metadata?: Record<string, unknown> }) => void
+  },
+  ctx: PluginInput
+): Promise<void> {
+  log(`[call_omo_agent] Polling for completion...`)
+
+  // Poll for session completion
+  const POLL_INTERVAL_MS = 500
+  const MAX_POLL_TIME_MS = 5 * 60 * 1000 // 5 minutes max
+  const pollStart = Date.now()
+  let lastMsgCount = 0
+  let stablePolls = 0
+  const STABILITY_REQUIRED = 3
+
+  while (Date.now() - pollStart < MAX_POLL_TIME_MS) {
+    // Check if aborted
+    if (toolContext.abort?.aborted) {
+      log(`[call_omo_agent] Aborted by user`)
+      throw new Error("Task aborted.")
+    }
+
+    await new Promise(resolve => setTimeout(resolve, POLL_INTERVAL_MS))
+
+    // Check session status
+    const statusResult = await ctx.client.session.status()
+    const allStatuses = (statusResult.data ?? {}) as Record<string, { type: string }>
+    const sessionStatus = allStatuses[sessionID]
+
+    // If session is actively running, reset stability counter
+    if (sessionStatus && sessionStatus.type !== "idle") {
+      stablePolls = 0
+      lastMsgCount = 0
+      continue
+    }
+
+    // Session is idle - check message stability
+    const messagesCheck = await ctx.client.session.messages({ path: { id: sessionID } })
+    const msgs = ((messagesCheck as { data?: unknown }).data ?? messagesCheck) as Array<unknown>
+    const currentMsgCount = msgs.length
+
+    if (currentMsgCount > 0 && currentMsgCount === lastMsgCount) {
+      stablePolls++
+      if (stablePolls >= STABILITY_REQUIRED) {
+        log(`[call_omo_agent] Session complete, ${currentMsgCount} messages`)
+        break
+      }
+    } else {
+      stablePolls = 0
+      lastMsgCount = currentMsgCount
+    }
+  }
+
+  if (Date.now() - pollStart >= MAX_POLL_TIME_MS) {
+    log(`[call_omo_agent] Timeout reached`)
+    throw new Error("Agent task timed out after 5 minutes.")
+  }
+}
--- a/src/tools/call-omo-agent/message-dir.ts
+++ b/src/tools/call-omo-agent/message-dir.ts
@@ -0,0 +1,18 @@
+import { existsSync, readdirSync } from "node:fs"
+import { join } from "node:path"
+import { MESSAGE_STORAGE } from "../../features/hook-message-injector"
+
+export function getMessageDir(sessionID: string): string | null {
+  if (!sessionID.startsWith("ses_")) return null
+  if (!existsSync(MESSAGE_STORAGE)) return null
+
+  const directPath = join(MESSAGE_STORAGE, sessionID)
+  if (existsSync(directPath)) return directPath
+
+  for (const dir of readdirSync(MESSAGE_STORAGE)) {
+    const sessionPath = join(MESSAGE_STORAGE, dir, sessionID)
+    if (existsSync(sessionPath)) return sessionPath
+  }
+
+  return null
+}
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
YeonGyu-Kim	a56a8bb241	fix: convert executeSyncTask to async prompt + polling pattern Oracle agent (and all sync subagent tasks) fails with JSON Parse error in ACP environments because session.prompt() (blocking HTTP) returns empty/incomplete responses. Replace promptSyncWithModelSuggestionRetry with promptWithModelSuggestionRetry (async, fire-and-forget) and add polling loop to wait for response stability, matching the proven pattern from executeUnstableAgentTask. Fixes #1681	2026-02-09 10:03:54 +09:00
github-actions[bot]	f07e364171	@mrm007 has signed the CLA in code-yeongyu/oh-my-opencode#1680	2026-02-08 21:41:45 +00:00
github-actions[bot]	e26c355c76	@aliozdenisik has signed the CLA in code-yeongyu/oh-my-opencode#1676	2026-02-08 17:12:45 +00:00
github-actions[bot]	5f9c3262a2	@JunyeongChoi0 has signed the CLA in code-yeongyu/oh-my-opencode#1674	2026-02-08 16:02:43 +00:00
github-actions[bot]	9d726d91fc	release: v3.4.0	2026-02-08 15:44:17 +00:00
YeonGyu-Kim	34e5eddb49	Merge pull request #1670 from code-yeongyu/fix/migration-once-only-v2 fix: ensure model migration respects intentional downgrades (#1660)	2026-02-08 20:00:52 +09:00
YeonGyu-Kim	441fda9177	fix: migrate config on deep copy, apply to rawConfig only on successful file write (#1660 ) Previously, migrateConfigFile() mutated rawConfig directly. If the file write failed (e.g. read-only file, permissions), the in-memory config was already changed to the migrated values, causing the plugin to use migrated models even though the user's file was untouched. On the next run, the migration would fire again since _migrations was never persisted. Now all mutations happen on a structuredClone copy. The original rawConfig is only updated after the file write succeeds. If the write fails, rawConfig stays untouched and the function returns false.	2026-02-08 19:33:26 +09:00
YeonGyu-Kim	006e6ade02	test(delegate-task): reset Bun mocks per test	2026-02-08 18:50:16 +09:00
YeonGyu-Kim	aa447765cb	feat(shared/git-worktree, features): add git diff stats utility and infrastructure improvements - Add collect-git-diff-stats utility for git worktree operations - Add comprehensive test coverage for git diff stats collection - Enhance claude-tasks storage module - Improve tmux subagent manager initialization - Support better git-based task tracking and analysis 🤖 Generated with assistance of OhMyOpenCode	2026-02-08 18:41:45 +09:00
YeonGyu-Kim	bdaa8fc6c1	refactor(tools/delegate-task): enhance skill resolution and type safety - Add improved type definitions for skill resolution - Enhance executor with better type safety for delegation flows - Add comprehensive test coverage for delegation tool behavior - Improve code organization for skill resolver integration 🤖 Generated with assistance of OhMyOpenCode	2026-02-08 18:41:39 +09:00
YeonGyu-Kim	7788ba3d8a	refactor(shared): improve model availability and resolution module structure - Use namespace import for connected-providers-cache for better clarity - Add explicit type annotation for modelsByProvider to improve type safety - Update tests to reflect refactored module organization - Improve code organization while maintaining functionality 🤖 Generated with assistance of OhMyOpenCode	2026-02-08 18:41:35 +09:00
YeonGyu-Kim	1324fee30f	feat(cli/run, background-agent): manage session permissions for CLI and background tasks - Deny question prompts in CLI run mode since there's no TUI to answer them - Inherit parent session permission rules in background task sessions - Force deny questions while preserving other parent permission settings - Add test coverage for permission inheritance behavior 🤖 Generated with assistance of OhMyOpenCode	2026-02-08 18:41:26 +09:00
YeonGyu-Kim	e663d7b335	refactor(shared): update model-availability tests to use split modules Migrate imports from monolithic `model-availability` to split modules (`model-name-matcher`, `available-models-fetcher`, `model-cache-availability`). Replace XDG_CACHE_HOME env var manipulation with `mock.module` for `data-path`, ensuring test isolation without polluting process env. 🤖 Generated with assistance of [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)	2026-02-08 18:00:19 +09:00
YeonGyu-Kim	e257bff31c	fix(plugin-handlers): remove `as any` type assertions in config-handler tests Replace unsafe `as any` casts on `createBuiltinAgents` spy with properly typed `as unknown as { mockResolvedValue: ... }` pattern. Adds bun-types reference directive. 🤖 Generated with assistance of [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)	2026-02-08 18:00:12 +09:00
YeonGyu-Kim	23bca2b4d5	feat(tools/background-task): resolve background_output task_id title	2026-02-08 17:54:59 +09:00
YeonGyu-Kim	83a05630cd	feat(tools/delegate-task): add skill-resolver module - Add skill-resolver.ts for resolving skill configurations - Handles skill loading and configuration resolution - Part of modular delegate-task refactoring effort 🤖 Generated with assistance of OhMyOpenCode	2026-02-08 17:52:34 +09:00
YeonGyu-Kim	6717349e5b	feat(claude-tasks): add CLAUDE_CODE_TASK_LIST_ID env var support - Export session-storage from claude-tasks/index.ts - Add CLAUDE_CODE_TASK_LIST_ID fallback support in storage.ts - Add comprehensive tests for CLAUDE_CODE_TASK_LIST_ID handling - Prefer ULTRAWORK_TASK_LIST_ID, fall back to CLAUDE_CODE_TASK_LIST_ID - Both env vars are properly sanitized for path safety 🤖 Generated with assistance of OhMyOpenCode	2026-02-08 17:52:16 +09:00
YeonGyu-Kim	ee72c45552	refactor(tools/background-task): split tools.ts into focused modules under 200 LOC - Create modules/ directory with 6 focused modules: - background-task.ts: task creation logic - background-output.ts: output retrieval logic - background-cancel.ts: cancellation logic - formatters.ts: message formatting utilities - message-processing.ts: message extraction utilities - utils.ts: shared utility functions - Reduce tools.ts from ~798 to ~30 lines (barrel pattern) - Add new types to types.ts for module interfaces - Update index.ts for clean re-exports - Follow modular code architecture (200 LOC limit) 🤖 Generated with assistance of OhMyOpenCode	2026-02-08 17:52:00 +09:00
YeonGyu-Kim	9377c7eba9	refactor(hooks/interactive-bash-session): split monolithic hook into modules - Convert index.ts to clean barrel export - Extract hook implementation to hook.ts - Extract terminal parsing to parser.ts - Extract state management to state-manager.ts - Reduce index.ts from ~276 to ~5 lines - Follow modular code architecture principles 🤖 Generated with assistance of OhMyOpenCode	2026-02-08 17:51:48 +09:00
YeonGyu-Kim	f1316bc800	refactor(tmux-subagent): split manager.ts into focused modules - Extract polling logic to polling-manager.ts - Extract session cleanup to session-cleaner.ts - Extract session spawning to session-spawner.ts - Extract cleanup logic to manager-cleanup.ts - Reduce manager.ts from ~495 to ~345 lines - Follow modular code architecture (200 LOC limit) 🤖 Generated with assistance of OhMyOpenCode	2026-02-08 17:51:38 +09:00
YeonGyu-Kim	1f8f7b592b	docs(AGENTS): update line counts and stats across all AGENTS.md files - Update main AGENTS.md with current file sizes - Update complexity hotspot line counts - Update agent count from 11 to 32 files - Update CLI utility count to 70 - Update test file count from 100+ to 163 🤖 Generated with assistance of OhMyOpenCode	2026-02-08 17:51:30 +09:00
YeonGyu-Kim	c6fafd6624	fix: remove task-continuation-enforcer and restore task tool titles	2026-02-08 17:49:22 +09:00
YeonGyu-Kim	42dbc8f39c	Fix Issue #1428 : Deny bash permission for Prometheus agent - Change PROMETHEUS_PERMISSION bash from 'allow' to 'deny' to prevent unrestricted bash execution - Prometheus is a read-only planner and should not execute bash commands - The prometheus-md-only hook provides additional blocking as backup	2026-02-08 17:37:44 +09:00
YeonGyu-Kim	6bb9a3b7bc	refactor(tools/call-omo-agent): split tools.ts into focused modules under 200 LOC - Extract getMessageDir to message-dir.ts - Extract executeBackground to background-executor.ts - Extract session creation logic to session-creator.ts - Extract polling logic to completion-poller.ts - Extract message processing to message-processor.ts - Create sync-executor.ts to orchestrate sync execution - Add ToolContextWithMetadata type to types.ts - tools.ts now <200 LOC and focused on tool definition	2026-02-08 17:37:44 +09:00
YeonGyu-Kim	984da95f15	Merge pull request #1664 from code-yeongyu/fix/prometheus-plan-family fix: add isPlanFamily() for prometheus↔plan mutual blocking and task permission	2026-02-08 16:49:45 +09:00
YeonGyu-Kim	bb86523240	fix: add isPlanFamily for prometheus↔plan mutual blocking and task permission - PLAN_AGENT_NAMES = ['plan'] (system prompt only) - PLAN_FAMILY_NAMES = ['plan', 'prometheus'] (blocking + task permission) - prometheus↔plan mutual delegation blocked via isPlanFamily() - prometheus gets task tool permission via isPlanFamily() - prompt-builder unchanged: prometheus does NOT get plan system prompt	2026-02-08 16:48:52 +09:00
YeonGyu-Kim	f2b7b759c8	Merge pull request #1173 from code-yeongyu/feature/handoff feat(commands): add /handoff builtin command for context continuation	2026-02-08 16:44:25 +09:00
YeonGyu-Kim	a5af7e95c0	Merge pull request #1536 from code-yeongyu/feat/task-continuation-enforcer feat(hooks): implement task-continuation-enforcer	2026-02-08 16:43:42 +09:00
justsisyphus	a5489718f9	feat(commands): add /handoff builtin command with programmatic context synthesis Port handoff concept from ampcode as a builtin command that extracts detailed context summary from current session for seamless continuation in a new session. Enhanced with programmatic context gathering: - Add HANDOFF_TEMPLATE with phased extraction (gather programmatic context via session_read/todoread/git, extract context, format, instruct) - Gather concrete data: session history, todo state, git diff/status - Include compaction-style sections: USER REQUESTS (AS-IS) verbatim, EXPLICIT CONSTRAINTS verbatim, plus all original handoff sections - Register handoff in BuiltinCommandName type and command definitions - Include session context variables (SESSION_ID, TIMESTAMP, ARGUMENTS) - Add 14 tests covering registration, template content, programmatic gathering, compaction-style sections, and emoji-free constraint	2026-02-08 16:38:53 +09:00
YeonGyu-Kim	cd5485a472	Merge pull request #1663 from code-yeongyu/fix/revert-load-skills-default fix: revert load_skills default and enforce via prompts instead	2026-02-08 16:36:53 +09:00
YeonGyu-Kim	582e0ead27	fix: revert load_skills default and enforce via prompts instead Revert .default([]) on load_skills schema back to required, restore the runtime error for missing load_skills, and add explicit load_skills=[] to all task() examples in agent prompts that were missing it. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-02-08 16:31:02 +09:00
YeonGyu-Kim	0743855b40	Merge pull request #1652 from code-yeongyu/fix-1623-v2 fix(agents): include custom agents in orchestrator delegation prompt (#1623)	2026-02-08 16:02:09 +09:00
YeonGyu-Kim	2588f33075	Merge pull request #1643 from code-yeongyu/fix/exa-api-key-1627 fix(mcp): append EXA_API_KEY to Exa MCP URL when env var is set (#1627)	2026-02-08 16:01:59 +09:00
YeonGyu-Kim	32193dc10d	Merge pull request #1658 from code-yeongyu/fix-1233 fix: detect completion tags in ralph/ULW loop (#1233)	2026-02-08 15:51:16 +09:00
YeonGyu-Kim	321b319b58	fix(agents): use config data instead of client API to avoid init deadlock (#1623 )	2026-02-08 15:34:47 +09:00
YeonGyu-Kim	a3dd1dbaf9	test(mcp): restore Tavily tests and add encoding edge case (#1627 )	2026-02-08 15:28:31 +09:00
YeonGyu-Kim	4c1e369176	Merge pull request #1657 from code-yeongyu/fix-1366-lsp-unblock fix(lsp): reset safety block on server restart (#1366)	2026-02-08 15:13:30 +09:00
YeonGyu-Kim	06611a7645	fix(mcp): remove duplicate x-api-key header, add test (#1627 )	2026-02-08 14:56:43 +09:00
YeonGyu-Kim	676ff513fa	fix: detect completion tags in ralph/ULW loop to stop iteration (#1233 )	2026-02-08 14:50:36 +09:00
YeonGyu-Kim	4738379ad7	fix(lsp): reset safety block on server restart to prevent permanent blocks (#1366 )	2026-02-08 14:34:11 +09:00
YeonGyu-Kim	44415e3f59	fix(mcp): remove duplicate x-api-key header from Exa config (#1627 )	2026-02-08 14:19:50 +09:00
YeonGyu-Kim	870a2a54f7	Merge pull request #1647 from code-yeongyu/fix/subagent-type-respect-model-config-1357 fix(delegate-task): resolve user agent model config in subagent_type path (#1357)	2026-02-08 14:12:21 +09:00
YeonGyu-Kim	cfd63482d7	Merge pull request #1646 from code-yeongyu/fix/background-task-race-condition-1582 fix(background-agent): serialize parent notifications (#1582)	2026-02-08 14:12:14 +09:00
YeonGyu-Kim	5845604a01	Merge pull request #1656 from code-yeongyu/fix/deny-todo-tools-for-task-system fix: deny todowrite/todoread per-agent when task_system is enabled	2026-02-08 14:09:29 +09:00
YeonGyu-Kim	74a1d70f57	Merge pull request #1648 from code-yeongyu/fix/category-delegation-respect-agent-model-1295 test: add regression tests for sisyphus-junior model override in category delegation (#1295)	2026-02-08 14:07:15 +09:00
YeonGyu-Kim	89e251da72	Merge pull request #1645 from code-yeongyu/fix/load-skills-default-1493 fix: add default value for load_skills parameter in task tool (#1493)	2026-02-08 14:07:08 +09:00
YeonGyu-Kim	e7f4f6dd13	fix: deny todowrite/todoread per-agent when task_system is enabled When experimental.task_system is enabled, add todowrite: deny and todoread: deny to per-agent permissions for all primary agents (sisyphus, hephaestus, atlas, prometheus, sisyphus-junior). This ensures the model never sees these tools in its tool list, complementing the existing global tools config and runtime hook.	2026-02-08 14:05:53 +09:00
YeonGyu-Kim	d8e7e4f170	refactor: extract git worktree parser from atlas hook	2026-02-08 14:01:31 +09:00
YeonGyu-Kim	2db9accfc7	Merge pull request #1655 from code-yeongyu/fix/sync-continuation-variant-loss fix: preserve variant in sync continuation to maintain thinking budget	2026-02-08 14:00:56 +09:00
YeonGyu-Kim	6b4e149881	test: assert variant forwarded in sync continuation	2026-02-08 13:57:13 +09:00
YeonGyu-Kim	7f4338b6ed	fix: preserve variant in sync continuation to maintain thinking budget	2026-02-08 13:55:35 +09:00
YeonGyu-Kim	24a013b867	Merge pull request #1653 from code-yeongyu/fix/plan-prometheus-decoupling fix(delegation): decouple plan from prometheus and fix sync task responses	2026-02-08 13:46:40 +09:00
YeonGyu-Kim	d769b95869	fix(delegation): use blocking prompt for sync tasks instead of polling Replace promptAsync + manual polling loop with promptSyncWithModelSuggestionRetry (session.prompt) which blocks until the LLM response completes. This matches OpenCode's native task tool behavior and fixes empty/broken responses that occurred when polling declared stability prematurely. Applied to both executeSyncTask and executeSyncContinuation paths.	2026-02-08 13:42:23 +09:00
YeonGyu-Kim	72cf908738	fix(delegation): decouple plan agent from prometheus - remove aliasing Remove 'prometheus' from PLAN_AGENT_NAMES so isPlanAgent() no longer matches prometheus. The only remaining connection is model inheritance via buildPlanDemoteConfig() in plan-model-inheritance.ts. - Remove 'prometheus' from PLAN_AGENT_NAMES array - Update self-delegation error message to say 'plan agent' not 'prometheus' - Update tests: prometheus is no longer treated as a plan agent - Update task permission: only plan agents get task tool, not prometheus	2026-02-08 13:42:15 +09:00
YeonGyu-Kim	f035be842d	fix(agents): include custom agents in orchestrator delegation prompt (#1623 )	2026-02-08 13:34:47 +09:00
YeonGyu-Kim	6ce482668b	refactor: extract git worktree parser from atlas hook	2026-02-08 13:30:00 +09:00
YeonGyu-Kim	a85da59358	fix: encode EXA_API_KEY before appending to URL query parameter	2026-02-08 13:28:08 +09:00
YeonGyu-Kim	b88a868173	fix(config): plan agent inherits model settings from prometheus when not explicitly configured Previously, demoted plan agent only received { mode: 'subagent' } with no model settings, causing fallback to step-3.5-flash. Now inherits all model-related settings (model, variant, temperature, top_p, maxTokens, thinking, reasoningEffort, textVerbosity, providerOptions) from the resolved prometheus config. User overrides via agents.plan.* take priority. Prompt, permission, description, and color are intentionally NOT inherited.	2026-02-08 13:22:56 +09:00
YeonGyu-Kim	d0bdf521c3	Merge pull request #1649 from code-yeongyu/feat/anthropic-prefill-recovery feat: auto-recover from Anthropic assistant message prefill errors	2026-02-08 13:19:38 +09:00
YeonGyu-Kim	7abefcca1f	feat: auto-recover from Anthropic assistant message prefill errors When Anthropic models reject requests with 'This model does not support assistant message prefill', detect this as a recoverable error type and automatically send 'Continue' once to resume the conversation. Extends session-recovery hook with new 'assistant_prefill_unsupported' error type. The existing session.error handler in index.ts already sends 'continue' after successful recovery, so no additional logic needed.	2026-02-08 13:16:16 +09:00
YeonGyu-Kim	a06364081b	fix(delegate-task): resolve user agent model config in subagent_type path (#1357 )	2026-02-08 13:14:11 +09:00
YeonGyu-Kim	104b9fbb39	test: add regression tests for sisyphus-junior model override in category delegation (#1295 ) Add targeted regression tests for the exact reproduction scenario from issue #1295: - quick category with sisyphusJuniorModel override (the reported scenario) - user-defined custom category with sisyphusJuniorModel fallback The underlying fix was already applied in PRs #1470 and #1556. These tests ensure the fix does not regress. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-02-08 13:13:47 +09:00
YeonGyu-Kim	f6fc30ada5	fix: add default value for load_skills parameter in task tool (#1493 ) Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-02-08 13:09:58 +09:00
YeonGyu-Kim	f1fcc26aaa	fix(background-agent): serialize parent notifications (#1582 )	2026-02-08 13:05:06 +09:00
YeonGyu-Kim	09999587f5	fix(mcp): append EXA_API_KEY to Exa MCP URL when env var is set (#1627 )	2026-02-08 12:38:42 +09:00
YeonGyu-Kim	01594a67af	fix(hooks): compose session recovery callbacks for continuation enforcers Cubic found that registering task-continuation-enforcer recovery callbacks overrode the todo-continuation-enforcer callbacks. Compose the callbacks so both enforcers receive abort/recovery notifications.	2026-02-06 11:41:31 +09:00
YeonGyu-Kim	551dbc95f2	feat(hooks): register task-continuation-enforcer in plugin lifecycle Integrates at 4 points: creation (gated by task_system), session recovery callbacks, event handler, and stop-continuation command.	2026-02-06 11:21:53 +09:00
YeonGyu-Kim	f4a9d0c3aa	feat(hooks): implement task-continuation-enforcer with TDD Mirrors todo-continuation-enforcer but reads from file-based task storage instead of OpenCode's todo API. Includes 19 tests covering all skip conditions, abort detection, countdown, and recovery scenarios.	2026-02-06 11:21:45 +09:00
YeonGyu-Kim	f796fdbe0a	feat(hooks): add TASK_CONTINUATION system directive and hook name	2026-02-06 11:21:37 +09:00