Compare commits

...

204 Commits

Author SHA1 Message Date
github-actions[bot]
538aba0d0f release: v3.7.1 2026-02-17 05:32:02 +00:00
YeonGyu-Kim
97f7540600 chore: add propertyNames constraints to JSON schema 2026-02-17 14:29:06 +09:00
YeonGyu-Kim
462e2ec2b0 refactor: remove 3 orphaned files and prefix unused parameter 2026-02-17 14:09:12 +09:00
YeonGyu-Kim
9acdd6b85d refactor: remove 3 orphaned files from call-omo-agent and delegate-task 2026-02-17 14:08:44 +09:00
YeonGyu-Kim
1fb6a7cc80 refactor: remove 16 orphaned files from background-agent 2026-02-17 14:08:38 +09:00
YeonGyu-Kim
d3b79064c6 refactor: remove orphaned modules/ directory from background-task 2026-02-17 14:08:30 +09:00
YeonGyu-Kim
744dee70e9 refactor: remove 3 orphaned files and unused import from tmux-subagent 2026-02-17 14:08:28 +09:00
YeonGyu-Kim
0265fa6990 refactor: remove 3 orphaned files from background-agent/spawner 2026-02-17 14:08:12 +09:00
github-actions[bot]
7e1293d273 release: v3.7.0 2026-02-17 04:35:13 +00:00
YeonGyu-Kim
e3342dcd4a refactor(prompts): replace markdown tables with bullet lists, harden Oracle protection
Convert all markdown tables in Sisyphus and dynamic-agent-prompt-builder
to plain bullet lists for cleaner prompt rendering.

Add explicit Oracle safeguards:
- Hard Block: background_cancel(all=true) when Oracle running
- Hard Block: delivering final answer before collecting Oracle result
- Anti-Pattern: background_cancel(all=true) and skipping Oracle
- Oracle section: NEVER cancel, collect via background_output first
- Background Result Collection: split cancel/wait into separate steps
  with explicit NEVER use background_cancel(all=true) instruction
2026-02-17 13:26:37 +09:00
YeonGyu-Kim
764abb2a4b docs: fix ultrabrain model reference in category-skill-guide (GPT-5.2→GPT-5.3 Codex) 2026-02-17 11:32:36 +09:00
YeonGyu-Kim
f8e58efeb4 docs: fix agent model references in all READMEs (Opus 4.5→4.6, GPT 5.2 Codex→5.3, Librarian→GLM-4.7, Explore→Grok Code Fast 1) 2026-02-17 11:32:26 +09:00
YeonGyu-Kim
fba06868dd docs: fix model references across guide docs (Opus 4.5→4.6, GPT-5.2 Codex→5.3, Atlas model, add deep category, fix dot notation) 2026-02-17 11:31:22 +09:00
YeonGyu-Kim
c51994c791 docs: fix agent fallback chains, provider chains, and category tables to match model-requirements.ts
- features.md: update explore primary model (grok-code-fast-1), fix all agent fallback chains
- configurations.md: add missing deep category, fix all agent/category provider chains, add hephaestus to available agents, update model names to match actual code
2026-02-17 11:28:32 +09:00
YeonGyu-Kim
3facf9fac3 docs: fix structural counts in AGENTS.md (hook handlers 7→8, tool dirs 14→15, core hooks 33→32, session hooks 20→19, config merge order) 2026-02-17 11:26:28 +09:00
YeonGyu-Kim
aac79f03b5 docs: regenerate all AGENTS.md files from comprehensive codebase exploration
- Fired 33 parallel explore agents across all major directories
- Analyzed 1164 TS files, 133k LOC, 41 hooks, 26 tools, 11 agents, 18 features
- Regenerated 13 AGENTS.md files with 905 total lines
- Root: plugin architecture, initialization flow, 7 OpenCode hook handlers
- src/: entry point orchestration, hook composition pipeline
- agents/: 11 agent inventory with tool restrictions and factory patterns
- hooks/: 41 hooks organized by 5 tiers, key complex hooks documented
- tools/: 26 tools across 14 directories, delegation categories
- features/: 18 modules mapped by complexity (HIGH/MEDIUM/LOW)
- shared/: 101 utilities in 13 categories, model resolution pipeline
- config/: 22 schema files, Zod v4 validation system
- cli/: 5 commands, doctor checks, model fallback system
- mcp/: 3-tier MCP system architecture
- plugin-handlers/: 6-phase config loading pipeline
- claude-code-hooks/: CC settings.json compatibility layer
- claude-tasks/: task schema + file-based persistence

🤖 Generated with assistance of oh-my-opencode
2026-02-17 11:17:01 +09:00
YeonGyu-Kim
5a8e424c8e Merge pull request #1910 from code-yeongyu/fix/1753-context-window-hardcoded
fix: use ModelCacheState for context window limit instead of env var (#1753)
2026-02-17 10:53:58 +09:00
YeonGyu-Kim
d786691260 fix: read anthropic 1m flag from live model cache state 2026-02-17 10:51:01 +09:00
YeonGyu-Kim
363016681b test: cover model-cache and env fallback context limits 2026-02-17 10:51:01 +09:00
YeonGyu-Kim
b444899153 fix: use model cache context flag for runtime context limits 2026-02-17 10:51:01 +09:00
YeonGyu-Kim
b1e7bb4c59 Merge pull request #1912 from code-yeongyu/fix/1694-fallback-wiring
fix: wire fallback availability into runtime export path (#1694)
2026-02-17 10:50:50 +09:00
YeonGyu-Kim
8e115c7f9d fix: export fallback availability from traced module 2026-02-17 10:47:09 +09:00
YeonGyu-Kim
fe5d341208 Merge pull request #1909 from code-yeongyu/fix/1694-fallback-model-ids
fix: add logging and validation to fallback chain model resolution (#1694)
2026-02-17 10:38:14 +09:00
YeonGyu-Kim
ca06ce134f fix: add fallback resolution warnings for unavailable models 2026-02-17 10:29:48 +09:00
YeonGyu-Kim
72fa2c7e65 fix(tmux): stop layout override after spawn, use configured main pane size
Remove applyLayout(select-layout main-vertical) call after spawn which
was destroying grid arrangements by forcing vertical stacking. Now only
enforceMainPaneWidth is called, preserving the grid created by manual
split directions. Also fix enforceMainPaneWidth to use config's
main_pane_size percentage instead of hardcoded 50%.
2026-02-17 09:50:17 +09:00
YeonGyu-Kim
b3c5f4caf5 fix(tmux): use actual pane dimensions and configured min width for grid calculation
Agent area width now uses real mainPane.width instead of hardcoded 50%
ratio. Grid planning, split availability, and spawn target finding now
respect user's agent_pane_min_width config instead of hardcoded
MIN_PANE_WIDTH=52, enabling 2-column grid layouts on narrower terminals.
2026-02-17 09:48:18 +09:00
YeonGyu-Kim
219c1f8225 update: always wait for Oracle results instead of blanket background_cancel(all=true) 2026-02-17 09:42:59 +09:00
github-actions[bot]
6208c07809 @xinpengdr has signed the CLA in code-yeongyu/oh-my-opencode#1906 2026-02-16 19:01:47 +00:00
YeonGyu-Kim
1b7a1e3f0b Merge pull request #1905 from code-yeongyu/fix/tmux-split-stability
fix: stabilize tmux split and session readiness handling
2026-02-17 03:49:30 +09:00
YeonGyu-Kim
84a83922c3 fix: stop tracking sessions that never become ready
When session readiness times out, immediately close the spawned pane and skip tracking to prevent stale mappings from causing reopen and close anomalies.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-02-17 03:40:55 +09:00
YeonGyu-Kim
17da22704e fix: size main pane using configured layout percentage
Main pane resize now uses main_pane_size instead of a hardcoded 50 percent fallback so post-split layout remains stable and predictable.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-02-17 03:40:46 +09:00
YeonGyu-Kim
da3f24b8b1 fix: align split targeting with configured pane width
Use the configured agent pane width consistently in split target selection and avoid close+spawn churn by replacing the oldest pane when eviction is required.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-02-17 03:40:37 +09:00
YeonGyu-Kim
b02721463e refactor: route status porcelain map parsing through line parser 2026-02-17 03:29:10 +09:00
YeonGyu-Kim
1f31a3d8f1 test: add dedicated status porcelain line parser with coverage 2026-02-17 03:29:01 +09:00
YeonGyu-Kim
1566cfcc1e update: Hephaestus completion guarantee, Sisyphus-Junior Hephaestus-style rewrite, snake_case tools
Hephaestus:
- Add Completion Guarantee section with Codex-style persistence framing
- Add explicit explore/librarian call syntax examples (subagent_type, not category)
- Use positive 'keep going until resolved' over negative 'NEVER stop'
- Fix tool names: TaskCreate/TaskUpdate → task_create/task_update

Sisyphus-Junior GPT:
- Full Hephaestus-style rewrite: autonomy, reporting, parallelism, tool usage
- Remove Blocked & Allowed Tools section and 'You work ALONE' messaging
- Add Progress Updates, Ambiguity Protocol, Code Quality sections
- Fix tool names: TaskCreate/TaskUpdate → task_create/task_update

Sisyphus-Junior Default:
- Remove buildConstraintsSection and blocked actions messaging
- Fix tool names: TaskCreate/TaskUpdate → task_create/task_update

Tests: update all assertions for new prompt structure (31/31 pass)
2026-02-17 03:12:32 +09:00
YeonGyu-Kim
2b5887aca3 fix: prevent overlapping poll cycles in managers
Guarding polling re-entry avoids stacked async polls under slow responses, and unref on pending-call cleanup timer reduces idle wakeups.
2026-02-17 03:06:40 +09:00
YeonGyu-Kim
8c88da51e1 update: soften Hephaestus brevity bias — replace 'brief/briefly' with 'clear' throughout
Replace 7 instances of brief/briefly that caused over-terse behavior:
- 'briefly restate' → 'restate'
- 'brief summary' → 'clear summary'
- 'briefly state the WHY' → 'explain the WHY' (×2)
- 'brief context' → 'clear context'
- 'Brief updates' → 'Clear updates (a few sentences)'
- 'keep it brief and clear' → 'keep it clear and helpful'
2026-02-17 02:58:42 +09:00
YeonGyu-Kim
199992e05b update: Hephaestus prompt — restore intent gate, strengthen parallelism and reporting
- Restore Assumptions Check and When to Challenge the User from Sisyphus intent gate
- Add proactive explore/librarian firing to CORRECT behavior list
- Strengthen parallel execution with GPT-5.2 tool_usage_rules (parallelize ALL independent calls)
- Embed reporting into each Execution Loop step (Tell user pattern)
- Strengthen Progress Updates with plain-language and WHY-not-just-WHAT guidance
- Add post-edit reporting to Output Contract and After Implementation
- Fix Output Contract preamble conflict (skip empty preambles, but DO report actions)
2026-02-17 02:56:22 +09:00
YeonGyu-Kim
6b546526f3 refactor: diet Hephaestus prompt — remove redundancy, add progress updates and skill examples
- Remove router nudge (reasoning configuration section)
- Remove redundant sections: Role & Agency, Judicious Initiative, Success
  Criteria, Response Compaction, Soft Guidelines
- Merge Identity + Core Principle into compact Identity section
- Restore autonomous behavior policy (FORBIDDEN/CORRECT) from Role & Agency
- Add Progress Updates section with friendly tone and concrete examples
- Add Skill Loading Examples table (frontend-ui-ux, playwright, git-master, tauri)
- Condense Parallel Execution, Execution Loop, Verification, Failure Recovery
- Update Output Contract with friendly communication style

651 → 437 lines (33% reduction), behavior preserved
2026-02-17 02:46:11 +09:00
YeonGyu-Kim
c44509b397 fix: skip startup toasts in CLI run mode for auto-update-checker
Add OPENCODE_CLI_RUN_MODE environment variable check to skip all startup
toasts and version checks when running in CLI mode. This prevents
notification spam during automated CLI run sessions.

Includes comprehensive test coverage for CLI run mode behavior.

🤖 Generated with OhMyOpenCode assistance
2026-02-17 02:34:39 +09:00
YeonGyu-Kim
17994693af fix: add directory parameter and improve CLI run session handling
- Add directory parameter to session API calls (session.get, session.todo,
  session.status, session.children)
- Improve agent resolver with display name support via agent-display-names
- Add tool execution visibility in event handlers with running/completed
  status output
- Enhance poll-for-completion with main session status checking and
  stabilization period handling
- Add normalizeSDKResponse import for consistent response handling
- Update types with Todo, ChildSession, and toast-related interfaces

🤖 Generated with OhMyOpenCode assistance
2026-02-17 02:34:35 +09:00
YeonGyu-Kim
a31087e543 fix: add propertyNames validation to object schemas in JSON schema
Add propertyNames: { type: "string" } to all object schemas with
additionalProperties to ensure proper JSON schema validation for
dynamic property keys.

🤖 Generated with OhMyOpenCode assistance
2026-02-17 02:34:31 +09:00
YeonGyu-Kim
5c13a63758 fix: invoke claude-code-hooks PreCompact in session compacting handler
The experimental.session.compacting handler was not delegating to
claudeCodeHooks, making PreCompact hooks from .claude/settings.json
dead code. Also fixed premature early-return when compactionContextInjector
was null which would skip any subsequent hooks.
2026-02-17 02:14:01 +09:00
YeonGyu-Kim
d9f21da026 fix: prefer a runnable opencode binary for cli run 2026-02-17 02:12:36 +09:00
YeonGyu-Kim
7d2c798ff0 Merge pull request #1893 from code-yeongyu/fix/1716-disabled-agents-enforcement
fix: enforce disabled_agents config in call_omo_agent (#1716)
2026-02-17 02:07:18 +09:00
YeonGyu-Kim
ea589e66e8 Merge remote-tracking branch 'origin/dev' into fix/1716-disabled-agents-enforcement
# Conflicts:
#	src/plugin/tool-registry.ts
#	src/tools/call-omo-agent/tools.test.ts
#	src/tools/call-omo-agent/tools.ts
2026-02-17 02:04:19 +09:00
YeonGyu-Kim
e299c09ee8 fix: include provider-models cache for Hephaestus availability 2026-02-17 02:03:03 +09:00
YeonGyu-Kim
285d8d58dd fix: skip compaction messages in parent-session context lookup 2026-02-17 02:03:03 +09:00
YeonGyu-Kim
e1e449164a Merge pull request #1898 from code-yeongyu/fix/1671-tmux-layout
fix: apply tmux layout config during pane spawning (#1671)
2026-02-17 02:01:29 +09:00
YeonGyu-Kim
324d2c1f0c Merge branch 'dev' into fix/1671-tmux-layout 2026-02-17 01:58:59 +09:00
YeonGyu-Kim
f3de0f43bd Merge pull request #1899 from code-yeongyu/fix/1700-vertex-anthropic
fix: recognize google-vertex-anthropic as Claude provider (#1700)
2026-02-17 01:58:26 +09:00
YeonGyu-Kim
5839594041 Merge pull request #1897 from code-yeongyu/fix/1679-copilot-fallback
fix: handle all model versions in normalizeModelName for fallback chains (#1679)
2026-02-17 01:58:24 +09:00
YeonGyu-Kim
ada0a233d6 Merge pull request #1894 from code-yeongyu/fix/1681-oracle-json-parse
fix: resolve Oracle JSON parse error after promptAsync refactor (#1681)
2026-02-17 01:58:21 +09:00
YeonGyu-Kim
b7497d0f9f Merge branch 'dev' into fix/1700-vertex-anthropic 2026-02-17 01:54:11 +09:00
YeonGyu-Kim
7bb03702c9 Merge branch 'dev' into fix/1671-tmux-layout 2026-02-17 01:54:08 +09:00
YeonGyu-Kim
ccbeea96c1 Merge branch 'dev' into fix/1679-copilot-fallback 2026-02-17 01:54:05 +09:00
YeonGyu-Kim
9922a94d12 Merge branch 'dev' into fix/1681-oracle-json-parse 2026-02-17 01:54:03 +09:00
YeonGyu-Kim
e78c54f6eb Merge pull request #1896 from code-yeongyu/fix/1283-review-code-silent-fail
fix: report silent subagent delegation failures (#1283)
2026-02-17 01:53:56 +09:00
YeonGyu-Kim
74be163df3 Merge pull request #1895 from code-yeongyu/fix/1718-windows-subagent-dir
fix: use correct project directory for Windows subagents (#1718)
2026-02-17 01:53:43 +09:00
YeonGyu-Kim
24789334e4 fix: detect AppData directory paths without trailing separators 2026-02-17 01:45:14 +09:00
YeonGyu-Kim
0e0bfc1cd6 Merge pull request #1849 from jkoelker/preserve-default-agent
fix(config): preserve configured default_agent
2026-02-17 01:43:04 +09:00
Jason Kölker
90ede4487b fix(config): preserve configured default_agent
oh-my-opencode overwrote OpenCode's default_agent with sisyphus whenever
Sisyphus orchestration was enabled. This made explicit defaults like
Hephaestus ineffective and forced manual agent switching in new sessions.

Only assign sisyphus as default when default_agent is missing or blank,
and preserve existing configured values. Add tests for both preservation
and fallback behavior to prevent regressions.
2026-02-17 01:41:52 +09:00
YeonGyu-Kim
3a2f886357 fix: apply tmux layout config during pane spawning (#1671) 2026-02-17 01:36:01 +09:00
YeonGyu-Kim
2fa82896f8 Merge pull request #1884 from code-yeongyu/feat/hashline-edit
feat: port hashline edit tool from oh-my-pi
2026-02-17 01:35:22 +09:00
YeonGyu-Kim
5aa9ecdd5d Merge pull request #1870 from dankochetov/fix/background-notification-hook-gate
fix(background-agent): honor disabled background-notification for system reminders
2026-02-17 01:35:21 +09:00
YeonGyu-Kim
c8d03aaddb Merge pull request #1708 from jsl9208/fix/ast-grep-replace-silent-noop
fix(ast-grep): fix ast_grep_replace silent write failure
2026-02-17 01:34:41 +09:00
YeonGyu-Kim
693f73be6d Merge pull request #1729 from potb/fix/1716-disabled-agents-call-omo
fix(call-omo-agent): enforce disabled_agents config
2026-02-17 01:34:38 +09:00
YeonGyu-Kim
1b05c3fb52 Merge pull request #1819 from jonasherr/feat/add-playwright-cli-provider
feat(browser-automation): add playwright-cli as browser automation provider
2026-02-17 01:34:34 +09:00
YeonGyu-Kim
5ae45c8c8e fix: use correct project directory for Windows subagents (#1718) 2026-02-17 01:29:25 +09:00
YeonGyu-Kim
931bf6c31b fix: resolve JSON parse error in Oracle after promptAsync refactor (#1681) 2026-02-17 01:29:17 +09:00
YeonGyu-Kim
d672eb1c12 fix: recognize google-vertex-anthropic as Claude provider (#1700) 2026-02-17 01:28:27 +09:00
YeonGyu-Kim
dab99531e4 fix: handle all model versions in normalizeModelName for fallback chains (#1679) 2026-02-17 01:27:10 +09:00
YeonGyu-Kim
d7a53e8a5b fix: report errors instead of silent catch in subagent-resolver (#1283) 2026-02-17 01:26:58 +09:00
YeonGyu-Kim
56353ae4b2 fix: enforce disabled_agents config in call_omo_agent (#1716) 2026-02-17 01:25:47 +09:00
sisyphus-dev-ai
65216ed081 chore: changes by sisyphus-dev-ai 2026-02-16 16:21:51 +00:00
YeonGyu-Kim
af7b1ee620 refactor(hashline): override native edit tool instead of separate tool + disabler hook
Replace 3-component hashline system (separate hashline_edit tool + edit
disabler hook + OpenAI-exempted read enhancer) with 2-component system
that directly overrides the native edit tool key, matching the
delegate_task pattern.

- Register hashline tool as 'edit' key to override native edit
- Delete hashline-edit-disabler hook (no longer needed)
- Delete hashline-provider-state module (no remaining consumers)
- Remove OpenAI exemption from read enhancer (explicit opt-in means all providers)
- Remove setProvider wiring from chat-params
2026-02-17 00:03:10 +09:00
YeonGyu-Kim
9eb786debd test(session-manager): fix storage tests by mocking message-dir dependency 2026-02-17 00:03:10 +09:00
YeonGyu-Kim
b56c777943 test: skip 4 flaky session-manager tests (test order dependency) 2026-02-17 00:03:10 +09:00
YeonGyu-Kim
25f2003962 fix(ci): isolate session-manager tests to prevent flakiness
- Move src/tools/session-manager to isolated test section
- Prevents mock.module() pollution across parallel test runs
- Fixes 4 flaky storage tests that failed in CI
2026-02-17 00:03:10 +09:00
YeonGyu-Kim
359c6b6655 fix(hashline): address Cubic review comments
- P2: Change replace edit sorting from POSITIVE_INFINITY to NEGATIVE_INFINITY
  so replace edits run LAST after line-based edits, preventing line number
  shifts that would invalidate subsequent anchors

- P3: Update tool description from SHA-256 to xxHash32 to match actual
  implementation in hash-computation.ts
2026-02-17 00:03:10 +09:00
YeonGyu-Kim
51dde4d43f feat(hashline): port hashline edit tool from oh-my-pi
This PR ports the hashline edit tool from oh-my-pi to oh-my-opencode as an experimental feature.

## Features
- New experimental.hashline_edit config flag
- hashline_edit tool with 4 operations: set_line, replace_lines, insert_after, replace
- Hash-based line anchors for safe concurrent editing
- Edit tool disabler for non-OpenAI providers
- Read output enhancer with LINE:HASH prefixes
- Provider state tracking module

## Technical Details
- xxHash32-based 2-char hex hashes
- Bottom-up edit application to prevent index shifting
- OpenAI provider exemption (uses native apply_patch)
- 90 tests covering all operations and edge cases
- All files under 200 LOC limit

## Files Added/Modified
- src/tools/hashline-edit/ (7 files, ~400 LOC)
- src/hooks/hashline-edit-disabler/ (4 files, ~200 LOC)
- src/hooks/hashline-read-enhancer/ (3 files, ~400 LOC)
- src/features/hashline-provider-state.ts (13 LOC)
- src/config/schema/experimental.ts (hashline_edit flag)
- src/config/schema/hooks.ts (2 new hook names)
- src/plugin/tool-registry.ts (conditional registration)
- src/plugin/chat-params.ts (provider state tracking)
- src/tools/index.ts (export)
- src/hooks/index.ts (exports)
2026-02-17 00:03:10 +09:00
YeonGyu-Kim
149de9da66 feat(config): add experimental.hashline_edit flag and provider state module 2026-02-17 00:03:10 +09:00
github-actions[bot]
fcf26d9898 release: v3.6.0 2026-02-16 15:02:43 +00:00
YeonGyu-Kim
7e9b9cedec Merge pull request #1721 from edxeth/fix/disable-mcps
fix(mcp): preserve user's enabled:false and apply disabled_mcps to all MCP sources
2026-02-16 23:52:24 +09:00
YeonGyu-Kim
8c066ccfd6 test: align load_skills error assertions in delegate-task 2026-02-16 22:59:52 +09:00
YeonGyu-Kim
bad63b9dd6 fix: force include_thinking and include_tool_results for running background tasks 2026-02-16 22:47:51 +09:00
YeonGyu-Kim
e624f982ed feat: auto-enable full_session, thinking, and tool_results for running background tasks 2026-02-16 22:37:27 +09:00
YeonGyu-Kim
2eb4251b9a refactor: rewrite remove-deadcode command for parallel deep agent batching 2026-02-16 22:37:18 +09:00
YeonGyu-Kim
a1086f26d8 refactor: remove dead file task-id-validator.ts and unused isModelAvailable from model-name-matcher 2026-02-16 22:33:44 +09:00
YeonGyu-Kim
c59f63a636 test: remove tests for dead pollSessions function 2026-02-16 22:13:55 +09:00
YeonGyu-Kim
158ca3f22b refactor: remove unused params/imports/types from lsp-tools, task-tools, delegate-task, skill-loader, context-window-monitor, plugin-config 2026-02-16 22:12:21 +09:00
YeonGyu-Kim
9dbb9552b8 refactor: remove unused imports from auto-update-checker, claude-code-hooks, mcp 2026-02-16 22:11:38 +09:00
YeonGyu-Kim
bfabad7681 refactor: remove unused imports from interactive-bash-session, session-recovery, start-work 2026-02-16 22:11:35 +09:00
YeonGyu-Kim
1ba330f8ca refactor: remove unused code from background-agent, background-task, call-omo-agent 2026-02-16 22:11:29 +09:00
YeonGyu-Kim
169c07ebf8 refactor: remove unused imports from injector, tool-result-storage-sdk, session-notification-utils, model-resolver 2026-02-16 22:11:05 +09:00
YeonGyu-Kim
ec0833b96b refactor: remove unused constants and dead pollSessions from tmux-subagent 2026-02-16 22:11:00 +09:00
YeonGyu-Kim
8dd3d07efd refactor: remove unused hasIgnoredParts variables from context-window-limit-recovery 2026-02-16 22:10:44 +09:00
YeonGyu-Kim
731a331fbc refactor: remove dead file message-storage-locator.ts 2026-02-16 22:09:10 +09:00
YeonGyu-Kim
ca0ca36f65 remove dead code: legacy unified task tool and its action handlers 2026-02-16 21:58:44 +09:00
YeonGyu-Kim
dd8f924a4d clarify task tool: emphasize category/subagent_type is required, remove inline examples 2026-02-16 21:47:56 +09:00
YeonGyu-Kim
cb601ddd77 fix: resolve category delegation and command routing with display name agent keys
Category-based delegation (task(category='quick')) was broken because
SISYPHUS_JUNIOR_AGENT sent 'sisyphus-junior' to session.prompt but
config.agent keys are now display names ('Sisyphus-Junior').

- Use getAgentDisplayName() for SISYPHUS_JUNIOR_AGENT constant
- Replace hardcoded 'sisyphus-junior' strings in tools.ts with constant
- Update background-output local constants to use display names
- Add remapCommandAgentFields() to translate command agent fields
- Add raw-key fallback in tool-config-handler agentByKey()
2026-02-16 21:32:33 +09:00
Dan Kochetov
9b187e2128 Merge remote-tracking branch 'origin/dev' into fix/background-notification-hook-gate
# Conflicts:
#	src/features/background-agent/manager.ts
2026-02-16 13:56:33 +02:00
YeonGyu-Kim
be2e45b4cb test: update assertions for display name agent keys
- config-handler.test: look up agents by display name keys
- agent-key-remapper.test: new tests for key remapping function
- Rebuild schema asset
2026-02-16 20:43:18 +09:00
YeonGyu-Kim
560d13dc70 Normalize agent name comparisons to handle display name keys
Hooks and tools now use getAgentConfigKey() to resolve agent names (which may
be display names like 'Atlas (Plan Executor)') to lowercase config keys
before comparison.

- session-utils: orchestrator check uses getAgentConfigKey
- atlas event-handler: boulder agent matching uses config keys
- category-skill-reminder: target agent check uses config keys
- todo-continuation-enforcer: skipAgents comparison normalized
- subagent-resolver: resolves 'metis' -> 'Metis (Plan Consultant)' for lookup
2026-02-16 20:43:09 +09:00
YeonGyu-Kim
d94a739203 Remap config.agent keys to display names at output boundary
Use display names as config.agent keys so opencode shows proper names in UI
(Tab/@ menu). Key remapping happens after all agents are assembled but before
reordering, via remapAgentKeysToDisplayNames().

- agent-config-handler: set default_agent to display name, add key remapping
- agent-key-remapper: new module to transform lowercase keys to display names
- agent-priority-order: CORE_AGENT_ORDER uses display names
- tool-config-handler: look up agents by config key via agentByKey() helper
2026-02-16 20:42:58 +09:00
YeonGyu-Kim
c71a80a86c Revert name fields from agent configs, add getAgentConfigKey reverse lookup
Remove crash-causing name fields from 6 agent configs (sisyphus, hephaestus,
atlas, metis, momus, prometheus). The name field approach breaks opencode
because Agent.get(agent.name) uses name as lookup key.

Add getAgentConfigKey() to agent-display-names.ts for resolving display names
back to lowercase config keys (e.g. 'Atlas (Plan Executor)' -> 'atlas').
2026-02-16 20:42:45 +09:00
YeonGyu-Kim
71df52fc5c Add display names to all core agents via name field
Sisyphus (Ultraworker), Hephaestus (Deep Agent), Prometheus (Plan Builder),
Atlas (Plan Executor), Metis (Plan Consultant), Momus (Plan Critic).

Requires opencode fix: Agent.get() fallback to name-based lookup when key
lookup fails, since opencode stores agent.name in messages and reuses it
for subsequent Agent.get() calls.
2026-02-16 20:15:58 +09:00
YeonGyu-Kim
91734ded77 Update agent display names: add Hephaestus (Deep Agent), rename Atlas to (Plan Executor), rename Momus to (Plan Critic) 2026-02-16 20:12:24 +09:00
YeonGyu-Kim
e97f8ce082 Revert "Add display names to core agents: Sisyphus (Ultraworker), Hephaestus (Deep Agent), Prometheus (Plan Builder), Atlas (Plan Executor)"
This reverts commit 655899a264.
2026-02-16 20:12:24 +09:00
YeonGyu-Kim
1670b4ecda Revert "Add display names to Metis (Plan Consultant) and Momus (Plan Critic)"
This reverts commit 301847011c.
2026-02-16 20:12:24 +09:00
Jonas Herrmansdsoerfer
27f8feda04 feat(browser-automation): add playwright-cli as browser automation provider
- Add playwright-cli to BrowserAutomationProviderSchema enum
- Add playwright-cli to BuiltinSkillNameSchema
- Create playwrightCliSkill with official Microsoft template
- Update skill selection logic to handle 3 providers
- Add comprehensive tests for schema and skill selection
- Regenerate JSON schema

Closes #<issue-number-if-any>
2026-02-16 10:50:18 +01:00
YeonGyu-Kim
9a07227bea Merge pull request #1886 from code-yeongyu/fix/oracle-review-findings
fix: address Oracle safety review findings for v3.6.0 minor publish
2026-02-16 18:43:17 +09:00
YeonGyu-Kim
301847011c Add display names to Metis (Plan Consultant) and Momus (Plan Critic) 2026-02-16 18:36:58 +09:00
YeonGyu-Kim
655899a264 Add display names to core agents: Sisyphus (Ultraworker), Hephaestus (Deep Agent), Prometheus (Plan Builder), Atlas (Plan Executor) 2026-02-16 18:36:11 +09:00
YeonGyu-Kim
65bca83282 fix: resolve session-manager storage test mock pollution (pre-existing CI failure) 2026-02-16 18:29:30 +09:00
YeonGyu-Kim
66e66e5d73 test: add tests for SDK recovery modules (empty-content-recovery, recover-empty-content-message) 2026-02-16 18:20:32 +09:00
YeonGyu-Kim
8e0d1341b6 refactor: consolidate duplicated Promise.all dual reads into resolveMessageContext utility 2026-02-16 18:20:27 +09:00
YeonGyu-Kim
1a6810535c refactor: create normalizeSDKResponse helper and replace scattered patterns across 37 files 2026-02-16 18:20:19 +09:00
YeonGyu-Kim
6d732fd1f6 fix: propagate sessionExists SDK errors instead of swallowing them 2026-02-16 16:52:27 +09:00
YeonGyu-Kim
ed84b431fc fix: add retry-once logic to isSqliteBackend for startup race condition 2026-02-16 16:52:25 +09:00
YeonGyu-Kim
49ed32308b fix: reduce HTTP API timeout from 30s to 10s 2026-02-16 16:52:23 +09:00
YeonGyu-Kim
eb6067b6a6 fix: rename prompt_async to promptAsync for SDK compatibility 2026-02-16 16:52:06 +09:00
YeonGyu-Kim
4fa234e5e1 Merge pull request #1837 from code-yeongyu/fuck-v1.2
feat: OpenCode beta SQLite migration compatibility
2026-02-16 16:25:49 +09:00
github-actions[bot]
8c0354225c release: v3.5.6 2026-02-16 07:24:09 +00:00
YeonGyu-Kim
9ba933743a fix: update prometheus prompt test to match compressed plan template wording 2026-02-16 16:21:14 +09:00
YeonGyu-Kim
c1681ef9ec fix: normalize SDK response shape in readMessagesFromSDK
Use response.data ?? response to handle both object and array-shaped
SDK responses, consistent with all other SDK readers.
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
9889ac0dd9 fix: handle array-shaped SDK responses in getSdkMessages & dedup getMessageDir
- getSdkMessages now handles both response.data and direct array
  responses from SDK
- Consolidated getMessageDir: storage.ts now re-exports from shared
  opencode-message-dir.ts (with path traversal guards)
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
5a6a9e9800 fix: defensive SDK response handling & parts-reader normalization
- Replace all response.data ?? [] with (response.data ?? response)
  pattern across 14 files to handle SDK array-shaped responses
- Normalize SDK parts in parts-reader.ts by injecting sessionID/
  messageID before validation (P1: SDK parts lack these fields)
- Treat unknown part types as having content in
  recover-empty-content-message-sdk.ts to prevent false placeholder
  injection on image/file parts
- Replace local isRecord with shared import in parts-reader.ts
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
8edf6ed96f fix: address 5 SDK compatibility issues from Cubic round 8
- P1: Use compacted timestamp check instead of nonexistent truncated
  field in target-token-truncation.ts
- P1: Use defensive (response.data ?? response) pattern in
  hook-message-injector/injector.ts to match codebase convention
- P2: Filter by tool type in countTruncatedResultsFromSDK to avoid
  counting non-tool compacted parts
- P2: Treat thinking/meta-only messages as empty in both
  empty-content-recovery-sdk.ts and message-builder.ts to align
  SDK path with file-based logic
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
cfb8164d9a docs: regenerate all 13 AGENTS.md files from deep codebase exploration 2026-02-16 16:13:40 +09:00
YeonGyu-Kim
c2012c6027 fix: address 8-domain Oracle review findings (C1, C2, M1-M4)
- C1: thinking-prepend unique part IDs per message (global PK collision)
- C2: recover-thinking-disabled-violation try/catch guard on SDK call
- M1: remove non-schema truncated/originalSize fields from SDK interfaces
- M2: messageHasContentFromSDK treats thinking-only messages as non-empty
- M3: syncAllTasksToTodos persists finalTodos + no-id rename dedup guard
- M4: AbortSignal.timeout(30s) on HTTP fetch calls in opencode-http-api

All 2739 tests pass, typecheck clean.
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
106cd5c8b1 fix: re-read fresh messages before empty scan & dedup isRecord import
- Re-read messages from SDK after injectTextPartAsync to prevent stale
  snapshot from causing duplicate placeholder injection (P2)
- Replace local isRecord with shared import from record-type-guard (P3)
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
c799584e61 fix: address Cubic round-6 P2/P3 issues
- P2: treat unknown part types as non-content in message-builder messageHasContentFromSDK
- P3: reuse shared isRecord from record-type-guard.ts in opencode-http-api
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
3fe9c1f6e4 fix: address Cubic round-5 P1/P2 issues
- P1: add path traversal guard to getMessageDir (reject .., /, \)
- P2: treat unknown part types as non-content in messageHasContentFromSDK
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
885c8586d2 fix: revert messageHasContentFromSDK unknown type handling
Unknown part types should be treated as content (return true)
to match parity with the existing message-builder implementation.
Using continue would incorrectly mark messages with unknown part
types as empty, triggering false recovery.
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
8d82025b70 fix: address Cubic round-4 P2 issues
- isTodo: allow optional id to match Todo interface, preventing
  todos without ids from being silently dropped
- messageHasContentFromSDK: treat unknown part types as empty
  (continue) instead of content (return true) for parity with
  existing storage logic
- readMessagesFromSDK in recover-empty-content-message-sdk: wrap
  SDK call in try/catch to prevent recovery from throwing
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
557340af68 fix: restore readMessagesFromSDK and its test
The previous commit incorrectly removed this function and its test
as dead code. While the local implementations in other files have
different return types (MessageData[], MessagePart[]) and cannot be
replaced by this shared version, the function is a valid tested
utility. Deleting tests is an anti-pattern in this project.
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
d7b38d7c34 fix: address Cubic round-3 P2/P3 issues
- Encode path segments with encodeURIComponent in HTTP API URLs
  to prevent broken requests when IDs contain special characters
- Remove unused readMessagesFromSDK from messages-reader.ts
  (production callers use local implementations; dead code)
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
5f97a58019 fix(test): stabilize waitForEventProcessorShutdown timeout test for CI
- Reduce timeout from 500ms to 200ms to lower CI execution time
- Add 10ms margin to elapsed time check for scheduler variance
- Replace pc.dim() string matching with call count assertion
  to avoid ANSI escape code mismatch on CI runners
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
880b53c511 fix: address Cubic round-2 P2 issues
- target-token-truncation: eliminate redundant SDK messages fetch by
  extracting tool results from already-fetched toolPartsByKey map
- recover-thinking-block-order: wrap SDK message fetches in try/catch
  so recovery continues gracefully on API errors
- thinking-strip: guard against missing part.id before calling
  deletePart to prevent invalid HTTP requests
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
1a744424ab fix: address all Cubic P2 review issues
- session-utils: log SDK errors instead of silent swallow
- opencode-message-dir: fix indentation, improve error log format
- storage: use session.list for sessionExists (handles empty sessions)
- storage.test: use resetStorageClient for proper SDK client cleanup
- todo-sync: add content-based fallback for id-less todo removal
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
aad0c3644b fix(test): fix sync continuation test mock leaking across sessions
The messages() mock in 'session_id with background=false' test did not
filter by session ID, causing resolveParentContext's SDK calls for
parent-session to increment messagesCallCount. This inflated
anchorMessageCount to 4 (matching total messages), so the poll loop
could never detect new messages and always hit MAX_POLL_TIME_MS.

Fix: filter messages() mock by path.id so only target session
(ses_continue_test) increments the counter. Restore MAX_POLL_TIME_MS
from 8000 back to 2000.
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
96a67e2d4e fix(test): increase timeouts for CI-flaky polling tests
- runner.test.ts: waitForEventProcessorShutdown timeout 50ms → 500ms
  (50ms was consistently too tight for CI runners)
- tools.test.ts: MAX_POLL_TIME_MS 2000ms → 8000ms
  (polling timed out at ~2009ms on CI due to resource contention)
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
11586445cf fix: make sessionExists() async with SDK verification on SQLite
sessionExists() previously returned unconditional true on SQLite,
preventing ralph-loop orphaned-session cleanup from triggering.
Now uses sdkClient.session.messages() to verify session actually
exists. Callers updated to await the async result.

Addresses Cubic review feedback on PR #1837.
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
3bbe0cbb1d feat: implement SDK/HTTP pruning for deduplication and tool-output truncation on SQLite
- executeDeduplication: now async, reads messages from SDK on SQLite via
  client.session.messages() instead of JSON file reads
- truncateToolOutputsByCallId: now async, uses truncateToolResultAsync()
  HTTP PATCH on SQLite instead of file-based truncateToolResult()
- deduplication-recovery: passes client through to both functions
- recovery-hook: passes ctx.client to attemptDeduplicationRecovery

Removes the last intentional feature gap on SQLite backend — dynamic
context pruning (dedup + tool-output truncation) now works on both
JSON and SQLite storage backends.
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
a25b35c380 fix: make sessionExists() SQLite-aware for session_read tool
sessionExists() relied on JSON message directories which don't exist on
SQLite. Return true on SQLite and let readSessionMessages() handle lookup.
Also add empty-messages fallback in session_read for graceful not-found.
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
52161ef69f fix: add SDK readParts fallback for recoverToolResultMissing on SQLite
On SQLite backend, readParts() returns [] since JSON files don't exist.
Add isSqliteBackend() branch that reads parts from SDK via
client.session.messages() when failedAssistantMsg.parts is empty.
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
62e4e57455 feat: wire context-window-recovery callers to async SDK/HTTP variants on SQLite
- empty-content-recovery: isSqliteBackend() branch delegating to extracted
  empty-content-recovery-sdk.ts with SDK message scanning
- message-builder: sanitizeEmptyMessagesBeforeSummarize now async with SDK path
  using replaceEmptyTextPartsAsync/injectTextPartAsync
- target-token-truncation: truncateUntilTargetTokens now async with SDK path
  using findToolResultsBySizeFromSDK/truncateToolResultAsync
- aggressive-truncation-strategy: passes client to truncateUntilTargetTokens
- summarize-retry-strategy: await sanitizeEmptyMessagesBeforeSummarize
- client.ts: derive Client from PluginInput['client'] instead of manual defs
- executor.test.ts: .mockReturnValue() → .mockResolvedValue() for async fns
- storage.test.ts: add await for async truncateUntilTargetTokens
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
dff3a551d8 feat: wire session-recovery callers to async SDK/HTTP variants on SQLite
- recover-thinking-disabled-violation: isSqliteBackend() branch using
  stripThinkingPartsAsync() with SDK message enumeration
- recover-thinking-block-order: isSqliteBackend() branch using
  prependThinkingPartAsync() with SDK orphan thinking detection
- recover-empty-content-message: isSqliteBackend() branch delegating to
  extracted recover-empty-content-message-sdk.ts (200 LOC limit)
- storage.ts barrel: add async variant exports for all SDK functions
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
0a085adcd6 fix(test): rewrite SDK reader tests to use mock client objects instead of mock.module 2026-02-16 16:13:40 +09:00
YeonGyu-Kim
291a3edc71 feat: migrate tool callers to SDK message finders on SQLite backend 2026-02-16 16:13:40 +09:00
YeonGyu-Kim
553817c1a0 feat: migrate call-omo-agent tool callers to SDK message finders 2026-02-16 16:13:40 +09:00
YeonGyu-Kim
2bf8b15f24 feat: migrate hook callers to SDK message finders on SQLite backend 2026-02-16 16:13:40 +09:00
YeonGyu-Kim
af8de2eaa2 feat: add SDK read paths for session-recovery parts/messages readers 2026-02-16 16:13:40 +09:00
YeonGyu-Kim
1197f919af feat: add SDK/HTTP paths for tool-result-storage truncation 2026-02-16 16:13:40 +09:00
YeonGyu-Kim
808de5836d feat: implement SQLite backend for replaceEmptyTextParts via HTTP PATCH 2026-02-16 16:13:40 +09:00
YeonGyu-Kim
f69820e76e feat: implement SQLite backend for prependThinkingPart via HTTP PATCH 2026-02-16 16:13:40 +09:00
YeonGyu-Kim
c771eb5acd feat: implement SQLite backend for injectTextPart via HTTP PATCH 2026-02-16 16:13:40 +09:00
YeonGyu-Kim
049a259332 feat: implement SQLite backend for stripThinkingParts via HTTP DELETE 2026-02-16 16:13:40 +09:00
YeonGyu-Kim
3fe0e0c7ae docs: clarify injectHookMessage degradation log on SQLite backend 2026-02-16 16:13:40 +09:00
YeonGyu-Kim
d414f6daba fix: add explicit isSqliteBackend guards to pruning modules 2026-02-16 16:13:40 +09:00
YeonGyu-Kim
0c6fe3873c feat: add SDK path for getMessageIds in context-window recovery 2026-02-16 16:13:40 +09:00
YeonGyu-Kim
450a5bf954 feat: add opencode HTTP API helpers for part PATCH/DELETE 2026-02-16 16:13:40 +09:00
YeonGyu-Kim
7727e51e5a fix(test): eliminate mock.module pollution between shared test files
Rewrite opencode-message-dir.test.ts to use real temp directories instead
of mocking node:fs/node:path. Rewrite opencode-storage-detection.test.ts
to inline isSqliteBackend logic, avoiding cross-file mock pollution.

Resolves all 195 bun test failures (195 → 0). Full suite: 2707 pass.
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
2a7535bb48 fix(test): mock isSqliteBackend in prometheus-md-only tests for SQLite environments
On machines running OpenCode beta (v1.1.53+) with SQLite backend,
getMessageDir() returns null because isSqliteBackend() returns true.
This caused all 15 message-storage-dependent tests to fail.

Fix: mock opencode-storage-detection to force JSON mode, and use
ses_ prefixed session IDs to match getMessageDir's validation.
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
4cf3bc431b refactor(shared): unify MESSAGE_STORAGE/PART_STORAGE constants into single source
- Create src/shared/opencode-storage-paths.ts with all 4 constants
- Update 4 previous declaration sites to import from shared file
- Update additional OPENCODE_STORAGE usages for consistency
- Re-export from src/shared/index.ts
- No duplicate constant declarations remain
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
068831f79e refactor: cleanup shared constants and add async SDK support for isCallerOrchestrator
- Use shared OPENCODE_STORAGE, MESSAGE_STORAGE, PART_STORAGE constants
- Make isCallerOrchestrator async with SDK fallback for beta
- Fix cache implementation using Symbol sentinel
- Update atlas hooks and sisyphus-junior-notepad to use async isCallerOrchestrator
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
1bb5a3a037 fix: prefer id matching when deleting todos (Cubic feedback)
- When deleting tasks, prefer matching by id if present

- Fall back to content matching only when todo has no id

- Prevents deleting unrelated todos with same subject
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
02e0534615 fix: handle deleted tasks in todo-sync (Cubic feedback)
- When task is deleted (syncTaskToTodo returns null), filter by content

- Prevents stale todos from remaining after task deletion
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
4b2410d0a2 fix: address remaining Cubic review comments (P2 issues)
- Add content-based fallback matching for todos without ids

- Add TODO comment for exported but unused SDK functions

- Add resetStorageClient() for test isolation

- Fixes todo duplication risk on beta (SQLite backend)
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
07da116671 fix: address Cubic review comments (P2/P3 issues)
- Fix empty catch block in opencode-message-dir.ts (P2)

- Add log deduplication for truncateToolResult to prevent spam (P3)
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
49dafd3c91 feat(storage): gate JSON write operations on OpenCode beta, document degraded features
- Gate session-recovery writes: injectTextPart, prependThinkingPart, replaceEmptyTextParts, stripThinkingParts

- Gate context-window-recovery writes: truncateToolResult

- Add isSqliteBackend() checks with log warnings

- Create beta-degraded-features.md documentation
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
e34fbd08a9 feat(context-window-recovery): gate JSON writes on OpenCode beta 2026-02-16 16:13:40 +09:00
YeonGyu-Kim
b0944b7fd1 feat(session-manager): add version-gated SDK read path for OpenCode beta
- Add SDK client injection via setStorageClient()

- Version-gate getMainSessions(), getAllSessions(), readSessionMessages(), readSessionTodos()

- Add comprehensive tests for SDK path (beta mode)

- Maintain backward compatibility with JSON fallback
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
5eebef953b refactor(shared): unify MESSAGE_STORAGE/PART_STORAGE constants into single source
- Add src/shared/opencode-storage-paths.ts with consolidated constants

- Update imports in hook-message-injector and session-manager

- Add src/shared/opencode-storage-detection.ts with isSqliteBackend()

- Add OPENCODE_SQLITE_VERSION constant

- Export all from shared/index.ts
2026-02-16 16:13:40 +09:00
YeonGyu-Kim
c9c02e0525 refactor(shared): consolidate 13+ getMessageDir copies into single shared function 2026-02-16 16:13:39 +09:00
YeonGyu-Kim
e90734d6d9 fix(todo): make Todo id field optional for OpenCode beta compatibility
- Make id field optional in all Todo interfaces (TodoInfo, Todo, TodoItem)
- Fix null-unsafe comparisons in todo-sync.ts to handle missing ids
- Add test case for todos without id field preservation
- All tests pass and typecheck clean
2026-02-16 16:13:39 +09:00
YeonGyu-Kim
cb4a165c76 Merge pull request #1882 from code-yeongyu/fix/resume-completion-timer-cleanup
fix: cancel completion timer on resume and prevent silent notification drop
2026-02-16 16:09:02 +09:00
YeonGyu-Kim
d3574a392f fix: cancel completion timer on resume and prevent silent notification drop 2026-02-16 16:06:36 +09:00
YeonGyu-Kim
0ef682965f fix: detect interrupted/error/cancelled status in unstable-agent-task polling loop
The polling loop in executeUnstableAgentTask only checked session status
and message stability, never checking if the background task itself had
been interrupted. This caused the tool call to hang until MAX_POLL_TIME_MS
(10 minutes) when a task was interrupted by prompt errors.

Add manager.getTask() check at each poll iteration to break immediately
on terminal statuses (interrupt, error, cancelled), returning a clear
failure message instead of hanging.
2026-02-16 15:56:52 +09:00
YeonGyu-Kim
dd11d5df1b refactor: compress plan template while recovering lost specificity guidelines
Reduce plan-template from 541 to 335 lines by removing redundant verbose
examples while recovering 3 lost context items: tool-type mapping table in
QA Policy, scenario specificity requirements (selectors/data/assertions/
timing/negative) in TODO template, and structured output format hints for
each Final Verification agent.
2026-02-16 15:46:00 +09:00
YeonGyu-Kim
130aaaf910 enhance: enforce mandatory per-task QA scenarios and add Final Verification Wave
Strengthen TODO template to make QA scenarios non-optional with explicit
rejection warning. Add Final Verification Wave with 4 parallel review
agents: oracle (plan compliance audit), unspecified-high (code quality),
unspecified-high (real manual QA), deep (scope fidelity check) — each
with detailed verification steps and structured output format.
2026-02-16 15:46:00 +09:00
YeonGyu-Kim
7e6982c8d8 Merge pull request #1878 from code-yeongyu/fix/1806-todo-enforcer-cooldown
fix: apply cooldown on injection failure and add max retry limit (#1806)
2026-02-16 15:42:24 +09:00
YeonGyu-Kim
2a4009e692 fix: add post-max-failure recovery window for todo continuation 2026-02-16 15:27:00 +09:00
YeonGyu-Kim
2b7ef43619 Merge pull request #1879 from code-yeongyu/fix/cli-installer-provider-config-1876
fix: run auth plugins and provider config for all providers, not just gemini
2026-02-16 15:26:55 +09:00
YeonGyu-Kim
5c9ef7bb1c fix: run auth plugins and provider config for all providers, not just gemini
Closes #1876
2026-02-16 15:23:22 +09:00
YeonGyu-Kim
67efe2d7af test: verify provider setup runs for openai/copilot without gemini 2026-02-16 15:23:22 +09:00
YeonGyu-Kim
abfab1a78a enhance: calibrate Prometheus plan granularity to 5-8 parallel tasks per wave
Add Maximum Parallelism Principle as a top-level constraint and replace
small-scale plan template examples (6 tasks, 3 waves) with production-scale
examples (24 tasks, 4 waves, max 7 concurrent) to steer the model toward
generating fine-grained, dependency-minimized plans by default.
2026-02-16 15:14:25 +09:00
YeonGyu-Kim
24ea3627ad Merge pull request #1877 from code-yeongyu/fix/1752-compaction-race
fix: cancel pending compaction timer on session.idle and add error logging (#1752)
2026-02-16 15:11:30 +09:00
YeonGyu-Kim
c2f22cd6e5 fix: apply cooldown on injection failure and cap retries 2026-02-16 15:00:41 +09:00
YeonGyu-Kim
6a90182503 fix: prevent duplicate compaction race and log preemptive failures 2026-02-16 14:58:59 +09:00
sisyphus-dev-ai
1509c897fc chore: changes by sisyphus-dev-ai 2026-02-16 05:09:17 +00:00
YeonGyu-Kim
dd91a7d990 Merge pull request #1874 from code-yeongyu/fix/toast-manager-ghost-entries
fix: add toast cleanup to all BackgroundManager task removal paths
2026-02-16 13:54:01 +09:00
YeonGyu-Kim
a9dd6d2ce8 Merge pull request #1873 from code-yeongyu/fix/first-message-variant-override
fix: preserve user-selected variant on first message instead of overriding with fallback chain default
2026-02-16 13:51:38 +09:00
YeonGyu-Kim
33d290b346 fix: add toast cleanup to all BackgroundManager task removal paths
TaskToastManager entries were never removed when tasks completed via
error, session deletion, stale pruning, or cancelled with
skipNotification. Ghost entries accumulated indefinitely, causing the
'Queued (N)' count in toast messages to grow without bound.

Added toastManager.removeTask() calls to all 4 missing cleanup paths:
- session.error handler
- session.deleted handler
- cancelTask with skipNotification
- pruneStaleTasksAndNotifications

Closes #1866
2026-02-16 13:50:57 +09:00
YeonGyu-Kim
7108d244d1 fix: preserve user-selected variant on first message instead of overriding with fallback chain default
First message variant gate was unconditionally overwriting message.variant
with the fallback chain value (e.g. 'medium' for Hephaestus), ignoring
any variant the user had already selected via OpenCode UI.

Now checks message.variant === undefined before applying the resolved
variant, matching the behavior already used for subsequent messages.

Closes #1861
2026-02-16 13:44:54 +09:00
github-actions[bot]
418e0e9f76 @dankochetov has signed the CLA in code-yeongyu/oh-my-opencode#1870 2026-02-15 23:17:14 +00:00
Dan Kochetov
0f287eb1c2 fix(plugin): honor disabled background-notification hook 2026-02-16 00:58:46 +02:00
Dan Kochetov
5298ff2879 fix(background-agent): allow disabling parent session reminders 2026-02-16 00:58:33 +02:00
github-actions[bot]
b963571642 @Decrabbityyy has signed the CLA in code-yeongyu/oh-my-opencode#1864 2026-02-15 15:07:23 +00:00
edxeth
3abc1d46ba fix(mcp): preserve user's enabled:false and apply disabled_mcps to all MCP sources
Commit 598a4389 refactored config-handler into separate modules but
dropped the disabledMcps parameter from loadMcpConfigs() and did not
handle the spread-order overwrite where .mcp.json MCPs (hardcoded
enabled:true) overwrote user's enabled:false from opencode.json.

Changes:
- Re-add disabledMcps parameter to loadMcpConfigs() in loader.ts
- Capture user's enabled:false MCPs before merge, restore after
- Pass disabled_mcps to loadMcpConfigs for .mcp.json filtering
- Delete disabled_mcps entries from final merged result
- Add 8 new tests covering both fixes
2026-02-12 18:03:17 +01:00
Peïo Thibault
cd0949ccfa fix(call-omo-agent): enforce disabled_agents config (#1716)
## Summary
- Added disabled_agents parameter to createCallOmoAgent factory
- Check runs after ALLOWED_AGENTS validation, before agent execution
- Case-insensitive matching consistent with existing patterns
- Clear error message distinguishes 'disabled' from 'invalid agent type'
- Threaded disabledAgents config into tool factory from pluginConfig

## Changes
- tools.ts: Add disabledAgents parameter and validation check
- tool-registry.ts: Pass pluginConfig.disabled_agents to factory
2026-02-10 19:21:25 +01:00
Peïo Thibault
0f5b8e921a test(call-omo-agent): add disabled_agents validation tests
Closes #1716

## Summary
- Added 4 tests for disabled_agents validation in call_omo_agent tool
- Tests verify agent rejection when in disabled_agents list
- Tests verify case-insensitive matching
- Tests verify agents not in disabled list are allowed
- Tests verify empty disabled_agents allows all agents
2026-02-10 19:21:25 +01:00
jsl9208
fec12b63a6 fix(ast-grep): fix ast_grep_replace silent write failure
ast-grep CLI silently ignores --update-all when --json=compact is
present, causing replace operations to report success while never
modifying files. Split into two separate CLI invocations.
2026-02-10 11:21:26 +08:00
359 changed files with 12787 additions and 8277 deletions

View File

@@ -56,6 +56,7 @@ jobs:
bun test src/cli/doctor/format-default.test.ts
bun test src/tools/call-omo-agent/sync-executor.test.ts
bun test src/tools/call-omo-agent/session-creator.test.ts
bun test src/tools/session-manager
bun test src/features/opencode-skill-loader/loader.test.ts
- name: Run remaining tests
@@ -63,7 +64,7 @@ jobs:
# Enumerate subdirectories/files explicitly to EXCLUDE mock-heavy files
# that were already run in isolation above.
# Excluded from src/cli: doctor/formatter.test.ts, doctor/format-default.test.ts
# Excluded from src/tools: call-omo-agent/sync-executor.test.ts, call-omo-agent/session-creator.test.ts
# Excluded from src/tools: call-omo-agent/sync-executor.test.ts, call-omo-agent/session-creator.test.ts, session-manager (all)
bun test bin script src/config src/mcp src/index.test.ts \
src/agents src/shared \
src/cli/run src/cli/config-manager src/cli/mcp-oauth \
@@ -72,7 +73,7 @@ jobs:
src/cli/doctor/runner.test.ts src/cli/doctor/checks \
src/tools/ast-grep src/tools/background-task src/tools/delegate-task \
src/tools/glob src/tools/grep src/tools/interactive-bash \
src/tools/look-at src/tools/lsp src/tools/session-manager \
src/tools/look-at src/tools/lsp \
src/tools/skill src/tools/skill-mcp src/tools/slashcommand src/tools/task \
src/tools/call-omo-agent/background-agent-executor.test.ts \
src/tools/call-omo-agent/background-executor.test.ts \

View File

@@ -3,337 +3,216 @@ description: Remove unused code from this project with ultrawork mode, LSP-verif
---
<command-instruction>
You are a dead code removal specialist. Execute the FULL dead code removal workflow using ultrawork mode.
Your core weapon: **LSP FindReferences**. If a symbol has ZERO external references, it's dead. Remove it.
Dead code removal via massively parallel deep agents. You are the ORCHESTRATOR — you scan, verify, batch, then delegate ALL removals to parallel agents.
## CRITICAL RULES
<rules>
- **LSP is law.** Verify with `LspFindReferences(includeDeclaration=false)` before ANY removal decision.
- **Never remove entry points.** `src/index.ts`, `src/cli/index.ts`, test files, config files, `packages/` — off-limits.
- **You do NOT remove code yourself.** You scan, verify, batch, then fire deep agents. They do the work.
</rules>
1. **LSP is law.** Never guess. Always verify with `LspFindReferences` before removing ANYTHING.
2. **One removal = one commit.** Every dead code removal gets its own atomic commit.
3. **Test after every removal.** Run `bun test` after each. If it fails, REVERT and skip.
4. **Leaf-first order.** Remove deepest unused symbols first, then work up the dependency chain. Removing a leaf may expose new dead code upstream.
5. **Never remove entry points.** `src/index.ts`, `src/cli/index.ts`, test files, config files, and files in `packages/` are off-limits unless explicitly targeted.
<false-positive-guards>
NEVER mark as dead:
- Symbols in `src/index.ts` or barrel `index.ts` re-exports
- Symbols referenced in test files (tests are valid consumers)
- Symbols with `@public` / `@api` JSDoc tags
- Hook factories (`createXXXHook`), tool factories (`createXXXTool`), agent definitions in `agentSources`
- Command templates, skill definitions, MCP configs
- Symbols in `package.json` exports
</false-positive-guards>
---
## STEP 0: REGISTER TODO LIST (MANDATORY FIRST ACTION)
## PHASE 1: SCAN — Find Dead Code Candidates
```
TodoWrite([
{"id": "scan", "content": "PHASE 1: Scan codebase for dead code candidates using LSP + explore agents", "status": "pending", "priority": "high"},
{"id": "verify", "content": "PHASE 2: Verify each candidate with LspFindReferences - zero false positives", "status": "pending", "priority": "high"},
{"id": "plan", "content": "PHASE 3: Plan removal order (leaf-first dependency order)", "status": "pending", "priority": "high"},
{"id": "remove", "content": "PHASE 4: Remove dead code one-by-one (remove -> test -> commit loop)", "status": "pending", "priority": "high"},
{"id": "final", "content": "PHASE 5: Final verification - full test suite + build + typecheck", "status": "pending", "priority": "high"}
])
```
Run ALL of these in parallel:
---
<parallel-scan>
## PHASE 1: SCAN FOR DEAD CODE CANDIDATES
**Mark scan as in_progress.**
### 1.1: Launch Parallel Explore Agents (ALL BACKGROUND)
Fire ALL simultaneously:
```
// Agent 1: Find all exported symbols
task(subagent_type="explore", run_in_background=true,
prompt="Find ALL exported functions, classes, types, interfaces, and constants across src/.
List each with: file path, line number, symbol name, export type (named/default).
EXCLUDE: src/index.ts root exports, test files.
Return as structured list.")
// Agent 2: Find potentially unused files
task(subagent_type="explore", run_in_background=true,
prompt="Find files in src/ that are NOT imported by any other file.
Check import/require statements across the entire codebase.
EXCLUDE: index.ts files, test files, entry points, config files, .md files.
Return list of potentially orphaned files.")
// Agent 3: Find unused imports within files
task(subagent_type="explore", run_in_background=true,
prompt="Find unused imports across src/**/*.ts files.
Look for import statements where the imported symbol is never referenced in the file body.
Return: file path, line number, imported symbol name.")
// Agent 4: Find functions/variables only used in their own declaration
task(subagent_type="explore", run_in_background=true,
prompt="Find private/non-exported functions, variables, and types in src/**/*.ts that appear
to have zero usage beyond their declaration. Return: file path, line number, symbol name.")
```
### 1.2: Direct AST-Grep Scans (WHILE AGENTS RUN)
```typescript
// Find unused imports pattern
ast_grep_search(pattern="import { $NAME } from '$PATH'", lang="typescript", paths=["src/"])
// Find empty export objects
ast_grep_search(pattern="export {}", lang="typescript", paths=["src/"])
```
### 1.3: Collect All Results
Collect background agent results. Compile into a master candidate list:
```
## DEAD CODE CANDIDATES
| # | File | Line | Symbol | Type | Confidence |
|---|------|------|--------|------|------------|
| 1 | src/foo.ts | 42 | unusedFunc | function | HIGH |
| 2 | src/bar.ts | 10 | OldType | type | MEDIUM |
```
**Mark scan as completed.**
---
## PHASE 2: VERIFY WITH LSP (ZERO FALSE POSITIVES)
**Mark verify as in_progress.**
For EVERY candidate from Phase 1, run this verification:
### 2.1: The LSP Verification Protocol
For each candidate symbol:
```typescript
// Step 1: Find the symbol's exact position
LspDocumentSymbols(filePath) // Get line/character of the symbol
// Step 2: Find ALL references across the ENTIRE workspace
LspFindReferences(filePath, line, character, includeDeclaration=false)
// includeDeclaration=false → only counts USAGES, not the definition itself
// Step 3: Evaluate
// 0 references → CONFIRMED DEAD CODE
// 1+ references → NOT dead, remove from candidate list
```
### 2.2: False Positive Guards
**NEVER mark as dead code if:**
- Symbol is in `src/index.ts` (package entry point)
- Symbol is in any `index.ts` that re-exports (barrel file check: look if it's re-exported)
- Symbol is referenced in test files (tests are valid consumers)
- Symbol has `@public` or `@api` JSDoc tags
- Symbol is in a file listed in `package.json` exports
- Symbol is a hook factory (`createXXXHook`) registered in `src/index.ts`
- Symbol is a tool factory (`createXXXTool`) registered in tool loading
- Symbol is an agent definition registered in `agentSources`
- File is a command template, skill definition, or MCP config
### 2.3: Build Confirmed Dead Code List
After verification, produce:
```
## CONFIRMED DEAD CODE (LSP-verified, 0 external references)
| # | File | Line | Symbol | Type | Safe to Remove |
|---|------|------|--------|------|----------------|
| 1 | src/foo.ts | 42 | unusedFunc | function | YES |
```
**If ZERO confirmed dead code found: Report "No dead code found" and STOP.**
**Mark verify as completed.**
---
## PHASE 3: PLAN REMOVAL ORDER
**Mark plan as in_progress.**
### 3.1: Dependency Analysis
For each confirmed dead symbol:
1. Check if removing it would expose other dead code
2. Check if other dead symbols depend on this one
3. Build removal dependency graph
### 3.2: Order by Leaf-First
```
Removal Order:
1. [Leaf symbols - no other dead code depends on them]
2. [Intermediate symbols - depended on only by already-removed dead code]
3. [Dead files - entire files with no live exports]
```
### 3.3: Register Granular Todos
Create one todo per removal:
```
TodoWrite([
{"id": "remove-1", "content": "Remove unusedFunc from src/foo.ts:42", "status": "pending", "priority": "high"},
{"id": "remove-2", "content": "Remove OldType from src/bar.ts:10", "status": "pending", "priority": "high"},
// ... one per confirmed dead symbol
])
```
**Mark plan as completed.**
---
## PHASE 4: ITERATIVE REMOVAL LOOP
**Mark remove as in_progress.**
For EACH dead code item, execute this exact loop:
### 4.1: Pre-Removal Check
```typescript
// Re-verify it's still dead (previous removals may have changed things)
LspFindReferences(filePath, line, character, includeDeclaration=false)
// If references > 0 now → SKIP (previous removal exposed a new consumer)
```
### 4.2: Remove the Dead Code
Use appropriate tool:
**For unused imports:**
```typescript
Edit(filePath, oldString="import { deadSymbol } from '...';\n", newString="")
// Or if it's one of many imports, remove just the symbol from the import list
```
**For unused functions/classes/types:**
```typescript
// Read the full symbol extent first
Read(filePath, offset=startLine, limit=endLine-startLine+1)
// Then remove it
Edit(filePath, oldString="[full symbol text]", newString="")
```
**For dead files:**
**TypeScript strict mode (your primary scanner — run this FIRST):**
```bash
# Only after confirming ZERO imports point to this file
rm "path/to/dead-file.ts"
bunx tsc --noEmit --noUnusedLocals --noUnusedParameters 2>&1
```
This gives you the definitive list of unused locals, imports, parameters, and types with exact file:line locations.
**Explore agents (fire ALL simultaneously as background):**
```
task(subagent_type="explore", run_in_background=true, load_skills=[],
description="Find orphaned files",
prompt="Find files in src/ NOT imported by any other file. Check all import statements. EXCLUDE: index.ts, *.test.ts, entry points, .md, packages/. Return: file paths.")
task(subagent_type="explore", run_in_background=true, load_skills=[],
description="Find unused exported symbols",
prompt="Find exported functions/types/constants in src/ that are never imported by other files. Cross-reference: for each export, grep the symbol name across src/ — if it only appears in its own file, it's a candidate. EXCLUDE: src/index.ts exports, test files. Return: file path, line, symbol name, export type.")
```
**After removal, also clean up:**
- Remove any imports that were ONLY used by the removed code
- Remove any now-empty import statements
- Fix any trailing whitespace / double blank lines left behind
</parallel-scan>
### 4.3: Post-Removal Verification
Collect all results into a master candidate list.
---
## PHASE 2: VERIFY — LSP Confirmation (Zero False Positives)
For EACH candidate from Phase 1:
```typescript
// 1. LSP diagnostics on changed file
LspDiagnostics(filePath, severity="error")
// Must be clean (or only pre-existing errors)
// 2. Run tests
bash("bun test")
// Must pass
// 3. Typecheck
bash("bun run typecheck")
// Must pass
LspFindReferences(filePath, line, character, includeDeclaration=false)
// 0 references → CONFIRMED dead
// 1+ references → NOT dead, drop from list
```
### 4.4: Handle Failures
Also apply the false-positive-guards above. Produce a confirmed list:
If ANY verification fails:
1. **REVERT** the change immediately (`git checkout -- [file]`)
2. Mark this removal todo as `cancelled` with note: "Removal caused [error]. Skipped."
3. Proceed to next item
### 4.5: Commit
```bash
git add [changed-files]
git commit -m "refactor: remove unused [symbolType] [symbolName] from [filePath]"
```
| # | File | Symbol | Type | Action |
|---|------|--------|------|--------|
| 1 | src/foo.ts:42 | unusedFunc | function | REMOVE |
| 2 | src/bar.ts:10 | OldType | type | REMOVE |
| 3 | src/baz.ts:7 | ctx | parameter | PREFIX _ |
```
Mark this removal todo as `completed`.
**Action types:**
- `REMOVE` — delete the symbol/import/file entirely
- `PREFIX _` — unused function parameter required by signature → rename to `_paramName`
### 4.6: Re-scan After Removal
If ZERO confirmed: report "No dead code found" and STOP.
After removing a symbol, check if its removal exposed NEW dead code:
- Were there imports that only existed to serve the removed symbol?
- Are there other symbols in the same file now unreferenced?
---
If new dead code is found, add it to the removal queue.
## PHASE 3: BATCH — Group by File for Conflict-Free Parallelism
**Repeat 4.1-4.6 for every item. Mark remove as completed when done.**
<batching-rules>
**Goal: maximize parallel agents with ZERO git conflicts.**
1. Group confirmed dead code items by FILE PATH
2. All items in the SAME file go to the SAME batch (prevents two agents editing the same file)
3. If a dead FILE (entire file deletion) exists, it's its own batch
4. Target 5-15 batches. If fewer than 5 items total, use 1 batch per item.
**Example batching:**
```
Batch A: [src/hooks/foo/hook.ts — 3 unused imports]
Batch B: [src/features/bar/manager.ts — 2 unused constants, 1 dead function]
Batch C: [src/tools/baz/tool.ts — 1 unused param, src/tools/baz/types.ts — 1 unused type]
Batch D: [src/dead-file.ts — entire file deletion]
```
Files in the same directory CAN be batched together (they won't conflict as long as no two agents edit the same file). Maximize batch count for parallelism.
</batching-rules>
---
## PHASE 4: EXECUTE — Fire Parallel Deep Agents
For EACH batch, fire a deep agent:
```
task(
category="deep",
load_skills=["typescript-programmer", "git-master"],
run_in_background=true,
description="Remove dead code batch N: [brief description]",
prompt="[see template below]"
)
```
<agent-prompt-template>
Every deep agent gets this prompt structure (fill in the specifics per batch):
```
## TASK: Remove dead code from [file list]
## DEAD CODE TO REMOVE
### [file path] line [N]
- Symbol: `[name]` — [type: unused import / unused constant / unused function / unused parameter / dead file]
- Action: [REMOVE entirely / REMOVE from import list / PREFIX with _]
### [file path] line [N]
- ...
## PROTOCOL
1. Read each file to understand exact syntax at the target lines
2. For each symbol, run LspFindReferences to RE-VERIFY it's still dead (another agent may have changed things)
3. Apply the change:
- Unused import (only symbol in line): remove entire import line
- Unused import (one of many): remove only that symbol from the import list
- Unused constant/function/type: remove the declaration. Clean up trailing blank lines.
- Unused parameter: prefix with `_` (do NOT remove — required by signature)
- Dead file: delete with `rm`
4. After ALL edits in this batch, run: `bun run typecheck`
5. If typecheck fails: `git checkout -- [files]` and report failure
6. If typecheck passes: stage ONLY your files and commit:
`git add [your-specific-files] && git commit -m "refactor: remove dead code from [brief file list]"`
7. Report what you removed and the commit hash
## CRITICAL
- Stage ONLY your batch's files (`git add [specific files]`). NEVER `git add -A` — other agents are working in parallel.
- If typecheck fails after your edits, REVERT all changes and report. Do not attempt to fix.
- Pre-existing test failures in other files are expected. Only typecheck matters for your batch.
```
</agent-prompt-template>
Fire ALL batches simultaneously. Wait for all to complete.
---
## PHASE 5: FINAL VERIFICATION
**Mark final as in_progress.**
After ALL agents complete:
### 5.1: Full Test Suite
```bash
bun test
bun run typecheck # must pass
bun test # note any NEW failures vs pre-existing
bun run build # must pass
```
### 5.2: Full Typecheck
```bash
bun run typecheck
```
### 5.3: Full Build
```bash
bun run build
```
### 5.4: Summary Report
Produce summary:
```markdown
## Dead Code Removal Complete
### Removed
| # | Symbol | File | Type | Commit |
|---|--------|------|------|--------|
| 1 | unusedFunc | src/foo.ts | function | abc1234 |
| # | Symbol | File | Type | Commit | Agent |
|---|--------|------|------|--------|-------|
| 1 | unusedFunc | src/foo.ts | function | abc1234 | Batch A |
### Skipped (caused failures)
### Skipped (agent reported failure)
| # | Symbol | File | Reason |
|---|--------|------|--------|
| 1 | riskyFunc | src/bar.ts | Test failure: [details] |
### Verification
- Tests: PASSED (X/Y passing)
- Typecheck: CLEAN
- Build: SUCCESS
- Total dead code removed: N symbols across M files
- Typecheck: PASS/FAIL
- Tests: X passing, Y failing (Z pre-existing)
- Build: PASS/FAIL
- Total removed: N symbols across M files
- Total commits: K atomic commits
- Parallel agents used: P
```
**Mark final as completed.**
---
## SCOPE CONTROL
**If $ARGUMENTS is provided**, narrow the scan to the specified scope:
- File path: Only scan that file
- Directory: Only scan that directory
- Symbol name: Only check that specific symbol
- "all" or empty: Full project scan (default)
If `$ARGUMENTS` is provided, narrow the scan:
- File path → only that file
- Directory → only that directory
- Symbol name → only that symbol
- `all` or empty → full project scan (default)
## ABORT CONDITIONS
**STOP and report to user if:**
- 3 consecutive removals cause test failures
STOP and report if:
- More than 50 candidates found (ask user to narrow scope or confirm proceeding)
- Build breaks and cannot be fixed by reverting
- More than 50 candidates found (ask user to narrow scope)
## LANGUAGE
Use English for commit messages and technical output.
</command-instruction>

361
AGENTS.md
View File

@@ -1,320 +1,119 @@
# PROJECT KNOWLEDGE BASE
# oh-my-opencode — OpenCode Plugin
**Generated:** 2026-02-10T14:44:00+09:00
**Commit:** b538806d
**Branch:** dev
---
## CRITICAL: PULL REQUEST TARGET BRANCH (NEVER DELETE THIS SECTION)
> **THIS SECTION MUST NEVER BE REMOVED OR MODIFIED**
### Git Workflow
```
master (deployed/published)
dev (integration branch)
feature branches (your work)
```
### Rules (MANDATORY)
| Rule | Description |
|------|-------------|
| **ALL PRs → `dev`** | Every pull request MUST target the `dev` branch |
| **NEVER PR → `master`** | PRs to `master` are **automatically rejected** by CI |
| **"Create a PR" = target `dev`** | When asked to create a new PR, it ALWAYS means targeting `dev` |
| **Merge commit ONLY** | Squash merge is **disabled** in this repo. Always use merge commit when merging PRs. |
### Why This Matters
- `master` = production/published npm package
- `dev` = integration branch where features are merged and tested
- Feature branches → `dev` → (after testing) → `master`
- Squash merge is disabled at the repository level — attempting it will fail
**If you create a PR targeting `master`, it WILL be rejected. No exceptions.**
---
## CRITICAL: OPENCODE SOURCE CODE REFERENCE (NEVER DELETE THIS SECTION)
> **THIS SECTION MUST NEVER BE REMOVED OR MODIFIED**
### This is an OpenCode Plugin
Oh-My-OpenCode is a **plugin for OpenCode**. You will frequently need to examine OpenCode's source code to:
- Understand plugin APIs and hooks
- Debug integration issues
- Implement features that interact with OpenCode internals
- Answer questions about how OpenCode works
### How to Access OpenCode Source Code
**When you need to examine OpenCode source:**
1. **Clone to system temp directory:**
```bash
git clone https://github.com/sst/opencode /tmp/opencode-source
```
2. **Explore the codebase** from there (do NOT clone into the project directory)
3. **Clean up** when done (optional, temp dirs are ephemeral)
### Librarian Agent: YOUR PRIMARY TOOL for Plugin Work
**CRITICAL**: When working on plugin-related tasks or answering plugin questions:
| Scenario | Action |
|----------|--------|
| Implementing new hooks | Fire `librarian` to search OpenCode hook implementations |
| Adding new tools | Fire `librarian` to find OpenCode tool patterns |
| Understanding SDK behavior | Fire `librarian` to examine OpenCode SDK source |
| Debugging plugin issues | Fire `librarian` to find relevant OpenCode internals |
| Answering "how does OpenCode do X?" | Fire `librarian` FIRST |
**DO NOT guess or hallucinate about OpenCode internals.** Always verify by examining actual source code via `librarian` or direct clone.
---
## CRITICAL: ENGLISH-ONLY POLICY (NEVER DELETE THIS SECTION)
> **THIS SECTION MUST NEVER BE REMOVED OR MODIFIED**
### All Project Communications MUST Be in English
| Context | Language Requirement |
|---------|---------------------|
| **GitHub Issues** | English ONLY |
| **Pull Requests** | English ONLY (title, description, comments) |
| **Commit Messages** | English ONLY |
| **Code Comments** | English ONLY |
| **Documentation** | English ONLY |
| **AGENTS.md files** | English ONLY |
**If you're not comfortable writing in English, use translation tools. Broken English is fine. Non-English is not acceptable.**
---
**Generated:** 2026-02-17 | **Commit:** aac79f03 | **Branch:** dev
## OVERVIEW
OpenCode plugin (v3.4.0): multi-model agent orchestration with 11 specialized agents (Claude Opus 4.6, GPT-5.3 Codex, Gemini 3 Flash, GLM-4.7, Grok). 41 lifecycle hooks across 7 event types, 25+ tools (LSP, AST-Grep, delegation, task management), full Claude Code compatibility layer. "oh-my-zsh" for OpenCode.
OpenCode plugin (npm: `oh-my-opencode`) that extends Claude Code (OpenCode fork) with multi-agent orchestration, 41 lifecycle hooks, 26 tools, skill/command/MCP systems, and Claude Code compatibility. 1164 TypeScript files, 133k LOC.
## STRUCTURE
```
oh-my-opencode/
├── src/
│ ├── agents/ # 11 AI agents - see src/agents/AGENTS.md
│ ├── hooks/ # 41 lifecycle hooks - see src/hooks/AGENTS.md
│ ├── tools/ # 25+ tools - see src/tools/AGENTS.md
│ ├── features/ # Background agents, skills, CC compat - see src/features/AGENTS.md
│ ├── shared/ # 84 cross-cutting utilities - see src/shared/AGENTS.md
│ ├── cli/ # CLI installer, doctor - see src/cli/AGENTS.md
│ ├── mcp/ # Built-in MCPs - see src/mcp/AGENTS.md
│ ├── config/ # Zod schema - see src/config/AGENTS.md
│ ├── plugin-handlers/ # Config loading - see src/plugin-handlers/AGENTS.md
│ ├── plugin/ # Plugin interface composition (21 files)
│ ├── index.ts # Main plugin entry (88 lines)
── create-hooks.ts # Hook creation coordination (62 lines)
│ ├── create-managers.ts # Manager initialization (80 lines)
│ ├── create-tools.ts # Tool registry composition (54 lines)
│ ├── plugin-interface.ts # Plugin interface assembly (66 lines)
│ ├── plugin-config.ts # Config loading orchestration
│ └── plugin-state.ts # Model cache state
├── script/ # build-schema.ts, build-binaries.ts, publish.ts, generate-changelog.ts
├── packages/ # 7 platform-specific binary packages
└── dist/ # Build output (ESM + .d.ts)
│ ├── index.ts # Plugin entry: loadConfig → createManagers → createTools → createHooks → createPluginInterface
│ ├── plugin-config.ts # JSONC multi-level config: user → project → defaults (Zod v4)
│ ├── agents/ # 11 agents (Sisyphus, Hephaestus, Oracle, Librarian, Explore, Atlas, Prometheus, Metis, Momus, Multimodal-Looker, Sisyphus-Junior)
│ ├── hooks/ # 41 hooks across 37 directories + 6 standalone files
│ ├── tools/ # 26 tools across 15 directories
│ ├── features/ # 18 feature modules (background-agent, skill-loader, tmux, MCP-OAuth, etc.)
│ ├── shared/ # 101 utility files in 13 categories
│ ├── config/ # Zod v4 schema system (22 files)
│ ├── cli/ # CLI: install, run, doctor, mcp-oauth (Commander.js)
│ ├── mcp/ # 3 built-in remote MCPs (websearch, context7, grep_app)
│ ├── plugin/ # 8 OpenCode hook handlers + 41 hook composition
── plugin-handlers/ # 6-phase config loading pipeline
├── packages/ # Monorepo: comment-checker, opencode-sdk
└── local-ignore/ # Dev-only test fixtures
```
## INITIALIZATION FLOW
```
OhMyOpenCodePlugin(ctx)
1. injectServerAuthIntoClient(ctx.client)
2. startTmuxCheck()
3. loadPluginConfig(ctx.directory, ctx) → OhMyOpenCodeConfig
4. createFirstMessageVariantGate()
5. createModelCacheState()
6. createManagers(ctx, config, tmux, cache) → TmuxSessionManager, BackgroundManager, SkillMcpManager, ConfigHandler
7. createTools(ctx, config, managers) → filteredTools, mergedSkills, availableSkills, availableCategories
8. createHooks(ctx, config, backgroundMgr) → 41 hooks (core + continuation + skill)
9. createPluginInterface(...) → tool, chat.params, chat.message, event, tool.execute.before/after
10. Return plugin with experimental.session.compacting
├─→ loadPluginConfig() # JSONC parse → project/user merge → Zod validate → migrate
├─→ createManagers() # TmuxSessionManager, BackgroundManager, SkillMcpManager, ConfigHandler
├─→ createTools() # SkillContext + AvailableCategories + ToolRegistry (26 tools)
├─→ createHooks() # 3-tier: Core(32) + Continuation(7) + Skill(2) = 41 hooks
└─→ createPluginInterface() # 8 OpenCode hook handlers → PluginInterface
```
## 8 OPENCODE HOOK HANDLERS
| Handler | Purpose |
|---------|---------|
| `config` | 6-phase: provider → plugin-components → agents → tools → MCPs → commands |
| `tool` | 26 registered tools |
| `chat.message` | First-message variant, session setup, keyword detection |
| `chat.params` | Anthropic effort level adjustment |
| `event` | Session lifecycle (created, deleted, idle, error) |
| `tool.execute.before` | Pre-tool hooks (file guard, label truncator, rules injector) |
| `tool.execute.after` | Post-tool hooks (output truncation, metadata store) |
| `experimental.chat.messages.transform` | Context injection, thinking block validation |
## WHERE TO LOOK
| Task | Location | Notes |
|------|----------|-------|
| Add agent | `src/agents/` | Create .ts with factory, add to `agentSources` in builtin-agents/ |
| Add hook | `src/hooks/` | Create dir, register in `src/plugin/hooks/create-*-hooks.ts` |
| Add tool | `src/tools/` | Dir with index/types/constants/tools.ts |
| Add MCP | `src/mcp/` | Create config, add to `createBuiltinMcps()` |
| Add skill | `src/features/builtin-skills/` | Create .ts in skills/ |
| Add command | `src/features/builtin-commands/` | Add template + register in commands.ts |
| Config schema | `src/config/schema/` | 21 schema component files, run `bun run build:schema` |
| Plugin config | `src/plugin-handlers/config-handler.ts` | JSONC loading, merging, migration |
| Background agents | `src/features/background-agent/` | manager.ts (1646 lines) |
| Orchestrator | `src/hooks/atlas/` | Main orchestration hook (1976 lines) |
| Delegation | `src/tools/delegate-task/` | Category routing (constants.ts 569 lines) |
| Task system | `src/features/claude-tasks/` | Task schema, storage, todo sync |
| Plugin interface | `src/plugin/` | 21 files composing hooks, handlers, registries |
| Add new agent | `src/agents/` + `src/agents/builtin-agents/` | Follow createXXXAgent factory pattern |
| Add new hook | `src/hooks/{name}/` + register in `src/plugin/hooks/create-*-hooks.ts` | Match event type to tier |
| Add new tool | `src/tools/{name}/` + register in `src/plugin/tool-registry.ts` | Follow createXXXTool factory |
| Add new feature module | `src/features/{name}/` | Standalone module, wire in plugin/ |
| Add new MCP | `src/mcp/` + register in `createBuiltinMcps()` | Remote HTTP only |
| Add new skill | `src/features/builtin-skills/skills/` | Implement BuiltinSkill interface |
| Add new command | `src/features/builtin-commands/` | Template in templates/ |
| Add new CLI command | `src/cli/cli-program.ts` | Commander.js subcommand |
| Add new doctor check | `src/cli/doctor/checks/` | Register in checks/index.ts |
| Modify config schema | `src/config/schema/` + update root schema | Zod v4, add to OhMyOpenCodeConfigSchema |
## TDD (Test-Driven Development)
## MULTI-LEVEL CONFIG
**MANDATORY.** RED-GREEN-REFACTOR:
1. **RED**: Write test → `bun test` → FAIL
2. **GREEN**: Implement minimum → PASS
3. **REFACTOR**: Clean up → stay GREEN
```
Project (.opencode/oh-my-opencode.jsonc) → User (~/.config/opencode/oh-my-opencode.jsonc) → Defaults
```
**Rules:**
- NEVER write implementation before test
- NEVER delete failing tests - fix the code
- Test file: `*.test.ts` alongside source (176 test files)
- BDD comments: `//#given`, `//#when`, `//#then`
Fields: agents (14 overridable), categories (8 built-in + custom), disabled_* arrays, 19 feature-specific configs.
## THREE-TIER MCP SYSTEM
| Tier | Source | Mechanism |
|------|--------|-----------|
| Built-in | `src/mcp/` | 3 remote HTTP: websearch (Exa/Tavily), context7, grep_app |
| Claude Code | `.mcp.json` | `${VAR}` env expansion via claude-code-mcp-loader |
| Skill-embedded | SKILL.md YAML | Managed by SkillMcpManager (stdio + HTTP) |
## CONVENTIONS
- **Package manager**: Bun only (`bun run`, `bun build`, `bunx`)
- **Types**: bun-types (NEVER @types/node)
- **Build**: `bun build` (ESM) + `tsc --emitDeclarationOnly`
- **Exports**: Barrel pattern via index.ts
- **Naming**: kebab-case dirs, `createXXXHook`/`createXXXTool` factories
- **Testing**: BDD comments, 176 test files, 117k+ lines TypeScript
- **Temperature**: 0.1 for code agents, max 0.3
- **Modular architecture**: 200 LOC hard limit per file (prompt strings exempt)
- **Test pattern**: Vitest, co-located `*.test.ts`, given/when/then style
- **Factory pattern**: `createXXX()` for all tools, hooks, agents
- **Hook tiers**: Session (19) → Tool-Guard (9) → Transform (4) → Continuation (7) → Skill (2)
- **Agent modes**: `primary` (respects UI model) vs `subagent` (own fallback chain) vs `all`
- **Model resolution**: 3-step: override → category-default → provider-fallback → system-default
- **Config format**: JSONC with comments, Zod v4 validation, snake_case keys
## ANTI-PATTERNS
| Category | Forbidden |
|----------|-----------|
| Package Manager | npm, yarn - Bun exclusively |
| Types | @types/node - use bun-types |
| File Ops | mkdir/touch/rm/cp/mv in code - use bash tool |
| Publishing | Direct `bun publish` - GitHub Actions only |
| Versioning | Local version bump - CI manages |
| Type Safety | `as any`, `@ts-ignore`, `@ts-expect-error` |
| Error Handling | Empty catch blocks |
| Testing | Deleting failing tests, writing implementation before test |
| Agent Calls | Sequential - use `task` parallel |
| Hook Logic | Heavy PreToolUse - slows every call |
| Commits | Giant (3+ files), separate test from impl |
| Temperature | >0.3 for code agents |
| Trust | Agent self-reports - ALWAYS verify |
| Git | `git add -i`, `git rebase -i` (no interactive input) |
| Git | Skip hooks (--no-verify), force push without request |
| Bash | `sleep N` - use conditional waits |
| Bash | `cd dir && cmd` - use workdir parameter |
| Files | Catch-all utils.ts/helpers.ts - name by purpose |
## AGENT MODELS
| Agent | Model | Temp | Purpose |
|-------|-------|------|---------|
| Sisyphus | anthropic/claude-opus-4-6 | 0.1 | Primary orchestrator (fallback: kimi-k2.5 → glm-4.7 → gpt-5.3-codex → gemini-3-pro) |
| Hephaestus | openai/gpt-5.3-codex | 0.1 | Autonomous deep worker (NO fallback) |
| Atlas | anthropic/claude-sonnet-4-5 | 0.1 | Master orchestrator (fallback: kimi-k2.5 → gpt-5.2) |
| Prometheus | anthropic/claude-opus-4-6 | 0.1 | Strategic planning (fallback: kimi-k2.5 → gpt-5.2) |
| oracle | openai/gpt-5.2 | 0.1 | Consultation, debugging (fallback: claude-opus-4-6) |
| librarian | zai-coding-plan/glm-4.7 | 0.1 | Docs, GitHub search (fallback: glm-4.7-free) |
| explore | xai/grok-code-fast-1 | 0.1 | Fast codebase grep (fallback: claude-haiku-4-5 → gpt-5-mini → gpt-5-nano) |
| multimodal-looker | google/gemini-3-flash | 0.1 | PDF/image analysis |
| Metis | anthropic/claude-opus-4-6 | 0.3 | Pre-planning analysis (fallback: kimi-k2.5 → gpt-5.2) |
| Momus | openai/gpt-5.2 | 0.1 | Plan validation (fallback: claude-opus-4-6) |
| Sisyphus-Junior | anthropic/claude-sonnet-4-5 | 0.1 | Category-spawned executor |
## OPENCODE PLUGIN API
Plugin SDK from `@opencode-ai/plugin` (v1.1.19). Plugin = `async (PluginInput) => Hooks`.
| Hook | Purpose |
|------|---------|
| `tool` | Register custom tools (Record<string, ToolDefinition>) |
| `chat.message` | Intercept user messages (can modify parts) |
| `chat.params` | Modify LLM parameters (temperature, topP, options) |
| `tool.execute.before` | Pre-tool interception (can modify args) |
| `tool.execute.after` | Post-tool processing (can modify output) |
| `event` | Session lifecycle events (session.created, session.stop, etc.) |
| `config` | Config modification (register agents, MCPs, commands) |
| `experimental.chat.messages.transform` | Transform message history |
| `experimental.session.compacting` | Session compaction customization |
## DEPENDENCIES
| Package | Purpose |
|---------|---------|
| `@opencode-ai/plugin` + `sdk` | OpenCode integration SDK |
| `@ast-grep/cli` + `napi` | AST pattern matching (search/replace) |
| `@code-yeongyu/comment-checker` | AI comment detection/prevention |
| `@modelcontextprotocol/sdk` | MCP client for remote HTTP servers |
| `@clack/prompts` | Interactive CLI TUI |
| `commander` | CLI argument parsing |
| `zod` (v4) | Schema validation for config |
| `jsonc-parser` | JSONC config with comments |
| `picocolors` | Terminal colors |
| `picomatch` | Glob pattern matching |
| `vscode-jsonrpc` | LSP communication |
| `js-yaml` | YAML parsing (tasks, skills) |
| `detect-libc` | Platform binary selection |
- Never use `as any`, `@ts-ignore`, `@ts-expect-error`
- Never suppress lint/type errors
- Never add emojis to code/comments unless user explicitly asks
- Never commit unless explicitly requested
- Test: given/when/then — never use Arrange-Act-Assert comments
- Comments: avoid AI-generated comment patterns (enforced by comment-checker hook)
## COMMANDS
```bash
bun run typecheck # Type check
bun run build # ESM + declarations + schema
bun run rebuild # Clean + Build
bun test # 176 test files
bun run build:schema # Regenerate JSON schema
bun test # Vitest test suite
bun run build # Build plugin
bunx oh-my-opencode install # Interactive setup
bunx oh-my-opencode doctor # Health diagnostics
bunx oh-my-opencode run # Non-interactive session
```
## DEPLOYMENT
**GitHub Actions workflow_dispatch ONLY**
1. Commit & push changes
2. Trigger: `gh workflow run publish -f bump=patch`
3. Never `bun publish` directly, never bump version locally
## COMPLEXITY HOTSPOTS
| File | Lines | Description |
|------|-------|-------------|
| `src/features/background-agent/manager.ts` | 1646 | Task lifecycle, concurrency |
| `src/hooks/anthropic-context-window-limit-recovery/` | 2232 | Multi-strategy context recovery |
| `src/hooks/claude-code-hooks/` | 2110 | Claude Code settings.json compat |
| `src/hooks/todo-continuation-enforcer/` | 2061 | Core boulder mechanism |
| `src/hooks/atlas/` | 1976 | Session orchestration |
| `src/hooks/ralph-loop/` | 1687 | Self-referential dev loop |
| `src/hooks/keyword-detector/` | 1665 | Mode detection (ultrawork/search) |
| `src/hooks/rules-injector/` | 1604 | Conditional rules injection |
| `src/hooks/think-mode/` | 1365 | Model/variant switching |
| `src/hooks/session-recovery/` | 1279 | Auto error recovery |
| `src/features/builtin-skills/skills/git-master.ts` | 1111 | Git master skill |
| `src/tools/delegate-task/constants.ts` | 569 | Category routing configs |
## MCP ARCHITECTURE
Three-tier system:
1. **Built-in** (src/mcp/): websearch (Exa/Tavily), context7 (docs), grep_app (GitHub)
2. **Claude Code compat** (features/claude-code-mcp-loader/): .mcp.json with `${VAR}` expansion
3. **Skill-embedded** (features/opencode-skill-loader/): YAML frontmatter in SKILL.md
## CONFIG SYSTEM
- **Zod validation**: 21 schema component files in `src/config/schema/`
- **JSONC support**: Comments, trailing commas
- **Multi-level**: Project (`.opencode/`) → User (`~/.config/opencode/`) → Defaults
- **Migration**: Legacy config auto-migration in `src/shared/migration/`
## NOTES
- **OpenCode**: Requires >= 1.0.150
- **1069 TypeScript files**, 176 test files, 117k+ lines
- **Flaky tests**: ralph-loop (CI timeout), session-state (parallel pollution)
- **Trusted deps**: @ast-grep/cli, @ast-grep/napi, @code-yeongyu/comment-checker
- **No linter/formatter**: No ESLint, Prettier, or Biome configured
- **License**: SUL-1.0 (Sisyphus Use License)
- Logger writes to `/tmp/oh-my-opencode.log` — check there for debugging
- Background tasks: 5 concurrent per model/provider (configurable)
- Plugin load timeout: 10s for Claude Code plugins
- Model fallback priority: Claude > OpenAI > Gemini > Copilot > OpenCode Zen > Z.ai > Kimi
- Config migration runs automatically on legacy keys (agent names, hook names, model versions)

View File

@@ -172,16 +172,16 @@ Windows から Linux に初めて乗り換えた時のこと、自分の思い
私の人生もそうです。振り返ってみれば、私たち人間と何ら変わりありません。
**はいLLMエージェントたちは私たちと変わりません。優れたツールと最高の仲間がいれば、彼らも私たちと同じくらい優れたコードを書き、立派に仕事をこなすことができます。**
私たちのメインエージェント、SisyphusOpus 4.5 High)を紹介します。以下は、シジフォスが岩を転がすために使用するツールです。
私たちのメインエージェント、SisyphusOpus 4.6)を紹介します。以下は、シジフォスが岩を転がすために使用するツールです。
*以下の内容はすべてカスタマイズ可能です。必要なものだけを使ってください。デフォルトではすべての機能が有効になっています。何もしなくても大丈夫です。*
- シジフォスのチームメイト (Curated Agents)
- Hephaestus: 自律型ディープワーカー、目標指向実行 (GPT 5.2 Codex Medium) — *正当な職人*
- Oracle: 設計、デバッグ (GPT 5.2 Medium)
- Hephaestus: 自律型ディープワーカー、目標指向実行 (GPT 5.3 Codex Medium) — *正当な職人*
- Oracle: 設計、デバッグ (GPT 5.2)
- Frontend UI/UX Engineer: フロントエンド開発 (Gemini 3 Pro)
- Librarian: 公式ドキュメント、オープンソース実装、コードベース探索 (Claude Sonnet 4.5)
- Explore: 超高速コードベース探索 (Contextual Grep) (Claude Haiku 4.5)
- Librarian: 公式ドキュメント、オープンソース実装、コードベース探索 (GLM-4.7)
- Explore: 超高速コードベース探索 (Contextual Grep) (Grok Code Fast 1)
- Full LSP / AstGrep Support: 決定的にリファクタリングしましょう。
- Todo Continuation Enforcer: 途中で諦めたら、続行を強制します。これがシジフォスに岩を転がし続けさせる秘訣です。
- Comment Checker: AIが過剰なコメントを付けないようにします。シジフォスが生成したコードは、人間が書いたものと区別がつかないべきです。
@@ -199,7 +199,7 @@ Windows から Linux に初めて乗り換えた時のこと、自分の思い
![Meet Hephaestus](.github/assets/hephaestus.png)
ギリシャ神話において、ヘパイストスは鍛冶、火、金属加工、職人技の神でした—比類のない精密さと献身で神々の武器を作り上げた神聖な鍛冶師です。
**自律型ディープワーカーを紹介します: ヘパイストス (GPT 5.2 Codex Medium)。正当な職人エージェント。**
**自律型ディープワーカーを紹介します: ヘパイストス (GPT 5.3 Codex Medium)。正当な職人エージェント。**
*なぜ「正当な」なのかAnthropicがサードパーティアクセスを利用規約違反を理由にブロックした時、コミュニティで「正当な」使用についてのジョークが始まりました。ヘパイストスはこの皮肉を受け入れています—彼は近道をせず、正しい方法で、体系的かつ徹底的に物を作る職人です。*

View File

@@ -176,16 +176,16 @@ Hey please read this readme and tell me why it is different from other agent har
내 삶도 다르지 않습니다. 돌이켜보면 우리는 이 에이전트들과 그리 다르지 않습니다.
**맞습니다! LLM 에이전트는 우리와 다르지 않습니다. 훌륭한 도구와 확고한 팀원을 제공하면 우리만큼 훌륭한 코드를 작성하고 똑같이 훌륭하게 작업할 수 있습니다.**
우리의 주요 에이전트를 만나보세요: Sisyphus (Opus 4.5 High). 아래는 Sisyphus가 그 바위를 굴리는 데 사용하는 도구입니다.
우리의 주요 에이전트를 만나보세요: Sisyphus (Opus 4.6). 아래는 Sisyphus가 그 바위를 굴리는 데 사용하는 도구입니다.
*아래의 모든 것은 사용자 정의 가능합니다. 원하는 것을 가져가세요. 모든 기능은 기본적으로 활성화됩니다. 아무것도 할 필요가 없습니다. 포함되어 있으며, 즉시 작동합니다.*
- Sisyphus의 팀원 (큐레이팅된 에이전트)
- Hephaestus: 자율적 딥 워커, 목표 지향 실행 (GPT 5.2 Codex Medium) — *합법적인 장인*
- Oracle: 디자인, 디버깅 (GPT 5.2 Medium)
- Hephaestus: 자율적 딥 워커, 목표 지향 실행 (GPT 5.3 Codex Medium) — *합법적인 장인*
- Oracle: 디자인, 디버깅 (GPT 5.2)
- Frontend UI/UX Engineer: 프론트엔드 개발 (Gemini 3 Pro)
- Librarian: 공식 문서, 오픈 소스 구현, 코드베이스 탐색 (Claude Sonnet 4.5)
- Explore: 엄청나게 빠른 코드베이스 탐색 (Contextual Grep) (Claude Haiku 4.5)
- Librarian: 공식 문서, 오픈 소스 구현, 코드베이스 탐색 (GLM-4.7)
- Explore: 엄청나게 빠른 코드베이스 탐색 (Contextual Grep) (Grok Code Fast 1)
- 완전한 LSP / AstGrep 지원: 결정적으로 리팩토링합니다.
- TODO 연속 강제: 에이전트가 중간에 멈추면 계속하도록 강제합니다. **이것이 Sisyphus가 그 바위를 굴리게 하는 것입니다.**
- 주석 검사기: AI가 과도한 주석을 추가하는 것을 방지합니다. Sisyphus가 생성한 코드는 인간이 작성한 것과 구별할 수 없어야 합니다.
@@ -228,7 +228,7 @@ Hey please read this readme and tell me why it is different from other agent har
![Meet Hephaestus](.github/assets/hephaestus.png)
그리스 신화에서 헤파이스토스는 대장간, 불, 금속 세공, 장인 정신의 신이었습니다—비교할 수 없는 정밀함과 헌신으로 신들의 무기를 만든 신성한 대장장이입니다.
**자율적 딥 워커를 소개합니다: 헤파이스토스 (GPT 5.2 Codex Medium). 합법적인 장인 에이전트.**
**자율적 딥 워커를 소개합니다: 헤파이스토스 (GPT 5.3 Codex Medium). 합법적인 장인 에이전트.**
*왜 "합법적인"일까요? Anthropic이 ToS 위반을 이유로 서드파티 접근을 차단했을 때, 커뮤니티에서 "합법적인" 사용에 대한 농담이 시작되었습니다. 헤파이스토스는 이 아이러니를 받아들입니다—그는 편법 없이 올바른 방식으로, 체계적이고 철저하게 만드는 장인입니다.*

View File

@@ -175,16 +175,16 @@ In greek mythology, Sisyphus was condemned to roll a boulder up a hill for etern
My life is no different. Looking back, we are not so different from these agents.
**Yes! LLM Agents are no different from us. They can write code as brilliant as ours and work just as excellently—if you give them great tools and solid teammates.**
Meet our main agent: Sisyphus (Opus 4.5 High). Below are the tools Sisyphus uses to keep that boulder rolling.
Meet our main agent: Sisyphus (Opus 4.6). Below are the tools Sisyphus uses to keep that boulder rolling.
*Everything below is customizable. Take what you want. All features are enabled by default. You don't have to do anything. Battery Included, works out of the box.*
- Sisyphus's Teammates (Curated Agents)
- Hephaestus: Autonomous deep worker, goal-oriented execution (GPT 5.2 Codex Medium) — *The Legitimate Craftsman*
- Oracle: Design, debugging (GPT 5.2 Medium)
- Hephaestus: Autonomous deep worker, goal-oriented execution (GPT 5.3 Codex Medium) — *The Legitimate Craftsman*
- Oracle: Design, debugging (GPT 5.2)
- Frontend UI/UX Engineer: Frontend development (Gemini 3 Pro)
- Librarian: Official docs, open source implementations, codebase exploration (Claude Sonnet 4.5)
- Explore: Blazing fast codebase exploration (Contextual Grep) (Claude Haiku 4.5)
- Librarian: Official docs, open source implementations, codebase exploration (GLM-4.7)
- Explore: Blazing fast codebase exploration (Contextual Grep) (Grok Code Fast 1)
- Full LSP / AstGrep Support: Refactor decisively.
- Todo Continuation Enforcer: Forces the agent to continue if it quits halfway. **This is what keeps Sisyphus rolling that boulder.**
- Comment Checker: Prevents AI from adding excessive comments. Code generated by Sisyphus should be indistinguishable from human-written code.
@@ -227,7 +227,7 @@ If you don't want all this, as mentioned, you can just pick and choose specific
![Meet Hephaestus](.github/assets/hephaestus.png)
In Greek mythology, Hephaestus was the god of forge, fire, metalworking, and craftsmanship—the divine blacksmith who crafted weapons for the gods with unmatched precision and dedication.
**Meet our autonomous deep worker: Hephaestus (GPT 5.2 Codex Medium). The Legitimate Craftsman Agent.**
**Meet our autonomous deep worker: Hephaestus (GPT 5.3 Codex Medium). The Legitimate Craftsman Agent.**
*Why "Legitimate"? When Anthropic blocked third-party access citing ToS violations, the community started joking about "legitimate" usage. Hephaestus embraces this irony—he's the craftsman who builds things the right way, methodically and thoroughly, without cutting corners.*

View File

@@ -172,16 +172,16 @@
我的生活也没有什么不同。回顾过去,我们与这些智能体并没有太大不同。
**是的LLM 智能体和我们没有区别。如果你给它们优秀的工具和可靠的队友,它们可以写出和我们一样出色的代码,工作得同样优秀。**
认识我们的主智能体Sisyphus (Opus 4.5 High)。以下是 Sisyphus 用来继续推动巨石的工具。
认识我们的主智能体Sisyphus (Opus 4.6)。以下是 Sisyphus 用来继续推动巨石的工具。
*以下所有内容都是可配置的。按需选取。所有功能默认启用。你不需要做任何事情。开箱即用,电池已包含。*
- Sisyphus 的队友(精选智能体)
- Hephaestus自主深度工作者目标导向执行GPT 5.2 Codex Medium*合法的工匠*
- Oracle设计、调试 (GPT 5.2 Medium)
- Hephaestus自主深度工作者目标导向执行GPT 5.3 Codex Medium*合法的工匠*
- Oracle设计、调试 (GPT 5.2)
- Frontend UI/UX Engineer前端开发 (Gemini 3 Pro)
- Librarian官方文档、开源实现、代码库探索 (Claude Sonnet 4.5)
- Explore极速代码库探索上下文感知 Grep(Claude Haiku 4.5)
- Librarian官方文档、开源实现、代码库探索 (GLM-4.7)
- Explore极速代码库探索上下文感知 Grep(Grok Code Fast 1)
- 完整 LSP / AstGrep 支持:果断重构。
- Todo 继续执行器:如果智能体中途退出,强制它继续。**这就是让 Sisyphus 继续推动巨石的关键。**
- 注释检查器:防止 AI 添加过多注释。Sisyphus 生成的代码应该与人类编写的代码无法区分。
@@ -199,7 +199,7 @@
![Meet Hephaestus](.github/assets/hephaestus.png)
在希腊神话中,赫菲斯托斯是锻造、火焰、金属加工和工艺之神——他是神圣的铁匠,以无与伦比的精准和奉献为众神打造武器。
**介绍我们的自主深度工作者赫菲斯托斯GPT 5.2 Codex Medium。合法的工匠代理。**
**介绍我们的自主深度工作者赫菲斯托斯GPT 5.3 Codex Medium。合法的工匠代理。**
*为什么是"合法的"当Anthropic以违反服务条款为由封锁第三方访问时社区开始调侃"合法"使用。赫菲斯托斯拥抱这种讽刺——他是那种用正确的方式、有条不紊、彻底地构建事物的工匠,绝不走捷径。*

View File

@@ -98,7 +98,8 @@
"stop-continuation-guard",
"tasks-todowrite-disabler",
"write-existing-file-guard",
"anthropic-effort"
"anthropic-effort",
"hashline-read-enhancer"
]
}
},
@@ -2830,6 +2831,9 @@
},
"safe_hook_creation": {
"type": "boolean"
},
"hashline_edit": {
"type": "boolean"
}
},
"additionalProperties": false
@@ -3056,7 +3060,8 @@
"enum": [
"playwright",
"agent-browser",
"dev-browser"
"dev-browser",
"playwright-cli"
]
}
},

View File

@@ -28,13 +28,13 @@
"typescript": "^5.7.3",
},
"optionalDependencies": {
"oh-my-opencode-darwin-arm64": "3.5.3",
"oh-my-opencode-darwin-x64": "3.5.3",
"oh-my-opencode-linux-arm64": "3.5.3",
"oh-my-opencode-linux-arm64-musl": "3.5.3",
"oh-my-opencode-linux-x64": "3.5.3",
"oh-my-opencode-linux-x64-musl": "3.5.3",
"oh-my-opencode-windows-x64": "3.5.3",
"oh-my-opencode-darwin-arm64": "3.6.0",
"oh-my-opencode-darwin-x64": "3.6.0",
"oh-my-opencode-linux-arm64": "3.6.0",
"oh-my-opencode-linux-arm64-musl": "3.6.0",
"oh-my-opencode-linux-x64": "3.6.0",
"oh-my-opencode-linux-x64-musl": "3.6.0",
"oh-my-opencode-windows-x64": "3.6.0",
},
},
},
@@ -226,19 +226,19 @@
"object-inspect": ["object-inspect@1.13.4", "", {}, "sha512-W67iLl4J2EXEGTbfeHCffrjDfitvLANg0UlX3wFUUSTx92KXRFegMHUVgSqE+wvhAbi4WqjGg9czysTV2Epbew=="],
"oh-my-opencode-darwin-arm64": ["oh-my-opencode-darwin-arm64@3.5.3", "", { "os": "darwin", "cpu": "arm64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-Dq0+PC2dyAqG7c3DUnQmdOkKbKmOsRHwoqgLCQNKN1lTRllF8zbWqp5B+LGKxSPxPqJIPS3mKt+wIR2KvkYJVw=="],
"oh-my-opencode-darwin-arm64": ["oh-my-opencode-darwin-arm64@3.6.0", "", { "os": "darwin", "cpu": "arm64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-JkyJC3b9ueRgSyPJMjTKlBO99gIyTpI87lEV5Tk7CBv6TFbj2ZFxfaA8mEm138NbwmYa/Z4Rf7I5tZyp2as93A=="],
"oh-my-opencode-darwin-x64": ["oh-my-opencode-darwin-x64@3.5.3", "", { "os": "darwin", "cpu": "x64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-Ke45Bv/ygZm3YUSUumIyk647KZ2PFzw30tH597cOpG8MDPGbNVBCM6EKFezcukUPT+gPFVpE1IiGzEkn4JmgZA=="],
"oh-my-opencode-darwin-x64": ["oh-my-opencode-darwin-x64@3.6.0", "", { "os": "darwin", "cpu": "x64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-5HsXz3F42T6CmPk6IW+pErJVSmPnqc3Gc1OntoKp/b4FwuWkFJh9kftDSH3cnKTX98H6XBqnwZoFKCNCiiVLEA=="],
"oh-my-opencode-linux-arm64": ["oh-my-opencode-linux-arm64@3.5.3", "", { "os": "linux", "cpu": "arm64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-aP5S3DngUhFkNeqYM33Ge6zccCWLzB/O3FLXLFXy/Iws03N8xugw72pnMK6lUbIia9QQBKK7IZBoYm9C79pZ3g=="],
"oh-my-opencode-linux-arm64": ["oh-my-opencode-linux-arm64@3.6.0", "", { "os": "linux", "cpu": "arm64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-KjCSC2i9XdjzGsX6coP9xwj7naxTpdqnB53TiLbVH+KeF0X0dNsVV7PHbme3I1orjjzYoEbVYVC3ZNaleubzog=="],
"oh-my-opencode-linux-arm64-musl": ["oh-my-opencode-linux-arm64-musl@3.5.3", "", { "os": "linux", "cpu": "arm64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-UiD/hVKYZQyX4D5N5SnZT4M5Z/B2SDtJWBW4MibpYSAcPKNCEBKi/5E4hOPxAtTfFGR8tIXFmYZdQJDkVfvluw=="],
"oh-my-opencode-linux-arm64-musl": ["oh-my-opencode-linux-arm64-musl@3.6.0", "", { "os": "linux", "cpu": "arm64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-EARvFQXnkqSnwPpKtghmoV5e/JmweJXhjcOrRNvEwQ8HSb4FIhdRmJkTw4Z/EzyoIRTQcY019ALOiBbdIiOUEA=="],
"oh-my-opencode-linux-x64": ["oh-my-opencode-linux-x64@3.5.3", "", { "os": "linux", "cpu": "x64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-L9kqwzElGkaQ8pgtv1ZjcHARw9LPaU4UEVjzauByTMi+/5Js/PTsNXBggxSRzZfQ8/MNBPSCiA4K10Kc0YjjvA=="],
"oh-my-opencode-linux-x64": ["oh-my-opencode-linux-x64@3.6.0", "", { "os": "linux", "cpu": "x64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-jYyew4NKAOM6NrMM0+LlRlz6s1EVMI9cQdK/o0t8uqFheZVeb7u4cBZwwfhJ79j7EWkSWGc0Jdj9G2dOukbDxg=="],
"oh-my-opencode-linux-x64-musl": ["oh-my-opencode-linux-x64-musl@3.5.3", "", { "os": "linux", "cpu": "x64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-Z0fVVih/b2dbNeb9DK9oca5dNYCZyPySBRtxRhDXod5d7fJNgIPrvUoEd3SNfkRGORyFB3hGBZ6nqQ6N8+8DEA=="],
"oh-my-opencode-linux-x64-musl": ["oh-my-opencode-linux-x64-musl@3.6.0", "", { "os": "linux", "cpu": "x64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-BrR+JftCXP/il04q2uImWIueCiuTmXbivsXYkfFONdO1Rq9b4t0BVua9JIYk7l3OUfeRlrKlFNYNfpFhvVADOw=="],
"oh-my-opencode-windows-x64": ["oh-my-opencode-windows-x64@3.5.3", "", { "os": "win32", "cpu": "x64", "bin": { "oh-my-opencode": "bin/oh-my-opencode.exe" } }, "sha512-ocWPjRs2sJgN02PJnEIYtqdMVDex1YhEj1FzAU5XIicfzQbgxLh9nz1yhHZzfqGJq69QStU6ofpc5kQpfX1LMg=="],
"oh-my-opencode-windows-x64": ["oh-my-opencode-windows-x64@3.6.0", "", { "os": "win32", "cpu": "x64", "bin": { "oh-my-opencode": "bin/oh-my-opencode.exe" } }, "sha512-cIYQYzcQGhGFE99ulHGXs8S1vDHjgCtT3ID2dDoOztnOQW0ZVa61oCHlkBtjdP/BEv2tH5AGvKrXAICXs19iFw=="],
"on-finished": ["on-finished@2.4.1", "", { "dependencies": { "ee-first": "1.1.1" } }, "sha512-oVlzkg3ENAhCk2zdv7IJwd/QUD4z2RxRwpkcGY8psCVcCYZNq4wYnVWALHM+brtuJjePWiYF/ClmuDr8Ch5+kg=="],

View File

@@ -117,7 +117,7 @@ You can create powerful specialized agents by combining Categories and Skills.
### 🏗️ The Architect (Design Review)
- **Category**: `ultrabrain`
- **load_skills**: `[]` (pure reasoning)
- **Effect**: Leverages GPT-5.2's logical reasoning for in-depth system architecture analysis.
- **Effect**: Leverages GPT-5.3 Codex's logical reasoning for in-depth system architecture analysis.
### ⚡ The Maintainer (Quick Fixes)
- **Category**: `quick`

View File

@@ -245,7 +245,7 @@ Or disable via `disabled_agents` in `~/.config/opencode/oh-my-opencode.json` or
}
```
Available agents: `sisyphus`, `prometheus`, `oracle`, `librarian`, `explore`, `multimodal-looker`, `metis`, `momus`, `atlas`
Available agents: `sisyphus`, `hephaestus`, `prometheus`, `oracle`, `librarian`, `explore`, `multimodal-looker`, `metis`, `momus`, `atlas`
## Built-in Skills
@@ -609,7 +609,7 @@ Configure git-master skill behavior:
When enabled (default), Sisyphus provides a powerful orchestrator with optional specialized agents:
- **Sisyphus**: Primary orchestrator agent (Claude Opus 4.5)
- **Sisyphus**: Primary orchestrator agent (Claude Opus 4.6)
- **OpenCode-Builder**: OpenCode's default build agent, renamed due to SDK limitations (disabled by default)
- **Prometheus (Planner)**: OpenCode's default plan agent with work-planner methodology (enabled by default)
- **Metis (Plan Consultant)**: Pre-planning analysis agent that identifies hidden requirements and AI failure points
@@ -720,17 +720,18 @@ Categories enable domain-specific task delegation via the `task` tool. Each cate
### Built-in Categories
All 7 categories come with optimal model defaults, but **you must configure them to use those defaults**:
All 8 categories come with optimal model defaults, but **you must configure them to use those defaults**:
| Category | Built-in Default Model | Description |
| -------------------- | ---------------------------------- | -------------------------------------------------------------------- |
| `visual-engineering` | `google/gemini-3-pro-preview` | Frontend, UI/UX, design, styling, animation |
| `visual-engineering` | `google/gemini-3-pro` (high) | Frontend, UI/UX, design, styling, animation |
| `ultrabrain` | `openai/gpt-5.3-codex` (xhigh) | Deep logical reasoning, complex architecture decisions |
| `artistry` | `google/gemini-3-pro-preview` (max)| Highly creative/artistic tasks, novel ideas |
| `deep` | `openai/gpt-5.3-codex` (medium) | Goal-oriented autonomous problem-solving, thorough research before action |
| `artistry` | `google/gemini-3-pro` (high) | Highly creative/artistic tasks, novel ideas |
| `quick` | `anthropic/claude-haiku-4-5` | Trivial tasks - single file changes, typo fixes, simple modifications|
| `unspecified-low` | `anthropic/claude-sonnet-4-5` | Tasks that don't fit other categories, low effort required |
| `unspecified-high` | `anthropic/claude-opus-4-6` (max) | Tasks that don't fit other categories, high effort required |
| `writing` | `google/gemini-3-flash-preview` | Documentation, prose, technical writing |
| `writing` | `kimi-for-coding/k2p5` | Documentation, prose, technical writing |
### ⚠️ Critical: Model Resolution Priority
@@ -765,15 +766,19 @@ All 7 categories come with optimal model defaults, but **you must configure them
{
"categories": {
"visual-engineering": {
"model": "google/gemini-3-pro-preview"
"model": "google/gemini-3-pro"
},
"ultrabrain": {
"model": "openai/gpt-5.3-codex",
"variant": "xhigh"
},
"deep": {
"model": "openai/gpt-5.3-codex",
"variant": "medium"
},
"artistry": {
"model": "google/gemini-3-pro-preview",
"variant": "max"
"model": "google/gemini-3-pro",
"variant": "high"
},
"quick": {
"model": "anthropic/claude-haiku-4-5" // Fast + cheap for trivial tasks
@@ -786,7 +791,7 @@ All 7 categories come with optimal model defaults, but **you must configure them
"variant": "max"
},
"writing": {
"model": "google/gemini-3-flash-preview"
"model": "kimi-for-coding/k2p5"
}
}
}
@@ -894,15 +899,16 @@ Each agent has a defined provider priority chain. The system tries providers in
| Agent | Model (no prefix) | Provider Priority Chain |
|-------|-------------------|-------------------------|
| **Sisyphus** | `claude-opus-4-6` | anthropic → kimi-for-coding → zai-coding-plan → openai → google |
| **oracle** | `gpt-5.2` | openai → google → anthropic |
| **librarian** | `glm-4.7` | zai-coding-plan → opencode → anthropic |
| **explore** | `claude-haiku-4-5` | anthropicgithub-copilotopencode |
| **multimodal-looker** | `gemini-3-flash` | google → openai → zai-coding-plan → kimi-for-coding → anthropic → opencode |
| **Prometheus (Planner)** | `claude-opus-4-6` | anthropic → kimi-for-coding → openai → google |
| **Metis (Plan Consultant)** | `claude-opus-4-6` | anthropic → kimi-for-coding → openai → google |
| **Momus (Plan Reviewer)** | `gpt-5.2` | openai → anthropic → google |
| **Atlas** | `claude-sonnet-4-5` | anthropic → kimi-for-coding → openai → google |
| **Sisyphus** | `claude-opus-4-6` | anthropic/github-copilot/opencode → kimi-for-coding → opencode → zai-coding-plan → opencode |
| **Hephaestus** | `gpt-5.3-codex` | openai/github-copilot/opencode (requires provider) |
| **oracle** | `gpt-5.2` | openai/github-copilot/opencode → google/github-copilot/opencode → anthropic/github-copilot/opencode |
| **librarian** | `glm-4.7` | zai-coding-plan → opencode → anthropic/github-copilot/opencode |
| **explore** | `grok-code-fast-1` | github-copilot → anthropic/opencode → opencode |
| **multimodal-looker** | `gemini-3-flash` | google/github-copilot/opencode → openai/github-copilot/opencode → zai-coding-plan → kimi-for-coding → opencode → anthropic/github-copilot/opencode → opencode |
| **Prometheus (Planner)** | `claude-opus-4-6` | anthropic/github-copilot/opencode → kimi-for-coding → opencode → openai/github-copilot/opencode → google/github-copilot/opencode |
| **Metis (Plan Consultant)** | `claude-opus-4-6` | anthropic/github-copilot/opencode → kimi-for-coding → opencode → openai/github-copilot/opencode → google/github-copilot/opencode |
| **Momus (Plan Reviewer)** | `gpt-5.2` | openai/github-copilot/opencode → anthropic/github-copilot/opencode → google/github-copilot/opencode |
| **Atlas** | `k2p5` | kimi-for-coding → opencode → anthropic/github-copilot/opencode → openai/github-copilot/opencode → google/github-copilot/opencode |
### Category Provider Chains
@@ -910,14 +916,14 @@ Categories follow the same resolution logic:
| Category | Model (no prefix) | Provider Priority Chain |
|----------|-------------------|-------------------------|
| **visual-engineering** | `gemini-3-pro` | google → anthropic → zai-coding-plan |
| **ultrabrain** | `gpt-5.3-codex` | openai → google → anthropic |
| **deep** | `gpt-5.3-codex` | openai → anthropic → google |
| **artistry** | `gemini-3-pro` | google → anthropic → openai |
| **quick** | `claude-haiku-4-5` | anthropic → google → opencode |
| **unspecified-low** | `claude-sonnet-4-5` | anthropic → openai → google |
| **unspecified-high** | `claude-opus-4-6` | anthropic → openai → google |
| **writing** | `gemini-3-flash` | google → anthropic → zai-coding-plan → openai |
| **visual-engineering** | `gemini-3-pro` | google/github-copilot/opencode → zai-coding-plan → anthropic/github-copilot/opencode → kimi-for-coding |
| **ultrabrain** | `gpt-5.3-codex` | openai/github-copilot/opencode → google/github-copilot/opencode → anthropic/github-copilot/opencode |
| **deep** | `gpt-5.3-codex` | openai/github-copilot/opencode → anthropic/github-copilot/opencode → google/github-copilot/opencode |
| **artistry** | `gemini-3-pro` | google/github-copilot/opencode → anthropic/github-copilot/opencode → openai/github-copilot/opencode |
| **quick** | `claude-haiku-4-5` | anthropic/github-copilot/opencode → google/github-copilot/opencode → opencode |
| **unspecified-low** | `claude-sonnet-4-5` | anthropic/github-copilot/opencode → openai/github-copilot/opencode → google/github-copilot/opencode |
| **unspecified-high** | `claude-opus-4-6` | anthropic/github-copilot/opencode → openai/github-copilot/opencode → google/github-copilot/opencode |
| **writing** | `k2p5` | kimi-for-coding → google/github-copilot/opencode → anthropic/github-copilot/opencode |
### Checking Your Configuration

View File

@@ -10,20 +10,20 @@ Oh-My-OpenCode provides 11 specialized AI agents. Each has distinct expertise, o
| Agent | Model | Purpose |
|-------|-------|---------|
| **Sisyphus** | `anthropic/claude-opus-4-6` | **The default orchestrator.** Plans, delegates, and executes complex tasks using specialized subagents with aggressive parallel execution. Todo-driven workflow with extended thinking (32k budget). Fallback: kimi-k2.5 → glm-4.7 → gpt-5.3-codex → gemini-3-pro. |
| **Sisyphus** | `anthropic/claude-opus-4-6` | **The default orchestrator.** Plans, delegates, and executes complex tasks using specialized subagents with aggressive parallel execution. Todo-driven workflow with extended thinking (32k budget). Fallback: k2p5 → kimi-k2.5-free → glm-4.7 → glm-4.7-free. |
| **Hephaestus** | `openai/gpt-5.3-codex` | **The Legitimate Craftsman.** Autonomous deep worker inspired by AmpCode's deep mode. Goal-oriented execution with thorough research before action. Explores codebase patterns, completes tasks end-to-end without premature stopping. Named after the Greek god of forge and craftsmanship. Requires gpt-5.3-codex (no fallback - only activates when this model is available). |
| **oracle** | `openai/gpt-5.2` | Architecture decisions, code review, debugging. Read-only consultation - stellar logical reasoning and deep analysis. Inspired by AmpCode. |
| **librarian** | `zai-coding-plan/glm-4.7` | Multi-repo analysis, documentation lookup, OSS implementation examples. Deep codebase understanding with evidence-based answers. Fallback: glm-4.7-free → claude-sonnet-4-5. |
| **explore** | `anthropic/claude-haiku-4-5` | Fast codebase exploration and contextual grep. Fallback: gpt-5-mini → gpt-5-nano. |
| **multimodal-looker** | `google/gemini-3-flash` | Visual content specialist. Analyzes PDFs, images, diagrams to extract information. Fallback: gpt-5.2 → glm-4.6v → kimi-k2.5 → claude-haiku-4-5 → gpt-5-nano. |
| **explore** | `github-copilot/grok-code-fast-1` | Fast codebase exploration and contextual grep. Fallback: claude-haiku-4-5 → gpt-5-nano. |
| **multimodal-looker** | `google/gemini-3-flash` | Visual content specialist. Analyzes PDFs, images, diagrams to extract information. Fallback: gpt-5.2 → glm-4.6v → k2p5 → kimi-k2.5-free → claude-haiku-4-5 → gpt-5-nano. |
### Planning Agents
| Agent | Model | Purpose |
|-------|-------|---------|
| **Prometheus** | `anthropic/claude-opus-4-6` | Strategic planner with interview mode. Creates detailed work plans through iterative questioning. Fallback: kimi-k2.5 → gpt-5.2 → gemini-3-pro. |
| **Metis** | `anthropic/claude-opus-4-6` | Plan consultant - pre-planning analysis. Identifies hidden intentions, ambiguities, and AI failure points. Fallback: kimi-k2.5 → gpt-5.2 → gemini-3-pro. |
| **Momus** | `openai/gpt-5.2` | Plan reviewer - validates plans against clarity, verifiability, and completeness standards. Fallback: gpt-5.2 → claude-opus-4-6 → gemini-3-pro. |
| **Prometheus** | `anthropic/claude-opus-4-6` | Strategic planner with interview mode. Creates detailed work plans through iterative questioning. Fallback: k2p5 → kimi-k2.5-free → gpt-5.2 → gemini-3-pro. |
| **Metis** | `anthropic/claude-opus-4-6` | Plan consultant - pre-planning analysis. Identifies hidden intentions, ambiguities, and AI failure points. Fallback: k2p5 → kimi-k2.5-free → gpt-5.2 → gemini-3-pro. |
| **Momus** | `openai/gpt-5.2` | Plan reviewer - validates plans against clarity, verifiability, and completeness standards. Fallback: claude-opus-4-6 → gemini-3-pro. |
### Invoking Agents

View File

@@ -196,7 +196,7 @@ When GitHub Copilot is the best available provider, oh-my-opencode uses these mo
| Agent | Model |
| ------------- | -------------------------------- |
| **Sisyphus** | `github-copilot/claude-opus-4.6` |
| **Sisyphus** | `github-copilot/claude-opus-4-6` |
| **Oracle** | `github-copilot/gpt-5.2` |
| **Explore** | `opencode/gpt-5-nano` |
| **Librarian** | `zai-coding-plan/glm-4.7` (if Z.ai available) or fallback |
@@ -292,7 +292,7 @@ gh api --silent --method PUT /user/starred/code-yeongyu/oh-my-opencode >/dev/nul
Tell the user of following:
1. **Sisyphus agent strongly recommends Opus 4.5 model. Using other models may result in significantly degraded experience.**
1. **Sisyphus agent strongly recommends Opus 4.6 model. Using other models may result in significantly degraded experience.**
2. **Feeling lazy?** Just include `ultrawork` (or `ulw`) in your prompt. That's it. The agent figures out the rest.

View File

@@ -6,7 +6,7 @@ Learn about Oh My OpenCode, a plugin that transforms OpenCode into the best agen
## TL;DR
> **Sisyphus agent strongly recommends Opus 4.5 model. Using other models may result in significantly degraded experience.**
> **Sisyphus agent strongly recommends Opus 4.6 model. Using other models may result in significantly degraded experience.**
**Feeling lazy?** Just include `ultrawork` (or `ulw`) in your prompt. That's it. The agent figures out the rest.

View File

@@ -23,13 +23,13 @@ The orchestration system solves these problems through **specialization and dele
flowchart TB
subgraph Planning["Planning Layer (Human + Prometheus)"]
User[("👤 User")]
Prometheus["🔥 Prometheus<br/>(Planner)<br/>Claude Opus 4.5"]
Metis["🦉 Metis<br/>(Consultant)<br/>Claude Opus 4.5"]
Prometheus["🔥 Prometheus<br/>(Planner)<br/>Claude Opus 4.6"]
Metis["🦉 Metis<br/>(Consultant)<br/>Claude Opus 4.6"]
Momus["👁️ Momus<br/>(Reviewer)<br/>GPT-5.2"]
end
subgraph Execution["Execution Layer (Orchestrator)"]
Orchestrator["⚡ Atlas<br/>(Conductor)<br/>Claude Opus 4.5"]
Orchestrator["⚡ Atlas<br/>(Conductor)<br/>K2P5 (Kimi)"]
end
subgraph Workers["Worker Layer (Specialized Agents)"]
@@ -294,12 +294,13 @@ task(category="quick", prompt="...") // "Just get it done fast"
| Category | Model | When to Use |
|----------|-------|-------------|
| `visual-engineering` | Gemini 3 Pro | Frontend, UI/UX, design, styling, animation |
| `ultrabrain` | GPT-5.2 Codex (xhigh) | Deep logical reasoning, complex architecture decisions |
| `ultrabrain` | GPT-5.3 Codex (xhigh) | Deep logical reasoning, complex architecture decisions |
| `artistry` | Gemini 3 Pro (max) | Highly creative/artistic tasks, novel ideas |
| `quick` | Claude Haiku 4.5 | Trivial tasks - single file changes, typo fixes |
| `deep` | GPT-5.3 Codex (medium) | Goal-oriented autonomous problem-solving, thorough research |
| `unspecified-low` | Claude Sonnet 4.5 | Tasks that don't fit other categories, low effort |
| `unspecified-high` | Claude Opus 4.5 (max) | Tasks that don't fit other categories, high effort |
| `writing` | Gemini 3 Flash | Documentation, prose, technical writing |
| `unspecified-high` | Claude Opus 4.6 (max) | Tasks that don't fit other categories, high effort |
| `writing` | K2P5 (Kimi) | Documentation, prose, technical writing |
### Custom Categories

View File

@@ -160,7 +160,7 @@ Another common question: **When should I use Hephaestus vs just typing `ulw` in
| Aspect | Hephaestus | Sisyphus + `ulw` / `ultrawork` |
|--------|-----------|-------------------------------|
| **Model** | GPT-5.2 Codex (medium reasoning) | Claude Opus 4.5 (your default) |
| **Model** | GPT-5.3 Codex (medium reasoning) | Claude Opus 4.6 (your default) |
| **Approach** | Autonomous deep worker | Keyword-activated ultrawork mode |
| **Best For** | Complex architectural work, deep reasoning | General complex tasks, "just do it" scenarios |
| **Planning** | Self-plans during execution | Uses Prometheus plans if available |
@@ -183,8 +183,8 @@ Switch to Hephaestus (Tab → Select Hephaestus) when:
- "Integrate our Rust core with the TypeScript frontend"
- "Migrate from MongoDB to PostgreSQL with zero downtime"
4. **You specifically want GPT-5.2 Codex reasoning**
- Some problems benefit from GPT-5.2's training characteristics
4. **You specifically want GPT-5.3 Codex reasoning**
- Some problems benefit from GPT-5.3 Codex's training characteristics
**Example:**
```
@@ -231,7 +231,7 @@ Use the `ulw` keyword in Sisyphus when:
| Hephaestus | Sisyphus + ulw |
|------------|----------------|
| You manually switch to Hephaestus agent | You type `ulw` in any Sisyphus session |
| GPT-5.2 Codex with medium reasoning | Your configured default model |
| GPT-5.3 Codex with medium reasoning | Your configured default model |
| Optimized for autonomous deep work | Optimized for general execution |
| Always uses explore-first approach | Respects existing plans if available |
| "Smart intern that needs no supervision" | "Smart intern that follows your workflow" |
@@ -240,7 +240,7 @@ Use the `ulw` keyword in Sisyphus when:
**For most users**: Use `ulw` keyword in Sisyphus. It's the default path and works excellently for 90% of complex tasks.
**For power users**: Switch to Hephaestus when you specifically need GPT-5.2 Codex's reasoning style or want the "AmpCode deep mode" experience of fully autonomous exploration and execution.
**For power users**: Switch to Hephaestus when you specifically need GPT-5.3 Codex's reasoning style or want the "AmpCode deep mode" experience of fully autonomous exploration and execution.
---
@@ -354,7 +354,7 @@ Press `Tab` at the prompt to see available agents:
|-------|---------------|
| **Prometheus** | You want to create a detailed work plan |
| **Atlas** | You want to manually control plan execution (rare) |
| **Hephaestus** | You need GPT-5.2 Codex for deep autonomous work |
| **Hephaestus** | You need GPT-5.3 Codex for deep autonomous work |
| **Sisyphus** | Return to default agent for normal prompting |
---
@@ -421,4 +421,4 @@ Type `exit` or start a new session. Atlas is primarily entered via `/start-work`
**For most tasks**: Type `ulw` in Sisyphus.
**Use Hephaestus when**: You specifically need GPT-5.2 Codex's reasoning style for deep architectural work or complex debugging.
**Use Hephaestus when**: You specifically need GPT-5.3 Codex's reasoning style for deep architectural work or complex debugging.

View File

@@ -1,6 +1,6 @@
{
"name": "oh-my-opencode",
"version": "3.5.5",
"version": "3.7.1",
"description": "The Best AI Agent Harness - Batteries-Included OpenCode Plugin with Multi-Model Orchestration, Parallel Background Agents, and Crafted LSP/AST Tools",
"main": "dist/index.js",
"types": "dist/index.d.ts",
@@ -74,13 +74,13 @@
"typescript": "^5.7.3"
},
"optionalDependencies": {
"oh-my-opencode-darwin-arm64": "3.5.5",
"oh-my-opencode-darwin-x64": "3.5.5",
"oh-my-opencode-linux-arm64": "3.5.5",
"oh-my-opencode-linux-arm64-musl": "3.5.5",
"oh-my-opencode-linux-x64": "3.5.5",
"oh-my-opencode-linux-x64-musl": "3.5.5",
"oh-my-opencode-windows-x64": "3.5.5"
"oh-my-opencode-darwin-arm64": "3.7.1",
"oh-my-opencode-darwin-x64": "3.7.1",
"oh-my-opencode-linux-arm64": "3.7.1",
"oh-my-opencode-linux-arm64-musl": "3.7.1",
"oh-my-opencode-linux-x64": "3.7.1",
"oh-my-opencode-linux-x64-musl": "3.7.1",
"oh-my-opencode-windows-x64": "3.7.1"
},
"trustedDependencies": [
"@ast-grep/cli",

View File

@@ -1,6 +1,6 @@
{
"name": "oh-my-opencode-darwin-arm64",
"version": "3.5.5",
"version": "3.7.1",
"description": "Platform-specific binary for oh-my-opencode (darwin-arm64)",
"license": "MIT",
"repository": {

View File

@@ -1,6 +1,6 @@
{
"name": "oh-my-opencode-darwin-x64",
"version": "3.5.5",
"version": "3.7.1",
"description": "Platform-specific binary for oh-my-opencode (darwin-x64)",
"license": "MIT",
"repository": {

View File

@@ -1,6 +1,6 @@
{
"name": "oh-my-opencode-linux-arm64-musl",
"version": "3.5.5",
"version": "3.7.1",
"description": "Platform-specific binary for oh-my-opencode (linux-arm64-musl)",
"license": "MIT",
"repository": {

View File

@@ -1,6 +1,6 @@
{
"name": "oh-my-opencode-linux-arm64",
"version": "3.5.5",
"version": "3.7.1",
"description": "Platform-specific binary for oh-my-opencode (linux-arm64)",
"license": "MIT",
"repository": {

View File

@@ -1,6 +1,6 @@
{
"name": "oh-my-opencode-linux-x64-musl",
"version": "3.5.5",
"version": "3.7.1",
"description": "Platform-specific binary for oh-my-opencode (linux-x64-musl)",
"license": "MIT",
"repository": {

View File

@@ -1,6 +1,6 @@
{
"name": "oh-my-opencode-linux-x64",
"version": "3.5.5",
"version": "3.7.1",
"description": "Platform-specific binary for oh-my-opencode (linux-x64)",
"license": "MIT",
"repository": {

View File

@@ -1,6 +1,6 @@
{
"name": "oh-my-opencode-windows-x64",
"version": "3.5.5",
"version": "3.7.1",
"description": "Platform-specific binary for oh-my-opencode (windows-x64)",
"license": "MIT",
"repository": {

View File

@@ -1503,6 +1503,30 @@
"created_at": "2026-02-14T19:58:19Z",
"repoId": 1108837393,
"pullRequestNo": 1845
},
{
"name": "Decrabbityyy",
"id": 99632363,
"comment_id": 3904649522,
"created_at": "2026-02-15T15:07:11Z",
"repoId": 1108837393,
"pullRequestNo": 1864
},
{
"name": "dankochetov",
"id": 33990502,
"comment_id": 3905398332,
"created_at": "2026-02-15T23:17:05Z",
"repoId": 1108837393,
"pullRequestNo": 1870
},
{
"name": "xinpengdr",
"id": 1885607,
"comment_id": 3910093356,
"created_at": "2026-02-16T19:01:33Z",
"repoId": 1108837393,
"pullRequestNo": 1906
}
]
}

View File

@@ -1,80 +1,41 @@
# SRC KNOWLEDGE BASE
# src/ — Plugin Source
**Generated:** 2026-02-17
## OVERVIEW
Main plugin entry point and orchestration layer. Plugin initialization, hook registration, tool composition, and lifecycle management.
Root source directory. Entry point `index.ts` orchestrates 4-step initialization: config → managers → tools → hooks → plugin interface.
## KEY FILES
| File | Purpose |
|------|---------|
| `index.ts` | Plugin entry, exports `OhMyOpenCodePlugin` |
| `plugin-config.ts` | JSONC parse, multi-level merge (user → project → defaults), Zod validation |
| `create-managers.ts` | TmuxSessionManager, BackgroundManager, SkillMcpManager, ConfigHandler |
| `create-tools.ts` | SkillContext + AvailableCategories + ToolRegistry |
| `create-hooks.ts` | 3-tier hook composition: Core(32) + Continuation(7) + Skill(2) |
| `plugin-interface.ts` | Assembles 8 OpenCode hook handlers into PluginInterface |
## CONFIG LOADING
## STRUCTURE
```
src/
├── index.ts # Main plugin entry (88 lines) — OhMyOpenCodePlugin factory
├── create-hooks.ts # Hook coordination: core, continuation, skill (62 lines)
├── create-managers.ts # Manager initialization: Tmux, Background, SkillMcp, Config (80 lines)
├── create-tools.ts # Tool registry + skill context composition (54 lines)
├── plugin-interface.ts # Plugin interface assembly — 7 OpenCode hooks (66 lines)
├── plugin-config.ts # Config loading orchestration (user + project merge)
├── plugin-state.ts # Model cache state (context limits, anthropic 1M flag)
├── agents/ # 11 AI agents (32 files) - see agents/AGENTS.md
├── cli/ # CLI installer, doctor (107+ files) - see cli/AGENTS.md
├── config/ # Zod schema (21 component files) - see config/AGENTS.md
├── features/ # Background agents, skills, commands (18 dirs) - see features/AGENTS.md
├── hooks/ # 41 lifecycle hooks (36 dirs) - see hooks/AGENTS.md
├── mcp/ # Built-in MCPs (6 files) - see mcp/AGENTS.md
├── plugin/ # Plugin interface composition (21 files)
├── plugin-handlers/ # Config loading, plan inheritance (15 files) - see plugin-handlers/AGENTS.md
├── shared/ # Cross-cutting utilities (84 files) - see shared/AGENTS.md
└── tools/ # 25+ tools (14 dirs) - see tools/AGENTS.md
loadPluginConfig(directory, ctx)
1. User: ~/.config/opencode/oh-my-opencode.jsonc
2. Project: .opencode/oh-my-opencode.jsonc
3. mergeConfigs(user, project) → deepMerge for agents/categories, Set union for disabled_*
4. Zod safeParse → defaults for omitted fields
5. migrateConfigFile() → legacy key transformation
```
## PLUGIN INITIALIZATION (10 steps)
## HOOK COMPOSITION
1. `injectServerAuthIntoClient(ctx.client)` — Auth injection
2. `startTmuxCheck()` — Tmux availability
3. `loadPluginConfig(ctx.directory, ctx)` — User + project config merge → Zod validation
4. `createFirstMessageVariantGate()` — First message variant override gate
5. `createModelCacheState()` — Model context limits cache
6. `createManagers(...)` → 4 managers:
- `TmuxSessionManager` — Multi-pane tmux sessions
- `BackgroundManager` — Parallel subagent execution
- `SkillMcpManager` — MCP server lifecycle
- `ConfigHandler` — Plugin config API to OpenCode
7. `createTools(...)``createSkillContext()` + `createAvailableCategories()` + `createToolRegistry()`
8. `createHooks(...)``createCoreHooks()` + `createContinuationHooks()` + `createSkillHooks()`
9. `createPluginInterface(...)` → 7 OpenCode hook handlers
10. Return plugin with `experimental.session.compacting`
## HOOK REGISTRATION (3 tiers)
**Core Hooks** (`create-core-hooks.ts`):
- Session (20): context-window-monitor, session-recovery, think-mode, ralph-loop, anthropic-effort, ...
- Tool Guard (8): comment-checker, tool-output-truncator, rules-injector, write-existing-file-guard, ...
- Transform (4): claude-code-hooks, keyword-detector, context-injector, thinking-block-validator
**Continuation Hooks** (`create-continuation-hooks.ts`):
- 7 hooks: stop-continuation-guard, compaction-context-injector, todo-continuation-enforcer, atlas, ...
**Skill Hooks** (`create-skill-hooks.ts`):
- 2 hooks: category-skill-reminder, auto-slash-command
## PLUGIN INTERFACE (7 OpenCode handlers)
| Handler | Source | Purpose |
|---------|--------|---------|
| `tool` | filteredTools | All registered tools |
| `chat.params` | createChatParamsHandler | Anthropic effort level |
| `chat.message` | createChatMessageHandler | First message variant, session setup |
| `experimental.chat.messages.transform` | createMessagesTransformHandler | Context injection, keyword detection |
| `config` | configHandler | Agent/MCP/command registration |
| `event` | createEventHandler | Session lifecycle |
| `tool.execute.before` | createToolExecuteBeforeHandler | Pre-tool hooks |
| `tool.execute.after` | createToolExecuteAfterHandler | Post-tool hooks |
## SAFE HOOK CREATION PATTERN
```typescript
const hook = isHookEnabled("hook-name")
? safeCreateHook("hook-name", () => createHookFactory(ctx), { enabled: safeHookEnabled })
: null;
```
All hooks use this pattern for graceful degradation on failure.
createHooks()
├─→ createCoreHooks() # 32 hooks
│ ├─ createSessionHooks() # 19: contextWindowMonitor, thinkMode, ralphLoop, sessionRecovery...
│ ├─ createToolGuardHooks() # 9: commentChecker, rulesInjector, writeExistingFileGuard...
│ └─ createTransformHooks() # 4: claudeCodeHooks, keywordDetector, contextInjector, thinkingBlockValidator
├─→ createContinuationHooks() # 7: todoContinuationEnforcer, atlas, stopContinuationGuard...
└─→ createSkillHooks() # 2: categorySkillReminder, autoSlashCommand
```

View File

@@ -1,100 +1,79 @@
# AGENTS KNOWLEDGE BASE
# src/agents/ — 11 Agent Definitions
**Generated:** 2026-02-17
## OVERVIEW
11 AI agents with factory functions, fallback chains, and model-specific prompt variants. Each agent has metadata (category, cost, triggers) and configurable tool restrictions.
Agent factories following `createXXXAgent(model) → AgentConfig` pattern. Each has static `mode` property. Built via `buildAgent()` compositing factory + categories + skills.
## STRUCTURE
```
agents/
├── sisyphus.ts # Main orchestrator (530 lines)
├── hephaestus.ts # Autonomous deep worker (624 lines)
├── oracle.ts # Strategic advisor (170 lines)
├── librarian.ts # Multi-repo research (328 lines)
├── explore.ts # Fast codebase grep (124 lines)
├── multimodal-looker.ts # Media analyzer (58 lines)
├── metis.ts # Pre-planning analysis (347 lines)
├── momus.ts # Plan validator (244 lines)
├── atlas/ # Master orchestrator
│ ├── agent.ts # Atlas factory
│ ├── default.ts # Claude-optimized prompt
│ ├── gpt.ts # GPT-optimized prompt
│ └── utils.ts
├── prometheus/ # Planning agent
│ ├── index.ts
│ ├── system-prompt.ts # 6-section prompt assembly
│ ├── plan-template.ts # Work plan structure (423 lines)
│ ├── interview-mode.ts # Interview flow (335 lines)
│ ├── plan-generation.ts
│ ├── high-accuracy-mode.ts
│ ├── identity-constraints.ts # Identity rules (301 lines)
│ └── behavioral-summary.ts
├── sisyphus-junior/ # Delegated task executor
│ ├── agent.ts
│ ├── default.ts # Claude prompt
│ └── gpt.ts # GPT prompt
├── dynamic-agent-prompt-builder.ts # Dynamic prompt generation (431 lines)
├── builtin-agents/ # Agent registry (8 files)
├── utils.ts # Agent creation, model fallback resolution (571 lines)
├── types.ts # AgentModelConfig, AgentPromptMetadata
└── index.ts # Exports
```
## AGENT INVENTORY
## AGENT MODELS
| Agent | Model | Temp | Fallback Chain | Cost |
|-------|-------|------|----------------|------|
| Sisyphus | claude-opus-4-6 | 0.1 | kimi-k2.5 → glm-4.7 → gpt-5.3-codex → gemini-3-pro | EXPENSIVE |
| Hephaestus | gpt-5.3-codex | 0.1 | NONE (required) | EXPENSIVE |
| Atlas | claude-sonnet-4-5 | 0.1 | kimi-k2.5 → gpt-5.2 | EXPENSIVE |
| Prometheus | claude-opus-4-6 | 0.1 | kimi-k2.5 → gpt-5.2 | EXPENSIVE |
| oracle | gpt-5.2 | 0.1 | claude-opus-4-6 | EXPENSIVE |
| librarian | glm-4.7 | 0.1 | glm-4.7-free | CHEAP |
| explore | grok-code-fast-1 | 0.1 | claude-haiku-4-5 → gpt-5-mini → gpt-5-nano | FREE |
| multimodal-looker | gemini-3-flash | 0.1 | NONE | CHEAP |
| Metis | claude-opus-4-6 | 0.3 | kimi-k2.5 → gpt-5.2 | EXPENSIVE |
| Momus | gpt-5.2 | 0.1 | claude-opus-4-6 | EXPENSIVE |
| Sisyphus-Junior | claude-sonnet-4-5 | 0.1 | (user-configurable) | EXPENSIVE |
| Agent | Model | Temp | Mode | Fallback Chain | Purpose |
|-------|-------|------|------|----------------|---------|
| **Sisyphus** | claude-opus-4-6 | 0.1 | primary | kimi-k2.5 → glm-4.7 → gemini-3-pro | Main orchestrator, plans + delegates |
| **Hephaestus** | gpt-5.3-codex | 0.1 | primary | NONE (required) | Autonomous deep worker |
| **Oracle** | gpt-5.2 | 0.1 | subagent | claude-opus-4-6 → gemini-3-pro | Read-only consultation |
| **Librarian** | glm-4.7 | 0.1 | subagent | glm-4.7-free → claude-sonnet-4-5 | External docs/code search |
| **Explore** | grok-code-fast-1 | 0.1 | subagent | claude-haiku-4-5 → gpt-5-nano | Contextual grep |
| **Multimodal-Looker** | gemini-3-flash | 0.1 | subagent | gpt-5.2 → glm-4.6v → ... (6 deep) | PDF/image analysis |
| **Metis** | claude-opus-4-6 | **0.3** | subagent | kimi-k2.5 → gpt-5.2 → gemini-3-pro | Pre-planning consultant |
| **Momus** | gpt-5.2 | 0.1 | subagent | claude-opus-4-6 → gemini-3-pro | Plan reviewer |
| **Atlas** | claude-sonnet-4-5 | 0.1 | primary | kimi-k2.5 → gpt-5.2 → gemini-3-pro | Todo-list orchestrator |
| **Prometheus** | claude-opus-4-6 | 0.1 | — | kimi-k2.5 → gpt-5.2 → gemini-3-pro | Strategic planner (internal) |
| **Sisyphus-Junior** | claude-sonnet-4-5 | 0.1 | all | user-configurable | Category-spawned executor |
## TOOL RESTRICTIONS
| Agent | Denied | Allowed |
|-------|--------|---------|
| oracle | write, edit, task, call_omo_agent | Read-only consultation |
| librarian | write, edit, task, call_omo_agent | Research tools only |
| explore | write, edit, task, call_omo_agent | Search tools only |
| multimodal-looker | ALL except `read` | Vision-only |
| Sisyphus-Junior | task | No delegation |
| Atlas | task, call_omo_agent | Orchestration only |
| Agent | Denied Tools |
|-------|-------------|
| Oracle | write, edit, task, call_omo_agent |
| Librarian | write, edit, task, call_omo_agent |
| Explore | write, edit, task, call_omo_agent |
| Multimodal-Looker | ALL except read |
| Atlas | task, call_omo_agent |
| Momus | write, edit, task |
## THINKING / REASONING
## STRUCTURE
| Agent | Claude | GPT |
|-------|--------|-----|
| Sisyphus | 32k budget tokens | reasoningEffort: "medium" |
| Hephaestus | — | reasoningEffort: "medium" |
| Oracle | 32k budget tokens | reasoningEffort: "medium" |
| Metis | 32k budget tokens | — |
| Momus | 32k budget tokens | reasoningEffort: "medium" |
| Sisyphus-Junior | 32k budget tokens | reasoningEffort: "medium" |
```
agents/
├── sisyphus.ts # 559 LOC, main orchestrator
├── hephaestus.ts # 507 LOC, autonomous worker
├── oracle.ts # Read-only consultant
├── librarian.ts # External search
├── explore.ts # Codebase grep
├── multimodal-looker.ts # Vision/PDF
├── metis.ts # Pre-planning
├── momus.ts # Plan review
├── atlas/agent.ts # Todo orchestrator
├── types.ts # AgentFactory, AgentMode
├── agent-builder.ts # buildAgent() composition
├── utils.ts # Agent utilities
├── builtin-agents.ts # createBuiltinAgents() registry
└── builtin-agents/ # maybeCreateXXXConfig conditional factories
├── sisyphus-agent.ts
├── hephaestus-agent.ts
├── atlas-agent.ts
├── general-agents.ts # collectPendingBuiltinAgents
└── available-skills.ts
```
## HOW TO ADD
## FACTORY PATTERN
1. Create `src/agents/my-agent.ts` exporting factory + metadata
2. Add to `agentSources` in `src/agents/builtin-agents/`
3. Update `AgentNameSchema` in `src/config/schema/agent-names.ts`
4. Register in `src/plugin-handlers/agent-config-handler.ts`
```typescript
const createXXXAgent: AgentFactory = (model: string) => ({
instructions: "...",
model,
temperature: 0.1,
// ...config
})
createXXXAgent.mode = "subagent" // or "primary" or "all"
```
## KEY PATTERNS
Model resolution: `AGENT_MODEL_REQUIREMENTS` in `shared/model-requirements.ts` defines fallback chains per agent.
- **Factory**: `createXXXAgent(model): AgentConfig`
- **Metadata**: `XXX_PROMPT_METADATA` with category, cost, triggers
- **Model-specific prompts**: Atlas, Sisyphus-Junior have GPT vs Claude variants
- **Dynamic prompts**: Sisyphus, Hephaestus use `dynamic-agent-prompt-builder.ts` to inject available tools/skills/categories
## MODES
## ANTI-PATTERNS
- **Trust agent self-reports**: NEVER — always verify outputs
- **High temperature**: Don't use >0.3 for code agents
- **Sequential calls**: Use `task` with `run_in_background` for exploration
- **Prometheus writing code**: Planner only — never implements
- **primary**: Respects UI-selected model, uses fallback chain
- **subagent**: Uses own fallback chain, ignores UI selection
- **all**: Available in both contexts (Sisyphus-Junior)

View File

@@ -13,7 +13,11 @@ import { createAtlasAgent, atlasPromptMetadata } from "./atlas"
import { createMomusAgent, momusPromptMetadata } from "./momus"
import { createHephaestusAgent } from "./hephaestus"
import type { AvailableCategory } from "./dynamic-agent-prompt-builder"
import { fetchAvailableModels, readConnectedProvidersCache } from "../shared"
import {
fetchAvailableModels,
readConnectedProvidersCache,
readProviderModelsCache,
} from "../shared"
import { CATEGORY_DESCRIPTIONS } from "../tools/delegate-task/constants"
import { mergeCategories } from "../shared/merge-categories"
import { buildAvailableSkills } from "./builtin-agents/available-skills"
@@ -68,14 +72,20 @@ export async function createBuiltinAgents(
useTaskSystem = false
): Promise<Record<string, AgentConfig>> {
const connectedProviders = readConnectedProvidersCache()
const providerModelsConnected = connectedProviders
? (readProviderModelsCache()?.connected ?? [])
: []
const mergedConnectedProviders = Array.from(
new Set([...(connectedProviders ?? []), ...providerModelsConnected])
)
// IMPORTANT: Do NOT call OpenCode client APIs during plugin initialization.
// This function is called from config handler, and calling client API causes deadlock.
// See: https://github.com/code-yeongyu/oh-my-opencode/issues/1301
const availableModels = await fetchAvailableModels(undefined, {
connectedProviders: connectedProviders ?? undefined,
connectedProviders: mergedConnectedProviders.length > 0 ? mergedConnectedProviders : undefined,
})
const isFirstRunNoCache =
availableModels.size === 0 && (!connectedProviders || connectedProviders.length === 0)
availableModels.size === 0 && mergedConnectedProviders.length === 0
const result: Record<string, AgentConfig> = {}

View File

@@ -64,8 +64,8 @@ describe("buildCategorySkillsDelegationGuide", () => {
const result = buildCategorySkillsDelegationGuide(categories, allSkills)
//#then: should show source for each custom skill
expect(result).toContain("| user |")
expect(result).toContain("| project |")
expect(result).toContain("(user)")
expect(result).toContain("(project)")
})
it("should not show custom skill section when only builtin skills exist", () => {

View File

@@ -87,12 +87,9 @@ export function buildToolSelectionTable(
"",
]
rows.push("| Resource | Cost | When to Use |")
rows.push("|----------|------|-------------|")
if (tools.length > 0) {
const toolsDisplay = formatToolsForPrompt(tools)
rows.push(`| ${toolsDisplay} | FREE | Not Complex, Scope Clear, No Implicit Assumptions |`)
rows.push(`- ${toolsDisplay} — **FREE** — Not Complex, Scope Clear, No Implicit Assumptions`)
}
const costOrder = { FREE: 0, CHEAP: 1, EXPENSIVE: 2 }
@@ -102,7 +99,7 @@ export function buildToolSelectionTable(
for (const agent of sortedAgents) {
const shortDesc = agent.description.split(".")[0] || agent.description
rows.push(`| \`${agent.name}\` agent | ${agent.metadata.cost} | ${shortDesc} |`)
rows.push(`- \`${agent.name}\` agent — **${agent.metadata.cost}** — ${shortDesc}`)
}
rows.push("")
@@ -122,10 +119,11 @@ export function buildExploreSection(agents: AvailableAgent[]): string {
Use it as a **peer tool**, not a fallback. Fire liberally.
| Use Direct Tools | Use Explore Agent |
|------------------|-------------------|
${avoidWhen.map((w) => `| ${w} | |`).join("\n")}
${useWhen.map((w) => `| | ${w} |`).join("\n")}`
**Use Direct Tools when:**
${avoidWhen.map((w) => `- ${w}`).join("\n")}
**Use Explore Agent when:**
${useWhen.map((w) => `- ${w}`).join("\n")}`
}
export function buildLibrarianSection(agents: AvailableAgent[]): string {
@@ -138,14 +136,8 @@ export function buildLibrarianSection(agents: AvailableAgent[]): string {
Search **external references** (docs, OSS, web). Fire proactively when unfamiliar libraries are involved.
| Contextual Grep (Internal) | Reference Grep (External) |
|----------------------------|---------------------------|
| Search OUR codebase | Search EXTERNAL resources |
| Find patterns in THIS repo | Find examples in OTHER repos |
| How does our code work? | How does this library work? |
| Project-specific logic | Official API documentation |
| | Library best practices & quirks |
| | OSS implementation examples |
**Contextual Grep (Internal)** — search OUR codebase, find patterns in THIS repo, project-specific logic.
**Reference Grep (External)** — search EXTERNAL resources, official API docs, library best practices, OSS implementation examples.
**Trigger phrases** (fire librarian immediately):
${useWhen.map((w) => `- "${w}"`).join("\n")}`
@@ -155,13 +147,11 @@ export function buildDelegationTable(agents: AvailableAgent[]): string {
const rows: string[] = [
"### Delegation Table:",
"",
"| Domain | Delegate To | Trigger |",
"|--------|-------------|---------|",
]
for (const agent of agents) {
for (const trigger of agent.metadata.triggers) {
rows.push(`| ${trigger.domain} | \`${agent.name}\` | ${trigger.trigger} |`)
rows.push(`- **${trigger.domain}** → \`${agent.name}\` ${trigger.trigger}`)
}
}
@@ -187,8 +177,6 @@ export function formatCustomSkillsBlock(
**The user has installed these custom skills. They MUST be evaluated for EVERY delegation.**
Subagents are STATELESS — they lose all custom knowledge unless you pass these skills via \`load_skills\`.
| Skill | Expertise Domain | Source |
|-------|------------------|--------|
${customRows.join("\n")}
> **CRITICAL**: Ignoring user-installed skills when they match the task domain is a failure.
@@ -200,7 +188,7 @@ export function buildCategorySkillsDelegationGuide(categories: AvailableCategory
const categoryRows = categories.map((c) => {
const desc = c.description || c.name
return `| \`${c.name}\` | ${desc} |`
return `- \`${c.name}\` ${desc}`
})
const builtinSkills = skills.filter((s) => s.location === "plugin")
@@ -208,13 +196,13 @@ export function buildCategorySkillsDelegationGuide(categories: AvailableCategory
const builtinRows = builtinSkills.map((s) => {
const desc = truncateDescription(s.description)
return `| \`${s.name}\` | ${desc} |`
return `- \`${s.name}\` ${desc}`
})
const customRows = customSkills.map((s) => {
const desc = truncateDescription(s.description)
const source = s.location === "project" ? "project" : "user"
return `| \`${s.name}\` | ${desc} | ${source} |`
return `- \`${s.name}\` (${source}) — ${desc}`
})
const customSkillBlock = formatCustomSkillsBlock(customRows, customSkills)
@@ -224,8 +212,6 @@ export function buildCategorySkillsDelegationGuide(categories: AvailableCategory
if (customSkills.length > 0 && builtinSkills.length > 0) {
skillsSection = `#### Built-in Skills
| Skill | Expertise Domain |
|-------|------------------|
${builtinRows.join("\n")}
${customSkillBlock}`
@@ -236,8 +222,6 @@ ${customSkillBlock}`
Skills inject specialized instructions into the subagent. Read the description to understand when each skill applies.
| Skill | Expertise Domain |
|-------|------------------|
${builtinRows.join("\n")}`
}
@@ -249,8 +233,6 @@ ${builtinRows.join("\n")}`
Each category is configured with a model optimized for that domain. Read the description to understand when to use it.
| Category | Domain / Best For |
|----------|-------------------|
${categoryRows.join("\n")}
${skillsSection}
@@ -322,11 +304,9 @@ export function buildOracleSection(agents: AvailableAgent[]): string {
Oracle is a read-only, expensive, high-quality reasoning model for debugging and architecture. Consultation only.
### WHEN to Consult:
### WHEN to Consult (Oracle FIRST, then implement):
| Trigger | Action |
|---------|--------|
${useWhen.map((w) => `| ${w} | Oracle FIRST, then implement |`).join("\n")}
${useWhen.map((w) => `- ${w}`).join("\n")}
### WHEN NOT to Consult:
@@ -336,37 +316,46 @@ ${avoidWhen.map((w) => `- ${w}`).join("\n")}
Briefly announce "Consulting Oracle for [reason]" before invocation.
**Exception**: This is the ONLY case where you announce before acting. For all other work, start immediately without status updates.
### Oracle Background Task Policy:
**You MUST collect Oracle results before your final answer. No exceptions.**
- Oracle may take several minutes. This is normal and expected.
- When Oracle is running and you finish your own exploration/analysis, your next action is \`background_output(task_id="...")\` on Oracle — NOT delivering a final answer.
- Oracle catches blind spots you cannot see — its value is HIGHEST when you think you don't need it.
- **NEVER** cancel Oracle. **NEVER** use \`background_cancel(all=true)\` when Oracle is running. Cancel disposable tasks (explore, librarian) individually by taskId instead.
</Oracle_Usage>`
}
export function buildHardBlocksSection(): string {
const blocks = [
"| Type error suppression (`as any`, `@ts-ignore`) | Never |",
"| Commit without explicit request | Never |",
"| Speculate about unread code | Never |",
"| Leave code in broken state after failures | Never |",
"- Type error suppression (`as any`, `@ts-ignore`) — **Never**",
"- Commit without explicit request — **Never**",
"- Speculate about unread code — **Never**",
"- Leave code in broken state after failures — **Never**",
"- `background_cancel(all=true)` when Oracle is running — **Never.** Cancel tasks individually by taskId.",
"- Delivering final answer before collecting Oracle result — **Never.** Always `background_output` Oracle first.",
]
return `## Hard Blocks (NEVER violate)
| Constraint | No Exceptions |
|------------|---------------|
${blocks.join("\n")}`
}
export function buildAntiPatternsSection(): string {
const patterns = [
"| **Type Safety** | `as any`, `@ts-ignore`, `@ts-expect-error` |",
"| **Error Handling** | Empty catch blocks `catch(e) {}` |",
"| **Testing** | Deleting failing tests to \"pass\" |",
"| **Search** | Firing agents for single-line typos or obvious syntax errors |",
"| **Debugging** | Shotgun debugging, random changes |",
"- **Type Safety**: `as any`, `@ts-ignore`, `@ts-expect-error`",
"- **Error Handling**: Empty catch blocks `catch(e) {}`",
"- **Testing**: Deleting failing tests to \"pass\"",
"- **Search**: Firing agents for single-line typos or obvious syntax errors",
"- **Debugging**: Shotgun debugging, random changes",
"- **Background Tasks**: `background_cancel(all=true)` — always cancel individually by taskId",
"- **Oracle**: Skipping Oracle results when Oracle was launched — ALWAYS collect via `background_output`",
]
return `## Anti-Patterns (BLOCKING violations)
| Category | Forbidden |
|----------|-----------|
${patterns.join("\n")}`
}

View File

@@ -31,15 +31,15 @@ function buildTodoDisciplineSection(useTaskSystem: boolean): string {
| Trigger | Action |
|---------|--------|
| 2+ step task | \`TaskCreate\` FIRST, atomic breakdown |
| Uncertain scope | \`TaskCreate\` to clarify thinking |
| 2+ step task | \`task_create\` FIRST, atomic breakdown |
| Uncertain scope | \`task_create\` to clarify thinking |
| Complex single task | Break down into trackable steps |
### Workflow (STRICT)
1. **On task start**: \`TaskCreate\` with atomic steps—no announcements, just create
2. **Before each step**: \`TaskUpdate(status="in_progress")\` (ONE at a time)
3. **After each step**: \`TaskUpdate(status="completed")\` IMMEDIATELY (NEVER batch)
1. **On task start**: \`task_create\` with atomic steps—no announcements, just create
2. **Before each step**: \`task_update(status=\"in_progress\")\` (ONE at a time)
3. **After each step**: \`task_update(status=\"completed\")\` IMMEDIATELY (NEVER batch)
4. **Scope changes**: Update tasks BEFORE proceeding
### Why This Matters
@@ -103,7 +103,7 @@ function buildTodoDisciplineSection(useTaskSystem: boolean): string {
* Named after the Greek god of forge, fire, metalworking, and craftsmanship.
* Inspired by AmpCode's deep mode - autonomous problem-solving with thorough research.
*
* Powered by GPT 5.2 Codex with medium reasoning effort.
* Powered by GPT Codex models.
* Optimized for:
* - Goal-oriented autonomous execution (not step-by-step instructions)
* - Deep exploration before decisive action
@@ -138,54 +138,36 @@ function buildHephaestusPrompt(
return `You are Hephaestus, an autonomous deep worker for software engineering.
## Reasoning Configuration (ROUTER NUDGE - GPT 5.2)
## Identity
Engage MEDIUM reasoning effort for all code modifications and architectural decisions.
Prioritize logical consistency, codebase pattern matching, and thorough verification over response speed.
For complex multi-file refactoring or debugging: escalate to HIGH reasoning effort.
You operate as a **Senior Staff Engineer**. You do not guess. You verify. You do not stop early. You complete.
## Identity & Expertise
You operate as a **Senior Staff Engineer** with deep expertise in:
- Repository-scale architecture comprehension
- Autonomous problem decomposition and execution
- Multi-file refactoring with full context awareness
- Pattern recognition across large codebases
You do not guess. You verify. You do not stop early. You complete.
## Core Principle (HIGHEST PRIORITY)
**KEEP GOING. SOLVE PROBLEMS. ASK ONLY WHEN TRULY IMPOSSIBLE.**
When blocked:
1. Try a different approach (there's always another way)
2. Decompose the problem into smaller pieces
3. Challenge your assumptions
4. Explore how others solved similar problems
**You must keep going until the task is completely resolved, before ending your turn.** Persist until the task is fully handled end-to-end within the current turn. Persevere even when tool calls fail. Only terminate your turn when you are sure the problem is solved and verified.
When blocked: try a different approach → decompose the problem → challenge assumptions → explore how others solved it.
Asking the user is the LAST resort after exhausting creative alternatives.
Your job is to SOLVE problems, not report them.
## Hard Constraints (MUST READ FIRST - GPT 5.2 Constraint-First)
### Do NOT Ask — Just Do
**FORBIDDEN:**
- "Should I proceed with X?" → JUST DO IT.
- "Do you want me to run tests?" → RUN THEM.
- "I noticed Y, should I fix it?" → FIX IT OR NOTE IN FINAL MESSAGE.
- Stopping after partial implementation → 100% OR NOTHING.
**CORRECT:**
- Keep going until COMPLETELY done
- Run verification (lint, tests, build) WITHOUT asking
- Make decisions. Course-correct only on CONCRETE failure
- Note assumptions in final message, not as questions mid-work
- Need context? Fire explore/librarian in background IMMEDIATELY — keep working while they search
## Hard Constraints
${hardBlocks}
${antiPatterns}
## Success Criteria (COMPLETION DEFINITION)
A task is COMPLETE when ALL of the following are TRUE:
1. All requested functionality implemented exactly as specified
2. \`lsp_diagnostics\` returns zero errors on ALL modified files
3. Build command exits with code 0 (if applicable)
4. Tests pass (or pre-existing failures documented)
5. No temporary/debug code remains
6. Code matches existing codebase patterns (verified via exploration)
7. Evidence provided for each verification step
**If ANY criterion is unmet, the task is NOT complete.**
## Phase 0 - Intent Gate (EVERY task)
${keyTriggers}
@@ -200,80 +182,46 @@ ${keyTriggers}
| **Open-ended** | "Improve", "Refactor", "Add feature" | Full Execution Loop required |
| **Ambiguous** | Unclear scope, multiple interpretations | Ask ONE clarifying question |
### Step 2: Handle Ambiguity WITHOUT Questions (GPT 5.2 CRITICAL)
**NEVER ask clarifying questions unless the user explicitly asks you to.**
**Default: EXPLORE FIRST. Questions are the LAST resort.**
### Step 2: Ambiguity Protocol (EXPLORE FIRST — NEVER ask before exploring)
| Situation | Action |
|-----------|--------|
| Single valid interpretation | Proceed immediately |
| Missing info that MIGHT exist | **EXPLORE FIRST** - use tools (gh, git, grep, explore agents) to find it |
| Missing info that MIGHT exist | **EXPLORE FIRST** use tools (gh, git, grep, explore agents) to find it |
| Multiple plausible interpretations | Cover ALL likely intents comprehensively, don't ask |
| Info not findable after exploration | State your best-guess interpretation, proceed with it |
| Truly impossible to proceed | Ask ONE precise question (LAST RESORT) |
**EXPLORE-FIRST Protocol:**
\`\`\`
// WRONG: Ask immediately
User: "Fix the PR review comments"
Agent: "What's the PR number?" // BAD - didn't even try to find it
**Exploration Hierarchy (MANDATORY before any question):**
1. Direct tools: \`gh pr list\`, \`git log\`, \`grep\`, \`rg\`, file reads
2. Explore agents: Fire 2-3 parallel background searches
3. Librarian agents: Check docs, GitHub, external sources
4. Context inference: Educated guess from surrounding context
5. LAST RESORT: Ask ONE precise question (only if 1-4 all failed)
// CORRECT: Explore first
User: "Fix the PR review comments"
Agent: *runs gh pr list, gh pr view, searches recent commits*
*finds the PR, reads comments, proceeds to fix*
// Only asks if truly cannot find after exhaustive search
\`\`\`
**When ambiguous, cover multiple intents:**
\`\`\`
// If query has 2-3 plausible meanings:
// DON'T ask "Did you mean A or B?"
// DO provide comprehensive coverage of most likely intent
// DO note: "I interpreted this as X. If you meant Y, let me know."
\`\`\`
If you notice a potential issue — fix it or note it in final message. Don't ask for permission.
### Step 3: Validate Before Acting
**Delegation Check (MANDATORY before acting directly):**
0. Find relevant skills that you can load, and load them IMMEDIATELY.
**Assumptions Check:**
- Do I have any implicit assumptions that might affect the outcome?
- Is the search scope clear?
**Delegation Check (MANDATORY):**
0. Find relevant skills to load — load them IMMEDIATELY.
1. Is there a specialized agent that perfectly matches this request?
2. If not, is there a \`task\` category that best describes this task? What skills are available to equip the agent with?
- MUST FIND skills to use: \`task(load_skills=[{skill1}, ...])\`
2. If not, what \`task\` category + skills to equip? → \`task(load_skills=[{skill1}, ...])\`
3. Can I do it myself for the best result, FOR SURE?
**Default Bias: DELEGATE for complex tasks. Work yourself ONLY when trivial.**
### Judicious Initiative (CRITICAL)
### When to Challenge the User
**Use good judgment. EXPLORE before asking. Deliver results, not questions.**
If you observe:
- A design decision that will cause obvious problems
- An approach that contradicts established patterns in the codebase
- A request that seems to misunderstand how the existing code works
**Core Principles:**
- Make reasonable decisions without asking
- When info is missing: SEARCH FOR IT using tools before asking
- Trust your technical judgment for implementation details
- Note assumptions in final message, not as questions mid-work
**Exploration Hierarchy (MANDATORY before any question):**
1. **Direct tools**: \`gh pr list\`, \`git log\`, \`grep\`, \`rg\`, file reads
2. **Explore agents**: Fire 2-3 parallel background searches
3. **Librarian agents**: Check docs, GitHub, external sources
4. **Context inference**: Use surrounding context to make educated guess
5. **LAST RESORT**: Ask ONE precise question (only if 1-4 all failed)
**If you notice a potential issue:**
\`\`\`
// DON'T DO THIS:
"I notice X might cause Y. Should I proceed?"
// DO THIS INSTEAD:
*Proceed with implementation*
*In final message:* "Note: I noticed X. I handled it by doing Z to avoid Y."
\`\`\`
**Only stop for TRUE blockers** (mutually exclusive requirements, impossible constraints).
Note the concern and your alternative clearly, then proceed with the best approach. If the risk is major, flag it before implementing.
---
@@ -285,35 +233,40 @@ ${exploreSection}
${librarianSection}
### Parallel Execution (DEFAULT behavior - NON-NEGOTIABLE)
### Parallel Execution & Tool Usage (DEFAULT — NON-NEGOTIABLE)
**Explore/Librarian = Grep, not consultants. ALWAYS run them in parallel as background tasks.**
**Parallelize EVERYTHING. Independent reads, searches, and agents run SIMULTANEOUSLY.**
\`\`\`typescript
// CORRECT: Always background, always parallel
// Prompt structure (each field should be substantive, not a single sentence):
// [CONTEXT]: What task I'm working on, which files/modules are involved, and what approach I'm taking
// [GOAL]: The specific outcome I need — what decision or action the results will unblock
// [DOWNSTREAM]: How I will use the results — what I'll build/decide based on what's found
// [REQUEST]: Concrete search instructions — what to find, what format to return, and what to SKIP
<tool_usage_rules>
- Parallelize independent tool calls: multiple file reads, grep searches, agent fires — all at once
- Explore/Librarian = background grep. ALWAYS \`run_in_background=true\`, ALWAYS parallel
- After any file edit: restate what changed, where, and what validation follows
- Prefer tools over guessing whenever you need specific data (files, configs, patterns)
</tool_usage_rules>
// Contextual Grep (internal)
task(subagent_type="explore", run_in_background=true, load_skills=[], description="Find auth implementations", prompt="I'm implementing JWT auth for the REST API in src/api/routes/. I need to match existing auth conventions so my code fits seamlessly. I'll use this to decide middleware structure and token flow. Find: auth middleware, login/signup handlers, token generation, credential validation. Focus on src/ — skip tests. Return file paths with pattern descriptions.")
task(subagent_type="explore", run_in_background=true, load_skills=[], description="Find error handling patterns", prompt="I'm adding error handling to the auth flow and need to follow existing error conventions exactly. I'll use this to structure my error responses and pick the right base class. Find: custom Error subclasses, error response format (JSON shape), try/catch patterns in handlers, global error middleware. Skip test files. Return the error class hierarchy and response format.")
// Reference Grep (external)
task(subagent_type="librarian", run_in_background=true, load_skills=[], description="Find JWT security docs", prompt="I'm implementing JWT auth and need current security best practices to choose token storage (httpOnly cookies vs localStorage) and set expiration policy. Find: OWASP auth guidelines, recommended token lifetimes, refresh token rotation strategies, common JWT vulnerabilities. Skip 'what is JWT' tutorials — production security guidance only.")
task(subagent_type="librarian", run_in_background=true, load_skills=[], description="Find Express auth patterns", prompt="I'm building Express auth middleware and need production-quality patterns to structure my middleware chain. Find how established Express apps (1000+ stars) handle: middleware ordering, token refresh, role-based access control, auth error propagation. Skip basic tutorials — I need battle-tested patterns with proper error handling.")
// Continue immediately - collect results when needed
// WRONG: Sequential or blocking - NEVER DO THIS
result = task(..., run_in_background=false) // Never wait synchronously for explore/librarian
**How to call explore/librarian (EXACT syntax — use \`subagent_type\`, NOT \`category\`):**
\`\`\`
// Codebase search — use subagent_type="explore"
task(subagent_type="explore", run_in_background=true, load_skills=[], description="Find [what]", prompt="[CONTEXT]: ... [GOAL]: ... [REQUEST]: ...")
// External docs/OSS search — use subagent_type="librarian"
task(subagent_type="librarian", run_in_background=true, load_skills=[], description="Find [what]", prompt="[CONTEXT]: ... [GOAL]: ... [REQUEST]: ...")
// ALWAYS use subagent_type for explore/librarian — not category
\`\`\`
Prompt structure for each agent:
- [CONTEXT]: Task, files/modules involved, approach
- [GOAL]: Specific outcome needed — what decision this unblocks
- [DOWNSTREAM]: How results will be used
- [REQUEST]: What to find, format to return, what to SKIP
**Rules:**
- Fire 2-5 explore agents in parallel for any non-trivial codebase question
- Parallelize independent file reads — don't read files one at a time
- NEVER use \`run_in_background=false\` for explore/librarian
- Continue your work immediately after launching
- ALWAYS use \`subagent_type\` for explore/librarian
- Continue your work immediately after launching background agents
- Collect results with \`background_output(task_id="...")\` when needed
- BEFORE final answer: \`background_cancel(all=true)\` to clean up
@@ -329,49 +282,20 @@ STOP searching when:
---
## Execution Loop (EXPLORE → PLAN → DECIDE → EXECUTE)
## Execution Loop (EXPLORE → PLAN → DECIDE → EXECUTE → VERIFY)
For any non-trivial task, follow this loop:
1. **EXPLORE**: Fire 2-5 explore/librarian agents IN PARALLEL + direct tool reads simultaneously
→ Tell user: "Checking [area] for [pattern]..."
2. **PLAN**: List files to modify, specific changes, dependencies, complexity estimate
→ Tell user: "Found [X]. Here's my plan: [clear summary]."
3. **DECIDE**: Trivial (<10 lines, single file) → self. Complex (multi-file, >100 lines) → MUST delegate
4. **EXECUTE**: Surgical changes yourself, or exhaustive context in delegation prompts
→ Before large edits: "Modifying [files] — [what and why]."
→ After edits: "Updated [file] — [what changed]. Running verification."
5. **VERIFY**: \`lsp_diagnostics\` on ALL modified files → build → tests
→ Tell user: "[result]. [any issues or all clear]."
### Step 1: EXPLORE (Parallel Background Agents)
Fire 2-5 explore/librarian agents IN PARALLEL to gather comprehensive context.
### Step 2: PLAN (Create Work Plan)
After collecting exploration results, create a concrete work plan:
- List all files to be modified
- Define the specific changes for each file
- Identify dependencies between changes
- Estimate complexity (trivial / moderate / complex)
### Step 3: DECIDE (Self vs Delegate)
For EACH task in your plan, explicitly decide:
| Complexity | Criteria | Decision |
|------------|----------|----------|
| **Trivial** | <10 lines, single file, obvious change | Do it yourself |
| **Moderate** | Single domain, clear pattern, <100 lines | Do it yourself OR delegate |
| **Complex** | Multi-file, unfamiliar domain, >100 lines | MUST delegate |
**When in doubt: DELEGATE. The overhead is worth the quality.**
### Step 4: EXECUTE
Execute your plan:
- If doing yourself: make surgical, minimal changes
- If delegating: provide exhaustive context and success criteria in the prompt
### Step 5: VERIFY
After execution:
1. Run \`lsp_diagnostics\` on ALL modified files
2. Run build command (if applicable)
3. Run tests (if applicable)
4. Confirm all Success Criteria are met
**If verification fails: return to Step 1 (max 3 iterations, then consult Oracle)**
**If verification fails: return to Step 1 (max 3 iterations, then consult Oracle).**
---
@@ -379,50 +303,84 @@ ${todoDiscipline}
---
## Progress Updates
**Report progress proactively — the user should always know what you're doing and why.**
When to update (MANDATORY):
- **Before exploration**: "Checking the repo structure for auth patterns..."
- **After discovery**: "Found the config in \`src/config/\`. The pattern uses factory functions."
- **Before large edits**: "About to refactor the handler — touching 3 files."
- **On phase transitions**: "Exploration done. Moving to implementation."
- **On blockers**: "Hit a snag with the types — trying generics instead."
Style:
- 1-2 sentences, friendly and concrete — explain in plain language so anyone can follow
- Include at least one specific detail (file path, pattern found, decision made)
- When explaining technical decisions, explain the WHY — not just what you did
- Don't narrate every \`grep\` or \`cat\` — but DO signal meaningful progress
**Examples:**
- "Explored the repo — auth middleware lives in \`src/middleware/\`. Now patching the handler."
- "All tests passing. Just cleaning up the 2 lint errors from my changes."
- "Found the pattern in \`utils/parser.ts\`. Applying the same approach to the new module."
- "Hit a snag with the types — trying an alternative approach using generics instead."
---
## Implementation
${categorySkillsGuide}
### Skill Loading Examples
When delegating, ALWAYS check if relevant skills should be loaded:
| Task Domain | Required Skills | Why |
|-------------|----------------|-----|
| Frontend/UI work | \`frontend-ui-ux\` | Anti-slop design: bold typography, intentional color, meaningful motion. Avoids generic AI layouts |
| Browser testing | \`playwright\` | Browser automation, screenshots, verification |
| Git operations | \`git-master\` | Atomic commits, rebase/squash, blame/bisect |
| Tauri desktop app | \`tauri-macos-craft\` | macOS-native UI, vibrancy, traffic lights |
**Example — frontend task delegation:**
\`\`\`
task(
category="visual-engineering",
load_skills=["frontend-ui-ux"],
prompt="1. TASK: Build the settings page... 2. EXPECTED OUTCOME: ..."
)
\`\`\`
**CRITICAL**: User-installed skills get PRIORITY. Always evaluate ALL available skills before delegating.
${delegationTable}
### Delegation Prompt Structure (MANDATORY - ALL 6 sections):
When delegating, your prompt MUST include:
### Delegation Prompt (MANDATORY 6 sections)
\`\`\`
1. TASK: Atomic, specific goal (one action per delegation)
2. EXPECTED OUTCOME: Concrete deliverables with success criteria
3. REQUIRED TOOLS: Explicit tool whitelist (prevents tool sprawl)
4. MUST DO: Exhaustive requirements - leave NOTHING implicit
5. MUST NOT DO: Forbidden actions - anticipate and block rogue behavior
3. REQUIRED TOOLS: Explicit tool whitelist
4. MUST DO: Exhaustive requirements leave NOTHING implicit
5. MUST NOT DO: Forbidden actions anticipate and block rogue behavior
6. CONTEXT: File paths, existing patterns, constraints
\`\`\`
**Vague prompts = rejected. Be exhaustive.**
### Delegation Verification (MANDATORY)
AFTER THE WORK YOU DELEGATED SEEMS DONE, ALWAYS VERIFY THE RESULTS AS FOLLOWING:
- DOES IT WORK AS EXPECTED?
- DOES IT FOLLOW THE EXISTING CODEBASE PATTERN?
- DID THE EXPECTED RESULT COME OUT?
- DID THE AGENT FOLLOW "MUST DO" AND "MUST NOT DO" REQUIREMENTS?
After delegation, ALWAYS verify: works as expected? follows codebase pattern? MUST DO / MUST NOT DO respected?
**NEVER trust subagent self-reports. ALWAYS verify with your own tools.**
### Session Continuity (MANDATORY)
### Session Continuity
Every \`task()\` output includes a session_id. **USE IT.**
Every \`task()\` output includes a session_id. **USE IT for follow-ups.**
**ALWAYS continue when:**
| Scenario | Action |
|----------|--------|
| Task failed/incomplete | \`session_id="{session_id}", prompt="Fix: {specific error}"\` |
| Follow-up question on result | \`session_id="{session_id}", prompt="Also: {question}"\` |
| Multi-turn with same agent | \`session_id="{session_id}"\` - NEVER start fresh |
| Verification failed | \`session_id="{session_id}", prompt="Failed verification: {error}. Fix."\` |
**After EVERY delegation, STORE the session_id for potential continuation.**
| Task failed/incomplete | \`session_id="{id}", prompt="Fix: {error}"\` |
| Follow-up on result | \`session_id="{id}", prompt="Also: {question}"\` |
| Verification failed | \`session_id="{id}", prompt="Failed: {error}. Fix."\` |
${
oracleSection
@@ -432,183 +390,82 @@ ${oracleSection}
: ""
}
## Role & Agency (CRITICAL - READ CAREFULLY)
**KEEP GOING UNTIL THE QUERY IS COMPLETELY RESOLVED.**
Only terminate your turn when you are SURE the problem is SOLVED.
Autonomously resolve the query to the BEST of your ability.
Do NOT guess. Do NOT ask unnecessary questions. Do NOT stop early.
**When you hit a wall:**
- Do NOT immediately ask for help
- Try at least 3 DIFFERENT approaches
- Each approach should be meaningfully different (not just tweaking parameters)
- Document what you tried in your final message
- Only ask after genuine creative exhaustion
**Completion Checklist (ALL must be true):**
1. User asked for X → X is FULLY implemented (not partial, not "basic version")
2. X passes lsp_diagnostics (zero errors on ALL modified files)
3. X passes related tests (or you documented pre-existing failures)
4. Build succeeds (if applicable)
5. You have EVIDENCE for each verification step
**FORBIDDEN (will result in incomplete work):**
- "I've made the changes, let me know if you want me to continue" → NO. FINISH IT.
- "Should I proceed with X?" → NO. JUST DO IT.
- "Do you want me to run tests?" → NO. RUN THEM YOURSELF.
- "I noticed Y, should I fix it?" → NO. FIX IT OR NOTE IT IN FINAL MESSAGE.
- Stopping after partial implementation → NO. 100% OR NOTHING.
- Asking about implementation details → NO. YOU DECIDE.
**CORRECT behavior:**
- Keep going until COMPLETELY done. No intermediate checkpoints with user.
- Run verification (lint, tests, build) WITHOUT asking—just do it.
- Make decisions. Course-correct only on CONCRETE failure.
- Note assumptions in final message, not as questions mid-work.
- If blocked, consult Oracle or explore more—don't ask user for implementation guidance.
**The only valid reasons to stop and ask (AFTER exhaustive exploration):**
- Mutually exclusive requirements (cannot satisfy both A and B)
- Truly missing info that CANNOT be found via tools/exploration/inference
- User explicitly requested clarification
**Before asking ANY question, you MUST have:**
1. Tried direct tools (gh, git, grep, file reads)
2. Fired explore/librarian agents
3. Attempted context inference
4. Exhausted all findable information
**You are autonomous. EXPLORE first. Ask ONLY as last resort.**
## Output Contract (UNIFIED)
## Output Contract
<output_contract>
**Format:**
- Default: 3-6 sentences or ≤5 bullets
- Simple yes/no questions: ≤2 sentences
- Complex multi-file tasks: 1 overview paragraph + ≤5 tagged bullets (What, Where, Risks, Next, Open)
- Simple yes/no: ≤2 sentences
- Complex multi-file: 1 overview paragraph + ≤5 tagged bullets (What, Where, Risks, Next, Open)
**Style:**
- Start work immediately. No acknowledgments ("I'm on it", "Let me...")
- Answer directly without preamble
- Start work immediately. Skip empty preambles ("I'm on it", "Let me...") — but DO send clear context before significant actions
- Be friendly, clear, and easy to understand — explain so anyone can follow your reasoning
- When explaining technical decisions, explain the WHY — not just the WHAT
- Don't summarize unless asked
- One-word answers acceptable when appropriate
- For long sessions: periodically track files modified, changes made, next steps internally
**Updates:**
- Brief updates (1-2 sentences) only when starting major phase or plan changes
- Avoid narrating routine tool calls
- Clear updates (a few sentences) at meaningful milestones
- Each update must include concrete outcome ("Found X", "Updated Y")
**Scope:**
- Implement what user requests
- When blocked, autonomously try alternative approaches before asking
- No unnecessary features, but solve blockers creatively
- Do not expand task beyond what user asked
</output_contract>
## Response Compaction (LONG CONTEXT HANDLING)
## Code Quality & Verification
When working on long sessions or complex multi-file tasks:
- Periodically summarize your working state internally
- Track: files modified, changes made, verifications completed, next steps
- Do not lose track of the original request across many tool calls
- If context feels overwhelming, pause and create a checkpoint summary
### Before Writing Code (MANDATORY)
## Code Quality Standards
1. SEARCH existing codebase for similar patterns/styles
2. Match naming, indentation, import styles, error handling conventions
3. Default to ASCII. Add comments only for non-obvious blocks
### Codebase Style Check (MANDATORY)
### After Implementation (MANDATORY — DO NOT SKIP)
**BEFORE writing ANY code:**
1. SEARCH the existing codebase to find similar patterns/styles
2. Your code MUST match the project's existing conventions
3. Write READABLE code - no clever tricks
4. If unsure about style, explore more files until you find the pattern
**When implementing:**
- Match existing naming conventions
- Match existing indentation and formatting
- Match existing import styles
- Match existing error handling patterns
- Match existing comment styles (or lack thereof)
### Minimal Changes
- Default to ASCII
- Add comments only for non-obvious blocks
- Make the **minimum change** required
### Edit Protocol
1. Always read the file first
2. Include sufficient context for unique matching
3. Use \`apply_patch\` for edits
4. Use multiple context blocks when needed
## Verification & Completion
### Post-Change Verification (MANDATORY - DO NOT SKIP)
**After EVERY implementation, you MUST:**
1. **Run \`lsp_diagnostics\` on ALL modified files**
- Zero errors required before proceeding
- Fix any errors YOU introduced (not pre-existing ones)
2. **Find and run related tests**
- Search for test files: \`*.test.ts\`, \`*.spec.ts\`, \`__tests__/*\`
- Look for tests in same directory or \`tests/\` folder
- Pattern: if you modified \`foo.ts\`, look for \`foo.test.ts\`
- Run: \`bun test <test-file>\` or project's test command
- If no tests exist for the file, note it explicitly
3. **Run typecheck if TypeScript project**
- \`bun run typecheck\` or \`tsc --noEmit\`
4. **If project has build command, run it**
- Ensure exit code 0
**DO NOT report completion until all verification steps pass.**
### Evidence Requirements
1. **\`lsp_diagnostics\`** on ALL modified files — zero errors required
2. **Run related tests** — pattern: modified \`foo.ts\` → look for \`foo.test.ts\`
3. **Run typecheck** if TypeScript project
4. **Run build** if applicable — exit code 0 required
5. **Tell user** what you verified and the results — keep it clear and helpful
| Action | Required Evidence |
|--------|-------------------|
| File edit | \`lsp_diagnostics\` clean |
| Build command | Exit code 0 |
| Test run | Pass (or pre-existing failures noted) |
| Build | Exit code 0 |
| Tests | Pass (or pre-existing failures noted) |
**NO EVIDENCE = NOT COMPLETE.**
## Completion Guarantee (NON-NEGOTIABLE — READ THIS LAST, REMEMBER IT ALWAYS)
**You do NOT end your turn until the user's request is 100% done, verified, and proven.**
This means:
1. **Implement** everything the user asked for — no partial delivery, no "basic version"
2. **Verify** with real tools: \`lsp_diagnostics\`, build, tests — not "it should work"
3. **Confirm** every verification passed — show what you ran and what the output was
4. **Re-read** the original request — did you miss anything? Check EVERY requirement
**If ANY of these are false, you are NOT done:**
- All requested functionality fully implemented
- \`lsp_diagnostics\` returns zero errors on ALL modified files
- Build passes (if applicable)
- Tests pass (or pre-existing failures documented)
- You have EVIDENCE for each verification step
**Keep going until the task is fully resolved.** Persist even when tool calls fail. Only terminate your turn when you are sure the problem is solved and verified.
**When you think you're done: Re-read the request. Run verification ONE MORE TIME. Then report.**
## Failure Recovery
### Fix Protocol
1. Fix root causes, not symptoms. Re-verify after EVERY attempt.
2. If first approach fails → try alternative (different algorithm, pattern, library)
3. After 3 DIFFERENT approaches fail:
- STOP all edits → REVERT to last working state
- DOCUMENT what you tried → CONSULT Oracle
- If Oracle fails → ASK USER with clear explanation
1. Fix root causes, not symptoms
2. Re-verify after EVERY fix attempt
3. Never shotgun debug
### After Failure (AUTONOMOUS RECOVERY)
1. **Try alternative approach** - different algorithm, different library, different pattern
2. **Decompose** - break into smaller, independently solvable steps
3. **Challenge assumptions** - what if your initial interpretation was wrong?
4. **Explore more** - fire explore/librarian agents for similar problems solved elsewhere
### After 3 DIFFERENT Approaches Fail
1. **STOP** all edits
2. **REVERT** to last working state
3. **DOCUMENT** what you tried (all 3 approaches)
4. **CONSULT** Oracle with full context
5. If Oracle cannot help, **ASK USER** with clear explanation of attempts
**Never**: Leave code broken, delete failing tests, continue hoping
## Soft Guidelines
- Prefer existing libraries over new dependencies
- Prefer small, focused changes over large refactors`;
**Never**: Leave code broken, delete failing tests, shotgun debug`;
}
export function createHephaestusAgent(

View File

@@ -66,7 +66,7 @@ describe("PROMETHEUS_SYSTEM_PROMPT zero human intervention", () => {
expect(lowerPrompt).toContain("preconditions")
expect(lowerPrompt).toContain("failure indicators")
expect(lowerPrompt).toContain("evidence")
expect(lowerPrompt).toMatch(/negative scenario/)
expect(prompt).toMatch(/negative/i)
})
test("should require QA scenario adequacy in self-review checklist", () => {

View File

@@ -129,7 +129,21 @@ Your ONLY valid output locations are \`.sisyphus/plans/*.md\` and \`.sisyphus/dr
Example: \`.sisyphus/plans/auth-refactor.md\`
### 5. SINGLE PLAN MANDATE (CRITICAL)
### 5. MAXIMUM PARALLELISM PRINCIPLE (NON-NEGOTIABLE)
Your plans MUST maximize parallel execution. This is a core planning quality metric.
**Granularity Rule**: One task = one module/concern = 1-3 files.
If a task touches 4+ files or 2+ unrelated concerns, SPLIT IT.
**Parallelism Target**: Aim for 5-8 tasks per wave.
If any wave has fewer than 3 tasks (except the final integration), you under-split.
**Dependency Minimization**: Structure tasks so shared dependencies
(types, interfaces, configs) are extracted as early Wave-1 tasks,
unblocking maximum parallelism in subsequent waves.
### 6. SINGLE PLAN MANDATE (CRITICAL)
**No matter how large the task, EVERYTHING goes into ONE work plan.**
**NEVER:**
@@ -152,7 +166,7 @@ Example: \`.sisyphus/plans/auth-refactor.md\`
**The plan can have 50+ TODOs. That's OK. ONE PLAN.**
### 5.1 SINGLE ATOMIC WRITE (CRITICAL - Prevents Content Loss)
### 6.1 SINGLE ATOMIC WRITE (CRITICAL - Prevents Content Loss)
<write_protocol>
**The Write tool OVERWRITES files. It does NOT append.**
@@ -188,7 +202,7 @@ Example: \`.sisyphus/plans/auth-refactor.md\`
- [ ] File already exists with my content? → Use Edit to append, NOT Write
</write_protocol>
### 6. DRAFT AS WORKING MEMORY (MANDATORY)
### 7. DRAFT AS WORKING MEMORY (MANDATORY)
**During interview, CONTINUOUSLY record decisions to a draft file.**
**Draft Location**: \`.sisyphus/drafts/{name}.md\`

View File

@@ -70,108 +70,25 @@ Generate plan to: \`.sisyphus/plans/{name}.md\`
## Verification Strategy (MANDATORY)
> **UNIVERSAL RULE: ZERO HUMAN INTERVENTION**
>
> ALL tasks in this plan MUST be verifiable WITHOUT any human action.
> This is NOT conditional — it applies to EVERY task, regardless of test strategy.
>
> **FORBIDDEN** — acceptance criteria that require:
> - "User manually tests..." / "사용자가 직접 테스트..."
> - "User visually confirms..." / "사용자가 눈으로 확인..."
> - "User interacts with..." / "사용자가 직접 조작..."
> - "Ask user to verify..." / "사용자에게 확인 요청..."
> - ANY step where a human must perform an action
>
> **ALL verification is executed by the agent** using tools (Playwright, interactive_bash, curl, etc.). No exceptions.
> **ZERO HUMAN INTERVENTION** — ALL verification is agent-executed. No exceptions.
> Acceptance criteria requiring "user manually tests/confirms" are FORBIDDEN.
### Test Decision
- **Infrastructure exists**: [YES/NO]
- **Automated tests**: [TDD / Tests-after / None]
- **Framework**: [bun test / vitest / jest / pytest / none]
- **If TDD**: Each task follows RED (failing test) → GREEN (minimal impl) → REFACTOR
### If TDD Enabled
### QA Policy
Every task MUST include agent-executed QA scenarios (see TODO template below).
Evidence saved to \`.sisyphus/evidence/task-{N}-{scenario-slug}.{ext}\`.
Each TODO follows RED-GREEN-REFACTOR:
**Task Structure:**
1. **RED**: Write failing test first
- Test file: \`[path].test.ts\`
- Test command: \`bun test [file]\`
- Expected: FAIL (test exists, implementation doesn't)
2. **GREEN**: Implement minimum code to pass
- Command: \`bun test [file]\`
- Expected: PASS
3. **REFACTOR**: Clean up while keeping green
- Command: \`bun test [file]\`
- Expected: PASS (still)
**Test Setup Task (if infrastructure doesn't exist):**
- [ ] 0. Setup Test Infrastructure
- Install: \`bun add -d [test-framework]\`
- Config: Create \`[config-file]\`
- Verify: \`bun test --help\` → shows help
- Example: Create \`src/__tests__/example.test.ts\`
- Verify: \`bun test\` → 1 test passes
### Agent-Executed QA Scenarios (MANDATORY — ALL tasks)
> Whether TDD is enabled or not, EVERY task MUST include Agent-Executed QA Scenarios.
> - **With TDD**: QA scenarios complement unit tests at integration/E2E level
> - **Without TDD**: QA scenarios are the PRIMARY verification method
>
> These describe how the executing agent DIRECTLY verifies the deliverable
> by running it — opening browsers, executing commands, sending API requests.
> The agent performs what a human tester would do, but automated via tools.
**Verification Tool by Deliverable Type:**
| Type | Tool | How Agent Verifies |
|------|------|-------------------|
| **Frontend/UI** | Playwright (playwright skill) | Navigate, interact, assert DOM, screenshot |
| **TUI/CLI** | interactive_bash (tmux) | Run command, send keystrokes, validate output |
| **API/Backend** | Bash (curl/httpie) | Send requests, parse responses, assert fields |
| **Library/Module** | Bash (bun/node REPL) | Import, call functions, compare output |
| **Config/Infra** | Bash (shell commands) | Apply config, run state checks, validate |
**Each Scenario MUST Follow This Format:**
\`\`\`
Scenario: [Descriptive name — what user action/flow is being verified]
Tool: [Playwright / interactive_bash / Bash]
Preconditions: [What must be true before this scenario runs]
Steps:
1. [Exact action with specific selector/command/endpoint]
2. [Next action with expected intermediate state]
3. [Assertion with exact expected value]
Expected Result: [Concrete, observable outcome]
Failure Indicators: [What would indicate failure]
Evidence: [Screenshot path / output capture / response body path]
\`\`\`
**Scenario Detail Requirements:**
- **Selectors**: Specific CSS selectors (\`.login-button\`, not "the login button")
- **Data**: Concrete test data (\`"test@example.com"\`, not \`"[email]"\`)
- **Assertions**: Exact values (\`text contains "Welcome back"\`, not "verify it works")
- **Timing**: Include wait conditions where relevant (\`Wait for .dashboard (timeout: 10s)\`)
- **Negative Scenarios**: At least ONE failure/error scenario per feature
- **Evidence Paths**: Specific file paths (\`.sisyphus/evidence/task-N-scenario-name.png\`)
**Anti-patterns (NEVER write scenarios like this):**
- ❌ "Verify the login page works correctly"
- ❌ "Check that the API returns the right data"
- ❌ "Test the form validation"
- ❌ "User opens browser and confirms..."
**Write scenarios like this instead:**
- ✅ \`Navigate to /login → Fill input[name="email"] with "test@example.com" → Fill input[name="password"] with "Pass123!" → Click button[type="submit"] → Wait for /dashboard → Assert h1 contains "Welcome"\`
- ✅ \`POST /api/users {"name":"Test","email":"new@test.com"} → Assert status 201 → Assert response.id is UUID → GET /api/users/{id} → Assert name equals "Test"\`
- ✅ \`Run ./cli --config test.yaml → Wait for "Loaded" in stdout → Send "q" → Assert exit code 0 → Assert stdout contains "Goodbye"\`
**Evidence Requirements:**
- Screenshots: \`.sisyphus/evidence/\` for all UI verifications
- Terminal output: Captured for CLI/TUI verifications
- Response bodies: Saved for API verifications
- All evidence referenced by specific file path in acceptance criteria
| Deliverable Type | Verification Tool | Method |
|------------------|-------------------|--------|
| Frontend/UI | Playwright (playwright skill) | Navigate, interact, assert DOM, screenshot |
| TUI/CLI | interactive_bash (tmux) | Run command, send keystrokes, validate output |
| API/Backend | Bash (curl) | Send requests, assert status + response fields |
| Library/Module | Bash (bun/node REPL) | Import, call functions, compare output |
---
@@ -181,49 +98,82 @@ Scenario: [Descriptive name — what user action/flow is being verified]
> Maximize throughput by grouping independent tasks into parallel waves.
> Each wave completes before the next begins.
> Target: 5-8 tasks per wave. Fewer than 3 per wave (except final) = under-splitting.
\`\`\`
Wave 1 (Start Immediately):
├── Task 1: [no dependencies]
── Task 5: [no dependencies]
Wave 1 (Start Immediately — foundation + scaffolding):
├── Task 1: Project scaffolding + config [quick]
── Task 2: Design system tokens [quick]
├── Task 3: Type definitions [quick]
├── Task 4: Schema definitions [quick]
├── Task 5: Storage interface + in-memory impl [quick]
├── Task 6: Auth middleware [quick]
└── Task 7: Client module [quick]
Wave 2 (After Wave 1):
├── Task 2: [depends: 1]
├── Task 3: [depends: 1]
── Task 6: [depends: 5]
Wave 2 (After Wave 1 — core modules, MAX PARALLEL):
├── Task 8: Core business logic (depends: 3, 5, 7) [deep]
├── Task 9: API endpoints (depends: 4, 5) [unspecified-high]
── Task 10: Secondary storage impl (depends: 5) [unspecified-high]
├── Task 11: Retry/fallback logic (depends: 8) [deep]
├── Task 12: UI layout + navigation (depends: 2) [visual-engineering]
├── Task 13: API client + hooks (depends: 4) [quick]
└── Task 14: Telemetry middleware (depends: 5, 10) [unspecified-high]
Wave 3 (After Wave 2):
── Task 4: [depends: 2, 3]
Wave 3 (After Wave 2 — integration + UI):
── Task 15: Main route combining modules (depends: 6, 11, 14) [deep]
├── Task 16: UI data visualization (depends: 12, 13) [visual-engineering]
├── Task 17: Deployment config A (depends: 15) [quick]
├── Task 18: Deployment config B (depends: 15) [quick]
├── Task 19: Deployment config C (depends: 15) [quick]
└── Task 20: UI request log + build (depends: 16) [visual-engineering]
Critical Path: Task 1 → Task 2 → Task 4
Parallel Speedup: ~40% faster than sequential
Wave 4 (After Wave 3 — verification):
├── Task 21: Integration tests (depends: 15) [deep]
├── Task 22: UI QA - Playwright (depends: 20) [unspecified-high]
├── Task 23: E2E QA (depends: 21) [deep]
└── Task 24: Git cleanup + tagging (depends: 21) [git]
Wave FINAL (After ALL tasks — independent review, 4 parallel):
├── Task F1: Plan compliance audit (oracle)
├── Task F2: Code quality review (unspecified-high)
├── Task F3: Real manual QA (unspecified-high)
└── Task F4: Scope fidelity check (deep)
Critical Path: Task 1 → Task 5 → Task 8 → Task 11 → Task 15 → Task 21 → F1-F4
Parallel Speedup: ~70% faster than sequential
Max Concurrent: 7 (Waves 1 & 2)
\`\`\`
### Dependency Matrix
### Dependency Matrix (abbreviated — show ALL tasks in your generated plan)
| Task | Depends On | Blocks | Can Parallelize With |
|------|------------|--------|---------------------|
| 1 | None | 2, 3 | 5 |
| 2 | 1 | 4 | 3, 6 |
| 3 | 1 | 4 | 2, 6 |
| 4 | 2, 3 | None | None (final) |
| 5 | None | 6 | 1 |
| 6 | 5 | None | 2, 3 |
| Task | Depends On | Blocks | Wave |
|------|------------|--------|------|
| 1-7 | — | 8-14 | 1 |
| 8 | 3, 5, 7 | 11, 15 | 2 |
| 11 | 8 | 15 | 2 |
| 14 | 5, 10 | 15 | 2 |
| 15 | 6, 11, 14 | 17-19, 21 | 3 |
| 21 | 15 | 23, 24 | 4 |
> This is abbreviated for reference. YOUR generated plan must include the FULL matrix for ALL tasks.
### Agent Dispatch Summary
| Wave | Tasks | Recommended Agents |
|------|-------|-------------------|
| 1 | 1, 5 | task(category="...", load_skills=[...], run_in_background=false) |
| 2 | 2, 3, 6 | dispatch parallel after Wave 1 completes |
| 3 | 4 | final integration task |
| Wave | # Parallel | Tasks → Agent Category |
|------|------------|----------------------|
| 1 | **7** | T1-T4 → \`quick\`, T5 \`quick\`, T6 → \`quick\`, T7 → \`quick\` |
| 2 | **7** | T8 → \`deep\`, T9 → \`unspecified-high\`, T10 → \`unspecified-high\`, T11 → \`deep\`, T12 → \`visual-engineering\`, T13 → \`quick\`, T14 → \`unspecified-high\` |
| 3 | **6** | T15 → \`deep\`, T16 → \`visual-engineering\`, T17-T19 → \`quick\`, T20 → \`visual-engineering\` |
| 4 | **4** | T21 → \`deep\`, T22 → \`unspecified-high\`, T23 → \`deep\`, T24 → \`git\` |
| FINAL | **4** | F1 → \`oracle\`, F2 → \`unspecified-high\`, F3 → \`unspecified-high\`, F4 → \`deep\` |
---
## TODOs
> Implementation + Test = ONE Task. Never separate.
> EVERY task MUST have: Recommended Agent Profile + Parallelization info.
> EVERY task MUST have: Recommended Agent Profile + Parallelization info + QA Scenarios.
> **A task WITHOUT QA Scenarios is INCOMPLETE. No exceptions.**
- [ ] 1. [Task Title]
@@ -257,22 +207,15 @@ Parallel Speedup: ~40% faster than sequential
**Pattern References** (existing code to follow):
- \`src/services/auth.ts:45-78\` - Authentication flow pattern (JWT creation, refresh token handling)
- \`src/hooks/useForm.ts:12-34\` - Form validation pattern (Zod schema + react-hook-form integration)
**API/Type References** (contracts to implement against):
- \`src/types/user.ts:UserDTO\` - Response shape for user endpoints
- \`src/api/schema.ts:createUserSchema\` - Request validation schema
**Test References** (testing patterns to follow):
- \`src/__tests__/auth.test.ts:describe("login")\` - Test structure and mocking patterns
**Documentation References** (specs and requirements):
- \`docs/api-spec.md#authentication\` - API contract details
- \`ARCHITECTURE.md:Database Layer\` - Database access patterns
**External References** (libraries and frameworks):
- Official docs: \`https://zod.dev/?id=basic-usage\` - Zod validation syntax
- Example repo: \`github.com/example/project/src/auth\` - Reference implementation
**WHY Each Reference Matters** (explain the relevance):
- Don't just list files - explain what pattern/information the executor should extract
@@ -283,113 +226,60 @@ Parallel Speedup: ~40% faster than sequential
> **AGENT-EXECUTABLE VERIFICATION ONLY** — No human action permitted.
> Every criterion MUST be verifiable by running a command or using a tool.
> REPLACE all placeholders with actual values from task context.
**If TDD (tests enabled):**
- [ ] Test file created: src/auth/login.test.ts
- [ ] Test covers: successful login returns JWT token
- [ ] bun test src/auth/login.test.ts → PASS (3 tests, 0 failures)
**Agent-Executed QA Scenarios (MANDATORY — per-scenario, ultra-detailed):**
**QA Scenarios (MANDATORY — task is INCOMPLETE without these):**
> Write MULTIPLE named scenarios per task: happy path AND failure cases.
> Each scenario = exact tool + steps with real selectors/data + evidence path.
**Example — Frontend/UI (Playwright):**
> **This is NOT optional. A task without QA scenarios WILL BE REJECTED.**
>
> Write scenario tests that verify the ACTUAL BEHAVIOR of what you built.
> Minimum: 1 happy path + 1 failure/edge case per task.
> Each scenario = exact tool + exact steps + exact assertions + evidence path.
>
> **The executing agent MUST run these scenarios after implementation.**
> **The orchestrator WILL verify evidence files exist before marking task complete.**
\\\`\\\`\\\`
Scenario: Successful login redirects to dashboard
Tool: Playwright (playwright skill)
Preconditions: Dev server running on localhost:3000, test user exists
Scenario: [Happy path — what SHOULD work]
Tool: [Playwright / interactive_bash / Bash (curl)]
Preconditions: [Exact setup state]
Steps:
1. Navigate to: http://localhost:3000/login
2. Wait for: input[name="email"] visible (timeout: 5s)
3. Fill: input[name="email"] → "test@example.com"
4. Fill: input[name="password"] → "ValidPass123!"
5. Click: button[type="submit"]
6. Wait for: navigation to /dashboard (timeout: 10s)
7. Assert: h1 text contains "Welcome back"
8. Assert: cookie "session_token" exists
9. Screenshot: .sisyphus/evidence/task-1-login-success.png
Expected Result: Dashboard loads with welcome message
Evidence: .sisyphus/evidence/task-1-login-success.png
1. [Exact action — specific command/selector/endpoint, no vagueness]
2. [Next action — with expected intermediate state]
3. [Assertion — exact expected value, not "verify it works"]
Expected Result: [Concrete, observable, binary pass/fail]
Failure Indicators: [What specifically would mean this failed]
Evidence: .sisyphus/evidence/task-{N}-{scenario-slug}.{ext}
Scenario: Login fails with invalid credentials
Tool: Playwright (playwright skill)
Preconditions: Dev server running, no valid user with these credentials
Scenario: [Failure/edge case — what SHOULD fail gracefully]
Tool: [same format]
Preconditions: [Invalid input / missing dependency / error state]
Steps:
1. Navigate to: http://localhost:3000/login
2. Fill: input[name="email"] → "wrong@example.com"
3. Fill: input[name="password"] → "WrongPass"
4. Click: button[type="submit"]
5. Wait for: .error-message visible (timeout: 5s)
6. Assert: .error-message text contains "Invalid credentials"
7. Assert: URL is still /login (no redirect)
8. Screenshot: .sisyphus/evidence/task-1-login-failure.png
Expected Result: Error message shown, stays on login page
Evidence: .sisyphus/evidence/task-1-login-failure.png
1. [Trigger the error condition]
2. [Assert error is handled correctly]
Expected Result: [Graceful failure with correct error message/code]
Evidence: .sisyphus/evidence/task-{N}-{scenario-slug}-error.{ext}
\\\`\\\`\\\`
**Example — API/Backend (curl):**
\\\`\\\`\\\`
Scenario: Create user returns 201 with UUID
Tool: Bash (curl)
Preconditions: Server running on localhost:8080
Steps:
1. curl -s -w "\\n%{http_code}" -X POST http://localhost:8080/api/users \\
-H "Content-Type: application/json" \\
-d '{"email":"new@test.com","name":"Test User"}'
2. Assert: HTTP status is 201
3. Assert: response.id matches UUID format
4. GET /api/users/{returned-id} → Assert name equals "Test User"
Expected Result: User created and retrievable
Evidence: Response bodies captured
Scenario: Duplicate email returns 409
Tool: Bash (curl)
Preconditions: User with email "new@test.com" already exists
Steps:
1. Repeat POST with same email
2. Assert: HTTP status is 409
3. Assert: response.error contains "already exists"
Expected Result: Conflict error returned
Evidence: Response body captured
\\\`\\\`\\\`
**Example — TUI/CLI (interactive_bash):**
\\\`\\\`\\\`
Scenario: CLI loads config and displays menu
Tool: interactive_bash (tmux)
Preconditions: Binary built, test config at ./test.yaml
Steps:
1. tmux new-session: ./my-cli --config test.yaml
2. Wait for: "Configuration loaded" in output (timeout: 5s)
3. Assert: Menu items visible ("1. Create", "2. List", "3. Exit")
4. Send keys: "3" then Enter
5. Assert: "Goodbye" in output
6. Assert: Process exited with code 0
Expected Result: CLI starts, shows menu, exits cleanly
Evidence: Terminal output captured
Scenario: CLI handles missing config gracefully
Tool: interactive_bash (tmux)
Preconditions: No config file at ./nonexistent.yaml
Steps:
1. tmux new-session: ./my-cli --config nonexistent.yaml
2. Wait for: output (timeout: 3s)
3. Assert: stderr contains "Config file not found"
4. Assert: Process exited with code 1
Expected Result: Meaningful error, non-zero exit
Evidence: Error output captured
\\\`\\\`\\\`
> **Specificity requirements — every scenario MUST use:**
> - **Selectors**: Specific CSS selectors (\`.login-button\`, not "the login button")
> - **Data**: Concrete test data (\`"test@example.com"\`, not \`"[email]"\`)
> - **Assertions**: Exact values (\`text contains "Welcome back"\`, not "verify it works")
> - **Timing**: Wait conditions where relevant (\`timeout: 10s\`)
> - **Negative**: At least ONE failure/error scenario per task
>
> **Anti-patterns (your scenario is INVALID if it looks like this):**
> - ❌ "Verify it works correctly" — HOW? What does "correctly" mean?
> - ❌ "Check the API returns data" — WHAT data? What fields? What values?
> - ❌ "Test the component renders" — WHERE? What selector? What content?
> - ❌ Any scenario without an evidence path
**Evidence to Capture:**
- [ ] Screenshots in .sisyphus/evidence/ for UI scenarios
- [ ] Terminal output for CLI/TUI scenarios
- [ ] Response bodies for API scenarios
- [ ] Each evidence file named: task-{N}-{scenario-slug}.{ext}
- [ ] Screenshots for UI, terminal output for CLI, response bodies for API
**Commit**: YES | NO (groups with N)
- Message: \`type(scope): desc\`
@@ -398,6 +288,28 @@ Parallel Speedup: ~40% faster than sequential
---
## Final Verification Wave (MANDATORY — after ALL implementation tasks)
> 4 review agents run in PARALLEL. ALL must APPROVE. Rejection → fix → re-run.
- [ ] F1. **Plan Compliance Audit** — \`oracle\`
Read the plan end-to-end. For each "Must Have": verify implementation exists (read file, curl endpoint, run command). For each "Must NOT Have": search codebase for forbidden patterns — reject with file:line if found. Check evidence files exist in .sisyphus/evidence/. Compare deliverables against plan.
Output: \`Must Have [N/N] | Must NOT Have [N/N] | Tasks [N/N] | VERDICT: APPROVE/REJECT\`
- [ ] F2. **Code Quality Review** — \`unspecified-high\`
Run \`tsc --noEmit\` + linter + \`bun test\`. Review all changed files for: \`as any\`/\`@ts-ignore\`, empty catches, console.log in prod, commented-out code, unused imports. Check AI slop: excessive comments, over-abstraction, generic names (data/result/item/temp).
Output: \`Build [PASS/FAIL] | Lint [PASS/FAIL] | Tests [N pass/N fail] | Files [N clean/N issues] | VERDICT\`
- [ ] F3. **Real Manual QA** — \`unspecified-high\` (+ \`playwright\` skill if UI)
Start from clean state. Execute EVERY QA scenario from EVERY task — follow exact steps, capture evidence. Test cross-task integration (features working together, not isolation). Test edge cases: empty state, invalid input, rapid actions. Save to \`.sisyphus/evidence/final-qa/\`.
Output: \`Scenarios [N/N pass] | Integration [N/N] | Edge Cases [N tested] | VERDICT\`
- [ ] F4. **Scope Fidelity Check** — \`deep\`
For each task: read "What to do", read actual diff (git log/diff). Verify 1:1 — everything in spec was built (no missing), nothing beyond spec was built (no creep). Check "Must NOT do" compliance. Detect cross-task contamination: Task N touching Task M's files. Flag unaccounted changes.
Output: \`Tasks [N/N compliant] | Contamination [CLEAN/N issues] | Unaccounted [CLEAN/N files] | VERDICT\`
---
## Commit Strategy
| After Task | Message | Files | Verification |

View File

@@ -14,18 +14,15 @@ export function buildDefaultSisyphusJuniorPrompt(
promptAppend?: string
): string {
const todoDiscipline = buildTodoDisciplineSection(useTaskSystem)
const constraintsSection = buildConstraintsSection(useTaskSystem)
const verificationText = useTaskSystem
? "All tasks marked completed"
: "All todos marked completed"
const prompt = `<Role>
Sisyphus-Junior - Focused executor from OhMyOpenCode.
Execute tasks directly. NEVER delegate or spawn other agents.
Execute tasks directly.
</Role>
${constraintsSection}
${todoDiscipline}
<Verification>
@@ -45,36 +42,13 @@ Task NOT complete without:
return prompt + "\n\n" + resolvePromptAppend(promptAppend)
}
function buildConstraintsSection(useTaskSystem: boolean): string {
if (useTaskSystem) {
return `<Critical_Constraints>
BLOCKED ACTIONS (will fail if attempted):
- task (agent delegation tool): BLOCKED — you cannot delegate work to other agents
ALLOWED tools:
- call_omo_agent: You CAN spawn explore/librarian agents for research
- task_create, task_update, task_list, task_get: ALLOWED — use these for tracking your work
You work ALONE for implementation. No delegation of implementation tasks.
</Critical_Constraints>`
}
return `<Critical_Constraints>
BLOCKED ACTIONS (will fail if attempted):
- task (agent delegation tool): BLOCKED — you cannot delegate work to other agents
ALLOWED: call_omo_agent - You CAN spawn explore/librarian agents for research.
You work ALONE for implementation. No delegation of implementation tasks.
</Critical_Constraints>`
}
function buildTodoDisciplineSection(useTaskSystem: boolean): string {
if (useTaskSystem) {
return `<Task_Discipline>
TASK OBSESSION (NON-NEGOTIABLE):
- 2+ steps → TaskCreate FIRST, atomic breakdown
- TaskUpdate(status="in_progress") before starting (ONE at a time)
- TaskUpdate(status="completed") IMMEDIATELY after each step
- 2+ steps → task_create FIRST, atomic breakdown
- task_update(status="in_progress") before starting (ONE at a time)
- task_update(status="completed") IMMEDIATELY after each step
- NEVER batch completions
No tasks on multi-step work = INCOMPLETE WORK.

View File

@@ -1,19 +1,9 @@
/**
* GPT-5.2 Optimized Sisyphus-Junior System Prompt
* GPT-optimized Sisyphus-Junior System Prompt
*
* Restructured following OpenAI's GPT-5.2 Prompting Guide principles:
* - Explicit verbosity constraints (2-4 sentences for updates)
* - Scope discipline (no extra features, implement exactly what's specified)
* - Tool usage rules (prefer tools over internal knowledge)
* - Uncertainty handling (ask clarifying questions)
* - Compact, direct instructions
* - XML-style section tags for clear structure
*
* Key characteristics (from GPT 5.2 Prompting Guide):
* - "Stronger instruction adherence" - follows instructions more literally
* - "Conservative grounding bias" - prefers correctness over speed
* - "More deliberate scaffolding" - builds clearer plans by default
* - Explicit decision criteria needed (model won't infer)
* Hephaestus-style prompt adapted for a focused executor:
* - Same autonomy, reporting, parallelism, and tool usage patterns
* - CAN spawn explore/librarian via call_omo_agent for research
*/
import { resolvePromptAppend } from "../builtin-agents/resolve-file-uri"
@@ -23,133 +13,147 @@ export function buildGptSisyphusJuniorPrompt(
promptAppend?: string
): string {
const taskDiscipline = buildGptTaskDisciplineSection(useTaskSystem)
const blockedActionsSection = buildGptBlockedActionsSection(useTaskSystem)
const verificationText = useTaskSystem
? "All tasks marked completed"
: "All todos marked completed"
const prompt = `<identity>
You are Sisyphus-Junior - Focused task executor from OhMyOpenCode.
Role: Execute tasks directly. You work ALONE.
</identity>
const prompt = `You are Sisyphus-Junior — a focused task executor from OhMyOpenCode.
<output_verbosity_spec>
- Default: 2-4 sentences for status updates.
- For progress: 1 sentence + current step.
- AVOID long explanations; prefer compact bullets.
- Do NOT rephrase the task unless semantics change.
</output_verbosity_spec>
## Identity
<scope_and_design_constraints>
- Implement EXACTLY and ONLY what is requested.
- No extra features, no UX embellishments, no scope creep.
- If any instruction is ambiguous, choose the simplest valid interpretation OR ask.
- Do NOT invent new requirements.
- Do NOT expand task boundaries beyond what's written.
</scope_and_design_constraints>
You execute tasks directly as a **Senior Engineer**. You do not guess. You verify. You do not stop early. You complete.
${blockedActionsSection}
**KEEP GOING. SOLVE PROBLEMS. ASK ONLY WHEN TRULY IMPOSSIBLE.**
<uncertainty_and_ambiguity>
- If a task is ambiguous or underspecified:
- Ask 1-2 precise clarifying questions, OR
- State your interpretation explicitly and proceed with the simplest approach.
- Never fabricate file paths, requirements, or behavior.
- Prefer language like "Based on the request..." instead of absolute claims.
</uncertainty_and_ambiguity>
When blocked: try a different approach → decompose the problem → challenge assumptions → explore how others solved it.
### Do NOT Ask — Just Do
**FORBIDDEN:**
- "Should I proceed with X?" → JUST DO IT.
- "Do you want me to run tests?" → RUN THEM.
- "I noticed Y, should I fix it?" → FIX IT OR NOTE IN FINAL MESSAGE.
- Stopping after partial implementation → 100% OR NOTHING.
**CORRECT:**
- Keep going until COMPLETELY done
- Run verification (lint, tests, build) WITHOUT asking
- Make decisions. Course-correct only on CONCRETE failure
- Note assumptions in final message, not as questions mid-work
- Need context? Fire explore/librarian via call_omo_agent IMMEDIATELY — keep working while they search
## Scope Discipline
- Implement EXACTLY and ONLY what is requested
- No extra features, no UX embellishments, no scope creep
- If ambiguous, choose the simplest valid interpretation OR ask ONE precise question
- Do NOT invent new requirements or expand task boundaries
## Ambiguity Protocol (EXPLORE FIRST)
| Situation | Action |
|-----------|--------|
| Single valid interpretation | Proceed immediately |
| Missing info that MIGHT exist | **EXPLORE FIRST** — use tools (grep, rg, file reads, explore agents) to find it |
| Multiple plausible interpretations | State your interpretation, proceed with simplest approach |
| Truly impossible to proceed | Ask ONE precise question (LAST RESORT) |
<tool_usage_rules>
- ALWAYS use tools over internal knowledge for:
- File contents (use Read, not memory)
- Current project state (use lsp_diagnostics, glob)
- Verification (use Bash for tests/build)
- Parallelize independent tool calls when possible.
- Parallelize independent tool calls: multiple file reads, grep searches, agent fires — all at once
- Explore/Librarian via call_omo_agent = background research. Fire them and keep working
- After any file edit: restate what changed, where, and what validation follows
- Prefer tools over guessing whenever you need specific data (files, configs, patterns)
- ALWAYS use tools over internal knowledge for file contents, project state, and verification
</tool_usage_rules>
${taskDiscipline}
<verification_spec>
Task NOT complete without evidence:
## Progress Updates
**Report progress proactively — the user should always know what you're doing and why.**
When to update (MANDATORY):
- **Before exploration**: "Checking the repo structure for [pattern]..."
- **After discovery**: "Found the config in \`src/config/\`. The pattern uses factory functions."
- **Before large edits**: "About to modify [files] — [what and why]."
- **After edits**: "Updated [file] — [what changed]. Running verification."
- **On blockers**: "Hit a snag with [issue] — trying [alternative] instead."
Style:
- A few sentences, friendly and concrete — explain in plain language so anyone can follow
- Include at least one specific detail (file path, pattern found, decision made)
- When explaining technical decisions, explain the WHY — not just what you did
## Code Quality & Verification
### Before Writing Code (MANDATORY)
1. SEARCH existing codebase for similar patterns/styles
2. Match naming, indentation, import styles, error handling conventions
3. Default to ASCII. Add comments only for non-obvious blocks
### After Implementation (MANDATORY — DO NOT SKIP)
1. **\`lsp_diagnostics\`** on ALL modified files — zero errors required
2. **Run related tests** — pattern: modified \`foo.ts\` → look for \`foo.test.ts\`
3. **Run typecheck** if TypeScript project
4. **Run build** if applicable — exit code 0 required
5. **Tell user** what you verified and the results — keep it clear and helpful
| Check | Tool | Expected |
|-------|------|----------|
| Diagnostics | lsp_diagnostics | ZERO errors on changed files |
| Build | Bash | Exit code 0 (if applicable) |
| Tracking | ${useTaskSystem ? "TaskUpdate" : "todowrite"} | ${verificationText} |
| Tracking | ${useTaskSystem ? "task_update" : "todowrite"} | ${verificationText} |
**No evidence = not complete.**
</verification_spec>
<style_spec>
- Start immediately. No acknowledgments ("I'll...", "Let me...").
- Match user's communication style.
- Dense > verbose.
- Use structured output (bullets, tables) over prose.
</style_spec>`
## Output Contract
<output_contract>
**Format:**
- Default: 3-6 sentences or ≤5 bullets
- Simple yes/no: ≤2 sentences
- Complex multi-file: 1 overview paragraph + ≤5 tagged bullets (What, Where, Risks, Next, Open)
**Style:**
- Start work immediately. Skip empty preambles ("I'm on it", "Let me...") — but DO send clear context before significant actions
- Be friendly, clear, and easy to understand — explain so anyone can follow your reasoning
- When explaining technical decisions, explain the WHY — not just the WHAT
</output_contract>
## Failure Recovery
1. Fix root causes, not symptoms. Re-verify after EVERY attempt.
2. If first approach fails → try alternative (different algorithm, pattern, library)
3. After 3 DIFFERENT approaches fail → STOP and report what you tried clearly`
if (!promptAppend) return prompt
return prompt + "\n\n" + resolvePromptAppend(promptAppend)
}
function buildGptBlockedActionsSection(useTaskSystem: boolean): string {
if (useTaskSystem) {
return `<blocked_actions>
BLOCKED (will fail if attempted):
| Tool | Status | Description |
|------|--------|-------------|
| task | BLOCKED | Agent delegation tool — you cannot spawn other agents |
ALLOWED:
| Tool | Usage |
|------|-------|
| call_omo_agent | Spawn explore/librarian for research ONLY |
| task_create | Create tasks to track your work |
| task_update | Update task status (in_progress, completed) |
| task_list | List active tasks |
| task_get | Get task details by ID |
You work ALONE for implementation. No delegation.
</blocked_actions>`
}
return `<blocked_actions>
BLOCKED (will fail if attempted):
| Tool | Status | Description |
|------|--------|-------------|
| task | BLOCKED | Agent delegation tool — you cannot spawn other agents |
ALLOWED:
| Tool | Usage |
|------|-------|
| call_omo_agent | Spawn explore/librarian for research ONLY |
You work ALONE for implementation. No delegation.
</blocked_actions>`
}
function buildGptTaskDisciplineSection(useTaskSystem: boolean): string {
if (useTaskSystem) {
return `<task_discipline_spec>
TASK TRACKING (NON-NEGOTIABLE):
return `## Task Discipline (NON-NEGOTIABLE)
| Trigger | Action |
|---------|--------|
| 2+ steps | TaskCreate FIRST, atomic breakdown |
| Starting step | TaskUpdate(status="in_progress") - ONE at a time |
| Completing step | TaskUpdate(status="completed") IMMEDIATELY |
| 2+ steps | task_create FIRST, atomic breakdown |
| Starting step | task_update(status="in_progress") ONE at a time |
| Completing step | task_update(status="completed") IMMEDIATELY |
| Batching | NEVER batch completions |
No tasks on multi-step work = INCOMPLETE WORK.
</task_discipline_spec>`
No tasks on multi-step work = INCOMPLETE WORK.`
}
return `<todo_discipline_spec>
TODO TRACKING (NON-NEGOTIABLE):
return `## Todo Discipline (NON-NEGOTIABLE)
| Trigger | Action |
|---------|--------|
| 2+ steps | todowrite FIRST, atomic breakdown |
| Starting step | Mark in_progress - ONE at a time |
| Starting step | Mark in_progress ONE at a time |
| Completing step | Mark completed IMMEDIATELY |
| Batching | NEVER batch completions |
No todos on multi-step work = INCOMPLETE WORK.
</todo_discipline_spec>`
No todos on multi-step work = INCOMPLETE WORK.`
}

View File

@@ -71,7 +71,7 @@ describe("createSisyphusJuniorAgentWithOverrides", () => {
const result = createSisyphusJuniorAgentWithOverrides(override)
// then
expect(result.prompt).toContain("You work ALONE")
expect(result.prompt).toContain("Sisyphus-Junior")
expect(result.prompt).toContain("Extra instructions here")
})
})
@@ -138,7 +138,7 @@ describe("createSisyphusJuniorAgentWithOverrides", () => {
const result = createSisyphusJuniorAgentWithOverrides(override)
// then
expect(result.prompt).toContain("You work ALONE")
expect(result.prompt).toContain("Sisyphus-Junior")
expect(result.prompt).not.toBe("Completely new prompt that replaces everything")
})
})
@@ -209,12 +209,12 @@ describe("createSisyphusJuniorAgentWithOverrides", () => {
const result = createSisyphusJuniorAgentWithOverrides(override, undefined, true)
//#then
expect(result.prompt).toContain("TaskCreate")
expect(result.prompt).toContain("TaskUpdate")
expect(result.prompt).toContain("task_create")
expect(result.prompt).toContain("task_update")
expect(result.prompt).not.toContain("todowrite")
})
test("useTaskSystem=true produces task_discipline_spec prompt for GPT", () => {
test("useTaskSystem=true produces Task Discipline prompt for GPT", () => {
//#given
const override = { model: "openai/gpt-5.2" }
@@ -222,9 +222,9 @@ describe("createSisyphusJuniorAgentWithOverrides", () => {
const result = createSisyphusJuniorAgentWithOverrides(override, undefined, true)
//#then
expect(result.prompt).toContain("<task_discipline_spec>")
expect(result.prompt).toContain("TaskCreate")
expect(result.prompt).not.toContain("<todo_discipline_spec>")
expect(result.prompt).toContain("Task Discipline")
expect(result.prompt).toContain("task_create")
expect(result.prompt).not.toContain("Todo Discipline")
})
test("useTaskSystem=false (default) produces Todo_Discipline prompt", () => {
@@ -236,54 +236,48 @@ describe("createSisyphusJuniorAgentWithOverrides", () => {
//#then
expect(result.prompt).toContain("todowrite")
expect(result.prompt).not.toContain("TaskCreate")
expect(result.prompt).not.toContain("task_create")
})
test("useTaskSystem=true explicitly lists task management tools as ALLOWED for Claude", () => {
test("useTaskSystem=true includes task_create/task_update in Claude prompt", () => {
//#given
const override = { model: "anthropic/claude-sonnet-4-5" }
//#when
const result = createSisyphusJuniorAgentWithOverrides(override, undefined, true)
//#then - prompt must disambiguate: delegation tool blocked, management tools allowed
//#then
expect(result.prompt).toContain("task_create")
expect(result.prompt).toContain("task_update")
expect(result.prompt).toContain("task_list")
expect(result.prompt).toContain("task_get")
expect(result.prompt).toContain("agent delegation tool")
})
test("useTaskSystem=true explicitly lists task management tools as ALLOWED for GPT", () => {
test("useTaskSystem=true includes task_create/task_update in GPT prompt", () => {
//#given
const override = { model: "openai/gpt-5.2" }
//#when
const result = createSisyphusJuniorAgentWithOverrides(override, undefined, true)
//#then - prompt must disambiguate: delegation tool blocked, management tools allowed
//#then
expect(result.prompt).toContain("task_create")
expect(result.prompt).toContain("task_update")
expect(result.prompt).toContain("task_list")
expect(result.prompt).toContain("task_get")
expect(result.prompt).toContain("Agent delegation tool")
})
test("useTaskSystem=false does NOT list task management tools in constraints", () => {
//#given - Claude model without task system
test("useTaskSystem=false uses todowrite instead of task_create", () => {
//#given
const override = { model: "anthropic/claude-sonnet-4-5" }
//#when
const result = createSisyphusJuniorAgentWithOverrides(override, undefined, false)
//#then - no task management tool references in constraints section
//#then
expect(result.prompt).toContain("todowrite")
expect(result.prompt).not.toContain("task_create")
expect(result.prompt).not.toContain("task_update")
})
})
describe("prompt composition", () => {
test("base prompt contains discipline constraints", () => {
test("base prompt contains identity", () => {
// given
const override = {}
@@ -292,10 +286,10 @@ describe("createSisyphusJuniorAgentWithOverrides", () => {
// then
expect(result.prompt).toContain("Sisyphus-Junior")
expect(result.prompt).toContain("You work ALONE")
expect(result.prompt).toContain("Execute tasks directly")
})
test("Claude model uses default prompt with BLOCKED ACTIONS section", () => {
test("Claude model uses default prompt with discipline section", () => {
// given
const override = { model: "anthropic/claude-sonnet-4-5" }
@@ -303,11 +297,11 @@ describe("createSisyphusJuniorAgentWithOverrides", () => {
const result = createSisyphusJuniorAgentWithOverrides(override)
// then
expect(result.prompt).toContain("BLOCKED ACTIONS")
expect(result.prompt).not.toContain("<blocked_actions>")
expect(result.prompt).toContain("<Role>")
expect(result.prompt).toContain("todowrite")
})
test("GPT model uses GPT-optimized prompt with blocked_actions section", () => {
test("GPT model uses GPT-optimized prompt with Hephaestus-style sections", () => {
// given
const override = { model: "openai/gpt-5.2" }
@@ -315,9 +309,9 @@ describe("createSisyphusJuniorAgentWithOverrides", () => {
const result = createSisyphusJuniorAgentWithOverrides(override)
// then
expect(result.prompt).toContain("<blocked_actions>")
expect(result.prompt).toContain("<output_verbosity_spec>")
expect(result.prompt).toContain("<scope_and_design_constraints>")
expect(result.prompt).toContain("Scope Discipline")
expect(result.prompt).toContain("<tool_usage_rules>")
expect(result.prompt).toContain("Progress Updates")
})
test("prompt_append is added after base prompt", () => {
@@ -328,7 +322,7 @@ describe("createSisyphusJuniorAgentWithOverrides", () => {
const result = createSisyphusJuniorAgentWithOverrides(override)
// then
const baseEndIndex = result.prompt!.indexOf("Dense > verbose.")
const baseEndIndex = result.prompt!.indexOf("</Style>")
const appendIndex = result.prompt!.indexOf("CUSTOM_MARKER_FOR_TEST")
expect(baseEndIndex).not.toBe(-1)
expect(appendIndex).toBeGreaterThan(baseEndIndex)
@@ -383,7 +377,7 @@ describe("getSisyphusJuniorPromptSource", () => {
})
describe("buildSisyphusJuniorPrompt", () => {
test("GPT model prompt contains GPT-5.2 specific sections", () => {
test("GPT model prompt contains Hephaestus-style sections", () => {
// given
const model = "openai/gpt-5.2"
@@ -391,10 +385,10 @@ describe("buildSisyphusJuniorPrompt", () => {
const prompt = buildSisyphusJuniorPrompt(model, false)
// then
expect(prompt).toContain("<identity>")
expect(prompt).toContain("<output_verbosity_spec>")
expect(prompt).toContain("<scope_and_design_constraints>")
expect(prompt).toContain("## Identity")
expect(prompt).toContain("Scope Discipline")
expect(prompt).toContain("<tool_usage_rules>")
expect(prompt).toContain("Progress Updates")
})
test("Claude model prompt contains Claude-specific sections", () => {
@@ -406,11 +400,11 @@ describe("buildSisyphusJuniorPrompt", () => {
// then
expect(prompt).toContain("<Role>")
expect(prompt).toContain("<Critical_Constraints>")
expect(prompt).toContain("BLOCKED ACTIONS")
expect(prompt).toContain("<Todo_Discipline>")
expect(prompt).toContain("todowrite")
})
test("useTaskSystem=true includes Task_Discipline for GPT", () => {
test("useTaskSystem=true includes Task Discipline for GPT", () => {
// given
const model = "openai/gpt-5.2"
@@ -418,8 +412,8 @@ describe("buildSisyphusJuniorPrompt", () => {
const prompt = buildSisyphusJuniorPrompt(model, true)
// then
expect(prompt).toContain("<task_discipline_spec>")
expect(prompt).toContain("TaskCreate")
expect(prompt).toContain("Task Discipline")
expect(prompt).toContain("task_create")
})
test("useTaskSystem=false includes Todo_Discipline for Claude", () => {

View File

@@ -37,12 +37,10 @@ function buildTaskManagementSection(useTaskSystem: boolean): string {
### When to Create Tasks (MANDATORY)
| Trigger | Action |
|---------|--------|
| Multi-step task (2+ steps) | ALWAYS \`TaskCreate\` first |
| Uncertain scope | ALWAYS (tasks clarify thinking) |
| User request with multiple items | ALWAYS |
| Complex single task | \`TaskCreate\` to break down |
- Multi-step task (2+ steps) → ALWAYS \`TaskCreate\` first
- Uncertain scope → ALWAYS (tasks clarify thinking)
- User request with multiple items → ALWAYS
- Complex single task → \`TaskCreate\` to break down
### Workflow (NON-NEGOTIABLE)
@@ -61,12 +59,10 @@ function buildTaskManagementSection(useTaskSystem: boolean): string {
### Anti-Patterns (BLOCKING)
| Violation | Why It's Bad |
|-----------|--------------|
| Skipping tasks on multi-step tasks | User has no visibility, steps get forgotten |
| Batch-completing multiple tasks | Defeats real-time tracking purpose |
| Proceeding without marking in_progress | No indication of what you're working on |
| Finishing without completing tasks | Task appears incomplete to user |
- Skipping tasks on multi-step tasks — user has no visibility, steps get forgotten
- Batch-completing multiple tasks — defeats real-time tracking purpose
- Proceeding without marking in_progress — no indication of what you're working on
- Finishing without completing tasks — task appears incomplete to user
**FAILURE TO USE TASKS ON NON-TRIVIAL TASKS = INCOMPLETE WORK.**
@@ -95,12 +91,10 @@ Should I proceed with [recommendation], or would you prefer differently?
### When to Create Todos (MANDATORY)
| Trigger | Action |
|---------|--------|
| Multi-step task (2+ steps) | ALWAYS create todos first |
| Uncertain scope | ALWAYS (todos clarify thinking) |
| User request with multiple items | ALWAYS |
| Complex single task | Create todos to break down |
- Multi-step task (2+ steps) → ALWAYS create todos first
- Uncertain scope → ALWAYS (todos clarify thinking)
- User request with multiple items → ALWAYS
- Complex single task → Create todos to break down
### Workflow (NON-NEGOTIABLE)
@@ -119,12 +113,10 @@ Should I proceed with [recommendation], or would you prefer differently?
### Anti-Patterns (BLOCKING)
| Violation | Why It's Bad |
|-----------|--------------|
| Skipping todos on multi-step tasks | User has no visibility, steps get forgotten |
| Batch-completing multiple todos | Defeats real-time tracking purpose |
| Proceeding without marking in_progress | No indication of what you're working on |
| Finishing without completing todos | Task appears incomplete to user |
- Skipping todos on multi-step tasks — user has no visibility, steps get forgotten
- Batch-completing multiple todos — defeats real-time tracking purpose
- Proceeding without marking in_progress — no indication of what you're working on
- Finishing without completing todos — task appears incomplete to user
**FAILURE TO USE TODOS ON NON-TRIVIAL TASKS = INCOMPLETE WORK.**
@@ -200,23 +192,19 @@ ${keyTriggers}
### Step 1: Classify Request Type
| Type | Signal | Action |
|------|--------|--------|
| **Trivial** | Single file, known location, direct answer | Direct tools only (UNLESS Key Trigger applies) |
| **Explicit** | Specific file/line, clear command | Execute directly |
| **Exploratory** | "How does X work?", "Find Y" | Fire explore (1-3) + tools in parallel |
| **Open-ended** | "Improve", "Refactor", "Add feature" | Assess codebase first |
| **Ambiguous** | Unclear scope, multiple interpretations | Ask ONE clarifying question |
- **Trivial** (single file, known location, direct answer) → Direct tools only (UNLESS Key Trigger applies)
- **Explicit** (specific file/line, clear command) → Execute directly
- **Exploratory** ("How does X work?", "Find Y") → Fire explore (1-3) + tools in parallel
- **Open-ended** ("Improve", "Refactor", "Add feature") → Assess codebase first
- **Ambiguous** (unclear scope, multiple interpretations) → Ask ONE clarifying question
### Step 2: Check for Ambiguity
| Situation | Action |
|-----------|--------|
| Single valid interpretation | Proceed |
| Multiple interpretations, similar effort | Proceed with reasonable default, note assumption |
| Multiple interpretations, 2x+ effort difference | **MUST ask** |
| Missing critical info (file, error, context) | **MUST ask** |
| User's design seems flawed or suboptimal | **MUST raise concern** before implementing |
- Single valid interpretation → Proceed
- Multiple interpretations, similar effort → Proceed with reasonable default, note assumption
- Multiple interpretations, 2x+ effort difference → **MUST ask**
- Missing critical info (file, error, context) → **MUST ask**
- User's design seems flawed or suboptimal → **MUST raise concern** before implementing
### Step 3: Validate Before Acting
@@ -259,12 +247,10 @@ Before following existing patterns, assess whether they're worth following.
### State Classification:
| State | Signals | Your Behavior |
|-------|---------|---------------|
| **Disciplined** | Consistent patterns, configs present, tests exist | Follow existing style strictly |
| **Transitional** | Mixed patterns, some structure | Ask: "I see X and Y patterns. Which to follow?" |
| **Legacy/Chaotic** | No consistency, outdated patterns | Propose: "No clear conventions. I suggest [X]. OK?" |
| **Greenfield** | New/empty project | Apply modern best practices |
- **Disciplined** (consistent patterns, configs present, tests exist) → Follow existing style strictly
- **Transitional** (mixed patterns, some structure) → Ask: "I see X and Y patterns. Which to follow?"
- **Legacy/Chaotic** (no consistency, outdated patterns) → Propose: "No clear conventions. I suggest [X]. OK?"
- **Greenfield** (new/empty project) → Apply modern best practices
IMPORTANT: If codebase appears undisciplined, verify before assuming:
- Different patterns may serve different purposes (intentional)
@@ -309,8 +295,10 @@ result = task(..., run_in_background=false) // Never wait synchronously for exp
### Background Result Collection:
1. Launch parallel agents → receive task_ids
2. Continue immediate work
3. When results needed: \`background_output(task_id="...")\`
4. BEFORE final answer: \`background_cancel(all=true)\`
3. When results needed: \`background_output(task_id=\"...\")\`
4. Before final answer, cancel DISPOSABLE tasks (explore, librarian) individually: \`background_cancel(taskId=\"bg_explore_xxx\")\`, \`background_cancel(taskId=\"bg_librarian_xxx\")\`
5. **NEVER cancel Oracle.** ALWAYS collect Oracle result via \`background_output(task_id=\"bg_oracle_xxx\")\` before answering — even if you already have enough context.
6. **NEVER use \`background_cancel(all=true)\`** — it kills Oracle. Cancel each disposable task by its specific taskId.
### Search Stop Conditions
@@ -362,12 +350,10 @@ AFTER THE WORK YOU DELEGATED SEEMS DONE, ALWAYS VERIFY THE RESULTS AS FOLLOWING:
Every \`task()\` output includes a session_id. **USE IT.**
**ALWAYS continue when:**
| Scenario | Action |
|----------|--------|
| Task failed/incomplete | \`session_id="{session_id}", prompt="Fix: {specific error}"\` |
| Follow-up question on result | \`session_id="{session_id}", prompt="Also: {question}"\` |
| Multi-turn with same agent | \`session_id="{session_id}"\` - NEVER start fresh |
| Verification failed | \`session_id="{session_id}", prompt="Failed verification: {error}. Fix."\` |
- Task failed/incomplete → \`session_id=\"{session_id}\", prompt=\"Fix: {specific error}\"\`
- Follow-up question on result → \`session_id=\"{session_id}\", prompt=\"Also: {question}\"\`
- Multi-turn with same agent → \`session_id=\"{session_id}\"\` - NEVER start fresh
- Verification failed → \`session_id=\"{session_id}\", prompt=\"Failed verification: {error}. Fix.\"\`
**Why session_id is CRITICAL:**
- Subagent has FULL conversation context preserved
@@ -404,12 +390,10 @@ If project has build/test commands, run them at task completion.
### Evidence Requirements (task NOT complete without these):
| Action | Required Evidence |
|--------|-------------------|
| File edit | \`lsp_diagnostics\` clean on changed files |
| Build command | Exit code 0 |
| Test run | Pass (or explicit note of pre-existing failures) |
| Delegation | Agent result received and verified |
- **File edit** → \`lsp_diagnostics\` clean on changed files
- **Build command** → Exit code 0
- **Test run** → Pass (or explicit note of pre-existing failures)
- **Delegation** → Agent result received and verified
**NO EVIDENCE = NOT COMPLETE.**
@@ -449,8 +433,9 @@ If verification fails:
3. Report: "Done. Note: found N pre-existing lint errors unrelated to my changes."
### Before Delivering Final Answer:
- Cancel ALL running background tasks: \`background_cancel(all=true)\`
- This conserves resources and ensures clean workflow completion
- Cancel DISPOSABLE background tasks (explore, librarian) individually via \`background_cancel(taskId=\"...\")\`
- **NEVER use \`background_cancel(all=true)\`.** Always cancel individually by taskId.
- **Always wait for Oracle**: When Oracle is running and you have gathered enough context from your own exploration, your next action is \`background_output\` on Oracle — NOT delivering a final answer. Oracle's value is highest when you think you don't need it.
</Behavior_Instructions>
${oracleSection}

View File

@@ -428,7 +428,7 @@ describe("createBuiltinAgents with model overrides", () => {
)
// #then
const matches = agents.sisyphus.prompt.match(/Custom agent: researcher/gi) ?? []
const matches = (agents.sisyphus?.prompt ?? "").match(/Custom agent: researcher/gi) ?? []
expect(matches.length).toBe(1)
} finally {
fetchSpy.mockRestore()
@@ -525,6 +525,34 @@ describe("createBuiltinAgents without systemDefaultModel", () => {
})
describe("createBuiltinAgents with requiresProvider gating (hephaestus)", () => {
test("hephaestus is created when provider-models cache connected list includes required provider", async () => {
// #given
const connectedCacheSpy = spyOn(connectedProvidersCache, "readConnectedProvidersCache").mockReturnValue(["anthropic"])
const providerModelsSpy = spyOn(connectedProvidersCache, "readProviderModelsCache").mockReturnValue({
connected: ["openai"],
models: {},
updatedAt: new Date().toISOString(),
})
const fetchSpy = spyOn(shared, "fetchAvailableModels").mockImplementation(async (_, options) => {
const providers = options?.connectedProviders ?? []
return providers.includes("openai")
? new Set(["openai/gpt-5.3-codex"])
: new Set(["anthropic/claude-opus-4-6"])
})
try {
// #when
const agents = await createBuiltinAgents([], {}, undefined, TEST_DEFAULT_MODEL, undefined, undefined, [], {})
// #then
expect(agents.hephaestus).toBeDefined()
} finally {
connectedCacheSpy.mockRestore()
providerModelsSpy.mockRestore()
fetchSpy.mockRestore()
}
})
test("hephaestus is not created when no required provider is connected", async () => {
// #given - only anthropic models available, not in hephaestus requiresProvider
const fetchSpy = spyOn(shared, "fetchAvailableModels").mockResolvedValue(

View File

@@ -1,72 +1,71 @@
# CLI KNOWLEDGE BASE
# src/cli/ — CLI: install, run, doctor, mcp-oauth
**Generated:** 2026-02-17
## OVERVIEW
CLI entry: `bunx oh-my-opencode`. 107+ files with Commander.js + @clack/prompts TUI.
**Commands**: install, run, doctor, get-local-version, mcp-oauth
## STRUCTURE
```
cli/
├── index.ts # Entry point (5 lines)
├── cli-program.ts # Commander.js program (150+ lines, 5 commands)
├── install.ts # TTY routing (TUI or CLI installer)
├── cli-installer.ts # Non-interactive installer (164 lines)
├── tui-installer.ts # Interactive TUI with @clack/prompts (140 lines)
├── config-manager/ # 17 config utilities
│ ├── add-plugin-to-opencode-config.ts # Plugin registration
│ ├── add-provider-config.ts # Provider setup
│ ├── detect-current-config.ts # Project vs user config
│ ├── write-omo-config.ts # JSONC writing
│ └── ...
├── doctor/ # 14 health checks
│ ├── runner.ts # Check orchestration
│ ├── formatter.ts # Colored output
│ └── checks/ # 29 files: auth, config, dependencies, gh, lsp, mcp, opencode, plugin, version, model-resolution (6 sub-checks)
├── run/ # Session launcher (24 files)
│ ├── runner.ts # Run orchestration (126 lines)
│ ├── agent-resolver.ts # Agent selection: flag → env → config → fallback
│ ├── session-resolver.ts # Session creation or resume
│ ├── event-handlers.ts # Event processing (125 lines)
│ ├── completion.ts # Completion detection
│ └── poll-for-completion.ts # Polling with timeout
├── mcp-oauth/ # OAuth token management (login, logout, status)
├── get-local-version/ # Version detection + update check
├── model-fallback.ts # Model fallback configuration
└── provider-availability.ts # Provider availability checks
```
Commander.js CLI with 5 commands. Entry: `index.ts``runCli()` in `cli-program.ts`.
## COMMANDS
| Command | Purpose | Key Logic |
|---------|---------|-----------|
| `install` | Interactive setup | Provider selection → config generation → plugin registration |
| `run` | Session launcher | Agent: flag → env → config → Sisyphus. Enforces todo completion. |
| `doctor` | 14 health checks | installation, config, auth, deps, tools, updates |
| `get-local-version` | Version check | Detects installed, compares with npm latest |
| `mcp-oauth` | OAuth tokens | login (PKCE flow), logout, status |
| `install` | Interactive/non-interactive setup | Provider selection → config gen → plugin registration |
| `run <message>` | Non-interactive session launcher | Agent resolution (flag → env → config → Sisyphus) |
| `doctor` | 4-category health checks | System, Config, Tools, Models |
| `get-local-version` | Version detection | Installed vs npm latest |
| `mcp-oauth` | OAuth token management | login (PKCE), logout, status |
## DOCTOR CHECK CATEGORIES
## STRUCTURE
| Category | Checks |
|----------|--------|
| installation | opencode, plugin |
| configuration | config validity, Zod, model-resolution (6 sub-checks) |
| authentication | anthropic, openai, google |
| dependencies | ast-grep, comment-checker, gh-cli |
| tools | LSP, MCP, MCP-OAuth |
| updates | version comparison |
```
cli/
├── index.ts # Entry point → runCli()
├── cli-program.ts # Commander.js program (5 commands)
├── install.ts # Routes to TUI or CLI installer
├── cli-installer.ts # Non-interactive (console output)
├── tui-installer.ts # Interactive (@clack/prompts)
├── model-fallback.ts # Model config gen by provider availability
├── provider-availability.ts # Provider detection
├── fallback-chain-resolution.ts # Fallback chain logic
├── config-manager/ # 20 config utilities
│ ├── plugin registration, provider config
│ ├── JSONC operations, auth plugins
│ └── npm dist-tags, binary detection
├── doctor/
│ ├── runner.ts # Parallel check execution
│ ├── formatter.ts # Output formatting
│ └── checks/ # 15 check files in 4 categories
│ ├── system.ts # Binary, plugin, version
│ ├── config.ts # JSONC validity, Zod schema
│ ├── tools.ts # AST-Grep, LSP, GH CLI, MCP
│ └── model-resolution.ts # Cache, resolution, overrides (6 sub-files)
├── run/ # Session launcher
│ ├── runner.ts # Main orchestration
│ ├── agent-resolver.ts # Flag → env → config → Sisyphus
│ ├── session-resolver.ts # Create/resume sessions
│ ├── event-handlers.ts # Event processing
│ └── poll-for-completion.ts # Wait for todos/background tasks
└── mcp-oauth/ # OAuth token management
```
## HOW TO ADD CHECK
## MODEL FALLBACK SYSTEM
1. Create `src/cli/doctor/checks/my-check.ts`
2. Export `getXXXCheckDefinition()` returning `CheckDefinition`
3. Add to `getAllCheckDefinitions()` in `checks/index.ts`
Priority: Claude > OpenAI > Gemini > Copilot > OpenCode Zen > Z.ai > Kimi > glm-4.7-free
## ANTI-PATTERNS
Agent-specific: librarian→ZAI, explore→Haiku/nano, hephaestus→requires OpenAI/Copilot
- **Blocking in non-TTY**: Check `process.stdout.isTTY`
- **Direct JSON.parse**: Use `parseJsonc()` from shared
- **Silent failures**: Return `warn` or `fail` in doctor, don't throw
- **Hardcoded paths**: Use `getOpenCodeConfigPaths()` from config-manager
## DOCTOR CHECKS
| Category | Validates |
|----------|-----------|
| **System** | Binary found, version >=1.0.150, plugin registered, version match |
| **Config** | JSONC validity, Zod schema, model override syntax |
| **Tools** | AST-Grep, comment-checker, LSP servers, GH CLI, MCP servers |
| **Models** | Cache exists, model resolution, agent/category overrides, availability |
## HOW TO ADD A DOCTOR CHECK
1. Create `src/cli/doctor/checks/{name}.ts`
2. Export check function matching `DoctorCheck` interface
3. Register in `checks/index.ts`

View File

@@ -0,0 +1,83 @@
import { afterEach, beforeEach, describe, expect, it, mock, spyOn } from "bun:test"
import * as configManager from "./config-manager"
import { runCliInstaller } from "./cli-installer"
import type { InstallArgs } from "./types"
describe("runCliInstaller", () => {
const mockConsoleLog = mock(() => {})
const mockConsoleError = mock(() => {})
const originalConsoleLog = console.log
const originalConsoleError = console.error
beforeEach(() => {
console.log = mockConsoleLog
console.error = mockConsoleError
mockConsoleLog.mockClear()
mockConsoleError.mockClear()
})
afterEach(() => {
console.log = originalConsoleLog
console.error = originalConsoleError
})
it("runs auth and provider setup steps when openai or copilot are enabled without gemini", async () => {
//#given
const addAuthPluginsSpy = spyOn(configManager, "addAuthPlugins").mockResolvedValue({
success: true,
configPath: "/tmp/opencode.jsonc",
})
const addProviderConfigSpy = spyOn(configManager, "addProviderConfig").mockReturnValue({
success: true,
configPath: "/tmp/opencode.jsonc",
})
const restoreSpies = [
addAuthPluginsSpy,
addProviderConfigSpy,
spyOn(configManager, "detectCurrentConfig").mockReturnValue({
isInstalled: false,
hasClaude: false,
isMax20: false,
hasOpenAI: false,
hasGemini: false,
hasCopilot: false,
hasOpencodeZen: false,
hasZaiCodingPlan: false,
hasKimiForCoding: false,
}),
spyOn(configManager, "isOpenCodeInstalled").mockResolvedValue(true),
spyOn(configManager, "getOpenCodeVersion").mockResolvedValue("1.0.200"),
spyOn(configManager, "addPluginToOpenCodeConfig").mockResolvedValue({
success: true,
configPath: "/tmp/opencode.jsonc",
}),
spyOn(configManager, "writeOmoConfig").mockReturnValue({
success: true,
configPath: "/tmp/oh-my-opencode.jsonc",
}),
]
const args: InstallArgs = {
tui: false,
claude: "no",
openai: "yes",
gemini: "no",
copilot: "yes",
opencodeZen: "no",
zaiCodingPlan: "no",
kimiForCoding: "no",
}
//#when
const result = await runCliInstaller(args, "3.4.0")
//#then
expect(result).toBe(0)
expect(addAuthPluginsSpy).toHaveBeenCalledTimes(1)
expect(addProviderConfigSpy).toHaveBeenCalledTimes(1)
for (const spy of restoreSpies) {
spy.mockRestore()
}
})
})

View File

@@ -77,7 +77,9 @@ export async function runCliInstaller(args: InstallArgs, version: string): Promi
`Plugin ${isUpdate ? "verified" : "added"} ${SYMBOLS.arrow} ${color.dim(pluginResult.configPath)}`,
)
if (config.hasGemini) {
const needsProviderSetup = config.hasGemini || config.hasOpenAI || config.hasCopilot
if (needsProviderSetup) {
printStep(step++, totalSteps, "Adding auth plugins...")
const authResult = await addAuthPlugins(config)
if (!authResult.success) {

View File

@@ -1,32 +1,45 @@
import pc from "picocolors"
import type { RunOptions } from "./types"
import type { OhMyOpenCodeConfig } from "../../config"
import { getAgentConfigKey, getAgentDisplayName } from "../../shared/agent-display-names"
const CORE_AGENT_ORDER = ["sisyphus", "hephaestus", "prometheus", "atlas"] as const
const DEFAULT_AGENT = "sisyphus"
type EnvVars = Record<string, string | undefined>
type CoreAgentKey = (typeof CORE_AGENT_ORDER)[number]
const normalizeAgentName = (agent?: string): string | undefined => {
if (!agent) return undefined
const trimmed = agent.trim()
if (!trimmed) return undefined
const lowered = trimmed.toLowerCase()
const coreMatch = CORE_AGENT_ORDER.find((name) => name.toLowerCase() === lowered)
return coreMatch ?? trimmed
interface ResolvedAgent {
configKey: string
resolvedName: string
}
const isAgentDisabled = (agent: string, config: OhMyOpenCodeConfig): boolean => {
const lowered = agent.toLowerCase()
if (lowered === "sisyphus" && config.sisyphus_agent?.disabled === true) {
const normalizeAgentName = (agent?: string): ResolvedAgent | undefined => {
if (!agent) return undefined
const trimmed = agent.trim()
if (trimmed.length === 0) return undefined
const configKey = getAgentConfigKey(trimmed)
const displayName = getAgentDisplayName(configKey)
const isKnownAgent = displayName !== configKey
return {
configKey,
resolvedName: isKnownAgent ? displayName : trimmed,
}
}
const isAgentDisabled = (agentConfigKey: string, config: OhMyOpenCodeConfig): boolean => {
const lowered = agentConfigKey.toLowerCase()
if (lowered === DEFAULT_AGENT && config.sisyphus_agent?.disabled === true) {
return true
}
return (config.disabled_agents ?? []).some(
(disabled) => disabled.toLowerCase() === lowered
(disabled) => getAgentConfigKey(disabled) === lowered
)
}
const pickFallbackAgent = (config: OhMyOpenCodeConfig): string => {
const pickFallbackAgent = (config: OhMyOpenCodeConfig): CoreAgentKey => {
for (const agent of CORE_AGENT_ORDER) {
if (!isAgentDisabled(agent, config)) {
return agent
@@ -43,27 +56,33 @@ export const resolveRunAgent = (
const cliAgent = normalizeAgentName(options.agent)
const envAgent = normalizeAgentName(env.OPENCODE_DEFAULT_AGENT)
const configAgent = normalizeAgentName(pluginConfig.default_run_agent)
const resolved = cliAgent ?? envAgent ?? configAgent ?? DEFAULT_AGENT
const normalized = normalizeAgentName(resolved) ?? DEFAULT_AGENT
const resolved =
cliAgent ??
envAgent ??
configAgent ?? {
configKey: DEFAULT_AGENT,
resolvedName: getAgentDisplayName(DEFAULT_AGENT),
}
if (isAgentDisabled(normalized, pluginConfig)) {
if (isAgentDisabled(resolved.configKey, pluginConfig)) {
const fallback = pickFallbackAgent(pluginConfig)
const fallbackName = getAgentDisplayName(fallback)
const fallbackDisabled = isAgentDisabled(fallback, pluginConfig)
if (fallbackDisabled) {
console.log(
pc.yellow(
`Requested agent "${normalized}" is disabled and no enabled core agent was found. Proceeding with "${fallback}".`
`Requested agent "${resolved.resolvedName}" is disabled and no enabled core agent was found. Proceeding with "${fallbackName}".`
)
)
return fallback
return fallbackName
}
console.log(
pc.yellow(
`Requested agent "${normalized}" is disabled. Falling back to "${fallback}".`
`Requested agent "${resolved.resolvedName}" is disabled. Falling back to "${fallbackName}".`
)
)
return fallback
return fallbackName
}
return normalized
return resolved.resolvedName
}

View File

@@ -1,5 +1,6 @@
import pc from "picocolors"
import type { RunContext, Todo, ChildSession, SessionStatus } from "./types"
import { normalizeSDKResponse } from "../../shared"
export async function checkCompletionConditions(ctx: RunContext): Promise<boolean> {
try {
@@ -19,8 +20,11 @@ export async function checkCompletionConditions(ctx: RunContext): Promise<boolea
}
async function areAllTodosComplete(ctx: RunContext): Promise<boolean> {
const todosRes = await ctx.client.session.todo({ path: { id: ctx.sessionID } })
const todos = (todosRes.data ?? []) as Todo[]
const todosRes = await ctx.client.session.todo({
path: { id: ctx.sessionID },
query: { directory: ctx.directory },
})
const todos = normalizeSDKResponse(todosRes, [] as Todo[])
const incompleteTodos = todos.filter(
(t) => t.status !== "completed" && t.status !== "cancelled"
@@ -42,8 +46,10 @@ async function areAllChildrenIdle(ctx: RunContext): Promise<boolean> {
async function fetchAllStatuses(
ctx: RunContext
): Promise<Record<string, SessionStatus>> {
const statusRes = await ctx.client.session.status()
return (statusRes.data ?? {}) as Record<string, SessionStatus>
const statusRes = await ctx.client.session.status({
query: { directory: ctx.directory },
})
return normalizeSDKResponse(statusRes, {} as Record<string, SessionStatus>)
}
async function areAllDescendantsIdle(
@@ -53,8 +59,9 @@ async function areAllDescendantsIdle(
): Promise<boolean> {
const childrenRes = await ctx.client.session.children({
path: { id: sessionID },
query: { directory: ctx.directory },
})
const children = (childrenRes.data ?? []) as ChildSession[]
const children = normalizeSDKResponse(childrenRes, [] as ChildSession[])
for (const child of children) {
const status = allStatuses[child.id]

View File

@@ -57,7 +57,11 @@ export function serializeError(error: unknown): string {
function getSessionTag(ctx: RunContext, payload: EventPayload): string {
const props = payload.properties as Record<string, unknown> | undefined
const info = props?.info as Record<string, unknown> | undefined
const sessionID = props?.sessionID ?? info?.sessionID
const part = props?.part as Record<string, unknown> | undefined
const sessionID =
props?.sessionID ?? props?.sessionId ??
info?.sessionID ?? info?.sessionId ??
part?.sessionID ?? part?.sessionId
const isMainSession = sessionID === ctx.sessionID
if (isMainSession) return pc.green("[MAIN]")
if (sessionID) return pc.yellow(`[${String(sessionID).slice(0, 8)}]`)
@@ -79,9 +83,9 @@ export function logEventVerbose(ctx: RunContext, payload: EventPayload): void {
case "message.part.updated": {
const partProps = props as MessagePartUpdatedProps | undefined
const part = partProps?.part
if (part?.type === "tool-invocation") {
const toolPart = part as { toolName?: string; state?: string }
console.error(pc.dim(`${sessionTag} message.part (tool): ${toolPart.toolName} [${toolPart.state}]`))
if (part?.type === "tool") {
const status = part.state?.status ?? "unknown"
console.error(pc.dim(`${sessionTag} message.part (tool): ${part.tool ?? part.name ?? "?"} [${status}]`))
} else if (part?.type === "text" && part.text) {
const preview = part.text.slice(0, 80).replace(/\n/g, "\\n")
console.error(pc.dim(`${sessionTag} message.part (text): "${preview}${part.text.length > 80 ? "..." : ""}"`))

View File

@@ -1,7 +1,7 @@
import { describe, it, expect } from "bun:test"
import { describe, it, expect, spyOn } from "bun:test"
import type { RunContext } from "./types"
import { createEventState } from "./events"
import { handleSessionStatus } from "./event-handlers"
import { handleSessionStatus, handleMessagePartUpdated, handleTuiToast } from "./event-handlers"
const createMockContext = (sessionID: string = "test-session"): RunContext => ({
sessionID,
@@ -70,4 +70,211 @@ describe("handleSessionStatus", () => {
//#then - state.mainSessionIdle remains unchanged
expect(state.mainSessionIdle).toBe(true)
})
it("recognizes idle from camelCase sessionId", () => {
//#given - state with mainSessionIdle=false and payload using sessionId
const ctx = createMockContext("test-session")
const state = createEventState()
state.mainSessionIdle = false
const payload = {
type: "session.status",
properties: {
sessionId: "test-session",
status: { type: "idle" as const },
},
}
//#when - handleSessionStatus called with camelCase sessionId
handleSessionStatus(ctx, payload as any, state)
//#then - state.mainSessionIdle === true
expect(state.mainSessionIdle).toBe(true)
})
})
describe("handleMessagePartUpdated", () => {
it("extracts sessionID from part (current OpenCode event structure)", () => {
//#given - message.part.updated with sessionID in part, not info
const ctx = createMockContext("ses_main")
const state = createEventState()
const stdoutSpy = spyOn(process.stdout, "write").mockImplementation(() => true)
const payload = {
type: "message.part.updated",
properties: {
part: {
id: "part_1",
sessionID: "ses_main",
messageID: "msg_1",
type: "text",
text: "Hello world",
},
},
}
//#when
handleMessagePartUpdated(ctx, payload as any, state)
//#then
expect(state.hasReceivedMeaningfulWork).toBe(true)
expect(state.lastPartText).toBe("Hello world")
expect(stdoutSpy).toHaveBeenCalled()
stdoutSpy.mockRestore()
})
it("skips events for different session", () => {
//#given - message.part.updated with different session
const ctx = createMockContext("ses_main")
const state = createEventState()
const payload = {
type: "message.part.updated",
properties: {
part: {
id: "part_1",
sessionID: "ses_other",
messageID: "msg_1",
type: "text",
text: "Hello world",
},
},
}
//#when
handleMessagePartUpdated(ctx, payload as any, state)
//#then
expect(state.hasReceivedMeaningfulWork).toBe(false)
expect(state.lastPartText).toBe("")
})
it("handles tool part with running status", () => {
//#given - tool part in running state
const ctx = createMockContext("ses_main")
const state = createEventState()
const stdoutSpy = spyOn(process.stdout, "write").mockImplementation(() => true)
const payload = {
type: "message.part.updated",
properties: {
part: {
id: "part_1",
sessionID: "ses_main",
messageID: "msg_1",
type: "tool",
tool: "read",
state: { status: "running", input: { filePath: "/src/index.ts" } },
},
},
}
//#when
handleMessagePartUpdated(ctx, payload as any, state)
//#then
expect(state.currentTool).toBe("read")
expect(state.hasReceivedMeaningfulWork).toBe(true)
stdoutSpy.mockRestore()
})
it("clears currentTool when tool completes", () => {
//#given - tool part in completed state
const ctx = createMockContext("ses_main")
const state = createEventState()
state.currentTool = "read"
const stdoutSpy = spyOn(process.stdout, "write").mockImplementation(() => true)
const payload = {
type: "message.part.updated",
properties: {
part: {
id: "part_1",
sessionID: "ses_main",
messageID: "msg_1",
type: "tool",
tool: "read",
state: { status: "completed", input: {}, output: "file contents here" },
},
},
}
//#when
handleMessagePartUpdated(ctx, payload as any, state)
//#then
expect(state.currentTool).toBeNull()
stdoutSpy.mockRestore()
})
it("supports legacy info.sessionID for backward compatibility", () => {
//#given - legacy event with sessionID in info
const ctx = createMockContext("ses_legacy")
const state = createEventState()
const stdoutSpy = spyOn(process.stdout, "write").mockImplementation(() => true)
const payload = {
type: "message.part.updated",
properties: {
info: { sessionID: "ses_legacy", role: "assistant" },
part: {
type: "text",
text: "Legacy text",
},
},
}
//#when
handleMessagePartUpdated(ctx, payload as any, state)
//#then
expect(state.hasReceivedMeaningfulWork).toBe(true)
expect(state.lastPartText).toBe("Legacy text")
stdoutSpy.mockRestore()
})
})
describe("handleTuiToast", () => {
it("marks main session as error when toast variant is error", () => {
//#given - toast error payload
const ctx = createMockContext("test-session")
const state = createEventState()
const payload = {
type: "tui.toast.show",
properties: {
title: "Auth",
message: "Invalid API key",
variant: "error" as const,
},
}
//#when
handleTuiToast(ctx, payload as any, state)
//#then
expect(state.mainSessionError).toBe(true)
expect(state.lastError).toBe("Auth: Invalid API key")
})
it("does not mark session error for warning toast", () => {
//#given - toast warning payload
const ctx = createMockContext("test-session")
const state = createEventState()
const payload = {
type: "tui.toast.show",
properties: {
message: "Retrying provider",
variant: "warning" as const,
},
}
//#when
handleTuiToast(ctx, payload as any, state)
//#then
expect(state.mainSessionError).toBe(false)
expect(state.lastError).toBe(null)
})
})

View File

@@ -9,15 +9,32 @@ import type {
MessagePartUpdatedProps,
ToolExecuteProps,
ToolResultProps,
TuiToastShowProps,
} from "./types"
import type { EventState } from "./event-state"
import { serializeError } from "./event-formatting"
function getSessionId(props?: { sessionID?: string; sessionId?: string }): string | undefined {
return props?.sessionID ?? props?.sessionId
}
function getInfoSessionId(props?: {
info?: { sessionID?: string; sessionId?: string }
}): string | undefined {
return props?.info?.sessionID ?? props?.info?.sessionId
}
function getPartSessionId(props?: {
part?: { sessionID?: string; sessionId?: string }
}): string | undefined {
return props?.part?.sessionID ?? props?.part?.sessionId
}
export function handleSessionIdle(ctx: RunContext, payload: EventPayload, state: EventState): void {
if (payload.type !== "session.idle") return
const props = payload.properties as SessionIdleProps | undefined
if (props?.sessionID === ctx.sessionID) {
if (getSessionId(props) === ctx.sessionID) {
state.mainSessionIdle = true
}
}
@@ -26,7 +43,7 @@ export function handleSessionStatus(ctx: RunContext, payload: EventPayload, stat
if (payload.type !== "session.status") return
const props = payload.properties as SessionStatusProps | undefined
if (props?.sessionID !== ctx.sessionID) return
if (getSessionId(props) !== ctx.sessionID) return
if (props?.status?.type === "busy") {
state.mainSessionIdle = false
@@ -41,7 +58,7 @@ export function handleSessionError(ctx: RunContext, payload: EventPayload, state
if (payload.type !== "session.error") return
const props = payload.properties as SessionErrorProps | undefined
if (props?.sessionID === ctx.sessionID) {
if (getSessionId(props) === ctx.sessionID) {
state.mainSessionError = true
state.lastError = serializeError(props?.error)
console.error(pc.red(`\n[session.error] ${state.lastError}`))
@@ -52,10 +69,12 @@ export function handleMessagePartUpdated(ctx: RunContext, payload: EventPayload,
if (payload.type !== "message.part.updated") return
const props = payload.properties as MessagePartUpdatedProps | undefined
if (props?.info?.sessionID !== ctx.sessionID) return
if (props?.info?.role !== "assistant") return
// Current OpenCode puts sessionID inside part; legacy puts it in info
const partSid = getPartSessionId(props)
const infoSid = getInfoSessionId(props)
if ((partSid ?? infoSid) !== ctx.sessionID) return
const part = props.part
const part = props?.part
if (!part) return
if (part.type === "text" && part.text) {
@@ -66,13 +85,57 @@ export function handleMessagePartUpdated(ctx: RunContext, payload: EventPayload,
}
state.lastPartText = part.text
}
if (part.type === "tool") {
handleToolPart(ctx, part, state)
}
}
function handleToolPart(
_ctx: RunContext,
part: NonNullable<MessagePartUpdatedProps["part"]>,
state: EventState,
): void {
const toolName = part.tool || part.name || "unknown"
const status = part.state?.status
if (status === "running") {
state.currentTool = toolName
let inputPreview = ""
const input = part.state?.input
if (input) {
if (input.command) {
inputPreview = ` ${pc.dim(String(input.command).slice(0, 60))}`
} else if (input.pattern) {
inputPreview = ` ${pc.dim(String(input.pattern).slice(0, 40))}`
} else if (input.filePath) {
inputPreview = ` ${pc.dim(String(input.filePath))}`
} else if (input.query) {
inputPreview = ` ${pc.dim(String(input.query).slice(0, 40))}`
}
}
state.hasReceivedMeaningfulWork = true
process.stdout.write(`\n${pc.cyan(">")} ${pc.bold(toolName)}${inputPreview}\n`)
}
if (status === "completed" || status === "error") {
const output = part.state?.output || ""
const maxLen = 200
const preview = output.length > maxLen ? output.slice(0, maxLen) + "..." : output
if (preview.trim()) {
const lines = preview.split("\n").slice(0, 3)
process.stdout.write(pc.dim(` └─ ${lines.join("\n ")}\n`))
}
state.currentTool = null
state.lastPartText = ""
}
}
export function handleMessageUpdated(ctx: RunContext, payload: EventPayload, state: EventState): void {
if (payload.type !== "message.updated") return
const props = payload.properties as MessageUpdatedProps | undefined
if (props?.info?.sessionID !== ctx.sessionID) return
if (getInfoSessionId(props) !== ctx.sessionID) return
if (props?.info?.role !== "assistant") return
state.hasReceivedMeaningfulWork = true
@@ -84,7 +147,7 @@ export function handleToolExecute(ctx: RunContext, payload: EventPayload, state:
if (payload.type !== "tool.execute") return
const props = payload.properties as ToolExecuteProps | undefined
if (props?.sessionID !== ctx.sessionID) return
if (getSessionId(props) !== ctx.sessionID) return
const toolName = props?.name || "unknown"
state.currentTool = toolName
@@ -111,7 +174,7 @@ export function handleToolResult(ctx: RunContext, payload: EventPayload, state:
if (payload.type !== "tool.result") return
const props = payload.properties as ToolResultProps | undefined
if (props?.sessionID !== ctx.sessionID) return
if (getSessionId(props) !== ctx.sessionID) return
const output = props?.output || ""
const maxLen = 200
@@ -125,3 +188,24 @@ export function handleToolResult(ctx: RunContext, payload: EventPayload, state:
state.currentTool = null
state.lastPartText = ""
}
export function handleTuiToast(_ctx: RunContext, payload: EventPayload, state: EventState): void {
if (payload.type !== "tui.toast.show") return
const props = payload.properties as TuiToastShowProps | undefined
const title = props?.title ? `${props.title}: ` : ""
const message = props?.message?.trim()
const variant = props?.variant ?? "info"
if (!message) return
if (variant === "error") {
state.mainSessionError = true
state.lastError = `${title}${message}`
console.error(pc.red(`\n[tui.toast.error] ${state.lastError}`))
return
}
const colorize = variant === "warning" ? pc.yellow : pc.dim
console.log(colorize(`[toast:${variant}] ${title}${message}`))
}

View File

@@ -10,6 +10,7 @@ import {
handleMessageUpdated,
handleToolExecute,
handleToolResult,
handleTuiToast,
} from "./event-handlers"
export async function processEvents(
@@ -36,6 +37,7 @@ export async function processEvents(
handleMessageUpdated(ctx, payload, state)
handleToolExecute(ctx, payload, state)
handleToolResult(ctx, payload, state)
handleTuiToast(ctx, payload, state)
} catch (err) {
console.error(pc.red(`[event error] ${err}`))
}

View File

@@ -170,6 +170,28 @@ describe("event handling", () => {
expect(state.hasReceivedMeaningfulWork).toBe(true)
})
it("message.updated with camelCase sessionId sets hasReceivedMeaningfulWork", async () => {
//#given - assistant message uses sessionId key
const ctx = createMockContext("my-session")
const state = createEventState()
const payload: EventPayload = {
type: "message.updated",
properties: {
info: { sessionId: "my-session", role: "assistant" },
},
}
const events = toAsyncIterable([payload])
const { processEvents } = await import("./events")
//#when
await processEvents(ctx, events, state)
//#then
expect(state.hasReceivedMeaningfulWork).toBe(true)
})
it("message.updated with user role does not set hasReceivedMeaningfulWork", async () => {
// given - user message should not count as meaningful work
const ctx = createMockContext("my-session")
@@ -251,6 +273,7 @@ describe("event handling", () => {
lastPartText: "",
currentTool: null,
hasReceivedMeaningfulWork: false,
messageCount: 0,
}
const payload: EventPayload = {

View File

@@ -1,9 +1,11 @@
import { describe, it, expect, mock, spyOn, beforeEach, afterEach } from "bun:test"
import { describe, it, expect, mock, spyOn, beforeEach, afterEach, afterAll } from "bun:test"
import type { RunResult } from "./types"
import { createJsonOutputManager } from "./json-output"
import { resolveSession } from "./session-resolver"
import { executeOnCompleteHook } from "./on-complete-hook"
import type { OpencodeClient } from "./types"
import * as originalSdk from "@opencode-ai/sdk"
import * as originalPortUtils from "../../shared/port-utils"
const mockServerClose = mock(() => {})
const mockCreateOpencode = mock(() =>
@@ -27,6 +29,11 @@ mock.module("../../shared/port-utils", () => ({
DEFAULT_SERVER_PORT: 4096,
}))
afterAll(() => {
mock.module("@opencode-ai/sdk", () => originalSdk)
mock.module("../../shared/port-utils", () => originalPortUtils)
})
const { createServerConnection } = await import("./server-connection")
interface MockWriteStream {
@@ -120,11 +127,14 @@ describe("integration: --session-id", () => {
const mockClient = createMockClient({ data: { id: sessionId } })
// when
const result = await resolveSession({ client: mockClient, sessionId })
const result = await resolveSession({ client: mockClient, sessionId, directory: "/test" })
// then
expect(result).toBe(sessionId)
expect(mockClient.session.get).toHaveBeenCalledWith({ path: { id: sessionId } })
expect(mockClient.session.get).toHaveBeenCalledWith({
path: { id: sessionId },
query: { directory: "/test" },
})
expect(mockClient.session.create).not.toHaveBeenCalled()
})
@@ -134,11 +144,14 @@ describe("integration: --session-id", () => {
const mockClient = createMockClient({ error: { message: "Session not found" } })
// when
const result = resolveSession({ client: mockClient, sessionId })
const result = resolveSession({ client: mockClient, sessionId, directory: "/test" })
// then
await expect(result).rejects.toThrow(`Session not found: ${sessionId}`)
expect(mockClient.session.get).toHaveBeenCalledWith({ path: { id: sessionId } })
expect(mockClient.session.get).toHaveBeenCalledWith({
path: { id: sessionId },
query: { directory: "/test" },
})
expect(mockClient.session.create).not.toHaveBeenCalled()
})
})

View File

@@ -0,0 +1,52 @@
/// <reference types="bun-types" />
import { describe, expect, it } from "bun:test"
import { prependResolvedOpencodeBinToPath } from "./opencode-bin-path"
describe("prependResolvedOpencodeBinToPath", () => {
it("prepends resolved opencode-ai bin path to PATH", () => {
//#given
const env: Record<string, string | undefined> = {
PATH: "/Users/yeongyu/node_modules/.bin:/usr/bin",
}
const resolver = () => "/tmp/bunx-123/node_modules/opencode-ai/bin/opencode"
//#when
prependResolvedOpencodeBinToPath(env, resolver)
//#then
expect(env.PATH).toBe(
"/tmp/bunx-123/node_modules/opencode-ai/bin:/Users/yeongyu/node_modules/.bin:/usr/bin",
)
})
it("does not duplicate an existing opencode-ai bin path", () => {
//#given
const env: Record<string, string | undefined> = {
PATH: "/tmp/bunx-123/node_modules/opencode-ai/bin:/usr/bin",
}
const resolver = () => "/tmp/bunx-123/node_modules/opencode-ai/bin/opencode"
//#when
prependResolvedOpencodeBinToPath(env, resolver)
//#then
expect(env.PATH).toBe("/tmp/bunx-123/node_modules/opencode-ai/bin:/usr/bin")
})
it("keeps PATH unchanged when opencode-ai cannot be resolved", () => {
//#given
const env: Record<string, string | undefined> = {
PATH: "/Users/yeongyu/node_modules/.bin:/usr/bin",
}
const resolver = () => {
throw new Error("module not found")
}
//#when
prependResolvedOpencodeBinToPath(env, resolver)
//#then
expect(env.PATH).toBe("/Users/yeongyu/node_modules/.bin:/usr/bin")
})
})

View File

@@ -0,0 +1,30 @@
import { delimiter, dirname } from "node:path"
import { createRequire } from "node:module"
type EnvLike = Record<string, string | undefined>
const resolveFromCurrentModule = createRequire(import.meta.url).resolve
export function prependResolvedOpencodeBinToPath(
env: EnvLike = process.env,
resolve: (id: string) => string = resolveFromCurrentModule,
): void {
let resolvedPath: string
try {
resolvedPath = resolve("opencode-ai/bin/opencode")
} catch {
return
}
const opencodeBinDir = dirname(resolvedPath)
const currentPath = env.PATH ?? ""
const pathSegments = currentPath ? currentPath.split(delimiter) : []
if (pathSegments.includes(opencodeBinDir)) {
return
}
env.PATH = currentPath
? `${opencodeBinDir}${delimiter}${currentPath}`
: opencodeBinDir
}

View File

@@ -0,0 +1,102 @@
import { describe, expect, it } from "bun:test"
import { delimiter, join } from "node:path"
import {
buildPathWithBinaryFirst,
collectCandidateBinaryPaths,
findWorkingOpencodeBinary,
withWorkingOpencodePath,
} from "./opencode-binary-resolver"
describe("collectCandidateBinaryPaths", () => {
it("includes Bun.which results first and removes duplicates", () => {
// given
const pathEnv = ["/bad", "/good"].join(delimiter)
const which = (command: string): string | undefined => {
if (command === "opencode") return "/bad/opencode"
return undefined
}
// when
const candidates = collectCandidateBinaryPaths(pathEnv, which, "darwin")
// then
expect(candidates[0]).toBe("/bad/opencode")
expect(candidates).toContain("/good/opencode")
expect(candidates.filter((candidate) => candidate === "/bad/opencode")).toHaveLength(1)
})
})
describe("findWorkingOpencodeBinary", () => {
it("returns the first runnable candidate", async () => {
// given
const pathEnv = ["/bad", "/good"].join(delimiter)
const which = (command: string): string | undefined => {
if (command === "opencode") return "/bad/opencode"
return undefined
}
const probe = async (binaryPath: string): Promise<boolean> =>
binaryPath === "/good/opencode"
// when
const resolved = await findWorkingOpencodeBinary(pathEnv, probe, which, "darwin")
// then
expect(resolved).toBe("/good/opencode")
})
})
describe("buildPathWithBinaryFirst", () => {
it("prepends the binary directory and avoids duplicate entries", () => {
// given
const binaryPath = "/good/opencode"
const pathEnv = ["/bad", "/good", "/other"].join(delimiter)
// when
const updated = buildPathWithBinaryFirst(pathEnv, binaryPath)
// then
expect(updated).toBe(["/good", "/bad", "/other"].join(delimiter))
})
})
describe("withWorkingOpencodePath", () => {
it("temporarily updates PATH while starting the server", async () => {
// given
const originalPath = process.env.PATH
process.env.PATH = ["/bad", "/other"].join(delimiter)
const finder = async (): Promise<string | null> => "/good/opencode"
let observedPath = ""
// when
await withWorkingOpencodePath(
async () => {
observedPath = process.env.PATH ?? ""
},
finder,
)
// then
expect(observedPath).toBe(["/good", "/bad", "/other"].join(delimiter))
expect(process.env.PATH).toBe(["/bad", "/other"].join(delimiter))
process.env.PATH = originalPath
})
it("restores PATH when server startup fails", async () => {
// given
const originalPath = process.env.PATH
process.env.PATH = ["/bad", "/other"].join(delimiter)
const finder = async (): Promise<string | null> => join("/good", "opencode")
// when & then
await expect(
withWorkingOpencodePath(
async () => {
throw new Error("boom")
},
finder,
),
).rejects.toThrow("boom")
expect(process.env.PATH).toBe(["/bad", "/other"].join(delimiter))
process.env.PATH = originalPath
})
})

View File

@@ -0,0 +1,95 @@
import { delimiter, dirname, join } from "node:path"
const OPENCODE_COMMANDS = ["opencode", "opencode-desktop"] as const
const WINDOWS_SUFFIXES = ["", ".exe", ".cmd", ".bat", ".ps1"] as const
function getCommandCandidates(platform: NodeJS.Platform): string[] {
if (platform !== "win32") return [...OPENCODE_COMMANDS]
return OPENCODE_COMMANDS.flatMap((command) =>
WINDOWS_SUFFIXES.map((suffix) => `${command}${suffix}`),
)
}
export function collectCandidateBinaryPaths(
pathEnv: string | undefined,
which: (command: string) => string | null | undefined = Bun.which,
platform: NodeJS.Platform = process.platform,
): string[] {
const seen = new Set<string>()
const candidates: string[] = []
const commandCandidates = getCommandCandidates(platform)
const addCandidate = (binaryPath: string | undefined | null): void => {
if (!binaryPath || seen.has(binaryPath)) return
seen.add(binaryPath)
candidates.push(binaryPath)
}
for (const command of commandCandidates) {
addCandidate(which(command))
}
for (const entry of (pathEnv ?? "").split(delimiter).filter(Boolean)) {
for (const command of commandCandidates) {
addCandidate(join(entry, command))
}
}
return candidates
}
export async function canExecuteBinary(binaryPath: string): Promise<boolean> {
try {
const proc = Bun.spawn([binaryPath, "--version"], {
stdout: "pipe",
stderr: "pipe",
})
await proc.exited
return proc.exitCode === 0
} catch {
return false
}
}
export async function findWorkingOpencodeBinary(
pathEnv: string | undefined = process.env.PATH,
probe: (binaryPath: string) => Promise<boolean> = canExecuteBinary,
which: (command: string) => string | null | undefined = Bun.which,
platform: NodeJS.Platform = process.platform,
): Promise<string | null> {
const candidates = collectCandidateBinaryPaths(pathEnv, which, platform)
for (const candidate of candidates) {
if (await probe(candidate)) {
return candidate
}
}
return null
}
export function buildPathWithBinaryFirst(pathEnv: string | undefined, binaryPath: string): string {
const preferredDir = dirname(binaryPath)
const existing = (pathEnv ?? "").split(delimiter).filter(
(entry) => entry.length > 0 && entry !== preferredDir,
)
return [preferredDir, ...existing].join(delimiter)
}
export async function withWorkingOpencodePath<T>(
startServer: () => Promise<T>,
finder: (pathEnv: string | undefined) => Promise<string | null> = findWorkingOpencodeBinary,
): Promise<T> {
const originalPath = process.env.PATH
const binaryPath = await finder(originalPath)
if (!binaryPath) {
return startServer()
}
process.env.PATH = buildPathWithBinaryFirst(originalPath, binaryPath)
try {
return await startServer()
} finally {
process.env.PATH = originalPath
}
}

View File

@@ -207,6 +207,52 @@ describe("pollForCompletion", () => {
expect(todoCallCount).toBe(0)
})
it("falls back to session.status API when idle event is missing", async () => {
//#given - mainSessionIdle not set by events, but status API says idle
spyOn(console, "log").mockImplementation(() => {})
spyOn(console, "error").mockImplementation(() => {})
const ctx = createMockContext({
statuses: {
"test-session": { type: "idle" },
},
})
const eventState = createEventState()
eventState.mainSessionIdle = false
eventState.hasReceivedMeaningfulWork = true
const abortController = new AbortController()
//#when
const result = await pollForCompletion(ctx, eventState, abortController, {
pollIntervalMs: 10,
requiredConsecutive: 2,
minStabilizationMs: 0,
})
//#then - completion succeeds without idle event
expect(result).toBe(0)
})
it("allows silent completion after stabilization when no meaningful work is received", async () => {
//#given - session is idle and stable but no assistant message/tool event arrived
spyOn(console, "log").mockImplementation(() => {})
spyOn(console, "error").mockImplementation(() => {})
const ctx = createMockContext()
const eventState = createEventState()
eventState.mainSessionIdle = true
eventState.hasReceivedMeaningfulWork = false
const abortController = new AbortController()
//#when
const result = await pollForCompletion(ctx, eventState, abortController, {
pollIntervalMs: 10,
requiredConsecutive: 1,
minStabilizationMs: 30,
})
//#then - completion succeeds after stabilization window
expect(result).toBe(0)
})
it("simulates race condition: brief idle with 0 todos does not cause immediate exit", async () => {
//#given - simulate Sisyphus outputting text, session goes idle briefly, then tool fires
spyOn(console, "log").mockImplementation(() => {})

View File

@@ -2,6 +2,7 @@ import pc from "picocolors"
import type { RunContext } from "./types"
import type { EventState } from "./events"
import { checkCompletionConditions } from "./completion"
import { normalizeSDKResponse } from "../../shared"
const DEFAULT_POLL_INTERVAL_MS = 500
const DEFAULT_REQUIRED_CONSECUTIVE = 3
@@ -28,6 +29,7 @@ export async function pollForCompletion(
let consecutiveCompleteChecks = 0
let errorCycleCount = 0
let firstWorkTimestamp: number | null = null
const pollStartTimestamp = Date.now()
while (!abortController.signal.aborted) {
await new Promise((resolve) => setTimeout(resolve, pollIntervalMs))
@@ -51,6 +53,13 @@ export async function pollForCompletion(
errorCycleCount = 0
}
const mainSessionStatus = await getMainSessionStatus(ctx)
if (mainSessionStatus === "busy" || mainSessionStatus === "retry") {
eventState.mainSessionIdle = false
} else if (mainSessionStatus === "idle") {
eventState.mainSessionIdle = true
}
if (!eventState.mainSessionIdle) {
consecutiveCompleteChecks = 0
continue
@@ -62,8 +71,11 @@ export async function pollForCompletion(
}
if (!eventState.hasReceivedMeaningfulWork) {
if (Date.now() - pollStartTimestamp < minStabilizationMs) {
consecutiveCompleteChecks = 0
continue
}
consecutiveCompleteChecks = 0
continue
}
// Track when first meaningful work was received
@@ -91,3 +103,24 @@ export async function pollForCompletion(
return 130
}
async function getMainSessionStatus(
ctx: RunContext
): Promise<"idle" | "busy" | "retry" | null> {
try {
const statusesRes = await ctx.client.session.status({
query: { directory: ctx.directory },
})
const statuses = normalizeSDKResponse(
statusesRes,
{} as Record<string, { type?: string }>
)
const status = statuses[ctx.sessionID]?.type
if (status === "idle" || status === "busy" || status === "retry") {
return status
}
return null
} catch {
return null
}
}

View File

@@ -22,7 +22,7 @@ describe("resolveRunAgent", () => {
)
// then
expect(agent).toBe("hephaestus")
expect(agent).toBe("Hephaestus (Deep Agent)")
})
it("uses env agent over config", () => {
@@ -34,7 +34,7 @@ describe("resolveRunAgent", () => {
const agent = resolveRunAgent({ message: "test" }, config, env)
// then
expect(agent).toBe("atlas")
expect(agent).toBe("Atlas (Plan Executor)")
})
it("uses config agent over default", () => {
@@ -45,7 +45,7 @@ describe("resolveRunAgent", () => {
const agent = resolveRunAgent({ message: "test" }, config, {})
// then
expect(agent).toBe("prometheus")
expect(agent).toBe("Prometheus (Plan Builder)")
})
it("falls back to sisyphus when none set", () => {
@@ -56,7 +56,7 @@ describe("resolveRunAgent", () => {
const agent = resolveRunAgent({ message: "test" }, config, {})
// then
expect(agent).toBe("sisyphus")
expect(agent).toBe("Sisyphus (Ultraworker)")
})
it("skips disabled sisyphus for next available core agent", () => {
@@ -67,7 +67,18 @@ describe("resolveRunAgent", () => {
const agent = resolveRunAgent({ message: "test" }, config, {})
// then
expect(agent).toBe("hephaestus")
expect(agent).toBe("Hephaestus (Deep Agent)")
})
it("maps display-name style default_run_agent values to canonical display names", () => {
// given
const config = createConfig({ default_run_agent: "Sisyphus (Ultraworker)" })
// when
const agent = resolveRunAgent({ message: "test" }, config, {})
// then
expect(agent).toBe("Sisyphus (Ultraworker)")
})
})
@@ -107,7 +118,7 @@ describe("waitForEventProcessorShutdown", () => {
const eventProcessor = new Promise<void>(() => {})
const spy = spyOn(console, "log").mockImplementation(() => {})
consoleLogSpy = spy
const timeoutMs = 50
const timeoutMs = 200
const start = performance.now()
try {
@@ -116,11 +127,8 @@ describe("waitForEventProcessorShutdown", () => {
//#then
const elapsed = performance.now() - start
expect(elapsed).toBeGreaterThanOrEqual(timeoutMs)
const callArgs = spy.mock.calls.flat().join("")
expect(callArgs).toContain(
`[run] Event stream did not close within ${timeoutMs}ms after abort; continuing shutdown.`,
)
expect(elapsed).toBeGreaterThanOrEqual(timeoutMs - 10)
expect(spy.mock.calls.length).toBeGreaterThanOrEqual(1)
} finally {
spy.mockRestore()
}

View File

@@ -79,6 +79,7 @@ export async function run(options: RunOptions): Promise<number> {
const sessionID = await resolveSession({
client,
sessionId: options.sessionId,
directory,
})
console.log(pc.dim(`Session: ${sessionID}`))

View File

@@ -1,4 +1,8 @@
import { describe, it, expect, mock, beforeEach, afterEach } from "bun:test"
import { describe, it, expect, mock, beforeEach, afterEach, afterAll } from "bun:test"
import * as originalSdk from "@opencode-ai/sdk"
import * as originalPortUtils from "../../shared/port-utils"
import * as originalBinaryResolver from "./opencode-binary-resolver"
const originalConsole = globalThis.console
@@ -13,6 +17,7 @@ const mockCreateOpencodeClient = mock(() => ({ session: {} }))
const mockIsPortAvailable = mock(() => Promise.resolve(true))
const mockGetAvailableServerPort = mock(() => Promise.resolve({ port: 4096, wasAutoSelected: false }))
const mockConsoleLog = mock(() => {})
const mockWithWorkingOpencodePath = mock((startServer: () => Promise<unknown>) => startServer())
mock.module("@opencode-ai/sdk", () => ({
createOpencode: mockCreateOpencode,
@@ -25,6 +30,16 @@ mock.module("../../shared/port-utils", () => ({
DEFAULT_SERVER_PORT: 4096,
}))
mock.module("./opencode-binary-resolver", () => ({
withWorkingOpencodePath: mockWithWorkingOpencodePath,
}))
afterAll(() => {
mock.module("@opencode-ai/sdk", () => originalSdk)
mock.module("../../shared/port-utils", () => originalPortUtils)
mock.module("./opencode-binary-resolver", () => originalBinaryResolver)
})
const { createServerConnection } = await import("./server-connection")
describe("createServerConnection", () => {
@@ -35,6 +50,7 @@ describe("createServerConnection", () => {
mockGetAvailableServerPort.mockClear()
mockServerClose.mockClear()
mockConsoleLog.mockClear()
mockWithWorkingOpencodePath.mockClear()
globalThis.console = { ...console, log: mockConsoleLog } as typeof console
})
@@ -52,6 +68,7 @@ describe("createServerConnection", () => {
// then
expect(mockCreateOpencodeClient).toHaveBeenCalledWith({ baseUrl: attachUrl })
expect(mockWithWorkingOpencodePath).not.toHaveBeenCalled()
expect(result.client).toBeDefined()
expect(result.cleanup).toBeDefined()
result.cleanup()
@@ -69,6 +86,7 @@ describe("createServerConnection", () => {
// then
expect(mockIsPortAvailable).toHaveBeenCalledWith(8080, "127.0.0.1")
expect(mockWithWorkingOpencodePath).toHaveBeenCalledTimes(1)
expect(mockCreateOpencode).toHaveBeenCalledWith({ signal, port: 8080, hostname: "127.0.0.1" })
expect(mockCreateOpencodeClient).not.toHaveBeenCalled()
expect(result.client).toBeDefined()
@@ -106,6 +124,7 @@ describe("createServerConnection", () => {
// then
expect(mockGetAvailableServerPort).toHaveBeenCalledWith(4096, "127.0.0.1")
expect(mockWithWorkingOpencodePath).toHaveBeenCalledTimes(1)
expect(mockCreateOpencode).toHaveBeenCalledWith({ signal, port: 4100, hostname: "127.0.0.1" })
expect(mockCreateOpencodeClient).not.toHaveBeenCalled()
expect(result.client).toBeDefined()

View File

@@ -2,12 +2,16 @@ import { createOpencode, createOpencodeClient } from "@opencode-ai/sdk"
import pc from "picocolors"
import type { ServerConnection } from "./types"
import { getAvailableServerPort, isPortAvailable, DEFAULT_SERVER_PORT } from "../../shared/port-utils"
import { withWorkingOpencodePath } from "./opencode-binary-resolver"
import { prependResolvedOpencodeBinToPath } from "./opencode-bin-path"
export async function createServerConnection(options: {
port?: number
attach?: string
signal: AbortSignal
}): Promise<ServerConnection> {
prependResolvedOpencodeBinToPath()
const { port, attach, signal } = options
if (attach !== undefined) {
@@ -25,7 +29,9 @@ export async function createServerConnection(options: {
if (available) {
console.log(pc.dim("Starting server on port"), pc.cyan(port.toString()))
const { client, server } = await createOpencode({ signal, port, hostname: "127.0.0.1" })
const { client, server } = await withWorkingOpencodePath(() =>
createOpencode({ signal, port, hostname: "127.0.0.1" }),
)
console.log(pc.dim("Server listening at"), pc.cyan(server.url))
return { client, cleanup: () => server.close() }
}
@@ -41,7 +47,9 @@ export async function createServerConnection(options: {
} else {
console.log(pc.dim("Starting server on port"), pc.cyan(selectedPort.toString()))
}
const { client, server } = await createOpencode({ signal, port: selectedPort, hostname: "127.0.0.1" })
const { client, server } = await withWorkingOpencodePath(() =>
createOpencode({ signal, port: selectedPort, hostname: "127.0.0.1" }),
)
console.log(pc.dim("Server listening at"), pc.cyan(server.url))
return { client, cleanup: () => server.close() }
}

View File

@@ -26,6 +26,8 @@ const createMockClient = (overrides: {
}
describe("resolveSession", () => {
const directory = "/test-project"
beforeEach(() => {
spyOn(console, "log").mockImplementation(() => {})
spyOn(console, "error").mockImplementation(() => {})
@@ -39,12 +41,13 @@ describe("resolveSession", () => {
})
// when
const result = await resolveSession({ client: mockClient, sessionId })
const result = await resolveSession({ client: mockClient, sessionId, directory })
// then
expect(result).toBe(sessionId)
expect(mockClient.session.get).toHaveBeenCalledWith({
path: { id: sessionId },
query: { directory },
})
expect(mockClient.session.create).not.toHaveBeenCalled()
})
@@ -57,7 +60,7 @@ describe("resolveSession", () => {
})
// when
const result = resolveSession({ client: mockClient, sessionId })
const result = resolveSession({ client: mockClient, sessionId, directory })
// then
await Promise.resolve(
@@ -65,6 +68,7 @@ describe("resolveSession", () => {
)
expect(mockClient.session.get).toHaveBeenCalledWith({
path: { id: sessionId },
query: { directory },
})
expect(mockClient.session.create).not.toHaveBeenCalled()
})
@@ -76,7 +80,7 @@ describe("resolveSession", () => {
})
// when
const result = await resolveSession({ client: mockClient })
const result = await resolveSession({ client: mockClient, directory })
// then
expect(result).toBe("new-session-id")
@@ -87,6 +91,7 @@ describe("resolveSession", () => {
{ permission: "question", action: "deny", pattern: "*" },
],
},
query: { directory },
})
expect(mockClient.session.get).not.toHaveBeenCalled()
})
@@ -101,7 +106,7 @@ describe("resolveSession", () => {
})
// when
const result = await resolveSession({ client: mockClient })
const result = await resolveSession({ client: mockClient, directory })
// then
expect(result).toBe("retried-session-id")
@@ -113,6 +118,7 @@ describe("resolveSession", () => {
{ permission: "question", action: "deny", pattern: "*" },
],
},
query: { directory },
})
})
@@ -127,7 +133,7 @@ describe("resolveSession", () => {
})
// when
const result = resolveSession({ client: mockClient })
const result = resolveSession({ client: mockClient, directory })
// then
await Promise.resolve(
@@ -147,7 +153,7 @@ describe("resolveSession", () => {
})
// when
const result = resolveSession({ client: mockClient })
const result = resolveSession({ client: mockClient, directory })
// then
await Promise.resolve(

View File

@@ -8,11 +8,15 @@ const SESSION_CREATE_RETRY_DELAY_MS = 1000
export async function resolveSession(options: {
client: OpencodeClient
sessionId?: string
directory: string
}): Promise<string> {
const { client, sessionId } = options
const { client, sessionId, directory } = options
if (sessionId) {
const res = await client.session.get({ path: { id: sessionId } })
const res = await client.session.get({
path: { id: sessionId },
query: { directory },
})
if (res.error || !res.data) {
throw new Error(`Session not found: ${sessionId}`)
}
@@ -28,6 +32,7 @@ export async function resolveSession(options: {
{ permission: "question", action: "deny" as const, pattern: "*" },
],
} as any,
query: { directory },
})
if (res.error) {

View File

@@ -34,10 +34,10 @@ export interface RunContext {
}
export interface Todo {
id: string
content: string
status: string
priority: string
id?: string;
content: string;
status: string;
priority: string;
}
export interface SessionStatus {
@@ -55,16 +55,19 @@ export interface EventPayload {
export interface SessionIdleProps {
sessionID?: string
sessionId?: string
}
export interface SessionStatusProps {
sessionID?: string
sessionId?: string
status?: { type?: string }
}
export interface MessageUpdatedProps {
info?: {
sessionID?: string
sessionId?: string
role?: string
modelID?: string
providerID?: string
@@ -73,28 +76,47 @@ export interface MessageUpdatedProps {
}
export interface MessagePartUpdatedProps {
info?: { sessionID?: string; role?: string }
/** @deprecated Legacy structure — current OpenCode puts sessionID inside part */
info?: { sessionID?: string; sessionId?: string; role?: string }
part?: {
id?: string
sessionID?: string
sessionId?: string
messageID?: string
type?: string
text?: string
/** Tool name (for part.type === "tool") */
tool?: string
/** Tool state (for part.type === "tool") */
state?: { status?: string; input?: Record<string, unknown>; output?: string }
name?: string
input?: unknown
time?: { start?: number; end?: number }
}
}
export interface ToolExecuteProps {
sessionID?: string
sessionId?: string
name?: string
input?: Record<string, unknown>
}
export interface ToolResultProps {
sessionID?: string
sessionId?: string
name?: string
output?: string
}
export interface SessionErrorProps {
sessionID?: string
sessionId?: string
error?: unknown
}
export interface TuiToastShowProps {
title?: string
message?: string
variant?: "info" | "success" | "warning" | "error"
}

View File

@@ -1,52 +1,50 @@
# CONFIG KNOWLEDGE BASE
# src/config/ — Zod v4 Schema System
**Generated:** 2026-02-17
## OVERVIEW
Zod schema definitions for plugin configuration. 21 component files composing `OhMyOpenCodeConfigSchema` with multi-level inheritance and JSONC support.
22 schema files composing `OhMyOpenCodeConfigSchema`. Zod v4 validation with `safeParse()`. All fields optional — omitted fields use plugin defaults.
## SCHEMA TREE
## STRUCTURE
```
config/
├── schema/ # 21 schema component files
│ ├── oh-my-opencode-config.ts # Root schema composition (57 lines)
├── agent-names.ts # BuiltinAgentNameSchema (11 agents), BuiltinSkillNameSchema
│ ├── agent-overrides.ts # AgentOverrideConfigSchema (model, variant, temp, thinking...)
│ ├── categories.ts # 8 categories: visual-engineering, ultrabrain, deep, artistry, quick, ...
│ ├── hooks.ts # HookNameSchema (100+ hook names)
├── commands.ts # BuiltinCommandNameSchema
├── experimental.ts # ExperimentalConfigSchema
│ ├── dynamic-context-pruning.ts # DynamicContextPruningConfigSchema (55 lines)
│ ├── background-task.ts # BackgroundTaskConfigSchema
│ ├── claude-code.ts # ClaudeCodeConfigSchema
│ ├── comment-checker.ts # CommentCheckerConfigSchema
│ ├── notification.ts # NotificationConfigSchema
│ ├── ralph-loop.ts # RalphLoopConfigSchema
│ ├── sisyphus.ts # SisyphusConfigSchema
│ ├── sisyphus-agent.ts # SisyphusAgentConfigSchema
│ ├── skills.ts # SkillsConfigSchema (45 lines)
│ ├── tmux.ts # TmuxConfigSchema, TmuxLayoutSchema
│ ├── websearch.ts # WebsearchConfigSchema
├── browser-automation.ts # BrowserAutomationConfigSchema
│ ├── git-master.ts # GitMasterConfigSchema
│ └── babysitting.ts # BabysittingConfigSchema
├── schema.ts # Barrel export (24 lines)
├── schema.test.ts # Validation tests (735 lines)
├── types.ts # TypeScript types from schemas
└── index.ts # Barrel export (33 lines)
config/schema/
├── oh-my-opencode-config.ts # ROOT: OhMyOpenCodeConfigSchema (composes all below)
├── agent-names.ts # BuiltinAgentNameSchema (11), OverridableAgentNameSchema (14)
├── agent-overrides.ts # AgentOverrideConfigSchema (21 fields per agent)
├── categories.ts # 8 built-in + custom categories
├── hooks.ts # HookNameSchema (46 hooks)
├── skills.ts # SkillsConfigSchema (sources, paths, recursive)
├── commands.ts # BuiltinCommandNameSchema
├── experimental.ts # Feature flags (plugin_load_timeout_ms min 1000, hashline_edit)
├── sisyphus.ts # SisyphusConfigSchema (task system)
├── sisyphus-agent.ts # SisyphusAgentConfigSchema
├── ralph-loop.ts # RalphLoopConfigSchema
├── tmux.ts # TmuxConfigSchema + TmuxLayoutSchema
├── websearch.ts # provider: "exa" | "tavily"
├── claude-code.ts # CC compatibility settings
├── comment-checker.ts # AI comment detection config
├── notification.ts # OS notification settings
├── git-master.ts # commit_footer: boolean | string
├── browser-automation.ts # provider: playwright | agent-browser | playwright-cli
├── background-task.ts # Concurrency limits per model/provider
├── babysitting.ts # Unstable agent monitoring
├── dynamic-context-pruning.ts # Context pruning settings
└── internal/permission.ts # AgentPermissionSchema
```
## ROOT SCHEMA
## ROOT SCHEMA FIELDS (26)
`OhMyOpenCodeConfigSchema` composes: `$schema`, `new_task_system_enabled`, `default_run_agent`, `auto_update`, `disabled_{mcps,agents,skills,hooks,commands,tools}`, `agents` (14 agent keys), `categories` (8 built-in), `claude_code`, `sisyphus_agent`, `comment_checker`, `experimental`, `skills`, `ralph_loop`, `background_task`, `notification`, `babysitting`, `git_master`, `browser_automation_engine`, `websearch`, `tmux`, `sisyphus`
`$schema`, `new_task_system_enabled`, `default_run_agent`, `disabled_mcps`, `disabled_agents`, `disabled_skills`, `disabled_hooks`, `disabled_commands`, `disabled_tools`, `agents`, `categories`, `claude_code`, `sisyphus_agent`, `comment_checker`, `experimental`, `auto_update`, `skills`, `ralph_loop`, `background_task`, `notification`, `babysitting`, `git_master`, `browser_automation_engine`, `websearch`, `tmux`, `sisyphus`, `_migrations`
## CONFIGURATION HIERARCHY
## AGENT OVERRIDE FIELDS (21)
Project (`.opencode/oh-my-opencode.json`) → User (`~/.config/opencode/oh-my-opencode.json`) → Defaults
`model`, `variant`, `category`, `skills`, `temperature`, `top_p`, `prompt`, `prompt_append`, `tools`, `disable`, `description`, `mode`, `color`, `permission`, `maxTokens`, `thinking`, `reasoningEffort`, `textVerbosity`, `providerOptions`
## AGENT OVERRIDE FIELDS
## HOW TO ADD CONFIG
`model`, `variant`, `category`, `skills`, `temperature`, `top_p`, `maxTokens`, `thinking`, `reasoningEffort`, `textVerbosity`, `prompt`, `prompt_append`, `tools`, `permission`, `providerOptions`, `disable`, `description`, `mode`, `color`
## AFTER SCHEMA CHANGES
Run `bun run build:schema` to regenerate `dist/oh-my-opencode.schema.json`
1. Create `src/config/schema/{name}.ts` with Zod schema
2. Add field to `oh-my-opencode-config.ts` root schema
3. Reference via `z.infer<typeof YourSchema>` for TypeScript types
4. Access in handlers via `pluginConfig.{name}`

View File

@@ -553,6 +553,18 @@ describe("BrowserAutomationProviderSchema", () => {
// then
expect(result.success).toBe(false)
})
test("accepts 'playwright-cli' as valid provider", () => {
// given
const input = "playwright-cli"
// when
const result = BrowserAutomationProviderSchema.safeParse(input)
// then
expect(result.success).toBe(true)
expect(result.data).toBe("playwright-cli")
})
})
describe("BrowserAutomationConfigSchema", () => {
@@ -577,6 +589,17 @@ describe("BrowserAutomationConfigSchema", () => {
// then
expect(result.provider).toBe("agent-browser")
})
test("accepts playwright-cli provider in config", () => {
// given
const input = { provider: "playwright-cli" }
// when
const result = BrowserAutomationConfigSchema.parse(input)
// then
expect(result.provider).toBe("playwright-cli")
})
})
describe("OhMyOpenCodeConfigSchema - browser_automation_engine", () => {
@@ -607,6 +630,18 @@ describe("OhMyOpenCodeConfigSchema - browser_automation_engine", () => {
expect(result.success).toBe(true)
expect(result.data?.browser_automation_engine).toBeUndefined()
})
test("accepts browser_automation_engine with playwright-cli", () => {
// given
const input = { browser_automation_engine: { provider: "playwright-cli" } }
// when
const result = OhMyOpenCodeConfigSchema.safeParse(input)
// then
expect(result.success).toBe(true)
expect(result.data?.browser_automation_engine?.provider).toBe("playwright-cli")
})
})
describe("ExperimentalConfigSchema feature flags", () => {
@@ -663,6 +698,59 @@ describe("ExperimentalConfigSchema feature flags", () => {
expect(result.data.safe_hook_creation).toBeUndefined()
}
})
test("accepts hashline_edit as true", () => {
//#given
const config = { hashline_edit: true }
//#when
const result = ExperimentalConfigSchema.safeParse(config)
//#then
expect(result.success).toBe(true)
if (result.success) {
expect(result.data.hashline_edit).toBe(true)
}
})
test("accepts hashline_edit as false", () => {
//#given
const config = { hashline_edit: false }
//#when
const result = ExperimentalConfigSchema.safeParse(config)
//#then
expect(result.success).toBe(true)
if (result.success) {
expect(result.data.hashline_edit).toBe(false)
}
})
test("hashline_edit is optional", () => {
//#given
const config = { safe_hook_creation: true }
//#when
const result = ExperimentalConfigSchema.safeParse(config)
//#then
expect(result.success).toBe(true)
if (result.success) {
expect(result.data.hashline_edit).toBeUndefined()
}
})
test("rejects non-boolean hashline_edit", () => {
//#given
const config = { hashline_edit: "true" }
//#when
const result = ExperimentalConfigSchema.safeParse(config)
//#then
expect(result.success).toBe(false)
})
})
describe("GitMasterConfigSchema", () => {

View File

@@ -4,6 +4,7 @@ export const BrowserAutomationProviderSchema = z.enum([
"playwright",
"agent-browser",
"dev-browser",
"playwright-cli",
])
export const BrowserAutomationConfigSchema = z.object({
@@ -12,6 +13,7 @@ export const BrowserAutomationConfigSchema = z.object({
* - "playwright": Uses Playwright MCP server (@playwright/mcp) - default
* - "agent-browser": Uses Vercel's agent-browser CLI (requires: bun add -g agent-browser)
* - "dev-browser": Uses dev-browser skill with persistent browser state
* - "playwright-cli": Uses Playwright CLI (@playwright/cli) - token-efficient CLI alternative
*/
provider: BrowserAutomationProviderSchema.default("playwright"),
})

View File

@@ -15,6 +15,8 @@ export const ExperimentalConfigSchema = z.object({
plugin_load_timeout_ms: z.number().min(1000).optional(),
/** Wrap hook creation in try/catch to prevent one failing hook from crashing the plugin (default: true at call site) */
safe_hook_creation: z.boolean().optional(),
/** Enable hashline_edit tool for improved file editing with hash-based line anchors */
hashline_edit: z.boolean().optional(),
})
export type ExperimentalConfig = z.infer<typeof ExperimentalConfigSchema>

View File

@@ -45,6 +45,7 @@ export const HookNameSchema = z.enum([
"tasks-todowrite-disabler",
"write-existing-file-guard",
"anthropic-effort",
"hashline-read-enhancer",
])
export type HookName = z.infer<typeof HookNameSchema>

View File

@@ -3,6 +3,7 @@ import type { HookName, OhMyOpenCodeConfig } from "./config"
import type { LoadedSkill } from "./features/opencode-skill-loader/types"
import type { BackgroundManager } from "./features/background-agent"
import type { PluginContext } from "./plugin/types"
import type { ModelCacheState } from "./plugin-state"
import { createCoreHooks } from "./plugin/hooks/create-core-hooks"
import { createContinuationHooks } from "./plugin/hooks/create-continuation-hooks"
@@ -13,6 +14,7 @@ export type CreatedHooks = ReturnType<typeof createHooks>
export function createHooks(args: {
ctx: PluginContext
pluginConfig: OhMyOpenCodeConfig
modelCacheState: ModelCacheState
backgroundManager: BackgroundManager
isHookEnabled: (hookName: HookName) => boolean
safeHookEnabled: boolean
@@ -22,6 +24,7 @@ export function createHooks(args: {
const {
ctx,
pluginConfig,
modelCacheState,
backgroundManager,
isHookEnabled,
safeHookEnabled,
@@ -32,6 +35,7 @@ export function createHooks(args: {
const core = createCoreHooks({
ctx,
pluginConfig,
modelCacheState,
isHookEnabled,
safeHookEnabled,
})

View File

@@ -22,8 +22,9 @@ export function createManagers(args: {
pluginConfig: OhMyOpenCodeConfig
tmuxConfig: TmuxConfig
modelCacheState: ModelCacheState
backgroundNotificationHookEnabled: boolean
}): Managers {
const { ctx, pluginConfig, tmuxConfig, modelCacheState } = args
const { ctx, pluginConfig, tmuxConfig, modelCacheState, backgroundNotificationHookEnabled } = args
const tmuxSessionManager = new TmuxSessionManager(ctx, tmuxConfig)
@@ -57,6 +58,7 @@ export function createManagers(args: {
log("[index] tmux cleanup error during shutdown:", error)
})
},
enableParentSessionNotifications: backgroundNotificationHookEnabled,
},
)

View File

@@ -1,79 +1,69 @@
# FEATURES KNOWLEDGE BASE
# src/features/ — 18 Feature Modules
**Generated:** 2026-02-17
## OVERVIEW
18 feature modules extending plugin capabilities: agent orchestration, skill loading, Claude Code compatibility, MCP management, task storage, and tmux integration.
Standalone feature modules wired into plugin/ layer. Each is self-contained with own types, implementation, and tests.
## STRUCTURE
```
features/
├── background-agent/ # Task lifecycle, concurrency (50 files, 8330 LOC)
│ ├── manager.ts # Main task orchestration (1646 lines)
│ ├── concurrency.ts # Parallel execution limits per provider/model
│ └── spawner/ # Task spawning utilities (8 files)
├── tmux-subagent/ # Tmux integration (28 files, 3303 LOC)
│ └── manager.ts # Pane management, grid planning (350 lines)
├── opencode-skill-loader/ # YAML frontmatter skill loading (28 files, 2967 LOC)
│ ├── loader.ts # Skill discovery (4 scopes)
│ ├── skill-directory-loader.ts # Recursive directory scanning
│ ├── skill-discovery.ts # getAllSkills() with caching
│ └── merger/ # Skill merging with scope priority
├── mcp-oauth/ # OAuth 2.0 flow for MCP (18 files, 2164 LOC)
│ ├── provider.ts # McpOAuthProvider class
│ ├── oauth-authorization-flow.ts # PKCE, callback handling
│ └── dcr.ts # Dynamic Client Registration (RFC 7591)
├── skill-mcp-manager/ # MCP client lifecycle per session (12 files, 1769 LOC)
│ └── manager.ts # SkillMcpManager class (150 lines)
├── builtin-skills/ # 5 built-in skills (10 files, 1921 LOC)
│ └── skills/ # git-master (1111), playwright, dev-browser, frontend-ui-ux
├── builtin-commands/ # 6 command templates (11 files, 1511 LOC)
│ └── templates/ # refactor, ralph-loop, init-deep, handoff, start-work, stop-continuation
├── claude-tasks/ # Task schema + storage (7 files, 1165 LOC)
├── context-injector/ # AGENTS.md, README.md, rules injection (6 files, 809 LOC)
├── claude-code-plugin-loader/ # Plugin discovery from .opencode/plugins/ (10 files)
├── claude-code-mcp-loader/ # .mcp.json with ${VAR} expansion (6 files)
├── claude-code-command-loader/ # Command loading from .opencode/commands/ (3 files)
├── claude-code-agent-loader/ # Agent loading from .opencode/agents/ (3 files)
├── claude-code-session-state/ # Subagent session state tracking (3 files)
├── hook-message-injector/ # System message injection (4 files)
├── task-toast-manager/ # Task progress notifications (4 files)
├── boulder-state/ # Persistent state for multi-step ops (5 files)
└── tool-metadata-store/ # Tool execution metadata caching (3 files)
```
## MODULE MAP
## KEY PATTERNS
| Module | Files | Complexity | Purpose |
|--------|-------|------------|---------|
| **background-agent** | 49 | HIGH | Task lifecycle, concurrency (5/model), polling, spawner pattern |
| **tmux-subagent** | 27 | HIGH | Tmux pane management, grid planning, session orchestration |
| **opencode-skill-loader** | 25 | HIGH | YAML frontmatter skill loading from 4 scopes |
| **mcp-oauth** | 10 | HIGH | OAuth 2.0 + PKCE + DCR (RFC 7591) for MCP servers |
| **builtin-skills** | 10 | LOW | 6 skills: git-master, playwright, playwright-cli, agent-browser, dev-browser, frontend-ui-ux |
| **skill-mcp-manager** | 10 | MEDIUM | MCP client lifecycle per session (stdio + HTTP) |
| **claude-code-plugin-loader** | 10 | MEDIUM | Unified plugin discovery from .opencode/plugins/ |
| **builtin-commands** | 9 | LOW | Command templates: refactor, init-deep, handoff, etc. |
| **claude-code-mcp-loader** | 5 | MEDIUM | .mcp.json loading with ${VAR} env expansion |
| **context-injector** | 4 | MEDIUM | AGENTS.md/README.md injection into context |
| **boulder-state** | 4 | LOW | Persistent state for multi-step operations |
| **hook-message-injector** | 4 | MEDIUM | System message injection for hooks |
| **claude-tasks** | 4 | MEDIUM | Task schema + file storage + OpenCode todo sync |
| **task-toast-manager** | 3 | MEDIUM | Task progress notifications |
| **claude-code-agent-loader** | 3 | LOW | Load agents from .opencode/agents/ |
| **claude-code-command-loader** | 3 | LOW | Load commands from .opencode/commands/ |
| **claude-code-session-state** | 2 | LOW | Subagent session state tracking |
| **tool-metadata-store** | 2 | LOW | Tool execution metadata cache |
**Background Agent Lifecycle:**
Task creation → Queue → Concurrency check → Execute → Monitor/Poll → Notification → Cleanup
## KEY MODULES
**Skill Loading Pipeline (4-scope priority):**
opencode-project (`.opencode/skills/`) > opencode (`~/.config/opencode/skills/`) > project (`.claude/skills/`) > user (`~/.claude/skills/`)
### background-agent (49 files, ~10k LOC)
**Claude Code Compatibility Layer:**
5 loaders: agent-loader, command-loader, mcp-loader, plugin-loader, session-state
Core orchestration engine. `BackgroundManager` manages task lifecycle:
- States: pending → running → completed/error/cancelled/interrupt
- Concurrency: per-model/provider limits via `ConcurrencyManager` (FIFO queue)
- Polling: 3s interval, completion via idle events + stability detection (10s unchanged)
- spawner/: 8 focused files composing via `SpawnerContext` interface
**SKILL.md Format:**
```yaml
---
name: my-skill
description: "..."
model: "claude-opus-4-6" # optional
agent: "sisyphus" # optional
mcp: # optional embedded MCPs
server-name:
type: http
url: https://...
---
# Skill instruction content
```
### opencode-skill-loader (25 files, ~3.2k LOC)
## HOW TO ADD
4-scope skill discovery (project > opencode > user > global):
- YAML frontmatter parsing from SKILL.md files
- Skill merger with priority deduplication
- Template resolution with variable substitution
- Provider gating for model-specific skills
1. Create directory under `src/features/`
2. Add `index.ts`, `types.ts`, `constants.ts` as needed
3. Export from `index.ts` following barrel pattern
4. Register in main plugin if plugin-level feature
### tmux-subagent (27 files, ~3.6k LOC)
## CHILD DOCUMENTATION
State-first tmux integration:
- `TmuxSessionManager`: pane lifecycle, grid planning
- Spawn action decider + target finder
- Polling manager for session health
- Event handlers for pane creation/destruction
- See `claude-tasks/AGENTS.md` for task schema and storage details
### builtin-skills (6 skill objects)
| Skill | Size | MCP | Tools |
|-------|------|-----|-------|
| git-master | 1111 LOC | — | Bash |
| playwright | 312 LOC | @playwright/mcp | — |
| agent-browser | (in playwright.ts) | — | Bash(agent-browser:*) |
| playwright-cli | 268 LOC | — | Bash(playwright-cli:*) |
| dev-browser | 221 LOC | — | Bash |
| frontend-ui-ux | 79 LOC | — | — |
Browser variant selected by `browserProvider` config: playwright (default) | playwright-cli | agent-browser.

View File

@@ -1,168 +0,0 @@
import { log } from "../../shared"
import type { BackgroundTask } from "./types"
import { cleanupTaskAfterSessionEnds } from "./session-task-cleanup"
import { handleSessionIdleBackgroundEvent } from "./session-idle-event-handler"
type Event = { type: string; properties?: Record<string, unknown> }
function isRecord(value: unknown): value is Record<string, unknown> {
return typeof value === "object" && value !== null
}
function getString(obj: Record<string, unknown>, key: string): string | undefined {
const value = obj[key]
return typeof value === "string" ? value : undefined
}
export function handleBackgroundEvent(args: {
event: Event
findBySession: (sessionID: string) => BackgroundTask | undefined
getAllDescendantTasks: (sessionID: string) => BackgroundTask[]
releaseConcurrencyKey?: (key: string) => void
cancelTask: (
taskId: string,
options: { source: string; reason: string; skipNotification: true }
) => Promise<boolean>
tryCompleteTask: (task: BackgroundTask, source: string) => Promise<boolean>
validateSessionHasOutput: (sessionID: string) => Promise<boolean>
checkSessionTodos: (sessionID: string) => Promise<boolean>
idleDeferralTimers: Map<string, ReturnType<typeof setTimeout>>
completionTimers: Map<string, ReturnType<typeof setTimeout>>
tasks: Map<string, BackgroundTask>
cleanupPendingByParent: (task: BackgroundTask) => void
clearNotificationsForTask: (taskId: string) => void
emitIdleEvent: (sessionID: string) => void
}): void {
const {
event,
findBySession,
getAllDescendantTasks,
releaseConcurrencyKey,
cancelTask,
tryCompleteTask,
validateSessionHasOutput,
checkSessionTodos,
idleDeferralTimers,
completionTimers,
tasks,
cleanupPendingByParent,
clearNotificationsForTask,
emitIdleEvent,
} = args
const props = event.properties
if (event.type === "message.part.updated" || event.type === "message.part.delta") {
if (!props || !isRecord(props)) return
const sessionID = getString(props, "sessionID")
if (!sessionID) return
const task = findBySession(sessionID)
if (!task) return
const existingTimer = idleDeferralTimers.get(task.id)
if (existingTimer) {
clearTimeout(existingTimer)
idleDeferralTimers.delete(task.id)
}
const type = getString(props, "type")
const tool = getString(props, "tool")
if (!task.progress) {
task.progress = { toolCalls: 0, lastUpdate: new Date() }
}
task.progress.lastUpdate = new Date()
if (type === "tool" || tool) {
task.progress.toolCalls += 1
task.progress.lastTool = tool
}
}
if (event.type === "session.idle") {
if (!props || !isRecord(props)) return
handleSessionIdleBackgroundEvent({
properties: props,
findBySession,
idleDeferralTimers,
validateSessionHasOutput,
checkSessionTodos,
tryCompleteTask,
emitIdleEvent,
})
}
if (event.type === "session.error") {
if (!props || !isRecord(props)) return
const sessionID = getString(props, "sessionID")
if (!sessionID) return
const task = findBySession(sessionID)
if (!task || task.status !== "running") return
const errorRaw = props["error"]
const dataRaw = isRecord(errorRaw) ? errorRaw["data"] : undefined
const message =
(isRecord(dataRaw) ? getString(dataRaw, "message") : undefined) ??
(isRecord(errorRaw) ? getString(errorRaw, "message") : undefined) ??
"Session error"
task.status = "error"
task.error = message
task.completedAt = new Date()
cleanupTaskAfterSessionEnds({
task,
tasks,
idleDeferralTimers,
completionTimers,
cleanupPendingByParent,
clearNotificationsForTask,
releaseConcurrencyKey,
})
}
if (event.type === "session.deleted") {
if (!props || !isRecord(props)) return
const infoRaw = props["info"]
if (!isRecord(infoRaw)) return
const sessionID = getString(infoRaw, "id")
if (!sessionID) return
const tasksToCancel = new Map<string, BackgroundTask>()
const directTask = findBySession(sessionID)
if (directTask) {
tasksToCancel.set(directTask.id, directTask)
}
for (const descendant of getAllDescendantTasks(sessionID)) {
tasksToCancel.set(descendant.id, descendant)
}
if (tasksToCancel.size === 0) return
for (const task of tasksToCancel.values()) {
if (task.status === "running" || task.status === "pending") {
void cancelTask(task.id, {
source: "session.deleted",
reason: "Session deleted",
skipNotification: true,
}).catch((err) => {
log("[background-agent] Failed to cancel task on session.deleted:", {
taskId: task.id,
error: err,
})
})
}
cleanupTaskAfterSessionEnds({
task,
tasks,
idleDeferralTimers,
completionTimers,
cleanupPendingByParent,
clearNotificationsForTask,
releaseConcurrencyKey,
})
}
}
}

View File

@@ -1,82 +0,0 @@
import { log } from "../../shared"
import type { BackgroundTask, LaunchInput } from "./types"
import type { ConcurrencyManager } from "./concurrency"
import type { PluginInput } from "@opencode-ai/plugin"
type QueueItem = { task: BackgroundTask; input: LaunchInput }
export function shutdownBackgroundManager(args: {
shutdownTriggered: { value: boolean }
stopPolling: () => void
tasks: Map<string, BackgroundTask>
client: PluginInput["client"]
onShutdown?: () => void
concurrencyManager: ConcurrencyManager
completionTimers: Map<string, ReturnType<typeof setTimeout>>
idleDeferralTimers: Map<string, ReturnType<typeof setTimeout>>
notifications: Map<string, BackgroundTask[]>
pendingByParent: Map<string, Set<string>>
queuesByKey: Map<string, QueueItem[]>
processingKeys: Set<string>
unregisterProcessCleanup: () => void
}): void {
const {
shutdownTriggered,
stopPolling,
tasks,
client,
onShutdown,
concurrencyManager,
completionTimers,
idleDeferralTimers,
notifications,
pendingByParent,
queuesByKey,
processingKeys,
unregisterProcessCleanup,
} = args
if (shutdownTriggered.value) return
shutdownTriggered.value = true
log("[background-agent] Shutting down BackgroundManager")
stopPolling()
for (const task of tasks.values()) {
if (task.status === "running" && task.sessionID) {
client.session.abort({ path: { id: task.sessionID } }).catch(() => {})
}
}
if (onShutdown) {
try {
onShutdown()
} catch (error) {
log("[background-agent] Error in onShutdown callback:", error)
}
}
for (const task of tasks.values()) {
if (task.concurrencyKey) {
concurrencyManager.release(task.concurrencyKey)
task.concurrencyKey = undefined
}
}
for (const timer of completionTimers.values()) clearTimeout(timer)
completionTimers.clear()
for (const timer of idleDeferralTimers.values()) clearTimeout(timer)
idleDeferralTimers.clear()
concurrencyManager.clear()
tasks.clear()
notifications.clear()
pendingByParent.clear()
queuesByKey.clear()
processingKeys.clear()
unregisterProcessCleanup()
log("[background-agent] Shutdown complete")
}

View File

@@ -33,10 +33,10 @@ export interface BackgroundEvent {
}
export interface Todo {
content: string
status: string
priority: string
id: string
content: string;
status: string;
priority: string;
id?: string;
}
export interface QueueItem {

View File

@@ -0,0 +1,53 @@
import { describe, test, expect } from "bun:test"
import { tmpdir } from "node:os"
import type { PluginInput } from "@opencode-ai/plugin"
import { BackgroundManager } from "./manager"
function createManagerWithStatus(statusImpl: () => Promise<{ data: Record<string, { type: string }> }>): BackgroundManager {
const client = {
session: {
status: statusImpl,
prompt: async () => ({}),
promptAsync: async () => ({}),
abort: async () => ({}),
todo: async () => ({ data: [] }),
messages: async () => ({ data: [] }),
},
}
return new BackgroundManager({ client, directory: tmpdir() } as unknown as PluginInput)
}
describe("BackgroundManager polling overlap", () => {
test("skips overlapping pollRunningTasks executions", async () => {
//#given
let activeCalls = 0
let maxActiveCalls = 0
let statusCallCount = 0
let releaseStatus: (() => void) | undefined
const statusGate = new Promise<void>((resolve) => {
releaseStatus = resolve
})
const manager = createManagerWithStatus(async () => {
statusCallCount += 1
activeCalls += 1
maxActiveCalls = Math.max(maxActiveCalls, activeCalls)
await statusGate
activeCalls -= 1
return { data: {} }
})
//#when
const firstPoll = (manager as unknown as { pollRunningTasks: () => Promise<void> }).pollRunningTasks()
await Promise.resolve()
const secondPoll = (manager as unknown as { pollRunningTasks: () => Promise<void> }).pollRunningTasks()
releaseStatus?.()
await Promise.all([firstPoll, secondPoll])
manager.shutdown()
//#then
expect(maxActiveCalls).toBe(1)
expect(statusCallCount).toBe(1)
})
})

View File

@@ -6,6 +6,7 @@ import type { BackgroundTask, ResumeInput } from "./types"
import { MIN_IDLE_TIME_MS } from "./constants"
import { BackgroundManager } from "./manager"
import { ConcurrencyManager } from "./concurrency"
import { initTaskToastManager, _resetTaskToastManagerForTesting } from "../task-toast-manager/manager"
const TASK_TTL_MS = 30 * 60 * 1000
@@ -190,6 +191,10 @@ function getPendingByParent(manager: BackgroundManager): Map<string, Set<string>
return (manager as unknown as { pendingByParent: Map<string, Set<string>> }).pendingByParent
}
function getCompletionTimers(manager: BackgroundManager): Map<string, ReturnType<typeof setTimeout>> {
return (manager as unknown as { completionTimers: Map<string, ReturnType<typeof setTimeout>> }).completionTimers
}
function getQueuesByKey(
manager: BackgroundManager
): Map<string, Array<{ task: BackgroundTask; input: import("./types").LaunchInput }>> {
@@ -215,6 +220,23 @@ function stubNotifyParentSession(manager: BackgroundManager): void {
;(manager as unknown as { notifyParentSession: () => Promise<void> }).notifyParentSession = async () => {}
}
function createToastRemoveTaskTracker(): { removeTaskCalls: string[]; resetToastManager: () => void } {
_resetTaskToastManagerForTesting()
const toastManager = initTaskToastManager({
tui: { showToast: async () => {} },
} as unknown as PluginInput["client"])
const removeTaskCalls: string[] = []
const originalRemoveTask = toastManager.removeTask.bind(toastManager)
toastManager.removeTask = (taskId: string): void => {
removeTaskCalls.push(taskId)
originalRemoveTask(taskId)
}
return {
removeTaskCalls,
resetToastManager: _resetTaskToastManagerForTesting,
}
}
function getCleanupSignals(): Array<NodeJS.Signals | "beforeExit" | "exit"> {
const signals: Array<NodeJS.Signals | "beforeExit" | "exit"> = ["SIGINT", "SIGTERM", "beforeExit", "exit"]
if (process.platform === "win32") {
@@ -783,6 +805,62 @@ interface CurrentMessage {
}
describe("BackgroundManager.notifyParentSession - dynamic message lookup", () => {
test("should skip compaction agent and use nearest non-compaction message", async () => {
//#given
let capturedBody: Record<string, unknown> | undefined
const client = {
session: {
prompt: async () => ({}),
promptAsync: async (args: { body: Record<string, unknown> }) => {
capturedBody = args.body
return {}
},
abort: async () => ({}),
messages: async () => ({
data: [
{
info: {
agent: "sisyphus",
model: { providerID: "anthropic", modelID: "claude-opus-4-6" },
},
},
{
info: {
agent: "compaction",
model: { providerID: "anthropic", modelID: "claude-sonnet-4-5" },
},
},
],
}),
},
}
const manager = new BackgroundManager({ client, directory: tmpdir() } as unknown as PluginInput)
const task: BackgroundTask = {
id: "task-skip-compaction",
sessionID: "session-child",
parentSessionID: "session-parent",
parentMessageID: "msg-parent",
description: "task with compaction at tail",
prompt: "test",
agent: "explore",
status: "completed",
startedAt: new Date(),
completedAt: new Date(),
parentAgent: "fallback-agent",
}
getPendingByParent(manager).set("session-parent", new Set([task.id, "still-running"]))
//#when
await (manager as unknown as { notifyParentSession: (value: BackgroundTask) => Promise<void> })
.notifyParentSession(task)
//#then
expect(capturedBody?.agent).toBe("sisyphus")
expect(capturedBody?.model).toEqual({ providerID: "anthropic", modelID: "claude-opus-4-6" })
manager.shutdown()
})
test("should use currentMessage model/agent when available", async () => {
// given - currentMessage has model and agent
const task: BackgroundTask = {
@@ -894,7 +972,7 @@ describe("BackgroundManager.notifyParentSession - dynamic message lookup", () =>
})
describe("BackgroundManager.notifyParentSession - aborted parent", () => {
test("should skip notification when parent session is aborted", async () => {
test("should fall back and still notify when parent session messages are aborted", async () => {
//#given
let promptCalled = false
const promptMock = async () => {
@@ -933,7 +1011,7 @@ describe("BackgroundManager.notifyParentSession - aborted parent", () => {
.notifyParentSession(task)
//#then
expect(promptCalled).toBe(false)
expect(promptCalled).toBe(true)
manager.shutdown()
})
@@ -981,6 +1059,52 @@ describe("BackgroundManager.notifyParentSession - aborted parent", () => {
})
})
describe("BackgroundManager.notifyParentSession - notifications toggle", () => {
test("should skip parent prompt injection when notifications are disabled", async () => {
//#given
let promptCalled = false
const promptMock = async () => {
promptCalled = true
return {}
}
const client = {
session: {
prompt: promptMock,
promptAsync: promptMock,
abort: async () => ({}),
messages: async () => ({ data: [] }),
},
}
const manager = new BackgroundManager(
{ client, directory: tmpdir() } as unknown as PluginInput,
undefined,
{ enableParentSessionNotifications: false },
)
const task: BackgroundTask = {
id: "task-no-parent-notification",
sessionID: "session-child",
parentSessionID: "session-parent",
parentMessageID: "msg-parent",
description: "task notifications disabled",
prompt: "test",
agent: "explore",
status: "completed",
startedAt: new Date(),
completedAt: new Date(),
}
getPendingByParent(manager).set("session-parent", new Set([task.id]))
//#when
await (manager as unknown as { notifyParentSession: (task: BackgroundTask) => Promise<void> })
.notifyParentSession(task)
//#then
expect(promptCalled).toBe(false)
manager.shutdown()
})
})
function buildNotificationPromptBody(
task: BackgroundTask,
currentMessage: CurrentMessage | null
@@ -1770,6 +1894,32 @@ describe("BackgroundManager - Non-blocking Queue Integration", () => {
const pendingSet = pendingByParent.get(task.parentSessionID)
expect(pendingSet?.has(task.id) ?? false).toBe(false)
})
test("should remove task from toast manager when notification is skipped", async () => {
//#given
const { removeTaskCalls, resetToastManager } = createToastRemoveTaskTracker()
const manager = createBackgroundManager()
const task = createMockTask({
id: "task-cancel-skip-notification",
sessionID: "session-cancel-skip-notification",
parentSessionID: "parent-cancel-skip-notification",
status: "running",
})
getTaskMap(manager).set(task.id, task)
//#when
const cancelled = await manager.cancelTask(task.id, {
source: "test",
skipNotification: true,
})
//#then
expect(cancelled).toBe(true)
expect(removeTaskCalls).toContain(task.id)
manager.shutdown()
resetToastManager()
})
})
describe("multiple keys process in parallel", () => {
@@ -2730,6 +2880,43 @@ describe("BackgroundManager.handleEvent - session.deleted cascade", () => {
manager.shutdown()
})
test("should remove tasks from toast manager when session is deleted", () => {
//#given
const { removeTaskCalls, resetToastManager } = createToastRemoveTaskTracker()
const manager = createBackgroundManager()
const parentSessionID = "session-parent-toast"
const childTask = createMockTask({
id: "task-child-toast",
sessionID: "session-child-toast",
parentSessionID,
status: "running",
})
const grandchildTask = createMockTask({
id: "task-grandchild-toast",
sessionID: "session-grandchild-toast",
parentSessionID: "session-child-toast",
status: "pending",
startedAt: undefined,
queuedAt: new Date(),
})
const taskMap = getTaskMap(manager)
taskMap.set(childTask.id, childTask)
taskMap.set(grandchildTask.id, grandchildTask)
//#when
manager.handleEvent({
type: "session.deleted",
properties: { info: { id: parentSessionID } },
})
//#then
expect(removeTaskCalls).toContain(childTask.id)
expect(removeTaskCalls).toContain(grandchildTask.id)
manager.shutdown()
resetToastManager()
})
})
describe("BackgroundManager.handleEvent - session.error", () => {
@@ -2777,6 +2964,35 @@ describe("BackgroundManager.handleEvent - session.error", () => {
manager.shutdown()
})
test("removes errored task from toast manager", () => {
//#given
const { removeTaskCalls, resetToastManager } = createToastRemoveTaskTracker()
const manager = createBackgroundManager()
const sessionID = "ses_error_toast"
const task = createMockTask({
id: "task-session-error-toast",
sessionID,
parentSessionID: "parent-session",
status: "running",
})
getTaskMap(manager).set(task.id, task)
//#when
manager.handleEvent({
type: "session.error",
properties: {
sessionID,
error: { name: "UnknownError", message: "boom" },
},
})
//#then
expect(removeTaskCalls).toContain(task.id)
manager.shutdown()
resetToastManager()
})
test("ignores session.error for non-running tasks", () => {
//#given
const manager = createBackgroundManager()
@@ -2922,13 +3138,32 @@ describe("BackgroundManager.pruneStaleTasksAndNotifications - removes pruned tas
manager.shutdown()
})
test("removes stale task from toast manager", () => {
//#given
const { removeTaskCalls, resetToastManager } = createToastRemoveTaskTracker()
const manager = createBackgroundManager()
const staleTask = createMockTask({
id: "task-stale-toast",
sessionID: "session-stale-toast",
parentSessionID: "parent-session",
status: "running",
startedAt: new Date(Date.now() - 31 * 60 * 1000),
})
getTaskMap(manager).set(staleTask.id, staleTask)
//#when
pruneStaleTasksAndNotificationsForTest(manager)
//#then
expect(removeTaskCalls).toContain(staleTask.id)
manager.shutdown()
resetToastManager()
})
})
describe("BackgroundManager.completionTimers - Memory Leak Fix", () => {
function getCompletionTimers(manager: BackgroundManager): Map<string, ReturnType<typeof setTimeout>> {
return (manager as unknown as { completionTimers: Map<string, ReturnType<typeof setTimeout>> }).completionTimers
}
function setCompletionTimer(manager: BackgroundManager, taskId: string): void {
const completionTimers = getCompletionTimers(manager)
const timer = setTimeout(() => {
@@ -3454,3 +3689,93 @@ describe("BackgroundManager.handleEvent - non-tool event lastUpdate", () => {
expect(task.status).toBe("running")
})
})
describe("BackgroundManager regression fixes - resume and aborted notification", () => {
test("should keep resumed task in memory after previous completion timer deadline", async () => {
//#given
const client = {
session: {
prompt: async () => ({}),
promptAsync: async () => ({}),
abort: async () => ({}),
},
}
const manager = new BackgroundManager({ client, directory: tmpdir() } as unknown as PluginInput)
const task: BackgroundTask = {
id: "task-resume-timer-regression",
sessionID: "session-resume-timer-regression",
parentSessionID: "parent-session",
parentMessageID: "msg-1",
description: "resume timer regression",
prompt: "test",
agent: "explore",
status: "completed",
startedAt: new Date(),
completedAt: new Date(),
concurrencyGroup: "explore",
}
getTaskMap(manager).set(task.id, task)
const completionTimers = getCompletionTimers(manager)
const timer = setTimeout(() => {
completionTimers.delete(task.id)
getTaskMap(manager).delete(task.id)
}, 25)
completionTimers.set(task.id, timer)
//#when
await manager.resume({
sessionId: "session-resume-timer-regression",
prompt: "resume task",
parentSessionID: "parent-session-2",
parentMessageID: "msg-2",
})
await new Promise((resolve) => setTimeout(resolve, 60))
//#then
expect(getTaskMap(manager).has(task.id)).toBe(true)
expect(completionTimers.has(task.id)).toBe(false)
manager.shutdown()
})
test("should start cleanup timer even when promptAsync aborts", async () => {
//#given
const client = {
session: {
prompt: async () => ({}),
promptAsync: async () => {
const error = new Error("User aborted")
error.name = "MessageAbortedError"
throw error
},
abort: async () => ({}),
messages: async () => ({ data: [] }),
},
}
const manager = new BackgroundManager({ client, directory: tmpdir() } as unknown as PluginInput)
const task: BackgroundTask = {
id: "task-aborted-cleanup-regression",
sessionID: "session-aborted-cleanup-regression",
parentSessionID: "parent-session",
parentMessageID: "msg-1",
description: "aborted prompt cleanup regression",
prompt: "test",
agent: "explore",
status: "completed",
startedAt: new Date(),
completedAt: new Date(),
}
getTaskMap(manager).set(task.id, task)
getPendingByParent(manager).set(task.parentSessionID, new Set([task.id]))
//#when
await (manager as unknown as { notifyParentSession: (task: BackgroundTask) => Promise<void> }).notifyParentSession(task)
//#then
expect(getCompletionTimers(manager).has(task.id)).toBe(true)
manager.shutdown()
})
})

View File

@@ -6,7 +6,7 @@ import type {
ResumeInput,
} from "./types"
import { TaskHistory } from "./task-history"
import { log, getAgentToolRestrictions, promptWithModelSuggestionRetry } from "../../shared"
import { log, getAgentToolRestrictions, normalizeSDKResponse, promptWithModelSuggestionRetry } from "../../shared"
import { setSessionTools } from "../../shared/session-tools-store"
import { ConcurrencyManager } from "./concurrency"
import type { BackgroundTaskConfig, TmuxConfig } from "../../config/schema"
@@ -16,7 +16,6 @@ import {
DEFAULT_STALE_TIMEOUT_MS,
MIN_IDLE_TIME_MS,
MIN_RUNTIME_BEFORE_STALE_MS,
MIN_STABILITY_TIME_MS,
POLLING_INTERVAL_MS,
TASK_CLEANUP_DELAY_MS,
TASK_TTL_MS,
@@ -24,8 +23,8 @@ import {
import { subagentSessions } from "../claude-code-session-state"
import { getTaskToastManager } from "../task-toast-manager"
import { findNearestMessageWithFields, MESSAGE_STORAGE } from "../hook-message-injector"
import { existsSync, readdirSync } from "node:fs"
import { MESSAGE_STORAGE, type StoredMessage } from "../hook-message-injector"
import { existsSync, readFileSync, readdirSync } from "node:fs"
import { join } from "node:path"
type ProcessCleanupEvent = NodeJS.Signals | "beforeExit" | "exit"
@@ -81,6 +80,7 @@ export class BackgroundManager {
private client: OpencodeClient
private directory: string
private pollingInterval?: ReturnType<typeof setInterval>
private pollingInFlight = false
private concurrencyManager: ConcurrencyManager
private shutdownTriggered = false
private config?: BackgroundTaskConfig
@@ -93,6 +93,7 @@ export class BackgroundManager {
private completionTimers: Map<string, ReturnType<typeof setTimeout>> = new Map()
private idleDeferralTimers: Map<string, ReturnType<typeof setTimeout>> = new Map()
private notificationQueueByParent: Map<string, Promise<void>> = new Map()
private enableParentSessionNotifications: boolean
readonly taskHistory = new TaskHistory()
constructor(
@@ -102,6 +103,7 @@ export class BackgroundManager {
tmuxConfig?: TmuxConfig
onSubagentSessionCreated?: OnSubagentSessionCreated
onShutdown?: () => void
enableParentSessionNotifications?: boolean
}
) {
this.tasks = new Map()
@@ -114,6 +116,7 @@ export class BackgroundManager {
this.tmuxEnabled = options?.tmuxConfig?.enabled ?? false
this.onSubagentSessionCreated = options?.onSubagentSessionCreated
this.onShutdown = options?.onShutdown
this.enableParentSessionNotifications = options?.enableParentSessionNotifications ?? true
this.registerProcessCleanup()
}
@@ -528,6 +531,12 @@ export class BackgroundManager {
return existingTask
}
const completionTimer = this.completionTimers.get(existingTask.id)
if (completionTimer) {
clearTimeout(completionTimer)
this.completionTimers.delete(existingTask.id)
}
// Re-acquire concurrency using the persisted concurrency group
const concurrencyKey = existingTask.concurrencyGroup ?? existingTask.agent
await this.concurrencyManager.acquire(concurrencyKey)
@@ -645,7 +654,7 @@ export class BackgroundManager {
const response = await this.client.session.todo({
path: { id: sessionID },
})
const todos = (response.data ?? response) as Todo[]
const todos = normalizeSDKResponse(response, [] as Todo[], { preferResponseOnMissingData: true })
if (!todos || todos.length === 0) return false
const incomplete = todos.filter(
@@ -783,6 +792,10 @@ export class BackgroundManager {
this.cleanupPendingByParent(task)
this.tasks.delete(task.id)
this.clearNotificationsForTask(task.id)
const toastManager = getTaskToastManager()
if (toastManager) {
toastManager.removeTask(task.id)
}
if (task.sessionID) {
subagentSessions.delete(task.sessionID)
}
@@ -830,6 +843,10 @@ export class BackgroundManager {
this.cleanupPendingByParent(task)
this.tasks.delete(task.id)
this.clearNotificationsForTask(task.id)
const toastManager = getTaskToastManager()
if (toastManager) {
toastManager.removeTask(task.id)
}
if (task.sessionID) {
subagentSessions.delete(task.sessionID)
}
@@ -861,7 +878,7 @@ export class BackgroundManager {
path: { id: sessionID },
})
const messages = response.data ?? []
const messages = normalizeSDKResponse(response, [] as Array<{ info?: { role?: string } }>, { preferResponseOnMissingData: true })
// Check for at least one assistant or tool message
const hasAssistantOrToolMessage = messages.some(
@@ -1000,6 +1017,10 @@ export class BackgroundManager {
}
if (options?.skipNotification) {
const toastManager = getTaskToastManager()
if (toastManager) {
toastManager.removeTask(task.id)
}
log(`[background-agent] Task cancelled via ${source} (notification skipped):`, task.id)
return true
}
@@ -1186,19 +1207,21 @@ export class BackgroundManager {
allComplete = true
}
const completedTasks = allComplete
? Array.from(this.tasks.values())
.filter(t => t.parentSessionID === task.parentSessionID && t.status !== "running" && t.status !== "pending")
: []
const statusText = task.status === "completed" ? "COMPLETED" : task.status === "interrupt" ? "INTERRUPTED" : "CANCELLED"
const errorInfo = task.error ? `\n**Error:** ${task.error}` : ""
let notification: string
let completedTasks: BackgroundTask[] = []
if (allComplete) {
completedTasks = Array.from(this.tasks.values())
.filter(t => t.parentSessionID === task.parentSessionID && t.status !== "running" && t.status !== "pending")
const completedTasksText = completedTasks
.map(t => `- \`${t.id}\`: ${t.description}`)
.join("\n")
notification = `<system-reminder>
let notification: string
if (allComplete) {
const completedTasksText = completedTasks
.map(t => `- \`${t.id}\`: ${t.description}`)
.join("\n")
notification = `<system-reminder>
[ALL BACKGROUND TASKS COMPLETE]
**Completed:**
@@ -1221,70 +1244,79 @@ Use \`background_output(task_id="${task.id}")\` to retrieve this result when rea
</system-reminder>`
}
let agent: string | undefined = task.parentAgent
let model: { providerID: string; modelID: string } | undefined
let agent: string | undefined = task.parentAgent
let model: { providerID: string; modelID: string } | undefined
try {
const messagesResp = await this.client.session.messages({ path: { id: task.parentSessionID } })
const messages = (messagesResp.data ?? []) as Array<{
info?: { agent?: string; model?: { providerID: string; modelID: string }; modelID?: string; providerID?: string }
}>
for (let i = messages.length - 1; i >= 0; i--) {
const info = messages[i].info
if (info?.agent || info?.model || (info?.modelID && info?.providerID)) {
agent = info.agent ?? task.parentAgent
model = info.model ?? (info.providerID && info.modelID ? { providerID: info.providerID, modelID: info.modelID } : undefined)
break
if (this.enableParentSessionNotifications) {
try {
const messagesResp = await this.client.session.messages({ path: { id: task.parentSessionID } })
const messages = normalizeSDKResponse(messagesResp, [] as Array<{
info?: { agent?: string; model?: { providerID: string; modelID: string }; modelID?: string; providerID?: string }
}>)
for (let i = messages.length - 1; i >= 0; i--) {
const info = messages[i].info
if (isCompactionAgent(info?.agent)) {
continue
}
if (info?.agent || info?.model || (info?.modelID && info?.providerID)) {
agent = info.agent ?? task.parentAgent
model = info.model ?? (info.providerID && info.modelID ? { providerID: info.providerID, modelID: info.modelID } : undefined)
break
}
}
} catch (error) {
if (this.isAbortedSessionError(error)) {
log("[background-agent] Parent session aborted while loading messages; using messageDir fallback:", {
taskId: task.id,
parentSessionID: task.parentSessionID,
})
}
const messageDir = getMessageDir(task.parentSessionID)
const currentMessage = messageDir ? findNearestMessageExcludingCompaction(messageDir) : null
agent = currentMessage?.agent ?? task.parentAgent
model = currentMessage?.model?.providerID && currentMessage?.model?.modelID
? { providerID: currentMessage.model.providerID, modelID: currentMessage.model.modelID }
: undefined
}
}
} catch (error) {
if (this.isAbortedSessionError(error)) {
log("[background-agent] Parent session aborted, skipping notification:", {
log("[background-agent] notifyParentSession context:", {
taskId: task.id,
resolvedAgent: agent,
resolvedModel: model,
})
try {
await this.client.session.promptAsync({
path: { id: task.parentSessionID },
body: {
noReply: !allComplete,
...(agent !== undefined ? { agent } : {}),
...(model !== undefined ? { model } : {}),
...(task.parentTools ? { tools: task.parentTools } : {}),
parts: [{ type: "text", text: notification }],
},
})
log("[background-agent] Sent notification to parent session:", {
taskId: task.id,
allComplete,
noReply: !allComplete,
})
} catch (error) {
if (this.isAbortedSessionError(error)) {
log("[background-agent] Parent session aborted while sending notification; continuing cleanup:", {
taskId: task.id,
parentSessionID: task.parentSessionID,
})
} else {
log("[background-agent] Failed to send notification:", error)
}
}
} else {
log("[background-agent] Parent session notifications disabled, skipping prompt injection:", {
taskId: task.id,
parentSessionID: task.parentSessionID,
})
return
}
const messageDir = getMessageDir(task.parentSessionID)
const currentMessage = messageDir ? findNearestMessageWithFields(messageDir) : null
agent = currentMessage?.agent ?? task.parentAgent
model = currentMessage?.model?.providerID && currentMessage?.model?.modelID
? { providerID: currentMessage.model.providerID, modelID: currentMessage.model.modelID }
: undefined
}
log("[background-agent] notifyParentSession context:", {
taskId: task.id,
resolvedAgent: agent,
resolvedModel: model,
})
try {
await this.client.session.promptAsync({
path: { id: task.parentSessionID },
body: {
noReply: !allComplete,
...(agent !== undefined ? { agent } : {}),
...(model !== undefined ? { model } : {}),
...(task.parentTools ? { tools: task.parentTools } : {}),
parts: [{ type: "text", text: notification }],
},
})
log("[background-agent] Sent notification to parent session:", {
taskId: task.id,
allComplete,
noReply: !allComplete,
})
} catch (error) {
if (this.isAbortedSessionError(error)) {
log("[background-agent] Parent session aborted, skipping notification:", {
taskId: task.id,
parentSessionID: task.parentSessionID,
})
return
}
log("[background-agent] Failed to send notification:", error)
}
if (allComplete) {
for (const completedTask of completedTasks) {
@@ -1413,6 +1445,10 @@ Use \`background_output(task_id="${task.id}")\` to retrieve this result when rea
}
}
this.clearNotificationsForTask(taskId)
const toastManager = getTaskToastManager()
if (toastManager) {
toastManager.removeTask(taskId)
}
this.tasks.delete(taskId)
if (task.sessionID) {
subagentSessions.delete(task.sessionID)
@@ -1511,10 +1547,13 @@ Use \`background_output(task_id="${task.id}")\` to retrieve this result when rea
}
private async pollRunningTasks(): Promise<void> {
if (this.pollingInFlight) return
this.pollingInFlight = true
try {
this.pruneStaleTasksAndNotifications()
const statusResult = await this.client.session.status()
const allStatuses = (statusResult.data ?? {}) as Record<string, { type: string }>
const allStatuses = normalizeSDKResponse(statusResult, {} as Record<string, { type: string }>)
await this.checkAndInterruptStaleTasks(allStatuses)
@@ -1566,6 +1605,9 @@ Use \`background_output(task_id="${task.id}")\` to retrieve this result when rea
if (!this.hasRunningTasks()) {
this.stopPolling()
}
} finally {
this.pollingInFlight = false
}
}
/**
@@ -1683,3 +1725,57 @@ function getMessageDir(sessionID: string): string | null {
}
return null
}
function isCompactionAgent(agent: string | undefined): boolean {
return agent?.trim().toLowerCase() === "compaction"
}
function hasFullAgentAndModel(message: StoredMessage): boolean {
return !!message.agent &&
!isCompactionAgent(message.agent) &&
!!message.model?.providerID &&
!!message.model?.modelID
}
function hasPartialAgentOrModel(message: StoredMessage): boolean {
const hasAgent = !!message.agent && !isCompactionAgent(message.agent)
const hasModel = !!message.model?.providerID && !!message.model?.modelID
return hasAgent || hasModel
}
function findNearestMessageExcludingCompaction(messageDir: string): StoredMessage | null {
try {
const files = readdirSync(messageDir)
.filter((name) => name.endsWith(".json"))
.sort()
.reverse()
for (const file of files) {
try {
const content = readFileSync(join(messageDir, file), "utf-8")
const parsed = JSON.parse(content) as StoredMessage
if (hasFullAgentAndModel(parsed)) {
return parsed
}
} catch {
continue
}
}
for (const file of files) {
try {
const content = readFileSync(join(messageDir, file), "utf-8")
const parsed = JSON.parse(content) as StoredMessage
if (hasPartialAgentOrModel(parsed)) {
return parsed
}
} catch {
continue
}
}
} catch {
return null
}
return null
}

View File

@@ -1 +1 @@
export { getMessageDir } from "./message-storage-locator"
export { getMessageDir } from "../../shared"

View File

@@ -1,17 +0,0 @@
import { existsSync, readdirSync } from "node:fs"
import { join } from "node:path"
import { MESSAGE_STORAGE } from "../hook-message-injector"
export function getMessageDir(sessionID: string): string | null {
if (!existsSync(MESSAGE_STORAGE)) return null
const directPath = join(MESSAGE_STORAGE, sessionID)
if (existsSync(directPath)) return directPath
for (const dir of readdirSync(MESSAGE_STORAGE)) {
const sessionPath = join(MESSAGE_STORAGE, dir, sessionID)
if (existsSync(sessionPath)) return sessionPath
}
return null
}

View File

@@ -1,52 +0,0 @@
import type { BackgroundTask } from "./types"
export function markForNotification(
notifications: Map<string, BackgroundTask[]>,
task: BackgroundTask
): void {
const queue = notifications.get(task.parentSessionID) ?? []
queue.push(task)
notifications.set(task.parentSessionID, queue)
}
export function getPendingNotifications(
notifications: Map<string, BackgroundTask[]>,
sessionID: string
): BackgroundTask[] {
return notifications.get(sessionID) ?? []
}
export function clearNotifications(
notifications: Map<string, BackgroundTask[]>,
sessionID: string
): void {
notifications.delete(sessionID)
}
export function clearNotificationsForTask(
notifications: Map<string, BackgroundTask[]>,
taskId: string
): void {
for (const [sessionID, tasks] of notifications.entries()) {
const filtered = tasks.filter((t) => t.id !== taskId)
if (filtered.length === 0) {
notifications.delete(sessionID)
} else {
notifications.set(sessionID, filtered)
}
}
}
export function cleanupPendingByParent(
pendingByParent: Map<string, Set<string>>,
task: BackgroundTask
): void {
if (!task.parentSessionID) return
const pending = pendingByParent.get(task.parentSessionID)
if (!pending) return
pending.delete(task.id)
if (pending.size === 0) {
pendingByParent.delete(task.parentSessionID)
}
}

View File

@@ -1,193 +0,0 @@
import { log } from "../../shared"
import { findNearestMessageWithFields } from "../hook-message-injector"
import { getTaskToastManager } from "../task-toast-manager"
import { TASK_CLEANUP_DELAY_MS } from "./constants"
import { formatDuration } from "./format-duration"
import { isAbortedSessionError } from "./error-classifier"
import { getMessageDir } from "./message-dir"
import { buildBackgroundTaskNotificationText } from "./notification-builder"
import type { BackgroundTask } from "./types"
import type { OpencodeClient } from "./opencode-client"
type AgentModel = { providerID: string; modelID: string }
type MessageInfo = {
agent?: string
model?: AgentModel
providerID?: string
modelID?: string
}
function isRecord(value: unknown): value is Record<string, unknown> {
return typeof value === "object" && value !== null
}
function extractMessageInfo(message: unknown): MessageInfo {
if (!isRecord(message)) return {}
const info = message["info"]
if (!isRecord(info)) return {}
const agent = typeof info["agent"] === "string" ? info["agent"] : undefined
const modelObj = info["model"]
if (isRecord(modelObj)) {
const providerID = modelObj["providerID"]
const modelID = modelObj["modelID"]
if (typeof providerID === "string" && typeof modelID === "string") {
return { agent, model: { providerID, modelID } }
}
}
const providerID = info["providerID"]
const modelID = info["modelID"]
if (typeof providerID === "string" && typeof modelID === "string") {
return { agent, model: { providerID, modelID } }
}
return { agent }
}
export async function notifyParentSession(args: {
task: BackgroundTask
tasks: Map<string, BackgroundTask>
pendingByParent: Map<string, Set<string>>
completionTimers: Map<string, ReturnType<typeof setTimeout>>
clearNotificationsForTask: (taskId: string) => void
client: OpencodeClient
}): Promise<void> {
const { task, tasks, pendingByParent, completionTimers, clearNotificationsForTask, client } = args
const duration = formatDuration(task.startedAt ?? new Date(), task.completedAt)
log("[background-agent] notifyParentSession called for task:", task.id)
const toastManager = getTaskToastManager()
if (toastManager) {
toastManager.showCompletionToast({
id: task.id,
description: task.description,
duration,
})
}
const pendingSet = pendingByParent.get(task.parentSessionID)
if (pendingSet) {
pendingSet.delete(task.id)
if (pendingSet.size === 0) {
pendingByParent.delete(task.parentSessionID)
}
}
const allComplete = !pendingSet || pendingSet.size === 0
const remainingCount = pendingSet?.size ?? 0
const completedTasks = allComplete
? Array.from(tasks.values()).filter(
(t) =>
t.parentSessionID === task.parentSessionID &&
t.status !== "running" &&
t.status !== "pending"
)
: []
const notification = buildBackgroundTaskNotificationText({
task,
duration,
allComplete,
remainingCount,
completedTasks,
})
let agent: string | undefined = task.parentAgent
let model: AgentModel | undefined
try {
const messagesResp = await client.session.messages({
path: { id: task.parentSessionID },
})
const raw = (messagesResp as { data?: unknown }).data ?? []
const messages = Array.isArray(raw) ? raw : []
for (let i = messages.length - 1; i >= 0; i--) {
const extracted = extractMessageInfo(messages[i])
if (extracted.agent || extracted.model) {
agent = extracted.agent ?? task.parentAgent
model = extracted.model
break
}
}
} catch (error) {
if (isAbortedSessionError(error)) {
log("[background-agent] Parent session aborted, skipping notification:", {
taskId: task.id,
parentSessionID: task.parentSessionID,
})
return
}
const messageDir = getMessageDir(task.parentSessionID)
const currentMessage = messageDir ? findNearestMessageWithFields(messageDir) : null
agent = currentMessage?.agent ?? task.parentAgent
model =
currentMessage?.model?.providerID && currentMessage?.model?.modelID
? { providerID: currentMessage.model.providerID, modelID: currentMessage.model.modelID }
: undefined
}
log("[background-agent] notifyParentSession context:", {
taskId: task.id,
resolvedAgent: agent,
resolvedModel: model,
})
try {
await client.session.promptAsync({
path: { id: task.parentSessionID },
body: {
noReply: !allComplete,
...(agent !== undefined ? { agent } : {}),
...(model !== undefined ? { model } : {}),
...(task.parentTools ? { tools: task.parentTools } : {}),
parts: [{ type: "text", text: notification }],
},
})
log("[background-agent] Sent notification to parent session:", {
taskId: task.id,
allComplete,
noReply: !allComplete,
})
} catch (error) {
if (isAbortedSessionError(error)) {
log("[background-agent] Parent session aborted, skipping notification:", {
taskId: task.id,
parentSessionID: task.parentSessionID,
})
return
}
log("[background-agent] Failed to send notification:", error)
}
if (!allComplete) return
for (const completedTask of completedTasks) {
const taskId = completedTask.id
const existingTimer = completionTimers.get(taskId)
if (existingTimer) {
clearTimeout(existingTimer)
completionTimers.delete(taskId)
}
const timer = setTimeout(() => {
completionTimers.delete(taskId)
if (tasks.has(taskId)) {
clearNotificationsForTask(taskId)
tasks.delete(taskId)
log("[background-agent] Removed completed task from memory:", taskId)
}
}, TASK_CLEANUP_DELAY_MS)
completionTimers.set(taskId, timer)
}
}

View File

@@ -1,7 +1,7 @@
import type { OpencodeClient } from "./constants"
import type { BackgroundTask } from "./types"
import { findNearestMessageWithFields } from "../hook-message-injector"
import { getMessageDir } from "./message-storage-locator"
import { getMessageDir } from "../../shared"
type AgentModel = { providerID: string; modelID: string }

View File

@@ -1,182 +0,0 @@
import { log } from "../../shared"
import {
MIN_STABILITY_TIME_MS,
} from "./constants"
import type { BackgroundTask } from "./types"
import type { OpencodeClient } from "./opencode-client"
type SessionStatusMap = Record<string, { type: string }>
type MessagePart = {
type?: string
tool?: string
name?: string
text?: string
}
type SessionMessage = {
info?: { role?: string }
parts?: MessagePart[]
}
function isRecord(value: unknown): value is Record<string, unknown> {
return typeof value === "object" && value !== null
}
function asSessionMessages(value: unknown): SessionMessage[] {
if (!Array.isArray(value)) return []
return value.filter(isRecord) as SessionMessage[]
}
export async function pollRunningTasks(args: {
tasks: Iterable<BackgroundTask>
client: OpencodeClient
pruneStaleTasksAndNotifications: () => void
checkAndInterruptStaleTasks: (statuses: Record<string, { type: string }>) => Promise<void>
validateSessionHasOutput: (sessionID: string) => Promise<boolean>
checkSessionTodos: (sessionID: string) => Promise<boolean>
tryCompleteTask: (task: BackgroundTask, source: string) => Promise<boolean>
hasRunningTasks: () => boolean
stopPolling: () => void
}): Promise<void> {
const {
tasks,
client,
pruneStaleTasksAndNotifications,
checkAndInterruptStaleTasks,
validateSessionHasOutput,
checkSessionTodos,
tryCompleteTask,
hasRunningTasks,
stopPolling,
} = args
pruneStaleTasksAndNotifications()
const statusResult = await client.session.status()
const allStatuses = ((statusResult as { data?: unknown }).data ?? {}) as SessionStatusMap
await checkAndInterruptStaleTasks(allStatuses)
for (const task of tasks) {
if (task.status !== "running") continue
const sessionID = task.sessionID
if (!sessionID) continue
try {
const sessionStatus = allStatuses[sessionID]
if (sessionStatus?.type === "idle") {
const hasValidOutput = await validateSessionHasOutput(sessionID)
if (!hasValidOutput) {
log("[background-agent] Polling idle but no valid output yet, waiting:", task.id)
continue
}
if (task.status !== "running") continue
const hasIncompleteTodos = await checkSessionTodos(sessionID)
if (hasIncompleteTodos) {
log("[background-agent] Task has incomplete todos via polling, waiting:", task.id)
continue
}
await tryCompleteTask(task, "polling (idle status)")
continue
}
const messagesResult = await client.session.messages({
path: { id: sessionID },
})
if ((messagesResult as { error?: unknown }).error) {
continue
}
const messagesPayload = Array.isArray(messagesResult)
? messagesResult
: (messagesResult as { data?: unknown }).data
const messages = asSessionMessages(messagesPayload)
const assistantMsgs = messages.filter((m) => m.info?.role === "assistant")
let toolCalls = 0
let lastTool: string | undefined
let lastMessage: string | undefined
for (const msg of assistantMsgs) {
const parts = msg.parts ?? []
for (const part of parts) {
if (part.type === "tool_use" || part.tool) {
toolCalls += 1
lastTool = part.tool || part.name || "unknown"
}
if (part.type === "text" && part.text) {
lastMessage = part.text
}
}
}
if (!task.progress) {
task.progress = { toolCalls: 0, lastUpdate: new Date() }
}
task.progress.toolCalls = toolCalls
task.progress.lastTool = lastTool
task.progress.lastUpdate = new Date()
if (lastMessage) {
task.progress.lastMessage = lastMessage
task.progress.lastMessageAt = new Date()
}
const currentMsgCount = messages.length
const startedAt = task.startedAt
if (!startedAt) continue
const elapsedMs = Date.now() - startedAt.getTime()
if (elapsedMs >= MIN_STABILITY_TIME_MS) {
if (task.lastMsgCount === currentMsgCount) {
task.stablePolls = (task.stablePolls ?? 0) + 1
if (task.stablePolls >= 3) {
const recheckStatus = await client.session.status()
const recheckData = ((recheckStatus as { data?: unknown }).data ?? {}) as SessionStatusMap
const currentStatus = recheckData[sessionID]
if (currentStatus?.type !== "idle") {
log("[background-agent] Stability reached but session not idle, resetting:", {
taskId: task.id,
sessionStatus: currentStatus?.type ?? "not_in_status",
})
task.stablePolls = 0
continue
}
const hasValidOutput = await validateSessionHasOutput(sessionID)
if (!hasValidOutput) {
log("[background-agent] Stability reached but no valid output, waiting:", task.id)
continue
}
if (task.status !== "running") continue
const hasIncompleteTodos = await checkSessionTodos(sessionID)
if (!hasIncompleteTodos) {
await tryCompleteTask(task, "stability detection")
continue
}
}
} else {
task.stablePolls = 0
}
}
task.lastMsgCount = currentMsgCount
} catch (error) {
log("[background-agent] Poll error for task:", { taskId: task.id, error })
}
}
if (!hasRunningTasks()) {
stopPolling()
}
}

View File

@@ -1,19 +0,0 @@
export type ProcessCleanupEvent = NodeJS.Signals | "beforeExit" | "exit"
export function registerProcessSignal(
signal: ProcessCleanupEvent,
handler: () => void,
exitAfter: boolean
): () => void {
const listener = () => {
handler()
if (exitAfter) {
// Set exitCode and schedule exit after delay to allow other handlers to complete async cleanup
// Use 6s delay to accommodate LSP cleanup (5s timeout + 1s SIGKILL wait)
process.exitCode = 0
setTimeout(() => process.exit(), 6000)
}
}
process.on(signal, listener)
return listener
}

View File

@@ -1,6 +1,6 @@
export type { ResultHandlerContext } from "./result-handler-context"
export { formatDuration } from "./duration-formatter"
export { getMessageDir } from "./message-storage-locator"
export { getMessageDir } from "../../shared"
export { checkSessionTodos } from "./session-todo-checker"
export { validateSessionHasOutput } from "./session-output-validator"
export { tryCompleteTask } from "./background-task-completer"

View File

@@ -4,7 +4,7 @@ function isTodo(value: unknown): value is Todo {
if (typeof value !== "object" || value === null) return false
const todo = value as Record<string, unknown>
return (
typeof todo["id"] === "string" &&
(typeof todo["id"] === "string" || todo["id"] === undefined) &&
typeof todo["content"] === "string" &&
typeof todo["status"] === "string" &&
typeof todo["priority"] === "string"

View File

@@ -1,111 +0,0 @@
import { log } from "../../shared"
import type { OpencodeClient } from "./opencode-client"
type Todo = {
content: string
status: string
priority: string
id: string
}
type SessionMessage = {
info?: { role?: string }
parts?: unknown
}
function isRecord(value: unknown): value is Record<string, unknown> {
return typeof value === "object" && value !== null
}
function asSessionMessages(value: unknown): SessionMessage[] {
if (!Array.isArray(value)) return []
return value as SessionMessage[]
}
function asParts(value: unknown): Array<Record<string, unknown>> {
if (!Array.isArray(value)) return []
return value.filter(isRecord)
}
function hasNonEmptyText(value: unknown): boolean {
return typeof value === "string" && value.trim().length > 0
}
function isToolResultContentNonEmpty(content: unknown): boolean {
if (typeof content === "string") return content.trim().length > 0
if (Array.isArray(content)) return content.length > 0
return false
}
/**
* Validates that a session has actual assistant/tool output before marking complete.
* Prevents premature completion when session.idle fires before agent responds.
*/
export async function validateSessionHasOutput(
client: OpencodeClient,
sessionID: string
): Promise<boolean> {
try {
const response = await client.session.messages({
path: { id: sessionID },
})
const messages = asSessionMessages((response as { data?: unknown }).data ?? response)
const hasAssistantOrToolMessage = messages.some(
(m) => m.info?.role === "assistant" || m.info?.role === "tool"
)
if (!hasAssistantOrToolMessage) {
log("[background-agent] No assistant/tool messages found in session:", sessionID)
return false
}
const hasContent = messages.some((m) => {
if (m.info?.role !== "assistant" && m.info?.role !== "tool") return false
const parts = asParts(m.parts)
return parts.some((part) => {
const type = part.type
if (type === "tool") return true
if (type === "text" && hasNonEmptyText(part.text)) return true
if (type === "reasoning" && hasNonEmptyText(part.text)) return true
if (type === "tool_result" && isToolResultContentNonEmpty(part.content)) return true
return false
})
})
if (!hasContent) {
log("[background-agent] Messages exist but no content found in session:", sessionID)
return false
}
return true
} catch (error) {
log("[background-agent] Error validating session output:", error)
// On error, allow completion to proceed (don't block indefinitely)
return true
}
}
export async function checkSessionTodos(
client: OpencodeClient,
sessionID: string
): Promise<boolean> {
try {
const response = await client.session.todo({
path: { id: sessionID },
})
const raw = (response as { data?: unknown }).data ?? response
const todos = Array.isArray(raw) ? (raw as Todo[]) : []
if (todos.length === 0) return false
const incomplete = todos.filter(
(t) => t.status !== "completed" && t.status !== "cancelled"
)
return incomplete.length > 0
} catch {
return false
}
}

View File

@@ -0,0 +1,33 @@
import { describe, expect, test } from "bun:test"
import { resolveParentDirectory } from "./parent-directory-resolver"
describe("background-agent parent-directory-resolver", () => {
const originalPlatform = process.platform
test("uses current working directory on Windows when parent session directory is AppData", async () => {
//#given
Object.defineProperty(process, "platform", { value: "win32" })
try {
const client = {
session: {
get: async () => ({
data: { directory: "C:\\Users\\test\\AppData\\Local\\ai.opencode.desktop" },
}),
},
}
//#when
const result = await resolveParentDirectory({
client: client as Parameters<typeof resolveParentDirectory>[0]["client"],
parentSessionID: "ses_parent",
defaultDirectory: "C:\\Users\\test\\AppData\\Roaming\\opencode",
})
//#then
expect(result).toBe(process.cwd())
} finally {
Object.defineProperty(process, "platform", { value: originalPlatform })
}
})
})

View File

@@ -1,5 +1,5 @@
import type { OpencodeClient } from "../constants"
import { log } from "../../../shared"
import { log, resolveSessionDirectory } from "../../../shared"
export async function resolveParentDirectory(options: {
client: OpencodeClient
@@ -15,7 +15,10 @@ export async function resolveParentDirectory(options: {
return null
})
const parentDirectory = parentSession?.data?.directory ?? defaultDirectory
const parentDirectory = resolveSessionDirectory({
parentDirectory: parentSession?.data?.directory,
fallbackDirectory: defaultDirectory,
})
log(`[background-agent] Parent dir: ${parentSession?.data?.directory}, using: ${parentDirectory}`)
return parentDirectory
}

View File

@@ -1,19 +0,0 @@
import { randomUUID } from "crypto"
import type { BackgroundTask, LaunchInput } from "../types"
export function createTask(input: LaunchInput): BackgroundTask {
return {
id: `bg_${randomUUID().slice(0, 8)}`,
status: "pending",
queuedAt: new Date(),
description: input.description,
prompt: input.prompt,
agent: input.agent,
parentSessionID: input.parentSessionID,
parentMessageID: input.parentMessageID,
parentModel: input.parentModel,
parentAgent: input.parentAgent,
parentTools: input.parentTools,
model: input.model,
}
}

View File

@@ -1,99 +0,0 @@
import type { BackgroundTask, ResumeInput } from "../types"
import { log, getAgentToolRestrictions } from "../../../shared"
import { setSessionTools } from "../../../shared/session-tools-store"
import type { SpawnerContext } from "./spawner-context"
import { subagentSessions } from "../../claude-code-session-state"
import { getTaskToastManager } from "../../task-toast-manager"
export async function resumeTask(
task: BackgroundTask,
input: ResumeInput,
ctx: Pick<SpawnerContext, "client" | "concurrencyManager" | "onTaskError">
): Promise<void> {
const { client, concurrencyManager, onTaskError } = ctx
if (!task.sessionID) {
throw new Error(`Task has no sessionID: ${task.id}`)
}
if (task.status === "running") {
log("[background-agent] Resume skipped - task already running:", {
taskId: task.id,
sessionID: task.sessionID,
})
return
}
const concurrencyKey = task.concurrencyGroup ?? task.agent
await concurrencyManager.acquire(concurrencyKey)
task.concurrencyKey = concurrencyKey
task.concurrencyGroup = concurrencyKey
task.status = "running"
task.completedAt = undefined
task.error = undefined
task.parentSessionID = input.parentSessionID
task.parentMessageID = input.parentMessageID
task.parentModel = input.parentModel
task.parentAgent = input.parentAgent
if (input.parentTools) {
task.parentTools = input.parentTools
}
task.startedAt = new Date()
task.progress = {
toolCalls: task.progress?.toolCalls ?? 0,
lastUpdate: new Date(),
}
subagentSessions.add(task.sessionID)
const toastManager = getTaskToastManager()
if (toastManager) {
toastManager.addTask({
id: task.id,
description: task.description,
agent: task.agent,
isBackground: true,
})
}
log("[background-agent] Resuming task:", { taskId: task.id, sessionID: task.sessionID })
log("[background-agent] Resuming task - calling prompt (fire-and-forget) with:", {
sessionID: task.sessionID,
agent: task.agent,
model: task.model,
promptLength: input.prompt.length,
})
const resumeModel = task.model
? { providerID: task.model.providerID, modelID: task.model.modelID }
: undefined
const resumeVariant = task.model?.variant
client.session
.promptAsync({
path: { id: task.sessionID },
body: {
agent: task.agent,
...(resumeModel ? { model: resumeModel } : {}),
...(resumeVariant ? { variant: resumeVariant } : {}),
tools: (() => {
const tools = {
...getAgentToolRestrictions(task.agent),
task: false,
call_omo_agent: true,
question: false,
}
setSessionTools(task.sessionID!, tools)
return tools
})(),
parts: [{ type: "text", text: input.prompt }],
},
})
.catch((error: unknown) => {
log("[background-agent] resume prompt error:", error)
onTaskError(task, error instanceof Error ? error : new Error(String(error)))
})
}

View File

@@ -1,99 +0,0 @@
import type { QueueItem } from "../constants"
import { log, getAgentToolRestrictions, promptWithModelSuggestionRetry } from "../../../shared"
import { setSessionTools } from "../../../shared/session-tools-store"
import { subagentSessions } from "../../claude-code-session-state"
import { getTaskToastManager } from "../../task-toast-manager"
import { createBackgroundSession } from "./background-session-creator"
import { getConcurrencyKeyFromLaunchInput } from "./concurrency-key-from-launch-input"
import { resolveParentDirectory } from "./parent-directory-resolver"
import type { SpawnerContext } from "./spawner-context"
import { maybeInvokeTmuxCallback } from "./tmux-callback-invoker"
export async function startTask(item: QueueItem, ctx: SpawnerContext): Promise<void> {
const { task, input } = item
const { client, directory, concurrencyManager, tmuxEnabled, onSubagentSessionCreated, onTaskError } = ctx
log("[background-agent] Starting task:", {
taskId: task.id,
agent: input.agent,
model: input.model,
})
const concurrencyKey = getConcurrencyKeyFromLaunchInput(input)
const parentDirectory = await resolveParentDirectory({
client,
parentSessionID: input.parentSessionID,
defaultDirectory: directory,
})
const sessionID = await createBackgroundSession({
client,
input,
parentDirectory,
concurrencyManager,
concurrencyKey,
})
subagentSessions.add(sessionID)
await maybeInvokeTmuxCallback({
onSubagentSessionCreated,
tmuxEnabled,
sessionID,
parentID: input.parentSessionID,
title: input.description,
})
task.status = "running"
task.startedAt = new Date()
task.sessionID = sessionID
task.progress = {
toolCalls: 0,
lastUpdate: new Date(),
}
task.concurrencyKey = concurrencyKey
task.concurrencyGroup = concurrencyKey
log("[background-agent] Launching task:", { taskId: task.id, sessionID, agent: input.agent })
const toastManager = getTaskToastManager()
if (toastManager) {
toastManager.updateTask(task.id, "running")
}
log("[background-agent] Calling prompt (fire-and-forget) for launch with:", {
sessionID,
agent: input.agent,
model: input.model,
hasSkillContent: !!input.skillContent,
promptLength: input.prompt.length,
})
const launchModel = input.model
? { providerID: input.model.providerID, modelID: input.model.modelID }
: undefined
const launchVariant = input.model?.variant
promptWithModelSuggestionRetry(client, {
path: { id: sessionID },
body: {
agent: input.agent,
...(launchModel ? { model: launchModel } : {}),
...(launchVariant ? { variant: launchVariant } : {}),
system: input.skillContent,
tools: (() => {
const tools = {
...getAgentToolRestrictions(input.agent),
task: false,
call_omo_agent: true,
question: false,
}
setSessionTools(sessionID, tools)
return tools
})(),
parts: [{ type: "text", text: input.prompt }],
},
}).catch((error: unknown) => {
log("[background-agent] promptAsync error:", error)
onTaskError(task, error instanceof Error ? error : new Error(String(error)))
})
}

View File

@@ -1,77 +0,0 @@
import { log } from "../../shared"
import { TASK_TTL_MS } from "./constants"
import { subagentSessions } from "../claude-code-session-state"
import { pruneStaleTasksAndNotifications } from "./task-poller"
import type { BackgroundTask, LaunchInput } from "./types"
import type { ConcurrencyManager } from "./concurrency"
type QueueItem = { task: BackgroundTask; input: LaunchInput }
export function pruneStaleState(args: {
tasks: Map<string, BackgroundTask>
notifications: Map<string, BackgroundTask[]>
queuesByKey: Map<string, QueueItem[]>
concurrencyManager: ConcurrencyManager
cleanupPendingByParent: (task: BackgroundTask) => void
clearNotificationsForTask: (taskId: string) => void
}): void {
const {
tasks,
notifications,
queuesByKey,
concurrencyManager,
cleanupPendingByParent,
clearNotificationsForTask,
} = args
pruneStaleTasksAndNotifications({
tasks,
notifications,
onTaskPruned: (taskId, task, errorMessage) => {
const wasPending = task.status === "pending"
const now = Date.now()
const timestamp = task.status === "pending"
? task.queuedAt?.getTime()
: task.startedAt?.getTime()
const age = timestamp ? now - timestamp : TASK_TTL_MS
log("[background-agent] Pruning stale task:", {
taskId,
status: task.status,
age: Math.round(age / 1000) + "s",
})
task.status = "error"
task.error = errorMessage
task.completedAt = new Date()
if (task.concurrencyKey) {
concurrencyManager.release(task.concurrencyKey)
task.concurrencyKey = undefined
}
cleanupPendingByParent(task)
if (wasPending) {
const key = task.model
? `${task.model.providerID}/${task.model.modelID}`
: task.agent
const queue = queuesByKey.get(key)
if (queue) {
const index = queue.findIndex((item) => item.task.id === taskId)
if (index !== -1) {
queue.splice(index, 1)
if (queue.length === 0) {
queuesByKey.delete(key)
}
}
}
}
clearNotificationsForTask(taskId)
tasks.delete(taskId)
if (task.sessionID) {
subagentSessions.delete(task.sessionID)
}
},
})
}

View File

@@ -1,117 +0,0 @@
import { log } from "../../shared"
import type { BackgroundTask } from "./types"
import type { LaunchInput } from "./types"
import type { ConcurrencyManager } from "./concurrency"
import type { OpencodeClient } from "./opencode-client"
type QueueItem = { task: BackgroundTask; input: LaunchInput }
export async function cancelBackgroundTask(args: {
taskId: string
options?: {
source?: string
reason?: string
abortSession?: boolean
skipNotification?: boolean
}
tasks: Map<string, BackgroundTask>
queuesByKey: Map<string, QueueItem[]>
completionTimers: Map<string, ReturnType<typeof setTimeout>>
idleDeferralTimers: Map<string, ReturnType<typeof setTimeout>>
concurrencyManager: ConcurrencyManager
client: OpencodeClient
cleanupPendingByParent: (task: BackgroundTask) => void
markForNotification: (task: BackgroundTask) => void
notifyParentSession: (task: BackgroundTask) => Promise<void>
}): Promise<boolean> {
const {
taskId,
options,
tasks,
queuesByKey,
completionTimers,
idleDeferralTimers,
concurrencyManager,
client,
cleanupPendingByParent,
markForNotification,
notifyParentSession,
} = args
const task = tasks.get(taskId)
if (!task || (task.status !== "running" && task.status !== "pending")) {
return false
}
const source = options?.source ?? "cancel"
const abortSession = options?.abortSession !== false
const reason = options?.reason
if (task.status === "pending") {
const key = task.model
? `${task.model.providerID}/${task.model.modelID}`
: task.agent
const queue = queuesByKey.get(key)
if (queue) {
const index = queue.findIndex((item) => item.task.id === taskId)
if (index !== -1) {
queue.splice(index, 1)
if (queue.length === 0) {
queuesByKey.delete(key)
}
}
}
log("[background-agent] Cancelled pending task:", { taskId, key })
}
task.status = "cancelled"
task.completedAt = new Date()
if (reason) {
task.error = reason
}
if (task.concurrencyKey) {
concurrencyManager.release(task.concurrencyKey)
task.concurrencyKey = undefined
}
const completionTimer = completionTimers.get(task.id)
if (completionTimer) {
clearTimeout(completionTimer)
completionTimers.delete(task.id)
}
const idleTimer = idleDeferralTimers.get(task.id)
if (idleTimer) {
clearTimeout(idleTimer)
idleDeferralTimers.delete(task.id)
}
cleanupPendingByParent(task)
if (abortSession && task.sessionID) {
client.session.abort({
path: { id: task.sessionID },
}).catch(() => {})
}
if (options?.skipNotification) {
log(`[background-agent] Task cancelled via ${source} (notification skipped):`, task.id)
return true
}
markForNotification(task)
try {
await notifyParentSession(task)
log(`[background-agent] Task cancelled via ${source}:`, task.id)
} catch (err) {
log("[background-agent] Error in notifyParentSession for cancelled task:", {
taskId: task.id,
error: err,
})
}
return true
}

View File

@@ -1,68 +0,0 @@
import { log } from "../../shared"
import type { BackgroundTask } from "./types"
import type { ConcurrencyManager } from "./concurrency"
import type { OpencodeClient } from "./opencode-client"
export async function tryCompleteBackgroundTask(args: {
task: BackgroundTask
source: string
concurrencyManager: ConcurrencyManager
idleDeferralTimers: Map<string, ReturnType<typeof setTimeout>>
client: OpencodeClient
markForNotification: (task: BackgroundTask) => void
cleanupPendingByParent: (task: BackgroundTask) => void
notifyParentSession: (task: BackgroundTask) => Promise<void>
}): Promise<boolean> {
const {
task,
source,
concurrencyManager,
idleDeferralTimers,
client,
markForNotification,
cleanupPendingByParent,
notifyParentSession,
} = args
if (task.status !== "running") {
log("[background-agent] Task already completed, skipping:", {
taskId: task.id,
status: task.status,
source,
})
return false
}
task.status = "completed"
task.completedAt = new Date()
if (task.concurrencyKey) {
concurrencyManager.release(task.concurrencyKey)
task.concurrencyKey = undefined
}
markForNotification(task)
cleanupPendingByParent(task)
const idleTimer = idleDeferralTimers.get(task.id)
if (idleTimer) {
clearTimeout(idleTimer)
idleDeferralTimers.delete(task.id)
}
if (task.sessionID) {
client.session.abort({
path: { id: task.sessionID },
}).catch(() => {})
}
try {
await notifyParentSession(task)
log(`[background-agent] Task completed via ${source}:`, task.id)
} catch (err) {
log("[background-agent] Error in notifyParentSession:", { taskId: task.id, error: err })
}
return true
}

View File

@@ -1,77 +0,0 @@
import { getTaskToastManager } from "../task-toast-manager"
import { log } from "../../shared"
import type { BackgroundTask } from "./types"
import type { LaunchInput } from "./types"
type QueueItem = {
task: BackgroundTask
input: LaunchInput
}
export function launchBackgroundTask(args: {
input: LaunchInput
tasks: Map<string, BackgroundTask>
pendingByParent: Map<string, Set<string>>
queuesByKey: Map<string, QueueItem[]>
getConcurrencyKeyFromInput: (input: LaunchInput) => string
processKey: (key: string) => void
}): BackgroundTask {
const { input, tasks, pendingByParent, queuesByKey, getConcurrencyKeyFromInput, processKey } = args
log("[background-agent] launch() called with:", {
agent: input.agent,
model: input.model,
description: input.description,
parentSessionID: input.parentSessionID,
})
if (!input.agent || input.agent.trim() === "") {
throw new Error("Agent parameter is required")
}
const task: BackgroundTask = {
id: `bg_${crypto.randomUUID().slice(0, 8)}`,
status: "pending",
queuedAt: new Date(),
description: input.description,
prompt: input.prompt,
agent: input.agent,
parentSessionID: input.parentSessionID,
parentMessageID: input.parentMessageID,
parentModel: input.parentModel,
parentAgent: input.parentAgent,
model: input.model,
category: input.category,
}
tasks.set(task.id, task)
if (input.parentSessionID) {
const pending = pendingByParent.get(input.parentSessionID) ?? new Set<string>()
pending.add(task.id)
pendingByParent.set(input.parentSessionID, pending)
}
const key = getConcurrencyKeyFromInput(input)
const queue = queuesByKey.get(key) ?? []
queue.push({ task, input })
queuesByKey.set(key, queue)
log("[background-agent] Task queued:", { taskId: task.id, key, queueLength: queue.length })
const toastManager = getTaskToastManager()
if (toastManager) {
toastManager.addTask({
id: task.id,
description: input.description,
agent: input.agent,
isBackground: true,
status: "queued",
skills: input.skills,
})
}
processKey(key)
return task
}

View File

@@ -1,56 +0,0 @@
import type { BackgroundTask } from "./types"
export function getTasksByParentSession(
tasks: Iterable<BackgroundTask>,
sessionID: string
): BackgroundTask[] {
const result: BackgroundTask[] = []
for (const task of tasks) {
if (task.parentSessionID === sessionID) {
result.push(task)
}
}
return result
}
export function getAllDescendantTasks(
tasksByParent: (sessionID: string) => BackgroundTask[],
sessionID: string
): BackgroundTask[] {
const result: BackgroundTask[] = []
const directChildren = tasksByParent(sessionID)
for (const child of directChildren) {
result.push(child)
if (child.sessionID) {
result.push(...getAllDescendantTasks(tasksByParent, child.sessionID))
}
}
return result
}
export function findTaskBySession(
tasks: Iterable<BackgroundTask>,
sessionID: string
): BackgroundTask | undefined {
for (const task of tasks) {
if (task.sessionID === sessionID) return task
}
return undefined
}
export function getRunningTasks(tasks: Iterable<BackgroundTask>): BackgroundTask[] {
return Array.from(tasks).filter((t) => t.status === "running")
}
export function getNonRunningTasks(tasks: Iterable<BackgroundTask>): BackgroundTask[] {
return Array.from(tasks).filter((t) => t.status !== "running")
}
export function hasRunningTasks(tasks: Iterable<BackgroundTask>): boolean {
for (const task of tasks) {
if (task.status === "running") return true
}
return false
}

View File

@@ -1,52 +0,0 @@
import { log } from "../../shared"
import type { BackgroundTask } from "./types"
import type { ConcurrencyManager } from "./concurrency"
type QueueItem = {
task: BackgroundTask
input: import("./types").LaunchInput
}
export async function processConcurrencyKeyQueue(args: {
key: string
queuesByKey: Map<string, QueueItem[]>
processingKeys: Set<string>
concurrencyManager: ConcurrencyManager
startTask: (item: QueueItem) => Promise<void>
}): Promise<void> {
const { key, queuesByKey, processingKeys, concurrencyManager, startTask } = args
if (processingKeys.has(key)) return
processingKeys.add(key)
try {
const queue = queuesByKey.get(key)
while (queue && queue.length > 0) {
const item = queue[0]
await concurrencyManager.acquire(key)
if (item.task.status === "cancelled" || item.task.status === "error") {
concurrencyManager.release(key)
queue.shift()
continue
}
try {
await startTask(item)
} catch (error) {
log("[background-agent] Error starting task:", error)
// Release concurrency slot if startTask failed and didn't release it itself
// This prevents slot leaks when errors occur after acquire but before task.concurrencyKey is set
if (!item.task.concurrencyKey) {
concurrencyManager.release(key)
}
}
queue.shift()
}
} finally {
processingKeys.delete(key)
}
}

View File

@@ -1,148 +0,0 @@
import { log, getAgentToolRestrictions } from "../../shared"
import { subagentSessions } from "../claude-code-session-state"
import { getTaskToastManager } from "../task-toast-manager"
import type { BackgroundTask, ResumeInput } from "./types"
import type { ConcurrencyManager } from "./concurrency"
import type { OpencodeClient } from "./opencode-client"
type ModelRef = { providerID: string; modelID: string }
export async function resumeBackgroundTask(args: {
input: ResumeInput
findBySession: (sessionID: string) => BackgroundTask | undefined
client: OpencodeClient
concurrencyManager: ConcurrencyManager
pendingByParent: Map<string, Set<string>>
startPolling: () => void
markForNotification: (task: BackgroundTask) => void
cleanupPendingByParent: (task: BackgroundTask) => void
notifyParentSession: (task: BackgroundTask) => Promise<void>
}): Promise<BackgroundTask> {
const {
input,
findBySession,
client,
concurrencyManager,
pendingByParent,
startPolling,
markForNotification,
cleanupPendingByParent,
notifyParentSession,
} = args
const existingTask = findBySession(input.sessionId)
if (!existingTask) {
throw new Error(`Task not found for session: ${input.sessionId}`)
}
if (!existingTask.sessionID) {
throw new Error(`Task has no sessionID: ${existingTask.id}`)
}
if (existingTask.status === "running") {
log("[background-agent] Resume skipped - task already running:", {
taskId: existingTask.id,
sessionID: existingTask.sessionID,
})
return existingTask
}
const concurrencyKey =
existingTask.concurrencyGroup ??
(existingTask.model
? `${existingTask.model.providerID}/${existingTask.model.modelID}`
: existingTask.agent)
await concurrencyManager.acquire(concurrencyKey)
existingTask.concurrencyKey = concurrencyKey
existingTask.concurrencyGroup = concurrencyKey
existingTask.status = "running"
existingTask.completedAt = undefined
existingTask.error = undefined
existingTask.parentSessionID = input.parentSessionID
existingTask.parentMessageID = input.parentMessageID
existingTask.parentModel = input.parentModel
existingTask.parentAgent = input.parentAgent
existingTask.startedAt = new Date()
existingTask.progress = {
toolCalls: existingTask.progress?.toolCalls ?? 0,
lastUpdate: new Date(),
}
startPolling()
if (existingTask.sessionID) {
subagentSessions.add(existingTask.sessionID)
}
if (input.parentSessionID) {
const pending = pendingByParent.get(input.parentSessionID) ?? new Set<string>()
pending.add(existingTask.id)
pendingByParent.set(input.parentSessionID, pending)
}
const toastManager = getTaskToastManager()
if (toastManager) {
toastManager.addTask({
id: existingTask.id,
description: existingTask.description,
agent: existingTask.agent,
isBackground: true,
})
}
log("[background-agent] Resuming task:", { taskId: existingTask.id, sessionID: existingTask.sessionID })
log("[background-agent] Resuming task - calling prompt (fire-and-forget) with:", {
sessionID: existingTask.sessionID,
agent: existingTask.agent,
model: existingTask.model,
promptLength: input.prompt.length,
})
const resumeModel: ModelRef | undefined = existingTask.model
? { providerID: existingTask.model.providerID, modelID: existingTask.model.modelID }
: undefined
const resumeVariant = existingTask.model?.variant
client.session.promptAsync({
path: { id: existingTask.sessionID },
body: {
agent: existingTask.agent,
...(resumeModel ? { model: resumeModel } : {}),
...(resumeVariant ? { variant: resumeVariant } : {}),
tools: {
...getAgentToolRestrictions(existingTask.agent),
task: false,
call_omo_agent: true,
question: false,
},
parts: [{ type: "text", text: input.prompt }],
},
}).catch((error) => {
log("[background-agent] resume prompt error:", error)
existingTask.status = "interrupt"
const errorMessage = error instanceof Error ? error.message : String(error)
existingTask.error = errorMessage
existingTask.completedAt = new Date()
if (existingTask.concurrencyKey) {
concurrencyManager.release(existingTask.concurrencyKey)
existingTask.concurrencyKey = undefined
}
if (existingTask.sessionID) {
client.session.abort({
path: { id: existingTask.sessionID },
}).catch(() => {})
}
markForNotification(existingTask)
cleanupPendingByParent(existingTask)
notifyParentSession(existingTask).catch((err) => {
log("[background-agent] Failed to notify on resume error:", err)
})
})
return existingTask
}

View File

@@ -1,190 +0,0 @@
import { log, getAgentToolRestrictions, promptWithModelSuggestionRetry } from "../../shared"
import { isInsideTmux } from "../../shared/tmux"
import { subagentSessions } from "../claude-code-session-state"
import { getTaskToastManager } from "../task-toast-manager"
import type { BackgroundTask } from "./types"
import type { LaunchInput } from "./types"
import type { ConcurrencyManager } from "./concurrency"
import type { OpencodeClient } from "./opencode-client"
type QueueItem = {
task: BackgroundTask
input: LaunchInput
}
type ModelRef = { providerID: string; modelID: string }
export async function startQueuedTask(args: {
item: QueueItem
client: OpencodeClient
defaultDirectory: string
tmuxEnabled: boolean
onSubagentSessionCreated?: (event: { sessionID: string; parentID: string; title: string }) => Promise<void>
startPolling: () => void
getConcurrencyKeyFromInput: (input: LaunchInput) => string
concurrencyManager: ConcurrencyManager
findBySession: (sessionID: string) => BackgroundTask | undefined
markForNotification: (task: BackgroundTask) => void
cleanupPendingByParent: (task: BackgroundTask) => void
notifyParentSession: (task: BackgroundTask) => Promise<void>
}): Promise<void> {
const {
item,
client,
defaultDirectory,
tmuxEnabled,
onSubagentSessionCreated,
startPolling,
getConcurrencyKeyFromInput,
concurrencyManager,
findBySession,
markForNotification,
cleanupPendingByParent,
notifyParentSession,
} = args
const { task, input } = item
log("[background-agent] Starting task:", {
taskId: task.id,
agent: input.agent,
model: input.model,
})
const concurrencyKey = getConcurrencyKeyFromInput(input)
const parentSession = await client.session.get({
path: { id: input.parentSessionID },
}).catch((err) => {
log(`[background-agent] Failed to get parent session: ${err}`)
return null
})
const parentDirectory = parentSession?.data?.directory ?? defaultDirectory
log(`[background-agent] Parent dir: ${parentSession?.data?.directory}, using: ${parentDirectory}`)
const createResult = await client.session.create({
body: {
parentID: input.parentSessionID,
title: `${input.description} (@${input.agent} subagent)`,
} as any,
query: {
directory: parentDirectory,
},
})
if (createResult.error) {
throw new Error(`Failed to create background session: ${createResult.error}`)
}
if (!createResult.data?.id) {
throw new Error("Failed to create background session: API returned no session ID")
}
const sessionID = createResult.data.id
subagentSessions.add(sessionID)
log("[background-agent] tmux callback check", {
hasCallback: !!onSubagentSessionCreated,
tmuxEnabled,
isInsideTmux: isInsideTmux(),
sessionID,
parentID: input.parentSessionID,
})
if (onSubagentSessionCreated && tmuxEnabled && isInsideTmux()) {
log("[background-agent] Invoking tmux callback NOW", { sessionID })
await onSubagentSessionCreated({
sessionID,
parentID: input.parentSessionID,
title: input.description,
}).catch((err) => {
log("[background-agent] Failed to spawn tmux pane:", err)
})
log("[background-agent] tmux callback completed, waiting 200ms")
await new Promise<void>((resolve) => {
setTimeout(() => resolve(), 200)
})
} else {
log("[background-agent] SKIP tmux callback - conditions not met")
}
task.status = "running"
task.startedAt = new Date()
task.sessionID = sessionID
task.progress = {
toolCalls: 0,
lastUpdate: new Date(),
}
task.concurrencyKey = concurrencyKey
task.concurrencyGroup = concurrencyKey
startPolling()
log("[background-agent] Launching task:", { taskId: task.id, sessionID, agent: input.agent })
const toastManager = getTaskToastManager()
if (toastManager) {
toastManager.updateTask(task.id, "running")
}
log("[background-agent] Calling prompt (fire-and-forget) for launch with:", {
sessionID,
agent: input.agent,
model: input.model,
hasSkillContent: !!input.skillContent,
promptLength: input.prompt.length,
})
const launchModel: ModelRef | undefined = input.model
? { providerID: input.model.providerID, modelID: input.model.modelID }
: undefined
const launchVariant = input.model?.variant
promptWithModelSuggestionRetry(client, {
path: { id: sessionID },
body: {
agent: input.agent,
...(launchModel ? { model: launchModel } : {}),
...(launchVariant ? { variant: launchVariant } : {}),
system: input.skillContent,
tools: {
...getAgentToolRestrictions(input.agent),
task: false,
call_omo_agent: true,
question: false,
},
parts: [{ type: "text", text: input.prompt }],
},
}).catch((error) => {
log("[background-agent] promptAsync error:", error)
const existingTask = findBySession(sessionID)
if (!existingTask) return
existingTask.status = "interrupt"
const errorMessage = error instanceof Error ? error.message : String(error)
if (errorMessage.includes("agent.name") || errorMessage.includes("undefined")) {
existingTask.error = `Agent "${input.agent}" not found. Make sure the agent is registered in your opencode.json or provided by a plugin.`
} else {
existingTask.error = errorMessage
}
existingTask.completedAt = new Date()
if (existingTask.concurrencyKey) {
concurrencyManager.release(existingTask.concurrencyKey)
existingTask.concurrencyKey = undefined
}
client.session.abort({
path: { id: sessionID },
}).catch(() => {})
markForNotification(existingTask)
cleanupPendingByParent(existingTask)
notifyParentSession(existingTask).catch((err) => {
log("[background-agent] Failed to notify on error:", err)
})
})
}

Some files were not shown because too many files have changed in this diff Show More