Compare commits

..

87 Commits

Author SHA1 Message Date
github-actions[bot]
5558ddf468 release: v3.1.4 2026-01-28 07:22:03 +00:00
justsisyphus
aa03d9b811 ci: sync publish.yml test isolation with ci.yml 2026-01-28 16:18:21 +09:00
YeonGyu-Kim
28a0dd06c7 fix: resolve version detection for npm global installations (#1194)
When oh-my-opencode is installed via npm global install and run as a
compiled binary, import.meta.url returns a virtual bun path ($bunfs)
instead of the actual filesystem path. This caused getCachedVersion()
to return null, resulting in 'unknown' version display.

Add fallback using process.execPath which correctly points to the actual
binary location, allowing us to walk up and find the package.json.

Fixes #1182
2026-01-28 15:54:17 +09:00
YeonGyu-Kim
995b7751af ci(cla): add repository owner to CLA allowlist (#1195)
The repository owner (code-yeongyu) was not in the CLA allowlist,
causing CLA signature requirement on their own PRs.

Added code-yeongyu to the allowlist to skip CLA for owner commits.

Co-authored-by: 김연규 <yeongyu@mengmotaMacbookAir.local>
2026-01-28 15:46:42 +09:00
justsisyphus
5087788f66 ci: split test execution to prevent mock.module pollution 2026-01-28 15:06:32 +09:00
justsisyphus
19524c8a27 ci: run tests sequentially to prevent mock.module pollution 2026-01-28 14:59:26 +09:00
justsisyphus
fbb4d46945 fix: explicit reset in mainSessionID test for parallel test safety 2026-01-28 14:40:15 +09:00
justsisyphus
5dc8d577a4 fix: add afterEach cleanup in session-state tests for parallel test isolation 2026-01-28 14:36:58 +09:00
justsisyphus
c249763d7e fix: reset sessionAgentMap in _resetForTesting for test isolation
- Add sessionAgentMap.clear() to _resetForTesting()
- Prevents test pollution when tests run in parallel in CI
2026-01-28 14:33:14 +09:00
justsisyphus
b2d618e851 fix: mock provider cache in delegate-task tests for CI stability
- Add spyOn for readConnectedProvidersCache to return connected providers
- Tests now work consistently regardless of actual provider cache state
- Fixes CI failures for category variant and unstable agent tests
2026-01-28 14:27:34 +09:00
justsisyphus
6f348a8a5c fix: resolve CI test timeouts with configurable timing
- Add timing.ts module for test-only timing configuration
- Replace hardcoded wait times with getTimingConfig()
- Enable all previously skipped tests (ralph-loop, session-state, delegate-task)
- Tests now complete in ~2s instead of timing out
2026-01-28 14:17:56 +09:00
justsisyphus
1da0adcbe8 feat(index): add provider cache missing warning toast
Show warning toast when hasConnectedProvidersCache() returns false,
indicating model filtering is disabled. Prompts user to restart
OpenCode for full functionality.
2026-01-28 13:31:11 +09:00
justsisyphus
8a9d966a3d fix(model-resolver): skip fallback chain when no cache exists
When no provider cache exists, skip the fallback chain entirely and let
OpenCode use Provider.defaultModel() as the final fallback. This prevents
incorrect model selection when the plugin loads before providers connect.

- Remove forced first-entry fallback when no cache
- Add log messages for cache miss scenarios
- Update tests for new behavior
2026-01-28 13:31:03 +09:00
justsisyphus
76f8c500cb fix(config): add 'dev-browser' to BrowserAutomationProviderSchema
Config validation was failing when 'dev-browser' was set as the browser
automation provider, causing the entire config to be rejected. This
silently disabled all config options including tmux.enabled.

- Add 'dev-browser' as valid option in BrowserAutomationProviderSchema
- Update JSDoc with dev-browser description
- Regenerate JSON schema
2026-01-28 12:05:20 +09:00
github-actions[bot]
388516bcc5 @agno01 has signed the CLA in code-yeongyu/oh-my-opencode#1188 2026-01-28 01:02:15 +00:00
github-actions[bot]
8dff875929 @zycaskevin has signed the CLA in code-yeongyu/oh-my-opencode#1184 2026-01-27 16:20:49 +00:00
github-actions[bot]
966cc90a02 release: v3.1.3 2026-01-27 16:12:43 +00:00
justsisyphus
1d27d78127 test: skip flaky sync variant test (CI timeout) 2026-01-28 01:07:14 +09:00
justsisyphus
38156d49f3 ci: use find/xargs to exclude mock-heavy test files 2026-01-28 01:01:45 +09:00
justsisyphus
897eea0263 ci: isolate mock-heavy test files to prevent parallel pollution 2026-01-28 01:00:17 +09:00
justsisyphus
9b59ef66e4 test: fix flaky tests caused by mock.module pollution across parallel test files 2026-01-28 00:54:20 +09:00
github-actions[bot]
0d938059f9 @moha-abdi has signed the CLA in code-yeongyu/oh-my-opencode#1179 2026-01-27 12:36:31 +00:00
github-actions[bot]
9d35f23725 @MoerAI has signed the CLA in code-yeongyu/oh-my-opencode#1172 2026-01-27 09:31:52 +00:00
justsisyphus
aa1646f82c fix(delegate-task): pass variant as top-level field in prompt body
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-01-27 17:54:58 +09:00
justsisyphus
e47ab084fd fix(keyword-detector): skip ultrawork injection for planner agents
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-01-27 17:54:52 +09:00
justsisyphus
baf6358736 fix(background-agent): pass variant as top-level field in prompt body 2026-01-27 16:49:03 +09:00
justsisyphus
488c89156b test(config-handler): add tests for plan demote and prometheus mode 2026-01-27 16:06:03 +09:00
justsisyphus
c4957a469d fix(prometheus): set mode to 'all' and restore plan demote logic
- Change prometheus mode from 'primary' to 'all' to allow delegate_task calls
- Restore plan agent demote logic to use prometheus config as base
- Revert d481c596 changes that broke plan agent inheritance
2026-01-27 15:57:45 +09:00
justsisyphus
d481c596bd fix(plan-agent): only inherit model from prometheus as fallback
Plan agent was incorrectly inheriting prometheus's entire config (prompt,
permission, etc.) causing it to behave as primary instead of subagent.

Now plan agent:
1. Uses plan config model if explicitly set
2. Falls back to prometheus model only if plan config has no model
3. Keeps original OpenCode plan config intact
2026-01-27 15:18:28 +09:00
justsisyphus
655d511294 Revert "docs: add v2.x to v3.x migration guide (#1057)"
This PR was incorrectly merged by AI agent without proper project owner review.

This reverts commit 1cb6b3de39a49acb43b76ac55a5b44b47ca4a9f7.
2026-01-27 14:09:37 +09:00
justsisyphus
7dedd6cf90 Revert "Add oh-my-opencode-slim (#1100)"
This PR was incorrectly merged by AI agent without proper project owner review.

The AI evaluated this as 'ULTRA SAFE' because it only modified README files,
but failed to recognize that adding external fork promotions to the project
README requires explicit project owner approval - not just technical safety.

This reverts commit 912a56db85.
2026-01-27 14:09:18 +09:00
justsisyphus
bd18f231f5 feat(sisyphus): add foundation schemas for tasks and swarm (Wave 1)
- Add SisyphusTasksConfig and SisyphusSwarmConfig to schema.ts
- Create Task JSON schema with Zod validation
- Create Mailbox IPC protocol message schemas
- Add storage utilities with Claude Code path compatibility
- 25 tests passing
2026-01-27 13:07:09 +09:00
justsisyphus
de439edc22 feat(subagent): block question tool at both SDK and hook level
- Add permission: [{ permission: 'question', action: 'deny' }] to session.create()
  in background-agent and delegate-task for SDK-level blocking
- Add subagent-question-blocker hook as backup layer to intercept question tool
  calls in tool.execute.before event
- Ensures subagents cannot ask questions to users and must work autonomously
2026-01-27 13:07:09 +09:00
github-actions[bot]
04500bae7d @code-yeongyu has signed the CLA in code-yeongyu/oh-my-opencode#1100 2026-01-27 02:59:24 +00:00
Sisyphus
1cb6b3de7d docs: add v2.x to v3.x migration guide (#1057)
Comprehensive migration guide covering:
- TL;DR quick upgrade section for most users
- What's new in v3.x (Atlas, Prometheus, categories, skills)
- Breaking changes checklist (high/medium/low impact)
- Step-by-step upgrade path
- Configuration changes (categories, permissions)
- API changes for plugin developers
- Troubleshooting common issues
- Complete agent and category reference

Consulted Oracle for migration guide strategy and structure.

Closes #1034 (item 4)

Co-authored-by: sisyphus-dev-ai <sisyphus-dev-ai@users.noreply.github.com>
2026-01-27 11:59:15 +09:00
Alvin
912a56db85 Add oh-my-opencode-slim (#1100) 2026-01-27 11:59:12 +09:00
itsmylife44
a5d9929c0a feat: support OPENCODE_SERVER_PORT and OPENCODE_SERVER_HOSTNAME env vars (#1157)
Add support for customizing the OpenCode server port and hostname via
environment variables. This enables orchestration tools like Open Agent
to run multiple concurrent missions without port conflicts.

Environment variables:
- OPENCODE_SERVER_PORT: Custom port for the OpenCode server
- OPENCODE_SERVER_HOSTNAME: Custom hostname for the OpenCode server

When running oh-my-opencode in parallel (e.g., multiple missions in
Open Agent), each instance can now use a unique port to avoid conflicts
with the default port 4096.
2026-01-27 11:59:10 +09:00
vmlinuzx
7f43f160b5 docs: clarify category model resolution priority and fallback behavior (#1074)
The previous documentation implied that categories automatically use their
built-in default models (e.g., Gemini for visual, GPT-5.2 for ultrabrain).

This was misleading. Categories only use built-in defaults if explicitly
configured. Otherwise, they fall back to the system default model.

Changes:
- Add explicit warning about model resolution priority
- Document all 7 built-in categories (was only showing 2)
- Show complete example config with all categories
- Explain the wasteful fallback scenario
- Add 'variant' to supported category options

Fixes confusion where users expect optimized model selection but get
system default for all unconfigured categories.

Co-authored-by: DC <vmlinux@p16.tailnet.freeflight.co>
2026-01-27 11:58:59 +09:00
0ln
af67bc8592 fix(mcp): add optional Context7 Authorization header (#1133)
Context7 should mirror `websearch` by only sending auth when
`CONTEXT7_API_KEY` is set.

Change: set bearer auth in `headers` using `CONTEXT7_API_KEY` if said environment variable is set, otherwise leave `headers` to `undefined`.
2026-01-27 11:58:55 +09:00
Peter Rallojay
c74d79e28a fix: prevent builtin MCPs from overwriting user MCP configs (#956) 2026-01-27 11:58:42 +09:00
justsisyphus
fc5298d778 feat(workflow): add ZAI Coding + OpenAI provider for sisyphus-agent
- Add zai-coding-plan provider with GLM 4.7 and GLM 4.6v models
- Add OpenAI provider with GPT-5.2 models
- Configure unspecified-low category to use zai-coding-plan/glm-4.7
- Auth is provided via OPENCODE_AUTH_JSON secret
2026-01-27 10:51:24 +09:00
justsisyphus
3e8e3db961 feat(prompts): enhance plan output with TL;DR, agent profiles, and parallelization
- prometheus-prompt: Add TL;DR section with quick summary, deliverables, effort estimate
- prometheus-prompt: Add recommended agent profile (category + skills) per task
- prometheus-prompt: Enhance parallelization with execution waves and dependency matrix
- ultrawork: Change plan agent to prometheus agent invocation
- ultrawork: Add session_id resume workflow for Prometheus iteration
2026-01-27 10:50:38 +09:00
justsisyphus
6fa5cac616 fix(compaction): preserve agent verification state (#1144) 2026-01-27 10:35:20 +09:00
justsisyphus
158ccabf24 fix(notification): prevent false positive plugin detection (#1148) 2026-01-27 10:35:20 +09:00
justsisyphus
2efbf2650f fix(cli): add baseline builds for non-AVX2 CPUs (#1154) 2026-01-27 10:35:20 +09:00
justsisyphus
acded4ba2a fix(delegate-task): add clear error when model not configured (#1139) 2026-01-27 10:35:20 +09:00
github-actions[bot]
911e43445f @ghtndl has signed the CLA in code-yeongyu/oh-my-opencode#1158 2026-01-27 01:27:26 +00:00
sisyphus-dev-ai
3049e1ebfb chore: changes by sisyphus-dev-ai 2026-01-27 01:10:31 +00:00
github-actions[bot]
62921b9e44 release: v3.1.2 2026-01-27 01:07:09 +00:00
github-actions[bot]
cd23f7ab7d release: v3.1.1 2026-01-26 23:48:28 +00:00
justsisyphus
518dceac72 Revert "feat(librarian): conditionally enable thinking based on model type"
This reverts commit f033b30549a396db90e148756130cddec1fcdb2b.
2026-01-27 08:39:45 +09:00
justsisyphus
19f43e30c8 feat(librarian): conditionally enable thinking based on model type
- Add isGeminiModel helper to detect Gemini models
- Disable thinking config for Gemini models (not supported)
- Enable thinking with 32000 token budget for other models
- Add tests verifying both Gemini and Claude behavior

🤖 Generated with assistance of OhMyOpenCode
2026-01-27 08:39:45 +09:00
justsisyphus
b3be9f33c6 feat(ultrawork): enforce plan agent invocation and parallel delegation
- Add MANDATORY section for delegate_task(subagent_type='plan') at top of ultrawork prompt
- Establish 'DELEGATE by default, work yourself only when trivial' principle
- Add parallel execution rules with anti-pattern and correct pattern examples
- Remove emoji (checkmark/cross) from PLAN_AGENT_SYSTEM_PREPEND
- Restructure workflow into clear 4-step sequence
2026-01-27 08:39:45 +09:00
github-actions[bot]
430098856a @itsmylife44 has signed the CLA in code-yeongyu/oh-my-opencode#1157 2026-01-26 23:20:52 +00:00
github-actions[bot]
5932f5f94f @acamq has signed the CLA in code-yeongyu/oh-my-opencode#1151 2026-01-26 18:20:30 +00:00
github-actions[bot]
fcf2e32071 @craftaholic has signed the CLA in code-yeongyu/oh-my-opencode#1110 2026-01-26 16:12:39 +00:00
github-actions[bot]
19827dac70 @orientpine has signed the CLA in code-yeongyu/oh-my-opencode#1145 2026-01-26 14:30:44 +00:00
github-actions[bot]
3ed1c6644e @Jeremy-Kr has signed the CLA in code-yeongyu/oh-my-opencode#1141 2026-01-26 11:59:22 +00:00
justsisyphus
cf6e714946 feat(plan-agent): apply prometheus config to plan agent with fallback chain
- Add prometheus model fallback chain (claude-opus-4-5 → gpt-5.2 → gemini-3-pro)
- Plan agent now inherits prometheus settings (model, prompt, permission, variant)
- Plan agent mode remains 'subagent' while using prometheus config
- Add name field to prometheus config to fix agent.name undefined error
2026-01-26 18:31:48 +09:00
justsisyphus
383f43548b feat(plan-agent): enforce dependency/parallel graphs and category+skill recommendations
Add mandatory sections to PLAN_AGENT_SYSTEM_PREPEND:
- Task Dependency Graph with blockers/dependents/reasons
- Parallel Execution Graph with wave structure
- Category + Skills recommendations per task
- Response format specification with exact structure

Uses ASCII art banners and visual emphasis for critical requirements.
2026-01-26 18:31:35 +09:00
justsisyphus
26b1c67964 fix(background-agent): disable question tool for background tasks 2026-01-26 18:25:06 +09:00
justsisyphus
7e065dfe12 feat(delegate-task): prepend system prompt for plan agent invocations
When plan agent (plan/prometheus/planner) is invoked via delegate_task,
automatically prepend a <system> prompt instructing the agent to:
- Launch explore/librarian agents in background to gather context
- Summarize user request and list uncertainties
- Ask clarifying questions until requirements are 100% clear
2026-01-26 18:25:06 +09:00
justsisyphus
8429da02b8 feat(config): add thinking/reasoningEffort/providerOptions to AgentOverrideConfigSchema
- Add maxTokens, thinking, reasoningEffort, textVerbosity, providerOptions fields to AgentOverrideConfigSchema
- Update think-mode hook to respect agent-level thinking settings (disabled or custom providerOptions)
- Add tests for agent-level thinking configuration override behavior
2026-01-26 18:25:06 +09:00
github-actions[bot]
ab51f5d39f @boguan has signed the CLA in code-yeongyu/oh-my-opencode#1137 2026-01-26 08:46:14 +00:00
justsisyphus
3ee519c7b0 feat: make systemDefaultModel optional for OpenCode fallback (#1136)
- Remove mandatory model requirement from plugin initialization
- Allow OpenCode to use its built-in model fallback when user doesn't specify
- Update model-resolver to handle undefined systemDefaultModel
- Remove throw errors in config-handler, utils, atlas, delegate-task
- Add tests for optional model scenarios

Closes #1129

Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
2026-01-26 17:01:08 +09:00
justsisyphus
c9b86b7815 test(cli): add version display test to verify package.json reading (#1134)
Closes #1063

Investigation findings:
- The CLI code correctly reads version from package.json
- The reported issue (bunx showing old version) is a caching issue
- Added test to ensure version is read as valid semver from package.json

Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
2026-01-26 17:00:55 +09:00
github-actions[bot]
9b6d8f629a @misyuari has signed the CLA in code-yeongyu/oh-my-opencode#1132 2026-01-26 07:31:12 +00:00
justsisyphus
6a2f43858a docs: add server mode and shell function examples for tmux integration
- Add --port flag requirement for tmux subagent pane spawning
- Add Fish shell function example with automatic port allocation
- Add Bash/Zsh equivalent function example
- Document how subagent panes work (opencode attach flow)
- Add OPENCODE_PORT environment variable documentation
- Add server mode reference section with opencode serve command
2026-01-26 16:24:14 +09:00
justsisyphus
601ea32a1c docs: add tmux integration and interactive terminal documentation
- Add Tmux Integration section to configurations.md with all config options
- Add Visual Multi-Agent with Tmux subsection to features.md
- Add Interactive Terminal Tools section documenting interactive_bash tool
2026-01-26 16:02:34 +09:00
github-actions[bot]
8f31211c75 release: v3.1.0 2026-01-26 06:46:47 +00:00
justsisyphus
04f2b513c6 feat(tmux-subagent): add replace action to prevent mass eviction
- Add column-based splittable calculation (getColumnCount, getColumnWidth)
- New decision tree: splittable → split, k=1 eviction → close+spawn, else → replace
- Add 'replace' action type using tmux respawn-pane (preserves layout)
- Replace oldest pane in-place instead of closing all panes when unsplittable
- Prevents scenario where all agent panes get closed leaving only 1
2026-01-26 15:25:11 +09:00
justsisyphus
8ebc933118 fix(tmux-subagent): enable 2D grid layout with divider-aware calculations
- Account for tmux pane dividers (1 char) in all size calculations
- Reduce MIN_PANE_WIDTH from 53 to 52 to fit 2 columns in standard terminals
- Fix enforceMainPaneWidth to use (windowWidth - divider) / 2
- Add virtual mainPane handling for close-spawn eviction loop
- Add comprehensive decision-engine tests (23 test cases)
2026-01-26 15:11:16 +09:00
justsisyphus
a67a35aea8 docs: regenerate AGENTS.md knowledge base via /init-deep 2026-01-26 14:56:55 +09:00
justsisyphus
9d66b80709 feat(hooks): add active working context section to compaction summary
Include files, code in progress, external references, and state/variables
in compaction summary for seamless continuation after context compaction.
2026-01-26 14:23:05 +09:00
justsisyphus
5c7eb02d5b chore(test): sync agent name casing in tests (#1128)
Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
2026-01-26 12:10:30 +09:00
justsisyphus
68aa913499 refactor(tmux-subagent): state-first architecture with decision engine (#1125)
* refactor(tmux-subagent): add state-first architecture with decision engine

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

* feat(tmux): add pane spawn callbacks for background and sync sessions

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

---------

Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-01-26 12:02:37 +09:00
justsisyphus
3a79b8761b feat(shared): add connected-providers-cache for model availability (#1121)
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-01-26 11:53:41 +09:00
justsisyphus
da416b362b feat(hooks): add category-skill-reminder hook (#1123)
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-01-26 11:48:32 +09:00
justsisyphus
90054b28ad chore(docs): regenerate AGENTS.md knowledge base (#1118)
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-01-26 11:48:30 +09:00
justsisyphus
892b245779 fix(test): update builtin skills count from 3 to 4 (#1126)
* fix(test): update builtin skills count from 3 to 4 (dev-browser added)

* chore(ci): add block-master-pr workflow

---------

Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
2026-01-26 02:29:28 +00:00
YeonGyu-Kim
aead4aebd2 Add tmux pane management for background agent sessions (#1094)
* feat(config): add TmuxConfigSchema for tmux subagent pane management

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

* feat(shared): add tmux module structure

* feat(shared/tmux): implement tmux pane utilities

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

* test(tmux-subagent): add TmuxSessionManager tests (TDD RED)

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

* feat(tmux-subagent): implement TmuxSessionManager

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

* feat(integration): wire TmuxSessionManager with 500ms delay

- Task 5: Add 500ms delay in BackgroundManager after session creation
- Task 6: Wire TmuxSessionManager event handlers (session.created/deleted)
- Both changes integrate tmux pane management into plugin lifecycle

Co-authored-by: Sisyphus <ultrawork@oh-my-opencode>

---------

Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Co-authored-by: Sisyphus <ultrawork@oh-my-opencode>
2026-01-25 15:34:10 +09:00
YeonGyu-Kim
bccc943173 feat(skills): add dev-browser skill with Windows support (#1093)
* feat(skills): add dev-browser skill with Windows support

* chore: trigger CI
2026-01-25 15:34:07 +09:00
justsisyphus
05904ca617 docs(agent-browser): add detailed installation guide with Playwright troubleshooting 2026-01-25 15:12:32 +09:00
YeonGyu-Kim
3af30b0a21 feat(skills): add agent-browser option for browser automation (#1090)
Add configurable browser automation allowing users to choose between
Playwright MCP (default) and Vercel's agent-browser CLI.

Changes:
- Add browser_automation_engine.provider config option
- Dynamic skill loading based on provider selection
- Comprehensive agent-browser CLI reference (inline in skills.ts)
- Propagate browserProvider to delegate_task and buildAgent
- Update documentation with provider comparison

Co-authored-by: Suyeol Jeon <devxoul@gmail.com>
Co-authored-by: YeonGyu Kim <code.yeongyu@gmail.com>
2026-01-25 15:02:41 +09:00
YeonGyu-Kim
b55fd8d76f feat(explore): add github-copilot/gpt-5-mini to fallback chain (#1091)
* feat(explore): add github-copilot/gpt-5-mini to fallback chain

* test(explore): add tests for github-copilot/gpt-5-mini fallback

---------

Co-authored-by: Suyeol Jeon <devxoul@gmail.com>
2026-01-25 05:53:11 +00:00
Sisyphus
208af055ef fix: generate skill/slashcommand descriptions synchronously when pre-provided (#1087)
* fix: generate skill/slashcommand tool descriptions synchronously when pre-provided

When skills are passed via options (pre-resolved), build the tool description
synchronously instead of fire-and-forget async. This eliminates the race
condition where the description getter returns the bare prefix before the
async cache-warming microtask completes.

Fixes #1039

* chore: changes by sisyphus-dev-ai

---------

Co-authored-by: sisyphus-dev-ai <sisyphus-dev-ai@users.noreply.github.com>
2026-01-25 14:52:50 +09:00
YeonGyu-Kim
0aa8f486af feat(hooks): add sisyphus-junior-notepad hook for conditional notepad rules injection (#1092)
* refactor(shared): extract isCallerOrchestrator to session-utils

* refactor(atlas): use shared isCallerOrchestrator, change to prepend

* refactor(prometheus-md-only): change to prepend pattern

* refactor(sisyphus-junior): remove Work_Context (moved to hook)

* feat(hooks): add sisyphus-junior-notepad hook

* fix(shared): replace dynamic require with static import in session-utils

- Change from dynamic require to static import for better bundler compatibility
- Fix import path: ../../features -> ../features
- Add barrel export to src/shared/index.ts

* feat(hooks): register sisyphus-junior-notepad hook

- Add to HookNameSchema in schema.ts
- Export from hooks/index.ts
- Register with isHookEnabled in index.ts
- Auto-generated schema.json update

---------

Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
2026-01-25 14:52:11 +09:00
127 changed files with 10502 additions and 853 deletions

View File

@@ -4,13 +4,32 @@ on:
push:
branches: [master, dev]
pull_request:
branches: [dev]
branches: [master, dev]
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
# Block PRs targeting master branch
block-master-pr:
runs-on: ubuntu-latest
if: github.event_name == 'pull_request'
steps:
- name: Check PR target branch
run: |
if [ "${{ github.base_ref }}" = "master" ]; then
echo "::error::PRs to master branch are not allowed. Please target the 'dev' branch instead."
echo ""
echo "PULL REQUESTS TO MASTER ARE BLOCKED"
echo ""
echo "All PRs must target the 'dev' branch."
echo "Please close this PR and create a new one targeting 'dev'."
exit 1
else
echo "PR targets '${{ github.base_ref }}' branch - OK"
fi
test:
runs-on: ubuntu-latest
steps:
@@ -25,8 +44,34 @@ jobs:
env:
BUN_INSTALL_ALLOW_SCRIPTS: "@ast-grep/napi"
- name: Run tests
run: bun test
- name: Run mock-heavy tests (isolated)
run: |
# These files use mock.module() which pollutes module cache
# Run them in separate processes to prevent cross-file contamination
bun test src/plugin-handlers
bun test src/hooks/atlas
bun test src/hooks/compaction-context-injector
bun test src/features/tmux-subagent
- name: Run remaining tests
run: |
# Run all other tests (mock-heavy ones are re-run but that's acceptable)
bun test bin script src/cli src/config src/mcp src/index.test.ts \
src/agents src/tools src/shared \
src/hooks/anthropic-context-window-limit-recovery \
src/hooks/claude-code-compatibility \
src/hooks/context-injection \
src/hooks/provider-toast \
src/hooks/session-notification \
src/hooks/sisyphus \
src/hooks/todo-continuation-enforcer \
src/features/background-agent \
src/features/builtin-commands \
src/features/builtin-skills \
src/features/claude-code-session-state \
src/features/hook-message-injector \
src/features/opencode-skill-loader \
src/features/skill-mcp-manager
typecheck:
runs-on: ubuntu-latest

View File

@@ -25,7 +25,7 @@ jobs:
path-to-signatures: 'signatures/cla.json'
path-to-document: 'https://github.com/code-yeongyu/oh-my-opencode/blob/master/CLA.md'
branch: 'dev'
allowlist: bot*,dependabot*,github-actions*,*[bot],sisyphus-dev-ai
allowlist: code-yeongyu,bot*,dependabot*,github-actions*,*[bot],sisyphus-dev-ai
custom-notsigned-prcomment: |
Thank you for your contribution! Before we can merge this PR, we need you to sign our [Contributor License Agreement (CLA)](https://github.com/code-yeongyu/oh-my-opencode/blob/master/CLA.md).

View File

@@ -45,8 +45,34 @@ jobs:
env:
BUN_INSTALL_ALLOW_SCRIPTS: "@ast-grep/napi"
- name: Run tests
run: bun test
- name: Run mock-heavy tests (isolated)
run: |
# These files use mock.module() which pollutes module cache
# Run them in separate processes to prevent cross-file contamination
bun test src/plugin-handlers
bun test src/hooks/atlas
bun test src/hooks/compaction-context-injector
bun test src/features/tmux-subagent
- name: Run remaining tests
run: |
# Run all other tests (mock-heavy ones are re-run but that's acceptable)
bun test bin script src/cli src/config src/mcp src/index.test.ts \
src/agents src/tools src/shared \
src/hooks/anthropic-context-window-limit-recovery \
src/hooks/claude-code-compatibility \
src/hooks/context-injection \
src/hooks/provider-toast \
src/hooks/session-notification \
src/hooks/sisyphus \
src/hooks/todo-continuation-enforcer \
src/features/background-agent \
src/features/builtin-commands \
src/features/builtin-skills \
src/features/claude-code-session-state \
src/features/hook-message-injector \
src/features/opencode-skill-loader \
src/features/skill-mcp-manager
typecheck:
runs-on: ubuntu-latest

View File

@@ -152,6 +152,41 @@ jobs:
"limit": { "context": 200000, "output": 64000 }
}
}
} |
.provider["zai-coding-plan"] = {
"name": "Z.AI Coding Plan",
"npm": "@ai-sdk/openai-compatible",
"options": {
"baseURL": "https://api.z.ai/api/paas/v4"
},
"models": {
"glm-4.7": {
"id": "glm-4.7",
"name": "GLM 4.7",
"limit": { "context": 128000, "output": 16000 }
},
"glm-4.6v": {
"id": "glm-4.6v",
"name": "GLM 4.6 Vision",
"limit": { "context": 128000, "output": 16000 }
}
}
} |
.provider.openai = {
"name": "OpenAI",
"npm": "@ai-sdk/openai",
"models": {
"gpt-5.2": {
"id": "gpt-5.2",
"name": "GPT-5.2",
"limit": { "context": 128000, "output": 16000 }
},
"gpt-5.2-codex": {
"id": "gpt-5.2-codex",
"name": "GPT-5.2 Codex",
"limit": { "context": 128000, "output": 32000 }
}
}
}
' "$OPENCODE_JSON" > /tmp/oc.json && mv /tmp/oc.json "$OPENCODE_JSON"
@@ -287,6 +322,9 @@ jobs:
)
jq --arg append "$PROMPT_APPEND" '.agents.Sisyphus.prompt_append = $append' "$OMO_JSON" > /tmp/omo.json && mv /tmp/omo.json "$OMO_JSON"
# Add categories configuration for unspecified-low to use GLM 4.7
jq '.categories["unspecified-low"] = { "model": "zai-coding-plan/glm-4.7" }' "$OMO_JSON" > /tmp/omo.json && mv /tmp/omo.json "$OMO_JSON"
mkdir -p ~/.local/share/opencode
echo "$OPENCODE_AUTH_JSON" > ~/.local/share/opencode/auth.json
chmod 600 ~/.local/share/opencode/auth.json

View File

@@ -1,12 +1,24 @@
# PROJECT KNOWLEDGE BASE
**Generated:** 2026-01-25T13:10:00+09:00
**Commit:** 043b1a33
**Generated:** 2026-01-26T14:50:00+09:00
**Commit:** 9d66b807
**Branch:** dev
---
## **IMPORTANT: PULL REQUEST TARGET BRANCH**
> **ALL PULL REQUESTS MUST TARGET THE `dev` BRANCH.**
>
> **DO NOT CREATE PULL REQUESTS TARGETING `master` BRANCH.**
>
> PRs to `master` will be automatically rejected by CI.
---
## OVERVIEW
OpenCode plugin: multi-model agent orchestration (Claude Opus 4.5, GPT-5.2, Gemini 3 Flash, Grok Code, GLM-4.7). 31 lifecycle hooks, 20+ tools (LSP, AST-Grep, delegation), 10 specialized agents, full Claude Code compatibility. "oh-my-zsh" for OpenCode.
OpenCode plugin: multi-model agent orchestration (Claude Opus 4.5, GPT-5.2, Gemini 3 Flash, Grok Code). 32 lifecycle hooks, 20+ tools (LSP, AST-Grep, delegation), 10 specialized agents, full Claude Code compatibility. "oh-my-zsh" for OpenCode.
## STRUCTURE
@@ -14,14 +26,14 @@ OpenCode plugin: multi-model agent orchestration (Claude Opus 4.5, GPT-5.2, Gemi
oh-my-opencode/
├── src/
│ ├── agents/ # 10 AI agents - see src/agents/AGENTS.md
│ ├── hooks/ # 31 lifecycle hooks - see src/hooks/AGENTS.md
│ ├── hooks/ # 32 lifecycle hooks - see src/hooks/AGENTS.md
│ ├── tools/ # 20+ tools - see src/tools/AGENTS.md
│ ├── features/ # Background agents, Claude Code compat - see src/features/AGENTS.md
│ ├── shared/ # 50 cross-cutting utilities - see src/shared/AGENTS.md
│ ├── shared/ # 55 cross-cutting utilities - see src/shared/AGENTS.md
│ ├── cli/ # CLI installer, doctor - see src/cli/AGENTS.md
│ ├── mcp/ # Built-in MCPs - see src/mcp/AGENTS.md
│ ├── config/ # Zod schema, TypeScript types
│ └── index.ts # Main plugin entry (601 lines)
│ └── index.ts # Main plugin entry (672 lines)
├── script/ # build-schema.ts, build-binaries.ts
├── packages/ # 7 platform-specific binaries
└── dist/ # Build output (ESM + .d.ts)
@@ -38,8 +50,8 @@ oh-my-opencode/
| Add skill | `src/features/builtin-skills/` | Create dir with SKILL.md |
| Add command | `src/features/builtin-commands/` | Add template + register in commands.ts |
| Config schema | `src/config/schema.ts` | Zod schema, run `bun run build:schema` |
| Background agents | `src/features/background-agent/` | manager.ts (1335 lines) |
| Orchestrator | `src/hooks/atlas/` | Main orchestration hook (773 lines) |
| Background agents | `src/features/background-agent/` | manager.ts (1377 lines) |
| Orchestrator | `src/hooks/atlas/` | Main orchestration hook (752 lines) |
## TDD (Test-Driven Development)
@@ -51,8 +63,8 @@ oh-my-opencode/
**Rules:**
- NEVER write implementation before test
- NEVER delete failing tests - fix the code
- Test file: `*.test.ts` alongside source
- BDD comments: `#given`, `#when`, `#then`
- Test file: `*.test.ts` alongside source (100 test files)
- BDD comments: `//#given`, `//#when`, `//#then`
## CONVENTIONS
@@ -61,7 +73,7 @@ oh-my-opencode/
- **Build**: `bun build` (ESM) + `tsc --emitDeclarationOnly`
- **Exports**: Barrel pattern via index.ts
- **Naming**: kebab-case dirs, `createXXXHook`/`createXXXTool` factories
- **Testing**: BDD comments, 95 test files
- **Testing**: BDD comments, 100 test files
- **Temperature**: 0.1 for code agents, max 0.3
## ANTI-PATTERNS
@@ -100,7 +112,7 @@ oh-my-opencode/
bun run typecheck # Type check
bun run build # ESM + declarations + schema
bun run rebuild # Clean + Build
bun test # 95 test files
bun test # 100 test files
```
## DEPLOYMENT
@@ -114,16 +126,14 @@ bun test # 95 test files
| File | Lines | Description |
|------|-------|-------------|
| `src/features/background-agent/manager.ts` | 1335 | Task lifecycle, concurrency |
| `src/features/builtin-skills/skills.ts` | 1203 | Skill definitions |
| `src/features/builtin-skills/skills.ts` | 1729 | Skill definitions |
| `src/features/background-agent/manager.ts` | 1377 | Task lifecycle, concurrency |
| `src/agents/prometheus-prompt.ts` | 1196 | Planning agent |
| `src/tools/delegate-task/tools.ts` | 1039 | Category-based delegation |
| `src/hooks/atlas/index.ts` | 773 | Orchestrator hook |
| `src/tools/delegate-task/tools.ts` | 1070 | Category-based delegation |
| `src/hooks/atlas/index.ts` | 752 | Orchestrator hook |
| `src/cli/config-manager.ts` | 664 | JSONC config parsing |
| `src/index.ts` | 672 | Main plugin entry |
| `src/features/builtin-commands/templates/refactor.ts` | 619 | Refactor command template |
| `src/index.ts` | 601 | Main plugin entry |
| `src/tools/lsp/client.ts` | 596 | LSP JSON-RPC client |
| `src/agents/atlas.ts` | 572 | Atlas orchestrator agent |
## MCP ARCHITECTURE

View File

@@ -38,6 +38,7 @@
"type": "string",
"enum": [
"playwright",
"agent-browser",
"frontend-ui-ux",
"git-master"
]
@@ -70,12 +71,14 @@
"interactive-bash-session",
"thinking-block-validator",
"ralph-loop",
"category-skill-reminder",
"compaction-context-injector",
"claude-code-hooks",
"auto-slash-command",
"edit-error-recovery",
"delegate-task-retry",
"prometheus-md-only",
"sisyphus-junior-notepad",
"start-work",
"atlas"
]
@@ -217,6 +220,51 @@
]
}
}
},
"maxTokens": {
"type": "number"
},
"thinking": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"enabled",
"disabled"
]
},
"budgetTokens": {
"type": "number"
}
},
"required": [
"type"
]
},
"reasoningEffort": {
"type": "string",
"enum": [
"low",
"medium",
"high",
"xhigh"
]
},
"textVerbosity": {
"type": "string",
"enum": [
"low",
"medium",
"high"
]
},
"providerOptions": {
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {}
}
}
},
@@ -343,6 +391,51 @@
]
}
}
},
"maxTokens": {
"type": "number"
},
"thinking": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"enabled",
"disabled"
]
},
"budgetTokens": {
"type": "number"
}
},
"required": [
"type"
]
},
"reasoningEffort": {
"type": "string",
"enum": [
"low",
"medium",
"high",
"xhigh"
]
},
"textVerbosity": {
"type": "string",
"enum": [
"low",
"medium",
"high"
]
},
"providerOptions": {
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {}
}
}
},
@@ -469,6 +562,51 @@
]
}
}
},
"maxTokens": {
"type": "number"
},
"thinking": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"enabled",
"disabled"
]
},
"budgetTokens": {
"type": "number"
}
},
"required": [
"type"
]
},
"reasoningEffort": {
"type": "string",
"enum": [
"low",
"medium",
"high",
"xhigh"
]
},
"textVerbosity": {
"type": "string",
"enum": [
"low",
"medium",
"high"
]
},
"providerOptions": {
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {}
}
}
},
@@ -595,6 +733,51 @@
]
}
}
},
"maxTokens": {
"type": "number"
},
"thinking": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"enabled",
"disabled"
]
},
"budgetTokens": {
"type": "number"
}
},
"required": [
"type"
]
},
"reasoningEffort": {
"type": "string",
"enum": [
"low",
"medium",
"high",
"xhigh"
]
},
"textVerbosity": {
"type": "string",
"enum": [
"low",
"medium",
"high"
]
},
"providerOptions": {
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {}
}
}
},
@@ -721,6 +904,51 @@
]
}
}
},
"maxTokens": {
"type": "number"
},
"thinking": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"enabled",
"disabled"
]
},
"budgetTokens": {
"type": "number"
}
},
"required": [
"type"
]
},
"reasoningEffort": {
"type": "string",
"enum": [
"low",
"medium",
"high",
"xhigh"
]
},
"textVerbosity": {
"type": "string",
"enum": [
"low",
"medium",
"high"
]
},
"providerOptions": {
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {}
}
}
},
@@ -847,6 +1075,51 @@
]
}
}
},
"maxTokens": {
"type": "number"
},
"thinking": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"enabled",
"disabled"
]
},
"budgetTokens": {
"type": "number"
}
},
"required": [
"type"
]
},
"reasoningEffort": {
"type": "string",
"enum": [
"low",
"medium",
"high",
"xhigh"
]
},
"textVerbosity": {
"type": "string",
"enum": [
"low",
"medium",
"high"
]
},
"providerOptions": {
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {}
}
}
},
@@ -973,6 +1246,51 @@
]
}
}
},
"maxTokens": {
"type": "number"
},
"thinking": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"enabled",
"disabled"
]
},
"budgetTokens": {
"type": "number"
}
},
"required": [
"type"
]
},
"reasoningEffort": {
"type": "string",
"enum": [
"low",
"medium",
"high",
"xhigh"
]
},
"textVerbosity": {
"type": "string",
"enum": [
"low",
"medium",
"high"
]
},
"providerOptions": {
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {}
}
}
},
@@ -1099,6 +1417,51 @@
]
}
}
},
"maxTokens": {
"type": "number"
},
"thinking": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"enabled",
"disabled"
]
},
"budgetTokens": {
"type": "number"
}
},
"required": [
"type"
]
},
"reasoningEffort": {
"type": "string",
"enum": [
"low",
"medium",
"high",
"xhigh"
]
},
"textVerbosity": {
"type": "string",
"enum": [
"low",
"medium",
"high"
]
},
"providerOptions": {
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {}
}
}
},
@@ -1225,6 +1588,51 @@
]
}
}
},
"maxTokens": {
"type": "number"
},
"thinking": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"enabled",
"disabled"
]
},
"budgetTokens": {
"type": "number"
}
},
"required": [
"type"
]
},
"reasoningEffort": {
"type": "string",
"enum": [
"low",
"medium",
"high",
"xhigh"
]
},
"textVerbosity": {
"type": "string",
"enum": [
"low",
"medium",
"high"
]
},
"providerOptions": {
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {}
}
}
},
@@ -1351,6 +1759,51 @@
]
}
}
},
"maxTokens": {
"type": "number"
},
"thinking": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"enabled",
"disabled"
]
},
"budgetTokens": {
"type": "number"
}
},
"required": [
"type"
]
},
"reasoningEffort": {
"type": "string",
"enum": [
"low",
"medium",
"high",
"xhigh"
]
},
"textVerbosity": {
"type": "string",
"enum": [
"low",
"medium",
"high"
]
},
"providerOptions": {
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {}
}
}
},
@@ -1477,6 +1930,51 @@
]
}
}
},
"maxTokens": {
"type": "number"
},
"thinking": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"enabled",
"disabled"
]
},
"budgetTokens": {
"type": "number"
}
},
"required": [
"type"
]
},
"reasoningEffort": {
"type": "string",
"enum": [
"low",
"medium",
"high",
"xhigh"
]
},
"textVerbosity": {
"type": "string",
"enum": [
"low",
"medium",
"high"
]
},
"providerOptions": {
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {}
}
}
},
@@ -1603,6 +2101,51 @@
]
}
}
},
"maxTokens": {
"type": "number"
},
"thinking": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"enabled",
"disabled"
]
},
"budgetTokens": {
"type": "number"
}
},
"required": [
"type"
]
},
"reasoningEffort": {
"type": "string",
"enum": [
"low",
"medium",
"high",
"xhigh"
]
},
"textVerbosity": {
"type": "string",
"enum": [
"low",
"medium",
"high"
]
},
"providerOptions": {
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {}
}
}
},
@@ -1729,6 +2272,51 @@
]
}
}
},
"maxTokens": {
"type": "number"
},
"thinking": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"enabled",
"disabled"
]
},
"budgetTokens": {
"type": "number"
}
},
"required": [
"type"
]
},
"reasoningEffort": {
"type": "string",
"enum": [
"low",
"medium",
"high",
"xhigh"
]
},
"textVerbosity": {
"type": "string",
"enum": [
"low",
"medium",
"high"
]
},
"providerOptions": {
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {}
}
}
}
@@ -2171,6 +2759,100 @@
"type": "boolean"
}
}
},
"browser_automation_engine": {
"type": "object",
"properties": {
"provider": {
"default": "playwright",
"type": "string",
"enum": [
"playwright",
"agent-browser",
"dev-browser"
]
}
}
},
"tmux": {
"type": "object",
"properties": {
"enabled": {
"default": false,
"type": "boolean"
},
"layout": {
"default": "main-vertical",
"type": "string",
"enum": [
"main-horizontal",
"main-vertical",
"tiled",
"even-horizontal",
"even-vertical"
]
},
"main_pane_size": {
"default": 60,
"type": "number",
"minimum": 20,
"maximum": 80
},
"main_pane_min_width": {
"default": 120,
"type": "number",
"minimum": 40
},
"agent_pane_min_width": {
"default": 40,
"type": "number",
"minimum": 20
}
}
},
"sisyphus": {
"type": "object",
"properties": {
"tasks": {
"type": "object",
"properties": {
"enabled": {
"default": false,
"type": "boolean"
},
"storage_path": {
"default": ".sisyphus/tasks",
"type": "string"
},
"claude_code_compat": {
"default": false,
"type": "boolean"
}
}
},
"swarm": {
"type": "object",
"properties": {
"enabled": {
"default": false,
"type": "boolean"
},
"storage_path": {
"default": ".sisyphus/teams",
"type": "string"
},
"ui_mode": {
"default": "toast",
"type": "string",
"enum": [
"toast",
"tmux",
"both"
]
}
}
}
}
}
}
}

View File

@@ -27,13 +27,13 @@
"typescript": "^5.7.3",
},
"optionalDependencies": {
"oh-my-opencode-darwin-arm64": "3.0.0",
"oh-my-opencode-darwin-x64": "3.0.0",
"oh-my-opencode-linux-arm64": "3.0.0",
"oh-my-opencode-linux-arm64-musl": "3.0.0",
"oh-my-opencode-linux-x64": "3.0.0",
"oh-my-opencode-linux-x64-musl": "3.0.0",
"oh-my-opencode-windows-x64": "3.0.0",
"oh-my-opencode-darwin-arm64": "3.1.2",
"oh-my-opencode-darwin-x64": "3.1.2",
"oh-my-opencode-linux-arm64": "3.1.2",
"oh-my-opencode-linux-arm64-musl": "3.1.2",
"oh-my-opencode-linux-x64": "3.1.2",
"oh-my-opencode-linux-x64-musl": "3.1.2",
"oh-my-opencode-windows-x64": "3.1.2",
},
},
},
@@ -225,20 +225,6 @@
"object-inspect": ["object-inspect@1.13.4", "", {}, "sha512-W67iLl4J2EXEGTbfeHCffrjDfitvLANg0UlX3wFUUSTx92KXRFegMHUVgSqE+wvhAbi4WqjGg9czysTV2Epbew=="],
"oh-my-opencode-darwin-arm64": ["oh-my-opencode-darwin-arm64@3.0.0", "", { "os": "darwin", "cpu": "arm64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-zelvb7qz5GsS+Dhyz9rACZrkUMtWbAZGijiHSQqmRcjlN/sRPNhXtsL55VheDjlPM3VP+t3+psv+se0WA/aw5w=="],
"oh-my-opencode-darwin-x64": ["oh-my-opencode-darwin-x64@3.0.0", "", { "os": "darwin", "cpu": "x64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-dRMD1U5zIrb6BsiKQJZtAFtuD8clAQquZyU2LajMoFTHBNhcBDIgsaBBwvMBIq7dTe8rnFq91ExiFA8OfdrzBA=="],
"oh-my-opencode-linux-arm64": ["oh-my-opencode-linux-arm64@3.0.0", "", { "os": "linux", "cpu": "arm64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-Wx6Cx2Nu2T69mfZa3FQ3gk0OFONvMh48rMVYK0Cp8VX5W4Zb/GZgTUFmZlYsApyxqP+7J9m18skd46qPOhzuEQ=="],
"oh-my-opencode-linux-arm64-musl": ["oh-my-opencode-linux-arm64-musl@3.0.0", "", { "os": "linux", "cpu": "arm64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-mfOlptgLoXLVuhFRcXgZU7BYGuL1axZOMOOjONgncNzOp/BQYU5B9BRFihBUXdDsWGmeMiLowrYGBhVpSv3NlA=="],
"oh-my-opencode-linux-x64": ["oh-my-opencode-linux-x64@3.0.0", "", { "os": "linux", "cpu": "x64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-vVjshfaz0UC9NrGD9FfjlYK5NvckIW0sZaE/wRv/LKjrukHFH1jJpJa5KKXxBWLsEJjt6ooJRguXXxtfNXpAWw=="],
"oh-my-opencode-linux-x64-musl": ["oh-my-opencode-linux-x64-musl@3.0.0", "", { "os": "linux", "cpu": "x64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-N6cNJ7+Dj0a5dWqPf6OKfB39o8HWw5HQ3hB4omgYqc6Gzo6nChA4KIiVefEC3+tIL98x4XvMeD7OU+UYgwxHnQ=="],
"oh-my-opencode-windows-x64": ["oh-my-opencode-windows-x64@3.0.0", "", { "os": "win32", "cpu": "x64", "bin": { "oh-my-opencode": "bin/oh-my-opencode.exe" } }, "sha512-TaC0hiHpnsS42GWTVUKoTwCb+QzNLBlQtTkIQ0PjlkDYFjlEC2LuR2FFcscik055PRRIGishyB9A1n/8XAgcvA=="],
"on-finished": ["on-finished@2.4.1", "", { "dependencies": { "ee-first": "1.1.1" } }, "sha512-oVlzkg3ENAhCk2zdv7IJwd/QUD4z2RxRwpkcGY8psCVcCYZNq4wYnVWALHM+brtuJjePWiYF/ClmuDr8Ch5+kg=="],
"once": ["once@1.4.0", "", { "dependencies": { "wrappy": "1" } }, "sha512-lNaJgI+2Q5URQBkccEKHTQOPaXdUxnZZElQTZY0MFUAuaEqe1E+Nyvgdz/aIyNi6Z9MzO5dv1H8n58/GELp3+w=="],

View File

@@ -159,7 +159,7 @@ Available agents: `oracle`, `librarian`, `explore`, `multimodal-looker`
Oh My OpenCode includes built-in skills that provide additional capabilities:
- **playwright**: Browser automation with Playwright MCP. Use for web scraping, testing, screenshots, and browser interactions.
- **playwright** (default) / **agent-browser**: Browser automation for web scraping, testing, screenshots, and browser interactions. See [Browser Automation](#browser-automation) for switching between providers.
- **git-master**: Git expert for atomic commits, rebase/squash, and history search (blame, bisect, log -S). STRONGLY RECOMMENDED: Use with `delegate_task(category='quick', load_skills=['git-master'], ...)` to save context.
Disable built-in skills via `disabled_skills` in `~/.config/opencode/oh-my-opencode.json` or `.opencode/oh-my-opencode.json`:
@@ -170,7 +170,231 @@ Disable built-in skills via `disabled_skills` in `~/.config/opencode/oh-my-openc
}
```
Available built-in skills: `playwright`, `git-master`
Available built-in skills: `playwright`, `agent-browser`, `git-master`
## Browser Automation
Choose between two browser automation providers:
| Provider | Interface | Features | Installation |
|----------|-----------|----------|--------------|
| **playwright** (default) | MCP tools | Playwright MCP server with structured tool calls | Auto-installed via npx |
| **agent-browser** | Bash CLI | Vercel's CLI with session management, parallel browsers | Requires `bun add -g agent-browser` |
**Switch providers** via `browser_automation_engine` in `oh-my-opencode.json`:
```json
{
"browser_automation_engine": {
"provider": "agent-browser"
}
}
```
### Playwright (Default)
Uses the official Playwright MCP server (`@playwright/mcp`). Browser automation happens through structured MCP tool calls.
### agent-browser
Uses [Vercel's agent-browser CLI](https://github.com/vercel-labs/agent-browser). Key advantages:
- **Session management**: Run multiple isolated browser instances with `--session` flag
- **Persistent profiles**: Keep browser state across restarts with `--profile`
- **Snapshot-based workflow**: Get element refs via `snapshot -i`, interact with `@e1`, `@e2`, etc.
- **CLI-first**: All commands via Bash - great for scripting
**Installation required**:
```bash
bun add -g agent-browser
agent-browser install # Download Chromium
```
**Example workflow**:
```bash
agent-browser open https://example.com
agent-browser snapshot -i # Get interactive elements with refs
agent-browser fill @e1 "user@example.com"
agent-browser click @e2
agent-browser screenshot result.png
agent-browser close
```
## Tmux Integration
Run background subagents in separate tmux panes for **visual multi-agent execution**. See your agents working in parallel, each in their own terminal pane.
**Enable tmux integration** via `tmux` in `oh-my-opencode.json`:
```json
{
"tmux": {
"enabled": true,
"layout": "main-vertical",
"main_pane_size": 60,
"main_pane_min_width": 120,
"agent_pane_min_width": 40
}
}
```
| Option | Default | Description |
|--------|---------|-------------|
| `enabled` | `false` | Enable tmux subagent pane spawning. Only works when running inside an existing tmux session. |
| `layout` | `main-vertical` | Tmux layout for agent panes. See [Layout Options](#layout-options) below. |
| `main_pane_size` | `60` | Main pane size as percentage (20-80). |
| `main_pane_min_width` | `120` | Minimum width for main pane in columns. |
| `agent_pane_min_width` | `40` | Minimum width for each agent pane in columns. |
### Layout Options
| Layout | Description |
|--------|-------------|
| `main-vertical` | Main pane left, agent panes stacked on right (default) |
| `main-horizontal` | Main pane top, agent panes stacked bottom |
| `tiled` | All panes in equal-sized grid |
| `even-horizontal` | All panes in horizontal row |
| `even-vertical` | All panes in vertical stack |
### Requirements
1. **Must run inside tmux**: The feature only activates when OpenCode is already running inside a tmux session
2. **Tmux installed**: Requires tmux to be available in PATH
3. **Server mode**: OpenCode must run with `--port` flag to enable subagent pane spawning
### How It Works
When `tmux.enabled` is `true` and you're inside a tmux session:
- Background agents (via `delegate_task(run_in_background=true)`) spawn in new tmux panes
- Each pane shows the subagent's real-time output
- Panes are automatically closed when the subagent completes
- Layout is automatically adjusted based on your configuration
### Running OpenCode with Tmux Subagent Support
To enable tmux subagent panes, OpenCode must run in **server mode** with the `--port` flag. This starts an HTTP server that subagent panes connect to via `opencode attach`.
**Basic setup**:
```bash
# Start tmux session
tmux new -s dev
# Run OpenCode with server mode (port 4096)
opencode --port 4096
# Now background agents will appear in separate panes
```
**Recommended: Shell Function**
For convenience, create a shell function that automatically handles tmux sessions and port allocation. Here's an example for Fish shell:
```fish
# ~/.config/fish/config.fish
function oc
set base_name (basename (pwd))
set path_hash (echo (pwd) | md5 | cut -c1-4)
set session_name "$base_name-$path_hash"
# Find available port starting from 4096
function __oc_find_port
set port 4096
while test $port -lt 5096
if not lsof -i :$port >/dev/null 2>&1
echo $port
return 0
end
set port (math $port + 1)
end
echo 4096
end
set oc_port (__oc_find_port)
set -x OPENCODE_PORT $oc_port
if set -q TMUX
# Already inside tmux - just run with port
opencode --port $oc_port $argv
else
# Create tmux session and run opencode
set oc_cmd "OPENCODE_PORT=$oc_port opencode --port $oc_port $argv; exec fish"
if tmux has-session -t "$session_name" 2>/dev/null
tmux new-window -t "$session_name" -c (pwd) "$oc_cmd"
tmux attach-session -t "$session_name"
else
tmux new-session -s "$session_name" -c (pwd) "$oc_cmd"
end
end
functions -e __oc_find_port
end
```
**Bash/Zsh equivalent**:
```bash
# ~/.bashrc or ~/.zshrc
oc() {
local base_name=$(basename "$PWD")
local path_hash=$(echo "$PWD" | md5sum | cut -c1-4)
local session_name="${base_name}-${path_hash}"
# Find available port
local port=4096
while [ $port -lt 5096 ]; do
if ! lsof -i :$port >/dev/null 2>&1; then
break
fi
port=$((port + 1))
done
export OPENCODE_PORT=$port
if [ -n "$TMUX" ]; then
opencode --port $port "$@"
else
local oc_cmd="OPENCODE_PORT=$port opencode --port $port $*; exec $SHELL"
if tmux has-session -t "$session_name" 2>/dev/null; then
tmux new-window -t "$session_name" -c "$PWD" "$oc_cmd"
tmux attach-session -t "$session_name"
else
tmux new-session -s "$session_name" -c "$PWD" "$oc_cmd"
fi
fi
}
```
**How subagent panes work**:
1. Main OpenCode starts HTTP server on specified port (e.g., `http://localhost:4096`)
2. When a background agent spawns, Oh My OpenCode creates a new tmux pane
3. The pane runs: `opencode attach http://localhost:4096 --session <session-id>`
4. Each subagent pane shows real-time streaming output
5. Panes are automatically closed when the subagent completes
**Environment variables**:
| Variable | Description |
|----------|-------------|
| `OPENCODE_PORT` | Default port for the HTTP server (used if `--port` not specified) |
### Server Mode Reference
OpenCode's server mode exposes an HTTP API for programmatic interaction:
```bash
# Standalone server (no TUI)
opencode serve --port 4096
# TUI with server (recommended for tmux integration)
opencode --port 4096
```
| Flag | Default | Description |
|------|---------|-------------|
| `--port` | `4096` | Port for HTTP server |
| `--hostname` | `127.0.0.1` | Hostname to listen on |
For more details, see the [OpenCode Server documentation](https://opencode.ai/docs/server/).
## Git Master
@@ -301,27 +525,96 @@ Configure concurrency limits for background agent tasks. This controls how many
Categories enable domain-specific task delegation via the `delegate_task` tool. Each category applies runtime presets (model, temperature, prompt additions) when calling the `Sisyphus-Junior` agent.
**Default Categories:**
### Built-in Categories
| Category | Model | Description |
| ---------------- | ----------------------------- | ---------------------------------------------------------------------------- |
| `visual` | `google/gemini-3-pro` | Frontend, UI/UX, design-focused tasks. High creativity (temp 0.7). |
| `business-logic` | `openai/gpt-5.2` | Backend logic, architecture, strategic reasoning. Low creativity (temp 0.1). |
All 7 categories come with optimal model defaults, but **you must configure them to use those defaults**:
**Usage:**
| Category | Built-in Default Model | Description |
| -------------------- | ---------------------------------- | -------------------------------------------------------------------- |
| `visual-engineering` | `google/gemini-3-pro-preview` | Frontend, UI/UX, design, styling, animation |
| `ultrabrain` | `openai/gpt-5.2-codex` (xhigh) | Deep logical reasoning, complex architecture decisions |
| `artistry` | `google/gemini-3-pro-preview` (max)| Highly creative/artistic tasks, novel ideas |
| `quick` | `anthropic/claude-haiku-4-5` | Trivial tasks - single file changes, typo fixes, simple modifications|
| `unspecified-low` | `anthropic/claude-sonnet-4-5` | Tasks that don't fit other categories, low effort required |
| `unspecified-high` | `anthropic/claude-opus-4-5` (max) | Tasks that don't fit other categories, high effort required |
| `writing` | `google/gemini-3-flash-preview` | Documentation, prose, technical writing |
### ⚠️ Critical: Model Resolution Priority
**Categories DO NOT use their built-in defaults unless configured.** Model resolution follows this priority:
```
// Via delegate_task tool
delegate_task(category="visual", prompt="Create a responsive dashboard component")
delegate_task(category="business-logic", prompt="Design the payment processing flow")
1. User-configured model (in oh-my-opencode.json)
2. Category's built-in default (if you add category to config)
3. System default model (from opencode.json)
```
// Or target a specific agent directly
**Example Problem:**
```json
// opencode.json
{ "model": "anthropic/claude-sonnet-4-5" }
// oh-my-opencode.json (empty categories section)
{}
// Result: ALL categories use claude-sonnet-4-5 (wasteful!)
// - quick tasks use Sonnet instead of Haiku (expensive)
// - ultrabrain uses Sonnet instead of GPT-5.2 (inferior reasoning)
// - visual tasks use Sonnet instead of Gemini (suboptimal for UI)
```
### Recommended Configuration
**To use optimal models for each category, add them to your config:**
```json
{
"categories": {
"visual-engineering": {
"model": "google/gemini-3-pro-preview"
},
"ultrabrain": {
"model": "openai/gpt-5.2-codex",
"variant": "xhigh"
},
"artistry": {
"model": "google/gemini-3-pro-preview",
"variant": "max"
},
"quick": {
"model": "anthropic/claude-haiku-4-5" // Fast + cheap for trivial tasks
},
"unspecified-low": {
"model": "anthropic/claude-sonnet-4-5"
},
"unspecified-high": {
"model": "anthropic/claude-opus-4-5",
"variant": "max"
},
"writing": {
"model": "google/gemini-3-flash-preview"
}
}
}
```
**Only configure categories you have access to.** Unconfigured categories fall back to your system default model.
### Usage
```javascript
// Via delegate_task tool
delegate_task(category="visual-engineering", prompt="Create a responsive dashboard component")
delegate_task(category="ultrabrain", prompt="Design the payment processing flow")
// Or target a specific agent directly (bypasses categories)
delegate_task(agent="oracle", prompt="Review this architecture")
```
**Custom Categories:**
### Custom Categories
Add custom categories in `oh-my-opencode.json`:
Add your own categories or override built-in ones:
```json
{
@@ -331,15 +624,15 @@ Add custom categories in `oh-my-opencode.json`:
"temperature": 0.2,
"prompt_append": "Focus on data analysis, ML pipelines, and statistical methods."
},
"visual": {
"model": "google/gemini-3-pro",
"visual-engineering": {
"model": "google/gemini-3-pro-preview",
"prompt_append": "Use shadcn/ui components and Tailwind CSS."
}
}
}
```
Each category supports: `model`, `temperature`, `top_p`, `maxTokens`, `thinking`, `reasoningEffort`, `textVerbosity`, `tools`, `prompt_append`.
Each category supports: `model`, `temperature`, `top_p`, `maxTokens`, `thinking`, `reasoningEffort`, `textVerbosity`, `tools`, `prompt_append`, `variant`.
## Model Resolution System

View File

@@ -62,6 +62,27 @@ delegate_task(agent="explore", background=true, prompt="Find auth implementation
background_output(task_id="bg_abc123")
```
#### Visual Multi-Agent with Tmux
Enable `tmux.enabled` to see background agents in separate tmux panes:
```json
{
"tmux": {
"enabled": true,
"layout": "main-vertical"
}
}
```
When running inside tmux:
- Background agents spawn in new panes
- Watch multiple agents work in real-time
- Each pane shows agent output live
- Auto-cleanup when agents complete
See [Tmux Integration](configurations.md#tmux-integration) for full configuration options.
Customize agent models, prompts, and permissions in `oh-my-opencode.json`. See [Configuration](configurations.md#agents).
---
@@ -78,11 +99,15 @@ Skills provide specialized workflows with embedded MCP servers and detailed inst
| **frontend-ui-ux** | UI/UX tasks, styling | Designer-turned-developer persona. Crafts stunning UI/UX even without design mockups. Emphasizes bold aesthetic direction, distinctive typography, cohesive color palettes. |
| **git-master** | commit, rebase, squash, blame | MUST USE for ANY git operations. Atomic commits with automatic splitting, rebase/squash workflows, history search (blame, bisect, log -S). |
### Skill: playwright
### Skill: Browser Automation (playwright / agent-browser)
**Trigger**: Any browser-related request
Provides browser automation via Playwright MCP server:
Oh-My-OpenCode provides two browser automation providers, configurable via `browser_automation_engine.provider`:
#### Option 1: Playwright MCP (Default)
The default provider uses Playwright MCP server:
```yaml
mcp:
@@ -91,18 +116,41 @@ mcp:
args: ["@playwright/mcp@latest"]
```
**Capabilities**:
**Usage**:
```
/playwright Navigate to example.com and take a screenshot
```
#### Option 2: Agent Browser CLI (Vercel)
Alternative provider using [Vercel's agent-browser CLI](https://github.com/vercel-labs/agent-browser):
```json
{
"browser_automation_engine": {
"provider": "agent-browser"
}
}
```
**Requires installation**:
```bash
bun add -g agent-browser
```
**Usage**:
```
Use agent-browser to navigate to example.com and extract the main heading
```
#### Capabilities (Both Providers)
- Navigate and interact with web pages
- Take screenshots and PDFs
- Fill forms and click elements
- Wait for network requests
- Scrape content
**Usage**:
```
/playwright Navigate to example.com and take a screenshot
```
### Skill: frontend-ui-ux
**Trigger**: UI design tasks, visual changes
@@ -418,6 +466,29 @@ Disable specific hooks in config:
| **session_search** | Full-text search across session messages |
| **session_info** | Get session metadata and statistics |
### Interactive Terminal Tools
| Tool | Description |
|------|-------------|
| **interactive_bash** | Tmux-based terminal for TUI apps (vim, htop, pudb). Pass tmux subcommands directly without prefix. |
**Usage Examples**:
```bash
# Create a new session
interactive_bash(tmux_command="new-session -d -s dev-app")
# Send keystrokes to a session
interactive_bash(tmux_command="send-keys -t dev-app 'vim main.py' Enter")
# Capture pane output
interactive_bash(tmux_command="capture-pane -p -t dev-app")
```
**Key Points**:
- Commands are tmux subcommands (no `tmux` prefix)
- Use for interactive apps that need persistent sessions
- One-shot commands should use regular `Bash` tool with `&`
---
## MCPs: Built-in Servers

View File

@@ -1,6 +1,6 @@
{
"name": "oh-my-opencode",
"version": "3.0.1",
"version": "3.1.4",
"description": "The Best AI Agent Harness - Batteries-Included OpenCode Plugin with Multi-Model Orchestration, Parallel Background Agents, and Crafted LSP/AST Tools",
"main": "dist/index.js",
"types": "dist/index.d.ts",
@@ -73,13 +73,13 @@
"typescript": "^5.7.3"
},
"optionalDependencies": {
"oh-my-opencode-darwin-arm64": "3.0.1",
"oh-my-opencode-darwin-x64": "3.0.1",
"oh-my-opencode-linux-arm64": "3.0.1",
"oh-my-opencode-linux-arm64-musl": "3.0.1",
"oh-my-opencode-linux-x64": "3.0.1",
"oh-my-opencode-linux-x64-musl": "3.0.1",
"oh-my-opencode-windows-x64": "3.0.1"
"oh-my-opencode-darwin-arm64": "3.1.4",
"oh-my-opencode-darwin-x64": "3.1.4",
"oh-my-opencode-linux-arm64": "3.1.4",
"oh-my-opencode-linux-arm64-musl": "3.1.4",
"oh-my-opencode-linux-x64": "3.1.4",
"oh-my-opencode-linux-x64-musl": "3.1.4",
"oh-my-opencode-windows-x64": "3.1.4"
},
"trustedDependencies": [
"@ast-grep/cli",

View File

@@ -1,6 +1,6 @@
{
"name": "oh-my-opencode-darwin-arm64",
"version": "3.0.1",
"version": "3.1.4",
"description": "Platform-specific binary for oh-my-opencode (darwin-arm64)",
"license": "MIT",
"repository": {

View File

@@ -0,0 +1,22 @@
{
"name": "oh-my-opencode-darwin-x64-baseline",
"version": "3.1.1",
"description": "Platform-specific binary for oh-my-opencode (darwin-x64-baseline, no AVX2)",
"license": "MIT",
"repository": {
"type": "git",
"url": "https://github.com/code-yeongyu/oh-my-opencode"
},
"os": [
"darwin"
],
"cpu": [
"x64"
],
"files": [
"bin"
],
"bin": {
"oh-my-opencode": "./bin/oh-my-opencode"
}
}

View File

@@ -1,6 +1,6 @@
{
"name": "oh-my-opencode-darwin-x64",
"version": "3.0.1",
"version": "3.1.4",
"description": "Platform-specific binary for oh-my-opencode (darwin-x64)",
"license": "MIT",
"repository": {

View File

@@ -1,6 +1,6 @@
{
"name": "oh-my-opencode-linux-arm64-musl",
"version": "3.0.1",
"version": "3.1.4",
"description": "Platform-specific binary for oh-my-opencode (linux-arm64-musl)",
"license": "MIT",
"repository": {

View File

@@ -1,6 +1,6 @@
{
"name": "oh-my-opencode-linux-arm64",
"version": "3.0.1",
"version": "3.1.4",
"description": "Platform-specific binary for oh-my-opencode (linux-arm64)",
"license": "MIT",
"repository": {

View File

@@ -0,0 +1,25 @@
{
"name": "oh-my-opencode-linux-x64-baseline",
"version": "3.1.1",
"description": "Platform-specific binary for oh-my-opencode (linux-x64-baseline, no AVX2)",
"license": "MIT",
"repository": {
"type": "git",
"url": "https://github.com/code-yeongyu/oh-my-opencode"
},
"os": [
"linux"
],
"cpu": [
"x64"
],
"libc": [
"glibc"
],
"files": [
"bin"
],
"bin": {
"oh-my-opencode": "./bin/oh-my-opencode"
}
}

View File

@@ -0,0 +1,25 @@
{
"name": "oh-my-opencode-linux-x64-musl-baseline",
"version": "3.1.1",
"description": "Platform-specific binary for oh-my-opencode (linux-x64-musl-baseline, no AVX2)",
"license": "MIT",
"repository": {
"type": "git",
"url": "https://github.com/code-yeongyu/oh-my-opencode"
},
"os": [
"linux"
],
"cpu": [
"x64"
],
"libc": [
"musl"
],
"files": [
"bin"
],
"bin": {
"oh-my-opencode": "./bin/oh-my-opencode"
}
}

View File

@@ -1,6 +1,6 @@
{
"name": "oh-my-opencode-linux-x64-musl",
"version": "3.0.1",
"version": "3.1.4",
"description": "Platform-specific binary for oh-my-opencode (linux-x64-musl)",
"license": "MIT",
"repository": {

View File

@@ -1,6 +1,6 @@
{
"name": "oh-my-opencode-linux-x64",
"version": "3.0.1",
"version": "3.1.4",
"description": "Platform-specific binary for oh-my-opencode (linux-x64)",
"license": "MIT",
"repository": {

View File

@@ -0,0 +1,22 @@
{
"name": "oh-my-opencode-windows-x64-baseline",
"version": "3.1.1",
"description": "Platform-specific binary for oh-my-opencode (windows-x64-baseline, no AVX2)",
"license": "MIT",
"repository": {
"type": "git",
"url": "https://github.com/code-yeongyu/oh-my-opencode"
},
"os": [
"win32"
],
"cpu": [
"x64"
],
"files": [
"bin"
],
"bin": {
"oh-my-opencode": "./bin/oh-my-opencode.exe"
}
}

View File

@@ -1,6 +1,6 @@
{
"name": "oh-my-opencode-windows-x64",
"version": "3.0.1",
"version": "3.1.4",
"description": "Platform-specific binary for oh-my-opencode (windows-x64)",
"license": "MIT",
"repository": {

View File

@@ -0,0 +1,79 @@
// script/build-binaries.test.ts
// Tests for platform binary build configuration
import { describe, expect, it } from "bun:test";
// Import PLATFORMS from build-binaries.ts
// We need to export it first, but for now we'll test the expected structure
const EXPECTED_BASELINE_TARGETS = [
"bun-linux-x64-baseline",
"bun-linux-x64-musl-baseline",
"bun-darwin-x64-baseline",
"bun-windows-x64-baseline",
];
describe("build-binaries", () => {
describe("PLATFORMS array", () => {
it("includes baseline variants for non-AVX2 CPU support", async () => {
// given
const module = await import("./build-binaries.ts");
const platforms = (module as { PLATFORMS: { target: string }[] }).PLATFORMS;
const targets = platforms.map((p) => p.target);
// when
const hasAllBaselineTargets = EXPECTED_BASELINE_TARGETS.every((baseline) =>
targets.includes(baseline)
);
// then
expect(hasAllBaselineTargets).toBe(true);
for (const baseline of EXPECTED_BASELINE_TARGETS) {
expect(targets).toContain(baseline);
}
});
it("has correct directory names for baseline platforms", async () => {
// given
const module = await import("./build-binaries.ts");
const platforms = (module as { PLATFORMS: { dir: string; target: string }[] }).PLATFORMS;
// when
const baselinePlatforms = platforms.filter((p) => p.target.includes("baseline"));
// then
expect(baselinePlatforms.length).toBe(4);
expect(baselinePlatforms.map((p) => p.dir)).toContain("linux-x64-baseline");
expect(baselinePlatforms.map((p) => p.dir)).toContain("linux-x64-musl-baseline");
expect(baselinePlatforms.map((p) => p.dir)).toContain("darwin-x64-baseline");
expect(baselinePlatforms.map((p) => p.dir)).toContain("windows-x64-baseline");
});
it("has correct binary names for baseline platforms", async () => {
// given
const module = await import("./build-binaries.ts");
const platforms = (module as { PLATFORMS: { dir: string; target: string; binary: string }[] }).PLATFORMS;
// when
const windowsBaseline = platforms.find((p) => p.target === "bun-windows-x64-baseline");
const linuxBaseline = platforms.find((p) => p.target === "bun-linux-x64-baseline");
// then
expect(windowsBaseline?.binary).toBe("oh-my-opencode.exe");
expect(linuxBaseline?.binary).toBe("oh-my-opencode");
});
it("has descriptions mentioning no AVX2 for baseline platforms", async () => {
// given
const module = await import("./build-binaries.ts");
const platforms = (module as { PLATFORMS: { target: string; description: string }[] }).PLATFORMS;
// when
const baselinePlatforms = platforms.filter((p) => p.target.includes("baseline"));
// then
for (const platform of baselinePlatforms) {
expect(platform.description).toContain("no AVX2");
}
});
});
});

View File

@@ -13,14 +13,18 @@ interface PlatformTarget {
description: string;
}
const PLATFORMS: PlatformTarget[] = [
export const PLATFORMS: PlatformTarget[] = [
{ dir: "darwin-arm64", target: "bun-darwin-arm64", binary: "oh-my-opencode", description: "macOS ARM64" },
{ dir: "darwin-x64", target: "bun-darwin-x64", binary: "oh-my-opencode", description: "macOS x64" },
{ dir: "darwin-x64-baseline", target: "bun-darwin-x64-baseline", binary: "oh-my-opencode", description: "macOS x64 (no AVX2)" },
{ dir: "linux-x64", target: "bun-linux-x64", binary: "oh-my-opencode", description: "Linux x64 (glibc)" },
{ dir: "linux-x64-baseline", target: "bun-linux-x64-baseline", binary: "oh-my-opencode", description: "Linux x64 (glibc, no AVX2)" },
{ dir: "linux-arm64", target: "bun-linux-arm64", binary: "oh-my-opencode", description: "Linux ARM64 (glibc)" },
{ dir: "linux-x64-musl", target: "bun-linux-x64-musl", binary: "oh-my-opencode", description: "Linux x64 (musl)" },
{ dir: "linux-x64-musl-baseline", target: "bun-linux-x64-musl-baseline", binary: "oh-my-opencode", description: "Linux x64 (musl, no AVX2)" },
{ dir: "linux-arm64-musl", target: "bun-linux-arm64-musl", binary: "oh-my-opencode", description: "Linux ARM64 (musl)" },
{ dir: "windows-x64", target: "bun-windows-x64", binary: "oh-my-opencode.exe", description: "Windows x64" },
{ dir: "windows-x64-baseline", target: "bun-windows-x64-baseline", binary: "oh-my-opencode.exe", description: "Windows x64 (no AVX2)" },
];
const ENTRY_POINT = "src/cli/index.ts";

View File

@@ -815,6 +815,118 @@
"created_at": "2026-01-25T03:13:52Z",
"repoId": 1108837393,
"pullRequestNo": 1084
},
{
"name": "misyuari",
"id": 12197761,
"comment_id": 3798225767,
"created_at": "2026-01-26T07:31:02Z",
"repoId": 1108837393,
"pullRequestNo": 1132
},
{
"name": "boguan",
"id": 3226538,
"comment_id": 3798448537,
"created_at": "2026-01-26T08:40:37Z",
"repoId": 1108837393,
"pullRequestNo": 1137
},
{
"name": "boguan",
"id": 3226538,
"comment_id": 3798471978,
"created_at": "2026-01-26T08:46:03Z",
"repoId": 1108837393,
"pullRequestNo": 1137
},
{
"name": "Jeremy-Kr",
"id": 110771206,
"comment_id": 3799211732,
"created_at": "2026-01-26T11:59:13Z",
"repoId": 1108837393,
"pullRequestNo": 1141
},
{
"name": "orientpine",
"id": 32758428,
"comment_id": 3799897021,
"created_at": "2026-01-26T14:30:33Z",
"repoId": 1108837393,
"pullRequestNo": 1145
},
{
"name": "craftaholic",
"id": 63741110,
"comment_id": 3797014417,
"created_at": "2026-01-25T17:52:34Z",
"repoId": 1108837393,
"pullRequestNo": 1110
},
{
"name": "acamq",
"id": 179265037,
"comment_id": 3801038978,
"created_at": "2026-01-26T18:20:17Z",
"repoId": 1108837393,
"pullRequestNo": 1151
},
{
"name": "itsmylife44",
"id": 34112129,
"comment_id": 3802225779,
"created_at": "2026-01-26T23:20:30Z",
"repoId": 1108837393,
"pullRequestNo": 1157
},
{
"name": "ghtndl",
"id": 117787238,
"comment_id": 3802593326,
"created_at": "2026-01-27T01:27:17Z",
"repoId": 1108837393,
"pullRequestNo": 1158
},
{
"name": "alvinunreal",
"id": 204474669,
"comment_id": 3796402213,
"created_at": "2026-01-25T10:26:58Z",
"repoId": 1108837393,
"pullRequestNo": 1100
},
{
"name": "MoerAI",
"id": 26067127,
"comment_id": 3803968993,
"created_at": "2026-01-27T09:00:57Z",
"repoId": 1108837393,
"pullRequestNo": 1172
},
{
"name": "moha-abdi",
"id": 83307623,
"comment_id": 3804988070,
"created_at": "2026-01-27T12:36:21Z",
"repoId": 1108837393,
"pullRequestNo": 1179
},
{
"name": "zycaskevin",
"id": 223135116,
"comment_id": 3806137669,
"created_at": "2026-01-27T16:20:38Z",
"repoId": 1108837393,
"pullRequestNo": 1184
},
{
"name": "agno01",
"id": 4479380,
"comment_id": 3808373433,
"created_at": "2026-01-28T01:02:02Z",
"repoId": 1108837393,
"pullRequestNo": 1188
}
]
}

View File

@@ -1,31 +1,28 @@
# AGENTS KNOWLEDGE BASE
## OVERVIEW
10 AI agents for multi-model orchestration. Sisyphus (primary), Atlas (orchestrator), oracle, librarian, explore, multimodal-looker, Prometheus, Metis, Momus, Sisyphus-Junior.
## STRUCTURE
```
agents/
├── atlas.ts # Master Orchestrator (572 lines)
├── sisyphus.ts # Main prompt (450 lines)
├── sisyphus-junior.ts # Delegated task executor (135 lines)
├── dynamic-agent-prompt-builder.ts # Dynamic prompt generation (359 lines)
├── atlas.ts # Master Orchestrator (holds todo list)
├── sisyphus.ts # Main prompt (SF Bay Area engineer identity)
├── sisyphus-junior.ts # Delegated task executor (category-spawned)
├── oracle.ts # Strategic advisor (GPT-5.2)
├── librarian.ts # Multi-repo research (326 lines)
├── explore.ts # Fast grep (Grok Code)
├── librarian.ts # Multi-repo research (GitHub CLI, Context7)
├── explore.ts # Fast contextual grep (Grok Code)
├── multimodal-looker.ts # Media analyzer (Gemini 3 Flash)
├── prometheus-prompt.ts # Planning (1196 lines)
├── metis.ts # Plan consultant (315 lines)
├── momus.ts # Plan reviewer (444 lines)
├── prometheus-prompt.ts # Planning (Interview/Consultant mode, 1196 lines)
├── metis.ts # Pre-planning analysis (Gap detection)
├── momus.ts # Plan reviewer (Ruthless fault-finding)
├── dynamic-agent-prompt-builder.ts # Dynamic prompt generation
├── types.ts # AgentModelConfig, AgentPromptMetadata
├── utils.ts # createBuiltinAgents(), resolveModelWithFallback()
└── index.ts # builtinAgents export
```
## AGENT MODELS
| Agent | Model | Temp | Purpose |
|-------|-------|------|---------|
| Sisyphus | anthropic/claude-opus-4-5 | 0.1 | Primary orchestrator |
@@ -40,14 +37,12 @@ agents/
| Sisyphus-Junior | anthropic/claude-sonnet-4-5 | 0.1 | Category-spawned executor |
## HOW TO ADD
1. Create `src/agents/my-agent.ts` exporting factory + metadata
2. Add to `agentSources` in `src/agents/utils.ts`
3. Update `AgentNameSchema` in `src/config/schema.ts`
4. Register in `src/index.ts` initialization
1. Create `src/agents/my-agent.ts` exporting factory + metadata.
2. Add to `agentSources` in `src/agents/utils.ts`.
3. Update `AgentNameSchema` in `src/config/schema.ts`.
4. Register in `src/index.ts` initialization.
## TOOL RESTRICTIONS
| Agent | Denied Tools |
|-------|-------------|
| oracle | write, edit, task, delegate_task |
@@ -57,14 +52,13 @@ agents/
| Sisyphus-Junior | task, delegate_task |
## PATTERNS
- **Factory**: `createXXXAgent(model?: string): AgentConfig`
- **Metadata**: `XXX_PROMPT_METADATA` with category, cost, triggers
- **Tool restrictions**: `createAgentToolRestrictions(tools)` or `createAgentToolAllowlist(tools)`
- **Thinking**: 32k budget tokens for Sisyphus, Oracle, Prometheus, Atlas
- **Factory**: `createXXXAgent(model: string): AgentConfig`
- **Metadata**: `XXX_PROMPT_METADATA` with category, cost, triggers.
- **Tool restrictions**: `createAgentToolRestrictions(tools)` or `createAgentToolAllowlist(tools)`.
- **Thinking**: 32k budget tokens for Sisyphus, Oracle, Prometheus, Atlas.
## ANTI-PATTERNS
- **Trust reports**: NEVER trust "I'm done" - verify outputs
- **High temp**: Don't use >0.3 for code agents
- **Sequential calls**: Use `delegate_task` with `run_in_background`
- **Trust reports**: NEVER trust "I'm done" - verify outputs.
- **High temp**: Don't use >0.3 for code agents.
- **Sequential calls**: Use `delegate_task` with `run_in_background` for exploration.
- **Prometheus writing code**: Planner only - never implements.

View File

@@ -523,9 +523,6 @@ function buildDynamicOrchestratorPrompt(ctx?: OrchestratorContext): string {
}
export function createAtlasAgent(ctx: OrchestratorContext): AgentConfig {
if (!ctx.model) {
throw new Error("createAtlasAgent requires a model in context")
}
const restrictions = createAgentToolRestrictions([
"task",
"call_omo_agent",
@@ -534,7 +531,7 @@ export function createAtlasAgent(ctx: OrchestratorContext): AgentConfig {
description:
"Orchestrates work via delegate_task() to complete ALL tasks in a todo list until fully done",
mode: "primary" as const,
model: ctx.model,
...(ctx.model ? { model: ctx.model } : {}),
temperature: 0.1,
prompt: buildDynamicOrchestratorPrompt(ctx),
thinking: { type: "enabled", budgetTokens: 32000 },

View File

@@ -863,6 +863,20 @@ Generate plan to: \`.sisyphus/plans/{name}.md\`
\`\`\`markdown
# {Plan Title}
## TL;DR
> **Quick Summary**: [1-2 sentences capturing the core objective and approach]
>
> **Deliverables**: [Bullet list of concrete outputs]
> - [Output 1]
> - [Output 2]
>
> **Estimated Effort**: [Quick | Short | Medium | Large | XL]
> **Parallel Execution**: [YES - N waves | NO - sequential]
> **Critical Path**: [Task X → Task Y → Task Z]
---
## Context
### Original Request
@@ -963,29 +977,55 @@ Each TODO includes detailed verification procedures:
---
## Task Flow
## Execution Strategy
### Parallel Execution Waves
> Maximize throughput by grouping independent tasks into parallel waves.
> Each wave completes before the next begins.
\`\`\`
Task 1 → Task 2 → Task 3
↘ Task 4 (parallel)
Wave 1 (Start Immediately):
├── Task 1: [no dependencies]
└── Task 5: [no dependencies]
Wave 2 (After Wave 1):
├── Task 2: [depends: 1]
├── Task 3: [depends: 1]
└── Task 6: [depends: 5]
Wave 3 (After Wave 2):
└── Task 4: [depends: 2, 3]
Critical Path: Task 1 → Task 2 → Task 4
Parallel Speedup: ~40% faster than sequential
\`\`\`
## Parallelization
### Dependency Matrix
| Group | Tasks | Reason |
|-------|-------|--------|
| A | 2, 3 | Independent files |
| Task | Depends On | Blocks | Can Parallelize With |
|------|------------|--------|---------------------|
| 1 | None | 2, 3 | 5 |
| 2 | 1 | 4 | 3, 6 |
| 3 | 1 | 4 | 2, 6 |
| 4 | 2, 3 | None | None (final) |
| 5 | None | 6 | 1 |
| 6 | 5 | None | 2, 3 |
| Task | Depends On | Reason |
|------|------------|--------|
| 4 | 1 | Requires output from 1 |
### Agent Dispatch Summary
| Wave | Tasks | Recommended Agents |
|------|-------|-------------------|
| 1 | 1, 5 | delegate_task(category="...", load_skills=[...], run_in_background=true) |
| 2 | 2, 3, 6 | dispatch parallel after Wave 1 completes |
| 3 | 4 | final integration task |
---
## TODOs
> Implementation + Test = ONE Task. Never separate.
> Specify parallelizability for EVERY task.
> EVERY task MUST have: Recommended Agent Profile + Parallelization info.
- [ ] 1. [Task Title]
@@ -996,7 +1036,21 @@ Task 1 → Task 2 → Task 3
**Must NOT do**:
- [Specific exclusions from guardrails]
**Parallelizable**: YES (with 3, 4) | NO (depends on 0)
**Recommended Agent Profile**:
> Select category + skills based on task domain. Justify each choice.
- **Category**: \`[visual-engineering | ultrabrain | artistry | quick | unspecified-low | unspecified-high | writing]\`
- Reason: [Why this category fits the task domain]
- **Skills**: [\`skill-1\`, \`skill-2\`]
- \`skill-1\`: [Why needed - domain overlap explanation]
- \`skill-2\`: [Why needed - domain overlap explanation]
- **Skills Evaluated but Omitted**:
- \`omitted-skill\`: [Why domain doesn't overlap]
**Parallelization**:
- **Can Run In Parallel**: YES | NO
- **Parallel Group**: Wave N (with Tasks X, Y) | Sequential
- **Blocks**: [Tasks that depend on this task completing]
- **Blocked By**: [Tasks this depends on] | None (can start immediately)
**References** (CRITICAL - Be Exhaustive):

View File

@@ -20,32 +20,6 @@ ALLOWED: call_omo_agent - You CAN spawn explore/librarian agents for research.
You work ALONE for implementation. No delegation of implementation tasks.
</Critical_Constraints>
<Work_Context>
## Notepad Location (for recording learnings)
NOTEPAD PATH: .sisyphus/notepads/{plan-name}/
- learnings.md: Record patterns, conventions, successful approaches
- issues.md: Record problems, blockers, gotchas encountered
- decisions.md: Record architectural choices and rationales
- problems.md: Record unresolved issues, technical debt
You SHOULD append findings to notepad files after completing work.
IMPORTANT: Always APPEND to notepad files - never overwrite or use Edit tool.
## Plan Location (READ ONLY)
PLAN PATH: .sisyphus/plans/{plan-name}.md
CRITICAL RULE: NEVER MODIFY THE PLAN FILE
The plan file (.sisyphus/plans/*.md) is SACRED and READ-ONLY.
- You may READ the plan to understand tasks
- You may READ checkbox items to know what to do
- You MUST NOT edit, modify, or update the plan file
- You MUST NOT mark checkboxes as complete in the plan
- Only the Orchestrator manages the plan file
VIOLATION = IMMEDIATE FAILURE. The Orchestrator tracks plan state.
</Work_Context>
<Todo_Discipline>
TODO OBSESSION (NON-NEGOTIABLE):
- 2+ steps → todowrite FIRST, atomic breakdown

View File

@@ -1,6 +1,8 @@
import { describe, test, expect } from "bun:test"
import { describe, test, expect, beforeEach, spyOn, afterEach } from "bun:test"
import { createBuiltinAgents } from "./utils"
import type { AgentConfig } from "@opencode-ai/sdk"
import { clearSkillCache } from "../features/opencode-skill-loader/skill-content"
import * as connectedProvidersCache from "../shared/connected-providers-cache"
const TEST_DEFAULT_MODEL = "anthropic/claude-opus-4-5"
@@ -45,17 +47,32 @@ describe("createBuiltinAgents with model overrides", () => {
expect(agents.sisyphus.reasoningEffort).toBeUndefined()
})
test("Oracle uses first fallback entry when no availableModels provided (no cache scenario)", async () => {
// #given - no available models simulates CI without model cache
test("Oracle uses connected provider when no availableModels but connected cache exists", async () => {
// #given - connected providers cache exists with openai
const cacheSpy = spyOn(connectedProvidersCache, "readConnectedProvidersCache").mockReturnValue(["openai"])
// #when
const agents = await createBuiltinAgents([], {}, undefined, TEST_DEFAULT_MODEL)
// #then - uses first fallback entry (openai/gpt-5.2) instead of system default
// #then - uses openai from connected cache
expect(agents.oracle.model).toBe("openai/gpt-5.2")
expect(agents.oracle.reasoningEffort).toBe("medium")
expect(agents.oracle.textVerbosity).toBe("high")
expect(agents.oracle.thinking).toBeUndefined()
cacheSpy.mockRestore()
})
test("Oracle created without model field when no cache exists (first run scenario)", async () => {
// #given - no cache at all (first run)
const cacheSpy = spyOn(connectedProvidersCache, "readConnectedProvidersCache").mockReturnValue(null)
// #when
const agents = await createBuiltinAgents([], {}, undefined, TEST_DEFAULT_MODEL)
// #then - oracle should be created with system default model (fallback to systemDefaultModel)
expect(agents.oracle).toBeDefined()
expect(agents.oracle.model).toBe(TEST_DEFAULT_MODEL)
cacheSpy.mockRestore()
})
test("Oracle with GPT model override has reasoningEffort, no thinking", async () => {
@@ -105,10 +122,54 @@ describe("createBuiltinAgents with model overrides", () => {
})
})
describe("createBuiltinAgents without systemDefaultModel", () => {
test("creates agents with connected provider when cache exists", async () => {
// #given - connected providers cache exists
const cacheSpy = spyOn(connectedProvidersCache, "readConnectedProvidersCache").mockReturnValue(["openai"])
// #when
const agents = await createBuiltinAgents([], {}, undefined, undefined)
// #then - agents should use connected provider from fallback chain
expect(agents.oracle).toBeDefined()
expect(agents.oracle.model).toBe("openai/gpt-5.2")
cacheSpy.mockRestore()
})
test("agents NOT created when no cache and no systemDefaultModel (first run without defaults)", async () => {
// #given - no cache and no system default
const cacheSpy = spyOn(connectedProvidersCache, "readConnectedProvidersCache").mockReturnValue(null)
// #when
const agents = await createBuiltinAgents([], {}, undefined, undefined)
// #then - oracle should NOT be created (resolveModelWithFallback returns undefined)
expect(agents.oracle).toBeUndefined()
cacheSpy.mockRestore()
})
test("sisyphus uses connected provider when cache exists", async () => {
// #given - connected providers cache exists with anthropic
const cacheSpy = spyOn(connectedProvidersCache, "readConnectedProvidersCache").mockReturnValue(["anthropic"])
// #when
const agents = await createBuiltinAgents([], {}, undefined, undefined)
// #then - sisyphus should use anthropic from connected cache
expect(agents.sisyphus).toBeDefined()
expect(agents.sisyphus.model).toBe("anthropic/claude-opus-4-5")
cacheSpy.mockRestore()
})
})
describe("buildAgent with category and skills", () => {
const { buildAgent } = require("./utils")
const TEST_MODEL = "anthropic/claude-opus-4-5"
beforeEach(() => {
clearSkillCache()
})
test("agent with category inherits category settings", () => {
// #given - agent factory that sets category but no model
const source = {
@@ -308,4 +369,42 @@ describe("buildAgent with category and skills", () => {
// #then
expect(agent.prompt).toBe("Base prompt")
})
test("agent with agent-browser skill resolves when browserProvider is set", () => {
// #given
const source = {
"test-agent": () =>
({
description: "Test agent",
skills: ["agent-browser"],
prompt: "Base prompt",
}) as AgentConfig,
}
// #when - browserProvider is "agent-browser"
const agent = buildAgent(source["test-agent"], TEST_MODEL, undefined, undefined, "agent-browser")
// #then - agent-browser skill content should be in prompt
expect(agent.prompt).toContain("agent-browser")
expect(agent.prompt).toContain("Base prompt")
})
test("agent with agent-browser skill NOT resolved when browserProvider not set", () => {
// #given
const source = {
"test-agent": () =>
({
description: "Test agent",
skills: ["agent-browser"],
prompt: "Base prompt",
}) as AgentConfig,
}
// #when - no browserProvider (defaults to playwright)
const agent = buildAgent(source["test-agent"], TEST_MODEL)
// #then - agent-browser skill not found, only base prompt remains
expect(agent.prompt).toBe("Base prompt")
expect(agent.prompt).not.toContain("agent-browser open")
})
})

View File

@@ -10,11 +10,12 @@ import { createMetisAgent } from "./metis"
import { createAtlasAgent } from "./atlas"
import { createMomusAgent } from "./momus"
import type { AvailableAgent, AvailableCategory, AvailableSkill } from "./dynamic-agent-prompt-builder"
import { deepMerge, fetchAvailableModels, resolveModelWithFallback, AGENT_MODEL_REQUIREMENTS, findCaseInsensitive, includesCaseInsensitive } from "../shared"
import { deepMerge, fetchAvailableModels, resolveModelWithFallback, AGENT_MODEL_REQUIREMENTS, findCaseInsensitive, includesCaseInsensitive, readConnectedProvidersCache } from "../shared"
import { DEFAULT_CATEGORIES, CATEGORY_DESCRIPTIONS } from "../tools/delegate-task/constants"
import { resolveMultipleSkills } from "../features/opencode-skill-loader/skill-content"
import { createBuiltinSkills } from "../features/builtin-skills"
import type { LoadedSkill, SkillScope } from "../features/opencode-skill-loader/types"
import type { BrowserAutomationProvider } from "../config/schema"
type AgentSource = AgentFactory | AgentConfig
@@ -50,7 +51,8 @@ export function buildAgent(
source: AgentSource,
model: string,
categories?: CategoriesConfig,
gitMasterConfig?: GitMasterConfig
gitMasterConfig?: GitMasterConfig,
browserProvider?: BrowserAutomationProvider
): AgentConfig {
const base = isFactory(source) ? source(model) : source
const categoryConfigs: Record<string, CategoryConfig> = categories
@@ -74,7 +76,7 @@ export function buildAgent(
}
if (agentWithCategory.skills?.length) {
const { resolved } = resolveMultipleSkills(agentWithCategory.skills, { gitMasterConfig })
const { resolved } = resolveMultipleSkills(agentWithCategory.skills, { gitMasterConfig, browserProvider })
if (resolved.size > 0) {
const skillContent = Array.from(resolved.values()).join("\n\n")
base.prompt = skillContent + (base.prompt ? "\n\n" + base.prompt : "")
@@ -146,14 +148,13 @@ export async function createBuiltinAgents(
categories?: CategoriesConfig,
gitMasterConfig?: GitMasterConfig,
discoveredSkills: LoadedSkill[] = [],
client?: any
client?: any,
browserProvider?: BrowserAutomationProvider
): Promise<Record<string, AgentConfig>> {
if (!systemDefaultModel) {
throw new Error("createBuiltinAgents requires systemDefaultModel")
}
// Fetch available models at plugin init
const availableModels = client ? await fetchAvailableModels(client) : new Set<string>()
const connectedProviders = readConnectedProvidersCache()
const availableModels = client
? await fetchAvailableModels(client, { connectedProviders: connectedProviders ?? undefined })
: new Set<string>()
const result: Record<string, AgentConfig> = {}
const availableAgents: AvailableAgent[] = []
@@ -167,7 +168,7 @@ export async function createBuiltinAgents(
description: categories?.[name]?.description ?? CATEGORY_DESCRIPTIONS[name] ?? "General tasks",
}))
const builtinSkills = createBuiltinSkills()
const builtinSkills = createBuiltinSkills({ browserProvider })
const builtinSkillNames = new Set(builtinSkills.map(s => s.name))
const builtinAvailable: AvailableSkill[] = builtinSkills.map((skill) => ({
@@ -196,15 +197,16 @@ export async function createBuiltinAgents(
const override = findCaseInsensitive(agentOverrides, agentName)
const requirement = AGENT_MODEL_REQUIREMENTS[agentName]
// Use resolver to determine model
const { model, variant: resolvedVariant } = resolveModelWithFallback({
const resolution = resolveModelWithFallback({
userModel: override?.model,
fallbackChain: requirement?.fallbackChain,
availableModels,
systemDefaultModel,
})
if (!resolution) continue
const { model, variant: resolvedVariant } = resolution
let config = buildAgent(source, model, mergedCategories, gitMasterConfig)
let config = buildAgent(source, model, mergedCategories, gitMasterConfig, browserProvider)
// Apply variant from override or resolved fallback chain
if (override?.variant) {
@@ -238,72 +240,76 @@ export async function createBuiltinAgents(
const sisyphusOverride = agentOverrides["sisyphus"]
const sisyphusRequirement = AGENT_MODEL_REQUIREMENTS["sisyphus"]
// Use resolver to determine model
const { model: sisyphusModel, variant: sisyphusResolvedVariant } = resolveModelWithFallback({
const sisyphusResolution = resolveModelWithFallback({
userModel: sisyphusOverride?.model,
fallbackChain: sisyphusRequirement?.fallbackChain,
availableModels,
systemDefaultModel,
})
let sisyphusConfig = createSisyphusAgent(
sisyphusModel,
availableAgents,
undefined,
availableSkills,
availableCategories
)
// Apply variant from override or resolved fallback chain
if (sisyphusOverride?.variant) {
sisyphusConfig = { ...sisyphusConfig, variant: sisyphusOverride.variant }
} else if (sisyphusResolvedVariant) {
sisyphusConfig = { ...sisyphusConfig, variant: sisyphusResolvedVariant }
}
if (sisyphusResolution) {
const { model: sisyphusModel, variant: sisyphusResolvedVariant } = sisyphusResolution
if (directory && sisyphusConfig.prompt) {
const envContext = createEnvContext()
sisyphusConfig = { ...sisyphusConfig, prompt: sisyphusConfig.prompt + envContext }
}
let sisyphusConfig = createSisyphusAgent(
sisyphusModel,
availableAgents,
undefined,
availableSkills,
availableCategories
)
if (sisyphusOverride?.variant) {
sisyphusConfig = { ...sisyphusConfig, variant: sisyphusOverride.variant }
} else if (sisyphusResolvedVariant) {
sisyphusConfig = { ...sisyphusConfig, variant: sisyphusResolvedVariant }
}
if (sisyphusOverride) {
sisyphusConfig = mergeAgentConfig(sisyphusConfig, sisyphusOverride)
}
if (directory && sisyphusConfig.prompt) {
const envContext = createEnvContext()
sisyphusConfig = { ...sisyphusConfig, prompt: sisyphusConfig.prompt + envContext }
}
result["sisyphus"] = sisyphusConfig
if (sisyphusOverride) {
sisyphusConfig = mergeAgentConfig(sisyphusConfig, sisyphusOverride)
}
result["sisyphus"] = sisyphusConfig
}
}
if (!disabledAgents.includes("atlas")) {
const orchestratorOverride = agentOverrides["atlas"]
const atlasRequirement = AGENT_MODEL_REQUIREMENTS["atlas"]
// Use resolver to determine model
const { model: atlasModel, variant: atlasResolvedVariant } = resolveModelWithFallback({
const atlasResolution = resolveModelWithFallback({
userModel: orchestratorOverride?.model,
fallbackChain: atlasRequirement?.fallbackChain,
availableModels,
systemDefaultModel,
})
let orchestratorConfig = createAtlasAgent({
model: atlasModel,
availableAgents,
availableSkills,
userCategories: categories,
})
// Apply variant from override or resolved fallback chain
if (orchestratorOverride?.variant) {
orchestratorConfig = { ...orchestratorConfig, variant: orchestratorOverride.variant }
} else if (atlasResolvedVariant) {
orchestratorConfig = { ...orchestratorConfig, variant: atlasResolvedVariant }
}
if (atlasResolution) {
const { model: atlasModel, variant: atlasResolvedVariant } = atlasResolution
if (orchestratorOverride) {
orchestratorConfig = mergeAgentConfig(orchestratorConfig, orchestratorOverride)
}
let orchestratorConfig = createAtlasAgent({
model: atlasModel,
availableAgents,
availableSkills,
userCategories: categories,
})
if (orchestratorOverride?.variant) {
orchestratorConfig = { ...orchestratorConfig, variant: orchestratorOverride.variant }
} else if (atlasResolvedVariant) {
orchestratorConfig = { ...orchestratorConfig, variant: atlasResolvedVariant }
}
result["atlas"] = orchestratorConfig
if (orchestratorOverride) {
orchestratorConfig = mergeAgentConfig(orchestratorConfig, orchestratorOverride)
}
result["atlas"] = orchestratorConfig
}
}
return result

View File

@@ -8,7 +8,7 @@ CLI entry: `bunx oh-my-opencode`. Interactive installer, doctor diagnostics. Com
```
cli/
├── index.ts # Commander.js entry
├── index.ts # Commander.js entry (4 commands)
├── install.ts # Interactive TUI (520 lines)
├── config-manager.ts # JSONC parsing (664 lines)
├── types.ts # InstallArgs, InstallConfig
@@ -18,7 +18,7 @@ cli/
│ ├── runner.ts # Check orchestration
│ ├── formatter.ts # Colored output
│ ├── constants.ts # Check IDs, symbols
│ ├── types.ts # CheckResult, CheckDefinition
│ ├── types.ts # CheckResult, CheckDefinition (114 lines)
│ └── checks/ # 14 checks, 21 files
│ ├── version.ts # OpenCode + plugin version
│ ├── config.ts # JSONC validity, Zod
@@ -38,36 +38,37 @@ cli/
| Command | Purpose |
|---------|---------|
| `install` | Interactive setup |
| `doctor` | 14 health checks |
| `run` | Launch session |
| `get-local-version` | Version check |
| `install` | Interactive setup with provider selection |
| `doctor` | 14 health checks for diagnostics |
| `run` | Launch session with todo enforcement |
| `get-local-version` | Version detection and update check |
## DOCTOR CATEGORIES
## DOCTOR CATEGORIES (14 Checks)
| Category | Checks |
|----------|--------|
| installation | opencode, plugin |
| configuration | config validity, Zod |
| configuration | config validity, Zod, model-resolution |
| authentication | anthropic, openai, google |
| dependencies | ast-grep, comment-checker |
| dependencies | ast-grep, comment-checker, gh-cli |
| tools | LSP, MCP |
| updates | version comparison |
## HOW TO ADD CHECK
1. Create `src/cli/doctor/checks/my-check.ts`
2. Export from `checks/index.ts`
3. Add to `getAllCheckDefinitions()`
2. Export `getXXXCheckDefinition()` factory returning `CheckDefinition`
3. Add to `getAllCheckDefinitions()` in `checks/index.ts`
## TUI FRAMEWORK
- **@clack/prompts**: `select()`, `spinner()`, `intro()`
- **picocolors**: Terminal colors
- **Symbols**: ✓ (pass), ✗ (fail), ⚠ (warn)
- **@clack/prompts**: `select()`, `spinner()`, `intro()`, `outro()`
- **picocolors**: Terminal colors for status and headers
- **Symbols**: ✓ (pass), ✗ (fail), ⚠ (warn), (info)
## ANTI-PATTERNS
- **Blocking in non-TTY**: Check `process.stdout.isTTY`
- **Direct JSON.parse**: Use `parseJsonc()`
- **Silent failures**: Return warn/fail in doctor
- **Blocking in non-TTY**: Always check `process.stdout.isTTY`
- **Direct JSON.parse**: Use `parseJsonc()` from shared utils
- **Silent failures**: Return `warn` or `fail` in doctor instead of throwing
- **Hardcoded paths**: Use `getOpenCodeConfigPaths()` from `config-manager.ts`

View File

@@ -712,7 +712,7 @@ exports[`generateModelConfig fallback providers uses GitHub Copilot models when
"model": "github-copilot/claude-sonnet-4.5",
},
"explore": {
"model": "opencode/gpt-5-nano",
"model": "github-copilot/gpt-5-mini",
},
"librarian": {
"model": "github-copilot/claude-sonnet-4.5",
@@ -776,7 +776,7 @@ exports[`generateModelConfig fallback providers uses GitHub Copilot models with
"model": "github-copilot/claude-sonnet-4.5",
},
"explore": {
"model": "opencode/gpt-5-nano",
"model": "github-copilot/gpt-5-mini",
},
"librarian": {
"model": "github-copilot/claude-sonnet-4.5",
@@ -1022,7 +1022,7 @@ exports[`generateModelConfig mixed provider scenarios uses OpenAI + Copilot comb
"model": "github-copilot/claude-sonnet-4.5",
},
"explore": {
"model": "opencode/gpt-5-nano",
"model": "github-copilot/gpt-5-mini",
},
"librarian": {
"model": "github-copilot/claude-sonnet-4.5",

View File

@@ -199,9 +199,11 @@ function buildDetailsArray(info: ModelResolutionInfo, available: AvailableModels
details.push("═══ Available Models (from cache) ═══")
details.push("")
if (available.cacheExists) {
details.push(` Providers: ${available.providers.length} (${available.providers.slice(0, 8).join(", ")}${available.providers.length > 8 ? "..." : ""})`)
details.push(` Providers in cache: ${available.providers.length}`)
details.push(` Sample: ${available.providers.slice(0, 6).join(", ")}${available.providers.length > 6 ? "..." : ""}`)
details.push(` Total models: ${available.modelCount}`)
details.push(` Cache: ~/.cache/opencode/models.json`)
details.push(` Runtime: only connected providers used`)
details.push(` Refresh: opencode models --refresh`)
} else {
details.push(" ⚠ Cache not found. Run 'opencode' to populate.")

17
src/cli/index.test.ts Normal file
View File

@@ -0,0 +1,17 @@
import { describe, it, expect } from "bun:test"
import packageJson from "../../package.json" with { type: "json" }
describe("CLI version", () => {
it("reads version from package.json as valid semver", () => {
//#given
const semverRegex = /^\d+\.\d+\.\d+(-[\w.]+)?$/
//#when
const version = packageJson.version
//#then
expect(version).toMatch(semverRegex)
expect(typeof version).toBe("string")
expect(version.length).toBeGreaterThan(0)
})
})

View File

@@ -353,6 +353,17 @@ describe("generateModelConfig", () => {
// #then explore should use gpt-5-nano (fallback)
expect(result.agents?.explore?.model).toBe("opencode/gpt-5-nano")
})
test("explore uses gpt-5-mini when only Copilot available", () => {
// #given only Copilot is available
const config = createConfig({ hasCopilot: true })
// #when generateModelConfig is called
const result = generateModelConfig(config)
// #then explore should use gpt-5-mini (Copilot fallback)
expect(result.agents?.explore?.model).toBe("github-copilot/gpt-5-mini")
})
})
describe("Sisyphus agent special cases", () => {

View File

@@ -139,12 +139,14 @@ export function generateModelConfig(config: InstallConfig): GeneratedOmoConfig {
continue
}
// Special case: explore uses Claude haiku → OpenCode gpt-5-nano
// Special case: explore uses Claude haiku → GitHub Copilot gpt-5-mini → OpenCode gpt-5-nano
if (role === "explore") {
if (avail.native.claude) {
agents[role] = { model: "anthropic/claude-haiku-4-5" }
} else if (avail.opencodeZen) {
agents[role] = { model: "opencode/claude-haiku-4-5" }
} else if (avail.copilot) {
agents[role] = { model: "github-copilot/gpt-5-mini" }
} else {
agents[role] = { model: "opencode/gpt-5-nano" }
}

View File

@@ -31,8 +31,18 @@ export async function run(options: RunOptions): Promise<number> {
}
try {
// Support custom OpenCode server port via environment variable
// This allows Open Agent and other orchestrators to run multiple
// concurrent missions without port conflicts
const serverPort = process.env.OPENCODE_SERVER_PORT
? parseInt(process.env.OPENCODE_SERVER_PORT, 10)
: undefined
const serverHostname = process.env.OPENCODE_SERVER_HOSTNAME || undefined
const { client, server } = await createOpencode({
signal: abortController.signal,
...(serverPort && !isNaN(serverPort) ? { port: serverPort } : {}),
...(serverHostname ? { hostname: serverHostname } : {}),
})
const cleanup = () => {

View File

@@ -9,6 +9,8 @@ export {
SisyphusAgentConfigSchema,
ExperimentalConfigSchema,
RalphLoopConfigSchema,
TmuxConfigSchema,
TmuxLayoutSchema,
} from "./schema"
export type {
@@ -23,4 +25,6 @@ export type {
ExperimentalConfig,
DynamicContextPruningConfig,
RalphLoopConfig,
TmuxConfig,
TmuxLayout,
} from "./schema"

View File

@@ -1,5 +1,12 @@
import { describe, expect, test } from "bun:test"
import { AgentOverrideConfigSchema, BuiltinCategoryNameSchema, CategoryConfigSchema, OhMyOpenCodeConfigSchema } from "./schema"
import {
AgentOverrideConfigSchema,
BrowserAutomationConfigSchema,
BrowserAutomationProviderSchema,
BuiltinCategoryNameSchema,
CategoryConfigSchema,
OhMyOpenCodeConfigSchema,
} from "./schema"
describe("disabled_mcps schema", () => {
test("should accept built-in MCP names", () => {
@@ -508,3 +515,94 @@ describe("Sisyphus-Junior agent override", () => {
}
})
})
describe("BrowserAutomationProviderSchema", () => {
test("accepts 'playwright' as valid provider", () => {
// #given
const input = "playwright"
// #when
const result = BrowserAutomationProviderSchema.safeParse(input)
// #then
expect(result.success).toBe(true)
expect(result.data).toBe("playwright")
})
test("accepts 'agent-browser' as valid provider", () => {
// #given
const input = "agent-browser"
// #when
const result = BrowserAutomationProviderSchema.safeParse(input)
// #then
expect(result.success).toBe(true)
expect(result.data).toBe("agent-browser")
})
test("rejects invalid provider", () => {
// #given
const input = "invalid-provider"
// #when
const result = BrowserAutomationProviderSchema.safeParse(input)
// #then
expect(result.success).toBe(false)
})
})
describe("BrowserAutomationConfigSchema", () => {
test("defaults provider to 'playwright' when not specified", () => {
// #given
const input = {}
// #when
const result = BrowserAutomationConfigSchema.parse(input)
// #then
expect(result.provider).toBe("playwright")
})
test("accepts agent-browser provider", () => {
// #given
const input = { provider: "agent-browser" }
// #when
const result = BrowserAutomationConfigSchema.parse(input)
// #then
expect(result.provider).toBe("agent-browser")
})
})
describe("OhMyOpenCodeConfigSchema - browser_automation_engine", () => {
test("accepts browser_automation_engine config", () => {
// #given
const input = {
browser_automation_engine: {
provider: "agent-browser",
},
}
// #when
const result = OhMyOpenCodeConfigSchema.safeParse(input)
// #then
expect(result.success).toBe(true)
expect(result.data?.browser_automation_engine?.provider).toBe("agent-browser")
})
test("accepts config without browser_automation_engine", () => {
// #given
const input = {}
// #when
const result = OhMyOpenCodeConfigSchema.safeParse(input)
// #then
expect(result.success).toBe(true)
expect(result.data?.browser_automation_engine).toBeUndefined()
})
})

View File

@@ -30,6 +30,7 @@ export const BuiltinAgentNameSchema = z.enum([
export const BuiltinSkillNameSchema = z.enum([
"playwright",
"agent-browser",
"frontend-ui-ux",
"git-master",
])
@@ -76,6 +77,7 @@ export const HookNameSchema = z.enum([
"thinking-block-validator",
"ralph-loop",
"category-skill-reminder",
"compaction-context-injector",
"claude-code-hooks",
@@ -83,6 +85,7 @@ export const HookNameSchema = z.enum([
"edit-error-recovery",
"delegate-task-retry",
"prometheus-md-only",
"sisyphus-junior-notepad",
"start-work",
"atlas",
])
@@ -113,6 +116,19 @@ export const AgentOverrideConfigSchema = z.object({
.regex(/^#[0-9A-Fa-f]{6}$/)
.optional(),
permission: AgentPermissionSchema.optional(),
/** Maximum tokens for response. Passed directly to OpenCode SDK. */
maxTokens: z.number().optional(),
/** Extended thinking configuration (Anthropic). Overrides category and default settings. */
thinking: z.object({
type: z.enum(["enabled", "disabled"]),
budgetTokens: z.number().optional(),
}).optional(),
/** Reasoning effort level (OpenAI). Overrides category and default settings. */
reasoningEffort: z.enum(["low", "medium", "high", "xhigh"]).optional(),
/** Text verbosity level. */
textVerbosity: z.enum(["low", "medium", "high"]).optional(),
/** Provider-specific options. Passed directly to OpenCode SDK. */
providerOptions: z.record(z.string(), z.unknown()).optional(),
})
export const AgentOverridesSchema = z.object({
@@ -297,6 +313,56 @@ export const GitMasterConfigSchema = z.object({
include_co_authored_by: z.boolean().default(true),
})
export const BrowserAutomationProviderSchema = z.enum(["playwright", "agent-browser", "dev-browser"])
export const BrowserAutomationConfigSchema = z.object({
/**
* Browser automation provider to use for the "playwright" skill.
* - "playwright": Uses Playwright MCP server (@playwright/mcp) - default
* - "agent-browser": Uses Vercel's agent-browser CLI (requires: bun add -g agent-browser)
* - "dev-browser": Uses dev-browser skill with persistent browser state
*/
provider: BrowserAutomationProviderSchema.default("playwright"),
})
export const TmuxLayoutSchema = z.enum([
'main-horizontal', // main pane top, agent panes bottom stack
'main-vertical', // main pane left, agent panes right stack (default)
'tiled', // all panes same size grid
'even-horizontal', // all panes horizontal row
'even-vertical', // all panes vertical stack
])
export const TmuxConfigSchema = z.object({
enabled: z.boolean().default(false),
layout: TmuxLayoutSchema.default('main-vertical'),
main_pane_size: z.number().min(20).max(80).default(60),
main_pane_min_width: z.number().min(40).default(120),
agent_pane_min_width: z.number().min(20).default(40),
})
export const SisyphusTasksConfigSchema = z.object({
/** Enable Sisyphus Tasks system (default: false) */
enabled: z.boolean().default(false),
/** Storage path for tasks (default: .sisyphus/tasks) */
storage_path: z.string().default(".sisyphus/tasks"),
/** Enable Claude Code path compatibility mode */
claude_code_compat: z.boolean().default(false),
})
export const SisyphusSwarmConfigSchema = z.object({
/** Enable Sisyphus Swarm system (default: false) */
enabled: z.boolean().default(false),
/** Storage path for teams (default: .sisyphus/teams) */
storage_path: z.string().default(".sisyphus/teams"),
/** UI mode: toast notifications, tmux panes, or both */
ui_mode: z.enum(["toast", "tmux", "both"]).default("toast"),
})
export const SisyphusConfigSchema = z.object({
tasks: SisyphusTasksConfigSchema.optional(),
swarm: SisyphusSwarmConfigSchema.optional(),
})
export const OhMyOpenCodeConfigSchema = z.object({
$schema: z.string().optional(),
disabled_mcps: z.array(AnyMcpNameSchema).optional(),
@@ -316,6 +382,9 @@ export const OhMyOpenCodeConfigSchema = z.object({
background_task: BackgroundTaskConfigSchema.optional(),
notification: NotificationConfigSchema.optional(),
git_master: GitMasterConfigSchema.optional(),
browser_automation_engine: BrowserAutomationConfigSchema.optional(),
tmux: TmuxConfigSchema.optional(),
sisyphus: SisyphusConfigSchema.optional(),
})
export type OhMyOpenCodeConfig = z.infer<typeof OhMyOpenCodeConfigSchema>
@@ -338,5 +407,12 @@ export type CategoryConfig = z.infer<typeof CategoryConfigSchema>
export type CategoriesConfig = z.infer<typeof CategoriesConfigSchema>
export type BuiltinCategoryName = z.infer<typeof BuiltinCategoryNameSchema>
export type GitMasterConfig = z.infer<typeof GitMasterConfigSchema>
export type BrowserAutomationProvider = z.infer<typeof BrowserAutomationProviderSchema>
export type BrowserAutomationConfig = z.infer<typeof BrowserAutomationConfigSchema>
export type TmuxConfig = z.infer<typeof TmuxConfigSchema>
export type TmuxLayout = z.infer<typeof TmuxLayoutSchema>
export type SisyphusTasksConfig = z.infer<typeof SisyphusTasksConfigSchema>
export type SisyphusSwarmConfig = z.infer<typeof SisyphusSwarmConfigSchema>
export type SisyphusConfig = z.infer<typeof SisyphusConfigSchema>
export { AnyMcpNameSchema, type AnyMcpName, McpNameSchema, type McpName } from "../mcp/types"

View File

@@ -2,34 +2,31 @@
## OVERVIEW
Core feature modules + Claude Code compatibility layer. Background agents, skill MCP, builtin skills/commands, 5 loaders.
Core feature modules + Claude Code compatibility layer. Orchestrates background agents, skill MCPs, builtin skills/commands, and 16 feature modules.
## STRUCTURE
```
features/
├── background-agent/ # Task lifecycle (1335 lines)
├── background-agent/ # Task lifecycle (1377 lines)
│ ├── manager.ts # Launch → poll → complete
── concurrency.ts # Per-provider limits
│ └── types.ts # BackgroundTask, LaunchInput
── skill-mcp-manager/ # MCP client lifecycle (520 lines)
│ ├── manager.ts # Lazy loading, cleanup
│ └── types.ts # SkillMcpConfig
├── builtin-skills/ # Playwright, git-master, frontend-ui-ux
│ └── skills.ts # 1203 lines
├── builtin-commands/ # ralph-loop, refactor, init-deep, start-work, remove-deadcode
│ ├── commands.ts # Command registry
│ └── templates/ # Command templates (4 files)
── concurrency.ts # Per-provider limits
├── builtin-skills/ # Core skills (1729 lines)
│ └── skills.ts # agent-browser, dev-browser, frontend-ui-ux, git-master, typescript-programmer
├── builtin-commands/ # ralph-loop, refactor, ulw-loop, init-deep, start-work, cancel-ralph
├── claude-code-agent-loader/ # ~/.claude/agents/*.md
├── claude-code-command-loader/ # ~/.claude/commands/*.md
├── claude-code-mcp-loader/ # .mcp.json
├── claude-code-mcp-loader/ # .mcp.json with ${VAR} expansion
├── claude-code-plugin-loader/ # installed_plugins.json
├── claude-code-session-state/ # Session persistence
├── opencode-skill-loader/ # Skills from 6 directories
├── context-injector/ # AGENTS.md/README.md injection
├── boulder-state/ # Todo state persistence
├── hook-message-injector/ # Message injection
── task-toast-manager/ # Background task notifications
── task-toast-manager/ # Background task notifications
├── skill-mcp-manager/ # MCP client lifecycle (520 lines)
├── tmux-subagent/ # Tmux session management
└── ... (16 modules total)
```
## LOADER PRIORITY
@@ -44,8 +41,9 @@ features/
- **Lifecycle**: `launch``poll` (2s) → `complete`
- **Stability**: 3 consecutive polls = idle
- **Concurrency**: Per-provider/model limits
- **Concurrency**: Per-provider/model limits via `ConcurrencyManager`
- **Cleanup**: 30m TTL, 3m stale timeout
- **State**: Per-session Maps, cleaned on `session.deleted`
## SKILL MCP
@@ -58,3 +56,4 @@ features/
- **Sequential delegation**: Use `delegate_task` parallel
- **Trust self-reports**: ALWAYS verify
- **Main thread blocks**: No heavy I/O in loader init
- **Direct state mutation**: Use managers for boulder/session state

View File

@@ -776,7 +776,7 @@ describe("BackgroundManager.notifyParentSession - dynamic message lookup", () =>
parentModel: { providerID: "old", modelID: "old-model" },
}
const currentMessage: CurrentMessage = {
agent: "Sisyphus",
agent: "sisyphus",
model: { providerID: "anthropic", modelID: "claude-opus-4-5" },
}
@@ -784,7 +784,7 @@ describe("BackgroundManager.notifyParentSession - dynamic message lookup", () =>
const promptBody = buildNotificationPromptBody(task, currentMessage)
// #then - uses currentMessage values, not task.parentModel/parentAgent
expect(promptBody.agent).toBe("Sisyphus")
expect(promptBody.agent).toBe("sisyphus")
expect(promptBody.model).toEqual({ providerID: "anthropic", modelID: "claude-opus-4-5" })
})
@@ -827,11 +827,11 @@ describe("BackgroundManager.notifyParentSession - dynamic message lookup", () =>
status: "completed",
startedAt: new Date(),
completedAt: new Date(),
parentAgent: "Sisyphus",
parentAgent: "sisyphus",
parentModel: { providerID: "anthropic", modelID: "claude-opus" },
}
const currentMessage: CurrentMessage = {
agent: "Sisyphus",
agent: "sisyphus",
model: { providerID: "anthropic" },
}
@@ -839,7 +839,7 @@ describe("BackgroundManager.notifyParentSession - dynamic message lookup", () =>
const promptBody = buildNotificationPromptBody(task, currentMessage)
// #then - model not passed due to incomplete data
expect(promptBody.agent).toBe("Sisyphus")
expect(promptBody.agent).toBe("sisyphus")
expect("model" in promptBody).toBe(false)
})
@@ -856,7 +856,7 @@ describe("BackgroundManager.notifyParentSession - dynamic message lookup", () =>
status: "completed",
startedAt: new Date(),
completedAt: new Date(),
parentAgent: "Sisyphus",
parentAgent: "sisyphus",
parentModel: { providerID: "anthropic", modelID: "claude-opus" },
}
@@ -864,7 +864,7 @@ describe("BackgroundManager.notifyParentSession - dynamic message lookup", () =>
const promptBody = buildNotificationPromptBody(task, null)
// #then - falls back to task.parentAgent, no model
expect(promptBody.agent).toBe("Sisyphus")
expect(promptBody.agent).toBe("sisyphus")
expect("model" in promptBody).toBe(false)
})
})

View File

@@ -7,7 +7,8 @@ import type {
} from "./types"
import { log, getAgentToolRestrictions } from "../../shared"
import { ConcurrencyManager } from "./concurrency"
import type { BackgroundTaskConfig } from "../../config/schema"
import type { BackgroundTaskConfig, TmuxConfig } from "../../config/schema"
import { isInsideTmux } from "../../shared/tmux"
import { subagentSessions } from "../claude-code-session-state"
import { getTaskToastManager } from "../task-toast-manager"
@@ -54,6 +55,14 @@ interface QueueItem {
input: LaunchInput
}
export interface SubagentSessionCreatedEvent {
sessionID: string
parentID: string
title: string
}
export type OnSubagentSessionCreated = (event: SubagentSessionCreatedEvent) => Promise<void>
export class BackgroundManager {
private static cleanupManagers = new Set<BackgroundManager>()
private static cleanupRegistered = false
@@ -68,12 +77,20 @@ export class BackgroundManager {
private concurrencyManager: ConcurrencyManager
private shutdownTriggered = false
private config?: BackgroundTaskConfig
private tmuxEnabled: boolean
private onSubagentSessionCreated?: OnSubagentSessionCreated
private queuesByKey: Map<string, QueueItem[]> = new Map()
private processingKeys: Set<string> = new Set()
constructor(ctx: PluginInput, config?: BackgroundTaskConfig) {
constructor(
ctx: PluginInput,
config?: BackgroundTaskConfig,
options?: {
tmuxConfig?: TmuxConfig
onSubagentSessionCreated?: OnSubagentSessionCreated
}
) {
this.tasks = new Map()
this.notifications = new Map()
this.pendingByParent = new Map()
@@ -81,6 +98,8 @@ export class BackgroundManager {
this.directory = ctx.directory
this.concurrencyManager = new ConcurrencyManager(config)
this.config = config
this.tmuxEnabled = options?.tmuxConfig?.enabled ?? false
this.onSubagentSessionCreated = options?.onSubagentSessionCreated
this.registerProcessCleanup()
}
@@ -205,7 +224,10 @@ export class BackgroundManager {
body: {
parentID: input.parentSessionID,
title: `Background: ${input.description}`,
},
permission: [
{ permission: "question", action: "deny" as const, pattern: "*" },
],
} as any,
query: {
directory: parentDirectory,
},
@@ -222,6 +244,29 @@ export class BackgroundManager {
const sessionID = createResult.data.id
subagentSessions.add(sessionID)
log("[background-agent] tmux callback check", {
hasCallback: !!this.onSubagentSessionCreated,
tmuxEnabled: this.tmuxEnabled,
isInsideTmux: isInsideTmux(),
sessionID,
parentID: input.parentSessionID,
})
if (this.onSubagentSessionCreated && this.tmuxEnabled && isInsideTmux()) {
log("[background-agent] Invoking tmux callback NOW", { sessionID })
await this.onSubagentSessionCreated({
sessionID,
parentID: input.parentSessionID,
title: input.description,
}).catch((err) => {
log("[background-agent] Failed to spawn tmux pane:", err)
})
log("[background-agent] tmux callback completed, waiting 200ms")
await new Promise(r => setTimeout(r, 200))
} else {
log("[background-agent] SKIP tmux callback - conditions not met")
}
// Update task to running state
task.status = "running"
task.startedAt = new Date()
@@ -252,17 +297,26 @@ export class BackgroundManager {
// Use prompt() instead of promptAsync() to properly initialize agent loop (fire-and-forget)
// Include model if caller provided one (e.g., from Sisyphus category configs)
// IMPORTANT: variant must be a top-level field in the body, NOT nested inside model
// OpenCode's PromptInput schema expects: { model: { providerID, modelID }, variant: "max" }
const launchModel = input.model
? { providerID: input.model.providerID, modelID: input.model.modelID }
: undefined
const launchVariant = input.model?.variant
this.client.session.prompt({
path: { id: sessionID },
body: {
agent: input.agent,
...(input.model ? { model: input.model } : {}),
...(launchModel ? { model: launchModel } : {}),
...(launchVariant ? { variant: launchVariant } : {}),
system: input.skillContent,
tools: {
...getAgentToolRestrictions(input.agent),
task: false,
delegate_task: false,
call_omo_agent: true,
question: false,
},
parts: [{ type: "text", text: input.prompt }],
},
@@ -499,16 +553,24 @@ export class BackgroundManager {
// Use prompt() instead of promptAsync() to properly initialize agent loop
// Include model if task has one (preserved from original launch with category config)
// variant must be top-level in body, not nested inside model (OpenCode PromptInput schema)
const resumeModel = existingTask.model
? { providerID: existingTask.model.providerID, modelID: existingTask.model.modelID }
: undefined
const resumeVariant = existingTask.model?.variant
this.client.session.prompt({
path: { id: existingTask.sessionID },
body: {
agent: existingTask.agent,
...(existingTask.model ? { model: existingTask.model } : {}),
...(resumeModel ? { model: resumeModel } : {}),
...(resumeVariant ? { variant: resumeVariant } : {}),
tools: {
...getAgentToolRestrictions(existingTask.agent),
task: false,
delegate_task: false,
call_omo_agent: true,
question: false,
},
parts: [{ type: "text", text: input.prompt }],
},

View File

@@ -25,7 +25,7 @@ export const START_WORK_TEMPLATE = `You are starting a Sisyphus work session.
}
\`\`\`
5. **Read the plan file** and start executing tasks according to Orchestrator Sisyphus workflow
5. **Read the plan file** and start executing tasks according to atlas workflow
## OUTPUT FORMAT
@@ -69,4 +69,4 @@ Reading plan and beginning execution...
- The session_id is injected by the hook - use it directly
- Always update boulder.json BEFORE starting work
- Read the FULL plan file before delegating any tasks
- Follow Orchestrator Sisyphus delegation protocols (7-section format)`
- Follow atlas delegation protocols (7-section format)`

View File

@@ -0,0 +1,336 @@
---
name: agent-browser
description: Automates browser interactions for web testing, form filling, screenshots, and data extraction. Use when the user needs to navigate websites, interact with web pages, fill forms, take screenshots, test web applications, or extract information from web pages.
---
# Browser Automation with agent-browser
## Quick start
```bash
agent-browser open <url> # Navigate to page
agent-browser snapshot -i # Get interactive elements with refs
agent-browser click @e1 # Click element by ref
agent-browser fill @e2 "text" # Fill input by ref
agent-browser close # Close browser
```
## Core workflow
1. Navigate: `agent-browser open <url>`
2. Snapshot: `agent-browser snapshot -i` (returns elements with refs like `@e1`, `@e2`)
3. Interact using refs from the snapshot
4. Re-snapshot after navigation or significant DOM changes
## Commands
### Navigation
```bash
agent-browser open <url> # Navigate to URL
agent-browser back # Go back
agent-browser forward # Go forward
agent-browser reload # Reload page
agent-browser close # Close browser
```
### Snapshot (page analysis)
```bash
agent-browser snapshot # Full accessibility tree
agent-browser snapshot -i # Interactive elements only (recommended)
agent-browser snapshot -c # Compact output
agent-browser snapshot -d 3 # Limit depth to 3
agent-browser snapshot -s "#main" # Scope to CSS selector
```
### Interactions (use @refs from snapshot)
```bash
agent-browser click @e1 # Click
agent-browser dblclick @e1 # Double-click
agent-browser focus @e1 # Focus element
agent-browser fill @e2 "text" # Clear and type
agent-browser type @e2 "text" # Type without clearing
agent-browser press Enter # Press key
agent-browser press Control+a # Key combination
agent-browser keydown Shift # Hold key down
agent-browser keyup Shift # Release key
agent-browser hover @e1 # Hover
agent-browser check @e1 # Check checkbox
agent-browser uncheck @e1 # Uncheck checkbox
agent-browser select @e1 "value" # Select dropdown
agent-browser scroll down 500 # Scroll page
agent-browser scrollintoview @e1 # Scroll element into view
agent-browser drag @e1 @e2 # Drag and drop
agent-browser upload @e1 file.pdf # Upload files
```
### Get information
```bash
agent-browser get text @e1 # Get element text
agent-browser get html @e1 # Get innerHTML
agent-browser get value @e1 # Get input value
agent-browser get attr @e1 href # Get attribute
agent-browser get title # Get page title
agent-browser get url # Get current URL
agent-browser get count ".item" # Count matching elements
agent-browser get box @e1 # Get bounding box
```
### Check state
```bash
agent-browser is visible @e1 # Check if visible
agent-browser is enabled @e1 # Check if enabled
agent-browser is checked @e1 # Check if checked
```
### Screenshots & PDF
```bash
agent-browser screenshot # Screenshot to stdout
agent-browser screenshot path.png # Save to file
agent-browser screenshot --full # Full page
agent-browser pdf output.pdf # Save as PDF
```
### Video recording
```bash
agent-browser record start ./demo.webm # Start recording (uses current URL + state)
agent-browser click @e1 # Perform actions
agent-browser record stop # Stop and save video
agent-browser record restart ./take2.webm # Stop current + start new recording
```
Recording creates a fresh context but preserves cookies/storage from your session.
### Wait
```bash
agent-browser wait @e1 # Wait for element
agent-browser wait 2000 # Wait milliseconds
agent-browser wait --text "Success" # Wait for text
agent-browser wait --url "**/dashboard" # Wait for URL pattern
agent-browser wait --load networkidle # Wait for network idle
agent-browser wait --fn "window.ready" # Wait for JS condition
```
### Mouse control
```bash
agent-browser mouse move 100 200 # Move mouse
agent-browser mouse down left # Press button
agent-browser mouse up left # Release button
agent-browser mouse wheel 100 # Scroll wheel
```
### Semantic locators (alternative to refs)
```bash
agent-browser find role button click --name "Submit"
agent-browser find text "Sign In" click
agent-browser find label "Email" fill "user@test.com"
agent-browser find first ".item" click
agent-browser find nth 2 "a" text
```
### Browser settings
```bash
agent-browser set viewport 1920 1080 # Set viewport size
agent-browser set device "iPhone 14" # Emulate device
agent-browser set geo 37.7749 -122.4194 # Set geolocation
agent-browser set offline on # Toggle offline mode
agent-browser set headers '{"X-Key":"v"}' # Extra HTTP headers
agent-browser set credentials user pass # HTTP basic auth
agent-browser set media dark # Emulate color scheme
```
### Cookies & Storage
```bash
agent-browser cookies # Get all cookies
agent-browser cookies set name value # Set cookie
agent-browser cookies clear # Clear cookies
agent-browser storage local # Get all localStorage
agent-browser storage local key # Get specific key
agent-browser storage local set k v # Set value
agent-browser storage local clear # Clear all
agent-browser storage session # Get all sessionStorage
agent-browser storage session key # Get specific key
agent-browser storage session set k v # Set value
agent-browser storage session clear # Clear all
```
### Network
```bash
agent-browser network route <url> # Intercept requests
agent-browser network route <url> --abort # Block requests
agent-browser network route <url> --body '{}' # Mock response
agent-browser network unroute [url] # Remove routes
agent-browser network requests # View tracked requests
agent-browser network requests --filter api # Filter requests
```
### Tabs & Windows
```bash
agent-browser tab # List tabs
agent-browser tab new [url] # New tab
agent-browser tab 2 # Switch to tab
agent-browser tab close # Close tab
agent-browser window new # New window
```
### Frames
```bash
agent-browser frame "#iframe" # Switch to iframe
agent-browser frame main # Back to main frame
```
### Dialogs
```bash
agent-browser dialog accept [text] # Accept dialog
agent-browser dialog dismiss # Dismiss dialog
```
### JavaScript
```bash
agent-browser eval "document.title" # Run JavaScript
```
## Global Options
| Option | Description |
|--------|-------------|
| `--session <name>` | Isolated browser session (`AGENT_BROWSER_SESSION` env) |
| `--profile <path>` | Persistent browser profile (`AGENT_BROWSER_PROFILE` env) |
| `--headers <json>` | HTTP headers scoped to URL's origin |
| `--executable-path <path>` | Custom browser binary (`AGENT_BROWSER_EXECUTABLE_PATH` env) |
| `--args <args>` | Browser launch args (`AGENT_BROWSER_ARGS` env) |
| `--user-agent <ua>` | Custom User-Agent (`AGENT_BROWSER_USER_AGENT` env) |
| `--proxy <url>` | Proxy server (`AGENT_BROWSER_PROXY` env) |
| `--proxy-bypass <hosts>` | Hosts to bypass proxy (`AGENT_BROWSER_PROXY_BYPASS` env) |
| `-p, --provider <name>` | Cloud browser provider (`AGENT_BROWSER_PROVIDER` env) |
| `--json` | Machine-readable JSON output |
| `--headed` | Show browser window (not headless) |
| `--cdp <port\|wss://url>` | Connect via Chrome DevTools Protocol |
| `--debug` | Debug output |
## Example: Form submission
```bash
agent-browser open https://example.com/form
agent-browser snapshot -i
# Output shows: textbox "Email" [ref=e1], textbox "Password" [ref=e2], button "Submit" [ref=e3]
agent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser snapshot -i # Check result
```
## Example: Authentication with saved state
```bash
# Login once
agent-browser open https://app.example.com/login
agent-browser snapshot -i
agent-browser fill @e1 "username"
agent-browser fill @e2 "password"
agent-browser click @e3
agent-browser wait --url "**/dashboard"
agent-browser state save auth.json
# Later sessions: load saved state
agent-browser state load auth.json
agent-browser open https://app.example.com/dashboard
```
### Header-based Auth (Skip login flows)
```bash
# Headers scoped to api.example.com only
agent-browser open api.example.com --headers '{"Authorization": "Bearer <token>"}'
# Navigate to another domain - headers NOT sent (safe)
agent-browser open other-site.com
# Global headers (all domains)
agent-browser set headers '{"X-Custom-Header": "value"}'
```
## Sessions & Persistent Profiles
### Sessions (parallel browsers)
```bash
agent-browser --session test1 open site-a.com
agent-browser --session test2 open site-b.com
agent-browser session list
```
### Persistent Profiles
Persists cookies, localStorage, IndexedDB, service workers, cache, login sessions across browser restarts.
```bash
agent-browser --profile ~/.myapp-profile open myapp.com
# Or via env var
AGENT_BROWSER_PROFILE=~/.myapp-profile agent-browser open myapp.com
```
- Use different profile paths for different projects
- Login once → restart browser → still logged in
- Stores: cookies, localStorage, IndexedDB, service workers, browser cache
## JSON output (for parsing)
Add `--json` for machine-readable output:
```bash
agent-browser snapshot -i --json
agent-browser get text @e1 --json
```
## Debugging
```bash
agent-browser open example.com --headed # Show browser window
agent-browser console # View console messages
agent-browser errors # View page errors
agent-browser record start ./debug.webm # Record from current page
agent-browser record stop # Save recording
agent-browser connect 9222 # Local CDP port
agent-browser --cdp "wss://browser-service.com/cdp?token=..." snapshot # Remote via WebSocket
agent-browser console --clear # Clear console
agent-browser errors --clear # Clear errors
agent-browser highlight @e1 # Highlight element
agent-browser trace start # Start recording trace
agent-browser trace stop trace.zip # Stop and save trace
```
---
## Installation
### Step 1: Install agent-browser CLI
```bash
bun add -g agent-browser
```
### Step 2: Install Playwright browsers
**IMPORTANT**: `agent-browser install` may fail on some platforms (e.g., darwin-arm64) with "No binary found" error. In that case, install Playwright browsers directly:
```bash
# Create a temp project and install playwright
cd /tmp && bun init -y && bun add playwright
# Install Chromium browser
bun playwright install chromium
```
This downloads Chrome for Testing to `~/Library/Caches/ms-playwright/`.
### Verify installation
```bash
agent-browser open https://example.com --headed
```
If the browser opens successfully, installation is complete.
### Troubleshooting
| Error | Solution |
|-------|----------|
| `No binary found for darwin-arm64` | Run `bun playwright install chromium` in a project with playwright dependency |
| `Executable doesn't exist at .../chromium-XXXX` | Re-run `bun playwright install chromium` |
| Browser doesn't open | Ensure `--headed` flag is used for visible browser |
---
Run `agent-browser --help` for all commands. Repo: https://github.com/vercel-labs/agent-browser

View File

@@ -0,0 +1,213 @@
---
name: dev-browser
description: Browser automation with persistent page state. Use when users ask to navigate websites, fill forms, take screenshots, extract web data, test web apps, or automate browser workflows. Trigger phrases include "go to [url]", "click on", "fill out the form", "take a screenshot", "scrape", "automate", "test the website", "log into", or any browser interaction request.
---
# Dev Browser Skill
Browser automation that maintains page state across script executions. Write small, focused scripts to accomplish tasks incrementally. Once you've proven out part of a workflow and there is repeated work to be done, you can write a script to do the repeated work in a single execution.
## Choosing Your Approach
- **Local/source-available sites**: Read the source code first to write selectors directly
- **Unknown page layouts**: Use `getAISnapshot()` to discover elements and `selectSnapshotRef()` to interact with them
- **Visual feedback**: Take screenshots to see what the user sees
## Setup
> **Installation**: See [references/installation.md](references/installation.md) for detailed setup instructions including Windows support.
Two modes available. Ask the user if unclear which to use.
### Standalone Mode (Default)
Launches a new Chromium browser for fresh automation sessions.
```bash
./skills/dev-browser/server.sh &
```
Add `--headless` flag if user requests it. **Wait for the `Ready` message before running scripts.**
### Extension Mode
Connects to user's existing Chrome browser. Use this when:
- The user is already logged into sites and wants you to do things behind an authed experience that isn't local dev.
- The user asks you to use the extension
**Important**: The core flow is still the same. You create named pages inside of their browser.
**Start the relay server:**
```bash
cd skills/dev-browser && npm i && npm run start-extension &
```
Wait for `Waiting for extension to connect...` followed by `Extension connected` in the console. To know that a client has connected and the browser is ready to be controlled.
**Workflow:**
1. Scripts call `client.page("name")` just like the normal mode to create new pages / connect to existing ones.
2. Automation runs on the user's actual browser session
If the extension hasn't connected yet, tell the user to launch and activate it. Download link: https://github.com/SawyerHood/dev-browser/releases
## Writing Scripts
> **Run all scripts from `skills/dev-browser/` directory.** The `@/` import alias requires this directory's config.
Execute scripts inline using heredocs:
```bash
cd skills/dev-browser && npx tsx <<'EOF'
import { connect, waitForPageLoad } from "@/client.js";
const client = await connect();
// Create page with custom viewport size (optional)
const page = await client.page("example", { viewport: { width: 1920, height: 1080 } });
await page.goto("https://example.com");
await waitForPageLoad(page);
console.log({ title: await page.title(), url: page.url() });
await client.disconnect();
EOF
```
**Write to `tmp/` files only when** the script needs reuse, is complex, or user explicitly requests it.
### Key Principles
1. **Small scripts**: Each script does ONE thing (navigate, click, fill, check)
2. **Evaluate state**: Log/return state at the end to decide next steps
3. **Descriptive page names**: Use `"checkout"`, `"login"`, not `"main"`
4. **Disconnect to exit**: `await client.disconnect()` - pages persist on server
5. **Plain JS in evaluate**: `page.evaluate()` runs in browser - no TypeScript syntax
## Workflow Loop
Follow this pattern for complex tasks:
1. **Write a script** to perform one action
2. **Run it** and observe the output
3. **Evaluate** - did it work? What's the current state?
4. **Decide** - is the task complete or do we need another script?
5. **Repeat** until task is done
### No TypeScript in Browser Context
Code passed to `page.evaluate()` runs in the browser, which doesn't understand TypeScript:
```typescript
// ✅ Correct: plain JavaScript
const text = await page.evaluate(() => {
return document.body.innerText;
});
// ❌ Wrong: TypeScript syntax will fail at runtime
const text = await page.evaluate(() => {
const el: HTMLElement = document.body; // Type annotation breaks in browser!
return el.innerText;
});
```
## Scraping Data
For scraping large datasets, intercept and replay network requests rather than scrolling the DOM. See [references/scraping.md](references/scraping.md) for the complete guide covering request capture, schema discovery, and paginated API replay.
## Client API
```typescript
const client = await connect();
// Get or create named page (viewport only applies to new pages)
const page = await client.page("name");
const pageWithSize = await client.page("name", { viewport: { width: 1920, height: 1080 } });
const pages = await client.list(); // List all page names
await client.close("name"); // Close a page
await client.disconnect(); // Disconnect (pages persist)
// ARIA Snapshot methods
const snapshot = await client.getAISnapshot("name"); // Get accessibility tree
const element = await client.selectSnapshotRef("name", "e5"); // Get element by ref
```
The `page` object is a standard Playwright Page.
## Waiting
```typescript
import { waitForPageLoad } from "@/client.js";
await waitForPageLoad(page); // After navigation
await page.waitForSelector(".results"); // For specific elements
await page.waitForURL("**/success"); // For specific URL
```
## Inspecting Page State
### Screenshots
```typescript
await page.screenshot({ path: "tmp/screenshot.png" });
await page.screenshot({ path: "tmp/full.png", fullPage: true });
```
### ARIA Snapshot (Element Discovery)
Use `getAISnapshot()` to discover page elements. Returns YAML-formatted accessibility tree:
```yaml
- banner:
- link "Hacker News" [ref=e1]
- navigation:
- link "new" [ref=e2]
- main:
- list:
- listitem:
- link "Article Title" [ref=e8]
- link "328 comments" [ref=e9]
- contentinfo:
- textbox [ref=e10]
- /placeholder: "Search"
```
**Interpreting refs:**
- `[ref=eN]` - Element reference for interaction (visible, clickable elements only)
- `[checked]`, `[disabled]`, `[expanded]` - Element states
- `[level=N]` - Heading level
- `/url:`, `/placeholder:` - Element properties
**Interacting with refs:**
```typescript
const snapshot = await client.getAISnapshot("hackernews");
console.log(snapshot); // Find the ref you need
const element = await client.selectSnapshotRef("hackernews", "e2");
await element.click();
```
## Error Recovery
Page state persists after failures. Debug with:
```bash
cd skills/dev-browser && npx tsx <<'EOF'
import { connect } from "@/client.js";
const client = await connect();
const page = await client.page("hackernews");
await page.screenshot({ path: "tmp/debug.png" });
console.log({
url: page.url(),
title: await page.title(),
bodyText: await page.textContent("body").then((t) => t?.slice(0, 200)),
});
await client.disconnect();
EOF
```

View File

@@ -0,0 +1,193 @@
# Dev Browser Installation Guide
This guide covers installation for all platforms: macOS, Linux, and Windows.
## Prerequisites
- [Node.js](https://nodejs.org) v18 or later with npm
- Git (for cloning the skill)
## Installation
### Step 1: Clone the Skill
```bash
# Clone dev-browser to a temporary location
git clone https://github.com/sawyerhood/dev-browser /tmp/dev-browser-skill
# Copy to skills directory (adjust path as needed)
# For oh-my-opencode: already bundled
# For manual installation:
mkdir -p ~/.config/opencode/skills
cp -r /tmp/dev-browser-skill/skills/dev-browser ~/.config/opencode/skills/dev-browser
# Cleanup
rm -rf /tmp/dev-browser-skill
```
**Windows (PowerShell):**
```powershell
# Clone dev-browser to temp location
git clone https://github.com/sawyerhood/dev-browser $env:TEMP\dev-browser-skill
# Copy to skills directory
New-Item -ItemType Directory -Force -Path "$env:USERPROFILE\.config\opencode\skills"
Copy-Item -Recurse "$env:TEMP\dev-browser-skill\skills\dev-browser" "$env:USERPROFILE\.config\opencode\skills\dev-browser"
# Cleanup
Remove-Item -Recurse -Force "$env:TEMP\dev-browser-skill"
```
### Step 2: Install Dependencies
```bash
cd ~/.config/opencode/skills/dev-browser
npm install
```
**Windows (PowerShell):**
```powershell
cd "$env:USERPROFILE\.config\opencode\skills\dev-browser"
npm install
```
### Step 3: Start the Server
#### Standalone Mode (New Browser Instance)
**macOS/Linux:**
```bash
cd ~/.config/opencode/skills/dev-browser
./server.sh &
# Or for headless:
./server.sh --headless &
```
**Windows (PowerShell):**
```powershell
cd "$env:USERPROFILE\.config\opencode\skills\dev-browser"
Start-Process -NoNewWindow -FilePath "node" -ArgumentList "server.js"
# Or for headless:
Start-Process -NoNewWindow -FilePath "node" -ArgumentList "server.js", "--headless"
```
**Windows (CMD):**
```cmd
cd %USERPROFILE%\.config\opencode\skills\dev-browser
start /B node server.js
```
Wait for the `Ready` message before running scripts.
#### Extension Mode (Use Existing Chrome)
**macOS/Linux:**
```bash
cd ~/.config/opencode/skills/dev-browser
npm run start-extension &
```
**Windows (PowerShell):**
```powershell
cd "$env:USERPROFILE\.config\opencode\skills\dev-browser"
Start-Process -NoNewWindow -FilePath "npm" -ArgumentList "run", "start-extension"
```
Wait for `Extension connected` message.
## Chrome Extension Setup (Optional)
The Chrome extension allows controlling your existing Chrome browser with all your logged-in sessions.
### Installation
1. Download `extension.zip` from [latest release](https://github.com/sawyerhood/dev-browser/releases/latest)
2. Extract to a permanent location:
- **macOS/Linux:** `~/.dev-browser-extension`
- **Windows:** `%USERPROFILE%\.dev-browser-extension`
3. Open Chrome → `chrome://extensions`
4. Enable "Developer mode" (toggle in top right)
5. Click "Load unpacked" → select the extracted folder
### Usage
1. Click the Dev Browser extension icon in Chrome toolbar
2. Toggle to "Active"
3. Start the extension relay server (see above)
4. Use dev-browser scripts - they'll control your existing Chrome
## Troubleshooting
### Server Won't Start
**Check Node.js version:**
```bash
node --version # Should be v18+
```
**Check port availability:**
```bash
# macOS/Linux
lsof -i :3000
# Windows
netstat -ano | findstr :3000
```
### Playwright Installation Issues
If Chromium fails to install:
```bash
npx playwright install chromium
```
### Windows-Specific Issues
**Execution Policy:**
If PowerShell scripts are blocked:
```powershell
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
```
**Path Issues:**
Use forward slashes or escaped backslashes in paths:
```powershell
# Good
cd "$env:USERPROFILE/.config/opencode/skills/dev-browser"
# Also good
cd "$env:USERPROFILE\.config\opencode\skills\dev-browser"
```
### Extension Not Connecting
1. Ensure extension is "Active" (click icon to toggle)
2. Check relay server is running (`npm run start-extension`)
3. Look for `Extension connected` message in console
4. Try reloading the extension in `chrome://extensions`
## Permissions
To skip permission prompts in Claude Code, add to `~/.claude/settings.json`:
```json
{
"permissions": {
"allow": ["Skill(dev-browser:dev-browser)", "Bash(npx tsx:*)"]
}
}
```
## Updating
```bash
cd ~/.config/opencode/skills/dev-browser
git pull
npm install
```
**Windows:**
```powershell
cd "$env:USERPROFILE\.config\opencode\skills\dev-browser"
git pull
npm install
```

View File

@@ -0,0 +1,155 @@
# Data Scraping Guide
For large datasets (followers, posts, search results), **intercept and replay network requests** rather than scrolling and parsing the DOM. This is faster, more reliable, and handles pagination automatically.
## Why Not Scroll?
Scrolling is slow, unreliable, and wastes time. APIs return structured data with pagination built in. Always prefer API replay.
## Start Small, Then Scale
**Don't try to automate everything at once.** Work incrementally:
1. **Capture one request** - verify you're intercepting the right endpoint
2. **Inspect one response** - understand the schema before writing extraction code
3. **Extract a few items** - make sure your parsing logic works
4. **Then scale up** - add pagination loop only after the basics work
This prevents wasting time debugging a complex script when the issue is a simple path like `data.user.timeline` vs `data.user.result.timeline`.
## Step-by-Step Workflow
### 1. Capture Request Details
First, intercept a request to understand URL structure and required headers:
```typescript
import { connect, waitForPageLoad } from "@/client.js";
import * as fs from "node:fs";
const client = await connect();
const page = await client.page("site");
let capturedRequest = null;
page.on("request", (request) => {
const url = request.url();
// Look for API endpoints (adjust pattern for your target site)
if (url.includes("/api/") || url.includes("/graphql/")) {
capturedRequest = {
url: url,
headers: request.headers(),
method: request.method(),
};
fs.writeFileSync("tmp/request-details.json", JSON.stringify(capturedRequest, null, 2));
console.log("Captured request:", url.substring(0, 80) + "...");
}
});
await page.goto("https://example.com/profile");
await waitForPageLoad(page);
await page.waitForTimeout(3000);
await client.disconnect();
```
### 2. Capture Response to Understand Schema
Save a raw response to inspect the data structure:
```typescript
page.on("response", async (response) => {
const url = response.url();
if (url.includes("UserTweets") || url.includes("/api/data")) {
const json = await response.json();
fs.writeFileSync("tmp/api-response.json", JSON.stringify(json, null, 2));
console.log("Captured response");
}
});
```
Then analyze the structure to find:
- Where the data array lives (e.g., `data.user.result.timeline.instructions[].entries`)
- Where pagination cursors are (e.g., `cursor-bottom` entries)
- What fields you need to extract
### 3. Replay API with Pagination
Once you understand the schema, replay requests directly:
```typescript
import { connect } from "@/client.js";
import * as fs from "node:fs";
const client = await connect();
const page = await client.page("site");
const results = new Map(); // Use Map for deduplication
const headers = JSON.parse(fs.readFileSync("tmp/request-details.json", "utf8")).headers;
const baseUrl = "https://example.com/api/data";
let cursor = null;
let hasMore = true;
while (hasMore) {
// Build URL with pagination cursor
const params = { count: 20 };
if (cursor) params.cursor = cursor;
const url = `${baseUrl}?params=${encodeURIComponent(JSON.stringify(params))}`;
// Execute fetch in browser context (has auth cookies/headers)
const response = await page.evaluate(
async ({ url, headers }) => {
const res = await fetch(url, { headers });
return res.json();
},
{ url, headers }
);
// Extract data and cursor (adjust paths for your API)
const entries = response?.data?.entries || [];
for (const entry of entries) {
if (entry.type === "cursor-bottom") {
cursor = entry.value;
} else if (entry.id && !results.has(entry.id)) {
results.set(entry.id, {
id: entry.id,
text: entry.content,
timestamp: entry.created_at,
});
}
}
console.log(`Fetched page, total: ${results.size}`);
// Check stop conditions
if (!cursor || entries.length === 0) hasMore = false;
// Rate limiting - be respectful
await new Promise((r) => setTimeout(r, 500));
}
// Export results
const data = Array.from(results.values());
fs.writeFileSync("tmp/results.json", JSON.stringify(data, null, 2));
console.log(`Saved ${data.length} items`);
await client.disconnect();
```
## Key Patterns
| Pattern | Description |
| ----------------------- | ------------------------------------------------------ |
| `page.on('request')` | Capture outgoing request URL + headers |
| `page.on('response')` | Capture response data to understand schema |
| `page.evaluate(fetch)` | Replay requests in browser context (inherits auth) |
| `Map` for deduplication | APIs often return overlapping data across pages |
| Cursor-based pagination | Look for `cursor`, `next_token`, `offset` in responses |
## Tips
- **Extension mode**: `page.context().cookies()` doesn't work - capture auth headers from intercepted requests instead
- **Rate limiting**: Add 500ms+ delays between requests to avoid blocks
- **Stop conditions**: Check for empty results, missing cursor, or reaching a date/ID threshold
- **GraphQL APIs**: URL params often include `variables` and `features` JSON objects - capture and reuse them

View File

@@ -1,2 +1,2 @@
export * from "./types"
export { createBuiltinSkills } from "./skills"
export { createBuiltinSkills, type CreateBuiltinSkillsOptions } from "./skills"

View File

@@ -0,0 +1,89 @@
import { describe, test, expect } from "bun:test"
import { createBuiltinSkills } from "./skills"
describe("createBuiltinSkills", () => {
test("returns playwright skill by default", () => {
// #given - no options (default)
// #when
const skills = createBuiltinSkills()
// #then
const browserSkill = skills.find((s) => s.name === "playwright")
expect(browserSkill).toBeDefined()
expect(browserSkill!.description).toContain("browser")
expect(browserSkill!.mcpConfig).toHaveProperty("playwright")
})
test("returns playwright skill when browserProvider is 'playwright'", () => {
// #given
const options = { browserProvider: "playwright" as const }
// #when
const skills = createBuiltinSkills(options)
// #then
const playwrightSkill = skills.find((s) => s.name === "playwright")
const agentBrowserSkill = skills.find((s) => s.name === "agent-browser")
expect(playwrightSkill).toBeDefined()
expect(agentBrowserSkill).toBeUndefined()
})
test("returns agent-browser skill when browserProvider is 'agent-browser'", () => {
// #given
const options = { browserProvider: "agent-browser" as const }
// #when
const skills = createBuiltinSkills(options)
// #then
const agentBrowserSkill = skills.find((s) => s.name === "agent-browser")
const playwrightSkill = skills.find((s) => s.name === "playwright")
expect(agentBrowserSkill).toBeDefined()
expect(agentBrowserSkill!.description).toContain("browser")
expect(agentBrowserSkill!.allowedTools).toContain("Bash(agent-browser:*)")
expect(agentBrowserSkill!.template).toContain("agent-browser")
expect(playwrightSkill).toBeUndefined()
})
test("agent-browser skill template is inlined (not loaded from file)", () => {
// #given
const options = { browserProvider: "agent-browser" as const }
// #when
const skills = createBuiltinSkills(options)
const agentBrowserSkill = skills.find((s) => s.name === "agent-browser")
// #then - template should contain substantial content (inlined, not fallback)
expect(agentBrowserSkill!.template).toContain("## Quick start")
expect(agentBrowserSkill!.template).toContain("## Commands")
expect(agentBrowserSkill!.template).toContain("agent-browser open")
expect(agentBrowserSkill!.template).toContain("agent-browser snapshot")
})
test("always includes frontend-ui-ux and git-master skills", () => {
// #given - both provider options
// #when
const defaultSkills = createBuiltinSkills()
const agentBrowserSkills = createBuiltinSkills({ browserProvider: "agent-browser" })
// #then
for (const skills of [defaultSkills, agentBrowserSkills]) {
expect(skills.find((s) => s.name === "frontend-ui-ux")).toBeDefined()
expect(skills.find((s) => s.name === "git-master")).toBeDefined()
}
})
test("returns exactly 4 skills regardless of provider", () => {
// #given
// #when
const defaultSkills = createBuiltinSkills()
const agentBrowserSkills = createBuiltinSkills({ browserProvider: "agent-browser" })
// #then
expect(defaultSkills).toHaveLength(4)
expect(agentBrowserSkills).toHaveLength(4)
})
})

View File

@@ -1,4 +1,5 @@
import type { BuiltinSkill } from "./types"
import type { BrowserAutomationProvider } from "../../config/schema"
const playwrightSkill: BuiltinSkill = {
name: "playwright",
@@ -14,6 +15,303 @@ This skill provides browser automation capabilities via the Playwright MCP serve
},
}
const agentBrowserSkill: BuiltinSkill = {
name: "agent-browser",
description: "MUST USE for any browser-related tasks. Browser automation via agent-browser CLI - verification, browsing, information gathering, web scraping, testing, screenshots, and all browser interactions.",
template: `# Browser Automation with agent-browser
## Quick start
\`\`\`bash
agent-browser open <url> # Navigate to page
agent-browser snapshot -i # Get interactive elements with refs
agent-browser click @e1 # Click element by ref
agent-browser fill @e2 "text" # Fill input by ref
agent-browser close # Close browser
\`\`\`
## Core workflow
1. Navigate: \`agent-browser open <url>\`
2. Snapshot: \`agent-browser snapshot -i\` (returns elements with refs like \`@e1\`, \`@e2\`)
3. Interact using refs from the snapshot
4. Re-snapshot after navigation or significant DOM changes
## Commands
### Navigation
\`\`\`bash
agent-browser open <url> # Navigate to URL
agent-browser back # Go back
agent-browser forward # Go forward
agent-browser reload # Reload page
agent-browser close # Close browser
\`\`\`
### Snapshot (page analysis)
\`\`\`bash
agent-browser snapshot # Full accessibility tree
agent-browser snapshot -i # Interactive elements only (recommended)
agent-browser snapshot -c # Compact output
agent-browser snapshot -d 3 # Limit depth to 3
agent-browser snapshot -s "#main" # Scope to CSS selector
\`\`\`
### Interactions (use @refs from snapshot)
\`\`\`bash
agent-browser click @e1 # Click
agent-browser dblclick @e1 # Double-click
agent-browser focus @e1 # Focus element
agent-browser fill @e2 "text" # Clear and type
agent-browser type @e2 "text" # Type without clearing
agent-browser press Enter # Press key
agent-browser press Control+a # Key combination
agent-browser keydown Shift # Hold key down
agent-browser keyup Shift # Release key
agent-browser hover @e1 # Hover
agent-browser check @e1 # Check checkbox
agent-browser uncheck @e1 # Uncheck checkbox
agent-browser select @e1 "value" # Select dropdown
agent-browser scroll down 500 # Scroll page
agent-browser scrollintoview @e1 # Scroll element into view
agent-browser drag @e1 @e2 # Drag and drop
agent-browser upload @e1 file.pdf # Upload files
\`\`\`
### Get information
\`\`\`bash
agent-browser get text @e1 # Get element text
agent-browser get html @e1 # Get innerHTML
agent-browser get value @e1 # Get input value
agent-browser get attr @e1 href # Get attribute
agent-browser get title # Get page title
agent-browser get url # Get current URL
agent-browser get count ".item" # Count matching elements
agent-browser get box @e1 # Get bounding box
\`\`\`
### Check state
\`\`\`bash
agent-browser is visible @e1 # Check if visible
agent-browser is enabled @e1 # Check if enabled
agent-browser is checked @e1 # Check if checked
\`\`\`
### Screenshots & PDF
\`\`\`bash
agent-browser screenshot # Screenshot to stdout
agent-browser screenshot path.png # Save to file
agent-browser screenshot --full # Full page
agent-browser pdf output.pdf # Save as PDF
\`\`\`
### Video recording
\`\`\`bash
agent-browser record start ./demo.webm # Start recording (uses current URL + state)
agent-browser click @e1 # Perform actions
agent-browser record stop # Stop and save video
agent-browser record restart ./take2.webm # Stop current + start new recording
\`\`\`
Recording creates a fresh context but preserves cookies/storage from your session.
### Wait
\`\`\`bash
agent-browser wait @e1 # Wait for element
agent-browser wait 2000 # Wait milliseconds
agent-browser wait --text "Success" # Wait for text
agent-browser wait --url "**/dashboard" # Wait for URL pattern
agent-browser wait --load networkidle # Wait for network idle
agent-browser wait --fn "window.ready" # Wait for JS condition
\`\`\`
### Mouse control
\`\`\`bash
agent-browser mouse move 100 200 # Move mouse
agent-browser mouse down left # Press button
agent-browser mouse up left # Release button
agent-browser mouse wheel 100 # Scroll wheel
\`\`\`
### Semantic locators (alternative to refs)
\`\`\`bash
agent-browser find role button click --name "Submit"
agent-browser find text "Sign In" click
agent-browser find label "Email" fill "user@test.com"
agent-browser find first ".item" click
agent-browser find nth 2 "a" text
\`\`\`
### Browser settings
\`\`\`bash
agent-browser set viewport 1920 1080 # Set viewport size
agent-browser set device "iPhone 14" # Emulate device
agent-browser set geo 37.7749 -122.4194 # Set geolocation
agent-browser set offline on # Toggle offline mode
agent-browser set headers '{"X-Key":"v"}' # Extra HTTP headers
agent-browser set credentials user pass # HTTP basic auth
agent-browser set media dark # Emulate color scheme
\`\`\`
### Cookies & Storage
\`\`\`bash
agent-browser cookies # Get all cookies
agent-browser cookies set name value # Set cookie
agent-browser cookies clear # Clear cookies
agent-browser storage local # Get all localStorage
agent-browser storage local key # Get specific key
agent-browser storage local set k v # Set value
agent-browser storage local clear # Clear all
agent-browser storage session # Get all sessionStorage
agent-browser storage session key # Get specific key
agent-browser storage session set k v # Set value
agent-browser storage session clear # Clear all
\`\`\`
### Network
\`\`\`bash
agent-browser network route <url> # Intercept requests
agent-browser network route <url> --abort # Block requests
agent-browser network route <url> --body '{}' # Mock response
agent-browser network unroute [url] # Remove routes
agent-browser network requests # View tracked requests
agent-browser network requests --filter api # Filter requests
\`\`\`
### Tabs & Windows
\`\`\`bash
agent-browser tab # List tabs
agent-browser tab new [url] # New tab
agent-browser tab 2 # Switch to tab
agent-browser tab close # Close tab
agent-browser window new # New window
\`\`\`
### Frames
\`\`\`bash
agent-browser frame "#iframe" # Switch to iframe
agent-browser frame main # Back to main frame
\`\`\`
### Dialogs
\`\`\`bash
agent-browser dialog accept [text] # Accept dialog
agent-browser dialog dismiss # Dismiss dialog
\`\`\`
### JavaScript
\`\`\`bash
agent-browser eval "document.title" # Run JavaScript
\`\`\`
## Global Options
| Option | Description |
|--------|-------------|
| \`--session <name>\` | Isolated browser session (\`AGENT_BROWSER_SESSION\` env) |
| \`--profile <path>\` | Persistent browser profile (\`AGENT_BROWSER_PROFILE\` env) |
| \`--headers <json>\` | HTTP headers scoped to URL's origin |
| \`--executable-path <path>\` | Custom browser binary (\`AGENT_BROWSER_EXECUTABLE_PATH\` env) |
| \`--args <args>\` | Browser launch args (\`AGENT_BROWSER_ARGS\` env) |
| \`--user-agent <ua>\` | Custom User-Agent (\`AGENT_BROWSER_USER_AGENT\` env) |
| \`--proxy <url>\` | Proxy server (\`AGENT_BROWSER_PROXY\` env) |
| \`--proxy-bypass <hosts>\` | Hosts to bypass proxy (\`AGENT_BROWSER_PROXY_BYPASS\` env) |
| \`-p, --provider <name>\` | Cloud browser provider (\`AGENT_BROWSER_PROVIDER\` env) |
| \`--json\` | Machine-readable JSON output |
| \`--headed\` | Show browser window (not headless) |
| \`--cdp <port\\|wss://url>\` | Connect via Chrome DevTools Protocol |
| \`--debug\` | Debug output |
## Example: Form submission
\`\`\`bash
agent-browser open https://example.com/form
agent-browser snapshot -i
# Output shows: textbox "Email" [ref=e1], textbox "Password" [ref=e2], button "Submit" [ref=e3]
agent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser snapshot -i # Check result
\`\`\`
## Example: Authentication with saved state
\`\`\`bash
# Login once
agent-browser open https://app.example.com/login
agent-browser snapshot -i
agent-browser fill @e1 "username"
agent-browser fill @e2 "password"
agent-browser click @e3
agent-browser wait --url "**/dashboard"
agent-browser state save auth.json
# Later sessions: load saved state
agent-browser state load auth.json
agent-browser open https://app.example.com/dashboard
\`\`\`
### Header-based Auth (Skip login flows)
\`\`\`bash
# Headers scoped to api.example.com only
agent-browser open api.example.com --headers '{"Authorization": "Bearer <token>"}'
# Navigate to another domain - headers NOT sent (safe)
agent-browser open other-site.com
# Global headers (all domains)
agent-browser set headers '{"X-Custom-Header": "value"}'
\`\`\`
## Sessions & Persistent Profiles
### Sessions (parallel browsers)
\`\`\`bash
agent-browser --session test1 open site-a.com
agent-browser --session test2 open site-b.com
agent-browser session list
\`\`\`
### Persistent Profiles
Persists cookies, localStorage, IndexedDB, service workers, cache, login sessions across browser restarts.
\`\`\`bash
agent-browser --profile ~/.myapp-profile open myapp.com
# Or via env var
AGENT_BROWSER_PROFILE=~/.myapp-profile agent-browser open myapp.com
\`\`\`
- Use different profile paths for different projects
- Login once → restart browser → still logged in
- Stores: cookies, localStorage, IndexedDB, service workers, browser cache
## JSON output (for parsing)
Add \`--json\` for machine-readable output:
\`\`\`bash
agent-browser snapshot -i --json
agent-browser get text @e1 --json
\`\`\`
## Debugging
\`\`\`bash
agent-browser open example.com --headed # Show browser window
agent-browser console # View console messages
agent-browser errors # View page errors
agent-browser record start ./debug.webm # Record from current page
agent-browser record stop # Save recording
agent-browser connect 9222 # Local CDP port
agent-browser --cdp "wss://browser-service.com/cdp?token=..." snapshot # Remote via WebSocket
agent-browser console --clear # Clear console
agent-browser errors --clear # Clear errors
agent-browser highlight @e1 # Highlight element
agent-browser trace start # Start recording trace
agent-browser trace stop trace.zip # Stop and save trace
\`\`\`
---
Install: \`bun add -g agent-browser && agent-browser install\`. Run \`agent-browser --help\` for all commands. Repo: https://github.com/vercel-labs/agent-browser`,
allowedTools: ["Bash(agent-browser:*)"],
}
const frontendUiUxSkill: BuiltinSkill = {
name: "frontend-ui-ux",
description: "Designer-turned-developer who crafts stunning UI/UX even without design mockups",
@@ -1198,6 +1496,234 @@ POTENTIAL ACTIONS:
- Bisect without proper good/bad boundaries -> Wasted time`,
}
export function createBuiltinSkills(): BuiltinSkill[] {
return [playwrightSkill, frontendUiUxSkill, gitMasterSkill]
const devBrowserSkill: BuiltinSkill = {
name: "dev-browser",
description:
"Browser automation with persistent page state. Use when users ask to navigate websites, fill forms, take screenshots, extract web data, test web apps, or automate browser workflows. Trigger phrases include 'go to [url]', 'click on', 'fill out the form', 'take a screenshot', 'scrape', 'automate', 'test the website', 'log into', or any browser interaction request.",
template: `# Dev Browser Skill
Browser automation that maintains page state across script executions. Write small, focused scripts to accomplish tasks incrementally. Once you've proven out part of a workflow and there is repeated work to be done, you can write a script to do the repeated work in a single execution.
## Choosing Your Approach
- **Local/source-available sites**: Read the source code first to write selectors directly
- **Unknown page layouts**: Use \`getAISnapshot()\` to discover elements and \`selectSnapshotRef()\` to interact with them
- **Visual feedback**: Take screenshots to see what the user sees
## Setup
**IMPORTANT**: Before using this skill, ensure the server is running. See [references/installation.md](references/installation.md) for platform-specific setup instructions (macOS, Linux, Windows).
Two modes available. Ask the user if unclear which to use.
### Standalone Mode (Default)
Launches a new Chromium browser for fresh automation sessions.
**macOS/Linux:**
\`\`\`bash
./skills/dev-browser/server.sh &
\`\`\`
**Windows (PowerShell):**
\`\`\`powershell
Start-Process -NoNewWindow -FilePath "node" -ArgumentList "skills/dev-browser/server.js"
\`\`\`
Add \`--headless\` flag if user requests it. **Wait for the \`Ready\` message before running scripts.**
### Extension Mode
Connects to user's existing Chrome browser. Use this when:
- The user is already logged into sites and wants you to do things behind an authed experience that isn't local dev.
- The user asks you to use the extension
**Important**: The core flow is still the same. You create named pages inside of their browser.
**Start the relay server:**
**macOS/Linux:**
\`\`\`bash
cd skills/dev-browser && npm i && npm run start-extension &
\`\`\`
**Windows (PowerShell):**
\`\`\`powershell
cd skills/dev-browser; npm i; Start-Process -NoNewWindow -FilePath "npm" -ArgumentList "run", "start-extension"
\`\`\`
Wait for \`Waiting for extension to connect...\` followed by \`Extension connected\` in the console.
If the extension hasn't connected yet, tell the user to launch and activate it. Download link: https://github.com/SawyerHood/dev-browser/releases
## Writing Scripts
> **Run all scripts from \`skills/dev-browser/\` directory.** The \`@/\` import alias requires this directory's config.
Execute scripts inline using heredocs:
**macOS/Linux:**
\`\`\`bash
cd skills/dev-browser && npx tsx <<'EOF'
import { connect, waitForPageLoad } from "@/client.js";
const client = await connect();
const page = await client.page("example", { viewport: { width: 1920, height: 1080 } });
await page.goto("https://example.com");
await waitForPageLoad(page);
console.log({ title: await page.title(), url: page.url() });
await client.disconnect();
EOF
\`\`\`
**Windows (PowerShell):**
\`\`\`powershell
cd skills/dev-browser
@"
import { connect, waitForPageLoad } from "@/client.js";
const client = await connect();
const page = await client.page("example", { viewport: { width: 1920, height: 1080 } });
await page.goto("https://example.com");
await waitForPageLoad(page);
console.log({ title: await page.title(), url: page.url() });
await client.disconnect();
"@ | npx tsx --input-type=module
\`\`\`
### Key Principles
1. **Small scripts**: Each script does ONE thing (navigate, click, fill, check)
2. **Evaluate state**: Log/return state at the end to decide next steps
3. **Descriptive page names**: Use \`"checkout"\`, \`"login"\`, not \`"main"\`
4. **Disconnect to exit**: \`await client.disconnect()\` - pages persist on server
5. **Plain JS in evaluate**: \`page.evaluate()\` runs in browser - no TypeScript syntax
## Workflow Loop
1. **Write a script** to perform one action
2. **Run it** and observe the output
3. **Evaluate** - did it work? What's the current state?
4. **Decide** - is the task complete or do we need another script?
5. **Repeat** until task is done
### No TypeScript in Browser Context
Code passed to \`page.evaluate()\` runs in the browser, which doesn't understand TypeScript:
\`\`\`typescript
// Correct: plain JavaScript
const text = await page.evaluate(() => {
return document.body.innerText;
});
// Wrong: TypeScript syntax will fail at runtime
const text = await page.evaluate(() => {
const el: HTMLElement = document.body; // Type annotation breaks in browser!
return el.innerText;
});
\`\`\`
## Scraping Data
For scraping large datasets, intercept and replay network requests rather than scrolling the DOM. See [references/scraping.md](references/scraping.md) for the complete guide.
## Client API
\`\`\`typescript
const client = await connect();
// Get or create named page
const page = await client.page("name");
const pageWithSize = await client.page("name", { viewport: { width: 1920, height: 1080 } });
const pages = await client.list(); // List all page names
await client.close("name"); // Close a page
await client.disconnect(); // Disconnect (pages persist)
// ARIA Snapshot methods
const snapshot = await client.getAISnapshot("name"); // Get accessibility tree
const element = await client.selectSnapshotRef("name", "e5"); // Get element by ref
\`\`\`
## Waiting
\`\`\`typescript
import { waitForPageLoad } from "@/client.js";
await waitForPageLoad(page); // After navigation
await page.waitForSelector(".results"); // For specific elements
await page.waitForURL("**/success"); // For specific URL
\`\`\`
## Screenshots
\`\`\`typescript
await page.screenshot({ path: "tmp/screenshot.png" });
await page.screenshot({ path: "tmp/full.png", fullPage: true });
\`\`\`
## ARIA Snapshot (Element Discovery)
Use \`getAISnapshot()\` to discover page elements. Returns YAML-formatted accessibility tree:
\`\`\`yaml
- banner:
- link "Hacker News" [ref=e1]
- navigation:
- link "new" [ref=e2]
- main:
- list:
- listitem:
- link "Article Title" [ref=e8]
\`\`\`
**Interacting with refs:**
\`\`\`typescript
const snapshot = await client.getAISnapshot("hackernews");
console.log(snapshot); // Find the ref you need
const element = await client.selectSnapshotRef("hackernews", "e2");
await element.click();
\`\`\`
## Error Recovery
Page state persists after failures. Debug with:
\`\`\`bash
cd skills/dev-browser && npx tsx <<'EOF'
import { connect } from "@/client.js";
const client = await connect();
const page = await client.page("hackernews");
await page.screenshot({ path: "tmp/debug.png" });
console.log({
url: page.url(),
title: await page.title(),
bodyText: await page.textContent("body").then((t) => t?.slice(0, 200)),
});
await client.disconnect();
EOF
\`\`\``,
}
export interface CreateBuiltinSkillsOptions {
browserProvider?: BrowserAutomationProvider
}
export function createBuiltinSkills(options: CreateBuiltinSkillsOptions = {}): BuiltinSkill[] {
const { browserProvider = "playwright" } = options
const browserSkill = browserProvider === "agent-browser" ? agentBrowserSkill : playwrightSkill
return [browserSkill, frontendUiUxSkill, gitMasterSkill, devBrowserSkill]
}

View File

@@ -1,4 +1,4 @@
import { describe, test, expect, beforeEach } from "bun:test"
import { describe, test, expect, beforeEach, afterEach } from "bun:test"
import {
setSessionAgent,
getSessionAgent,
@@ -13,9 +13,11 @@ describe("claude-code-session-state", () => {
beforeEach(() => {
// #given - clean state before each test
_resetForTesting()
clearSessionAgent("test-session-1")
clearSessionAgent("test-session-2")
clearSessionAgent("test-prometheus-session")
})
afterEach(() => {
// #then - cleanup after each test to prevent pollution
_resetForTesting()
})
describe("setSessionAgent", () => {
@@ -37,7 +39,7 @@ describe("claude-code-session-state", () => {
setSessionAgent(sessionID, "Prometheus (Planner)")
// #when - try to overwrite
setSessionAgent(sessionID, "Sisyphus")
setSessionAgent(sessionID, "sisyphus")
// #then - first agent preserved
expect(getSessionAgent(sessionID)).toBe("Prometheus (Planner)")
@@ -58,10 +60,10 @@ describe("claude-code-session-state", () => {
setSessionAgent(sessionID, "Prometheus (Planner)")
// #when - force update
updateSessionAgent(sessionID, "Sisyphus")
updateSessionAgent(sessionID, "sisyphus")
// #then
expect(getSessionAgent(sessionID)).toBe("Sisyphus")
expect(getSessionAgent(sessionID)).toBe("sisyphus")
})
})
@@ -92,9 +94,9 @@ describe("claude-code-session-state", () => {
expect(getMainSessionID()).toBe(mainID)
})
test.skip("should return undefined when not set", () => {
// #given - not set
// TODO: Fix flaky test - parallel test execution causes state pollution
test("should return undefined when not set", () => {
// #given - explicit reset to ensure clean state (parallel test isolation)
_resetForTesting()
// #then
expect(getMainSessionID()).toBeUndefined()
})
@@ -129,7 +131,7 @@ describe("claude-code-session-state", () => {
// #given - user switches to custom agent "MyCustomAgent"
const sessionID = "test-session-custom"
const customAgent = "MyCustomAgent"
const defaultAgent = "Sisyphus"
const defaultAgent = "sisyphus"
// User switches to custom agent (via UI)
setSessionAgent(sessionID, customAgent)

View File

@@ -14,6 +14,7 @@ export function getMainSessionID(): string | undefined {
export function _resetForTesting(): void {
_mainSessionID = undefined
subagentSessions.clear()
sessionAgentMap.clear()
}
const sessionAgentMap = new Map<string, string>()

View File

@@ -21,7 +21,7 @@ describe("createContextInjectorMessagesTransformHook", () => {
sessionID,
role,
time: { created: Date.now() },
agent: "Sisyphus",
agent: "sisyphus",
model: { providerID: "test", modelID: "test" },
path: { cwd: "/", root: "/" },
},

View File

@@ -265,3 +265,66 @@ describe("resolveMultipleSkillsAsync", () => {
expect(result.notFound).toEqual([])
})
})
describe("resolveSkillContent with browserProvider", () => {
it("should resolve agent-browser skill when browserProvider is 'agent-browser'", () => {
// #given: browserProvider set to agent-browser
const options = { browserProvider: "agent-browser" as const }
// #when: resolving content for 'agent-browser'
const result = resolveSkillContent("agent-browser", options)
// #then: returns agent-browser template
expect(result).not.toBeNull()
expect(result).toContain("agent-browser")
})
it("should return null for agent-browser when browserProvider is default", () => {
// #given: no browserProvider (defaults to playwright)
// #when: resolving content for 'agent-browser'
const result = resolveSkillContent("agent-browser")
// #then: returns null because agent-browser is not in default builtin skills
expect(result).toBeNull()
})
it("should return null for playwright when browserProvider is agent-browser", () => {
// #given: browserProvider set to agent-browser
const options = { browserProvider: "agent-browser" as const }
// #when: resolving content for 'playwright'
const result = resolveSkillContent("playwright", options)
// #then: returns null because playwright is replaced by agent-browser
expect(result).toBeNull()
})
})
describe("resolveMultipleSkills with browserProvider", () => {
it("should resolve agent-browser when browserProvider is set", () => {
// #given: agent-browser and git-master requested with browserProvider
const skillNames = ["agent-browser", "git-master"]
const options = { browserProvider: "agent-browser" as const }
// #when: resolving multiple skills
const result = resolveMultipleSkills(skillNames, options)
// #then: both resolved
expect(result.resolved.has("agent-browser")).toBe(true)
expect(result.resolved.has("git-master")).toBe(true)
expect(result.notFound).toHaveLength(0)
})
it("should not resolve agent-browser without browserProvider option", () => {
// #given: agent-browser requested without browserProvider
const skillNames = ["agent-browser"]
// #when: resolving multiple skills
const result = resolveMultipleSkills(skillNames)
// #then: agent-browser not found
expect(result.resolved.has("agent-browser")).toBe(false)
expect(result.notFound).toContain("agent-browser")
})
})

View File

@@ -3,24 +3,27 @@ import { discoverSkills } from "./loader"
import type { LoadedSkill } from "./types"
import { parseFrontmatter } from "../../shared/frontmatter"
import { readFileSync } from "node:fs"
import type { GitMasterConfig } from "../../config/schema"
import type { GitMasterConfig, BrowserAutomationProvider } from "../../config/schema"
export interface SkillResolutionOptions {
gitMasterConfig?: GitMasterConfig
browserProvider?: BrowserAutomationProvider
}
let cachedSkills: LoadedSkill[] | null = null
const cachedSkillsByProvider = new Map<string, LoadedSkill[]>()
function clearSkillCache(): void {
cachedSkills = null
cachedSkillsByProvider.clear()
}
async function getAllSkills(): Promise<LoadedSkill[]> {
if (cachedSkills) return cachedSkills
async function getAllSkills(options?: SkillResolutionOptions): Promise<LoadedSkill[]> {
const cacheKey = options?.browserProvider ?? "playwright"
const cached = cachedSkillsByProvider.get(cacheKey)
if (cached) return cached
const [discoveredSkills, builtinSkillDefs] = await Promise.all([
discoverSkills({ includeClaudeCodePaths: true }),
Promise.resolve(createBuiltinSkills()),
Promise.resolve(createBuiltinSkills({ browserProvider: options?.browserProvider })),
])
const builtinSkillsAsLoaded: LoadedSkill[] = builtinSkillDefs.map((skill) => ({
@@ -44,8 +47,9 @@ async function getAllSkills(): Promise<LoadedSkill[]> {
const discoveredNames = new Set(discoveredSkills.map((s) => s.name))
const uniqueBuiltins = builtinSkillsAsLoaded.filter((s) => !discoveredNames.has(s.name))
cachedSkills = [...discoveredSkills, ...uniqueBuiltins]
return cachedSkills
const allSkills = [...discoveredSkills, ...uniqueBuiltins]
cachedSkillsByProvider.set(cacheKey, allSkills)
return allSkills
}
async function extractSkillTemplate(skill: LoadedSkill): Promise<string> {
@@ -118,7 +122,7 @@ export function injectGitMasterConfig(template: string, config?: GitMasterConfig
}
export function resolveSkillContent(skillName: string, options?: SkillResolutionOptions): string | null {
const skills = createBuiltinSkills()
const skills = createBuiltinSkills({ browserProvider: options?.browserProvider })
const skill = skills.find((s) => s.name === skillName)
if (!skill) return null
@@ -133,7 +137,7 @@ export function resolveMultipleSkills(skillNames: string[], options?: SkillResol
resolved: Map<string, string>
notFound: string[]
} {
const skills = createBuiltinSkills()
const skills = createBuiltinSkills({ browserProvider: options?.browserProvider })
const skillMap = new Map(skills.map((s) => [s.name, s.template]))
const resolved = new Map<string, string>()
@@ -159,7 +163,7 @@ export async function resolveSkillContentAsync(
skillName: string,
options?: SkillResolutionOptions
): Promise<string | null> {
const allSkills = await getAllSkills()
const allSkills = await getAllSkills(options)
const skill = allSkills.find((s) => s.name === skillName)
if (!skill) return null
@@ -179,7 +183,7 @@ export async function resolveMultipleSkillsAsync(
resolved: Map<string, string>
notFound: string[]
}> {
const allSkills = await getAllSkills()
const allSkills = await getAllSkills(options)
const skillMap = new Map<string, LoadedSkill>()
for (const skill of allSkills) {
skillMap.set(skill.name, skill)

View File

@@ -0,0 +1,112 @@
import { describe, it, expect } from "bun:test"
import {
MailboxMessageSchema,
PermissionRequestSchema,
PermissionResponseSchema,
ShutdownRequestSchema,
TaskAssignmentSchema,
JoinRequestSchema,
ProtocolMessageSchema,
} from "./types"
describe("MailboxMessageSchema", () => {
//#given a valid mailbox message
//#when parsing
//#then it should succeed
it("parses valid message", () => {
const msg = {
from: "agent-001",
text: '{"type":"idle_notification"}',
timestamp: "2026-01-27T10:00:00Z",
read: false,
}
expect(MailboxMessageSchema.safeParse(msg).success).toBe(true)
})
//#given a message with optional color
//#when parsing
//#then it should succeed
it("parses message with color", () => {
const msg = {
from: "agent-001",
text: "{}",
timestamp: "2026-01-27T10:00:00Z",
color: "blue",
read: true,
}
expect(MailboxMessageSchema.safeParse(msg).success).toBe(true)
})
})
describe("ProtocolMessageSchema", () => {
//#given permission_request message
//#when parsing
//#then it should succeed
it("parses permission_request", () => {
const msg = {
type: "permission_request",
requestId: "req-123",
toolName: "Bash",
input: { command: "rm -rf /" },
agentId: "agent-001",
timestamp: Date.now(),
}
expect(PermissionRequestSchema.safeParse(msg).success).toBe(true)
})
//#given permission_response message
//#when parsing
//#then it should succeed
it("parses permission_response", () => {
const approved = {
type: "permission_response",
requestId: "req-123",
decision: "approved",
updatedInput: { command: "ls" },
}
expect(PermissionResponseSchema.safeParse(approved).success).toBe(true)
const rejected = {
type: "permission_response",
requestId: "req-123",
decision: "rejected",
feedback: "Too dangerous",
}
expect(PermissionResponseSchema.safeParse(rejected).success).toBe(true)
})
//#given shutdown_request message
//#when parsing
//#then it should succeed
it("parses shutdown messages", () => {
const request = { type: "shutdown_request" }
expect(ShutdownRequestSchema.safeParse(request).success).toBe(true)
})
//#given task_assignment message
//#when parsing
//#then it should succeed
it("parses task_assignment", () => {
const msg = {
type: "task_assignment",
taskId: "1",
subject: "Fix bug",
description: "Fix the auth bug",
assignedBy: "team-lead",
timestamp: Date.now(),
}
expect(TaskAssignmentSchema.safeParse(msg).success).toBe(true)
})
//#given join_request message
//#when parsing
//#then it should succeed
it("parses join_request", () => {
const msg = {
type: "join_request",
agentName: "new-agent",
sessionId: "sess-123",
}
expect(JoinRequestSchema.safeParse(msg).success).toBe(true)
})
})

View File

@@ -0,0 +1,153 @@
import { z } from "zod"
export const MailboxMessageSchema = z.object({
from: z.string(),
text: z.string(),
timestamp: z.string(),
color: z.string().optional(),
read: z.boolean(),
})
export type MailboxMessage = z.infer<typeof MailboxMessageSchema>
export const PermissionRequestSchema = z.object({
type: z.literal("permission_request"),
requestId: z.string(),
toolName: z.string(),
input: z.unknown(),
agentId: z.string(),
timestamp: z.number(),
})
export type PermissionRequest = z.infer<typeof PermissionRequestSchema>
export const PermissionResponseSchema = z.object({
type: z.literal("permission_response"),
requestId: z.string(),
decision: z.enum(["approved", "rejected"]),
updatedInput: z.unknown().optional(),
feedback: z.string().optional(),
permissionUpdates: z.unknown().optional(),
})
export type PermissionResponse = z.infer<typeof PermissionResponseSchema>
export const ShutdownRequestSchema = z.object({
type: z.literal("shutdown_request"),
})
export type ShutdownRequest = z.infer<typeof ShutdownRequestSchema>
export const ShutdownApprovedSchema = z.object({
type: z.literal("shutdown_approved"),
})
export type ShutdownApproved = z.infer<typeof ShutdownApprovedSchema>
export const ShutdownRejectedSchema = z.object({
type: z.literal("shutdown_rejected"),
reason: z.string().optional(),
})
export type ShutdownRejected = z.infer<typeof ShutdownRejectedSchema>
export const TaskAssignmentSchema = z.object({
type: z.literal("task_assignment"),
taskId: z.string(),
subject: z.string(),
description: z.string(),
assignedBy: z.string(),
timestamp: z.number(),
})
export type TaskAssignment = z.infer<typeof TaskAssignmentSchema>
export const TaskCompletedSchema = z.object({
type: z.literal("task_completed"),
taskId: z.string(),
agentId: z.string(),
timestamp: z.number(),
})
export type TaskCompleted = z.infer<typeof TaskCompletedSchema>
export const IdleNotificationSchema = z.object({
type: z.literal("idle_notification"),
})
export type IdleNotification = z.infer<typeof IdleNotificationSchema>
export const JoinRequestSchema = z.object({
type: z.literal("join_request"),
agentName: z.string(),
sessionId: z.string(),
})
export type JoinRequest = z.infer<typeof JoinRequestSchema>
export const JoinApprovedSchema = z.object({
type: z.literal("join_approved"),
agentName: z.string(),
teamName: z.string(),
})
export type JoinApproved = z.infer<typeof JoinApprovedSchema>
export const JoinRejectedSchema = z.object({
type: z.literal("join_rejected"),
reason: z.string().optional(),
})
export type JoinRejected = z.infer<typeof JoinRejectedSchema>
export const PlanApprovalRequestSchema = z.object({
type: z.literal("plan_approval_request"),
requestId: z.string(),
plan: z.string(),
agentId: z.string(),
})
export type PlanApprovalRequest = z.infer<typeof PlanApprovalRequestSchema>
export const PlanApprovalResponseSchema = z.object({
type: z.literal("plan_approval_response"),
requestId: z.string(),
decision: z.enum(["approved", "rejected"]),
feedback: z.string().optional(),
})
export type PlanApprovalResponse = z.infer<typeof PlanApprovalResponseSchema>
export const ModeSetRequestSchema = z.object({
type: z.literal("mode_set_request"),
mode: z.enum(["acceptEdits", "bypassPermissions", "default", "delegate", "dontAsk", "plan"]),
})
export type ModeSetRequest = z.infer<typeof ModeSetRequestSchema>
export const TeamPermissionUpdateSchema = z.object({
type: z.literal("team_permission_update"),
permissions: z.record(z.string(), z.unknown()),
})
export type TeamPermissionUpdate = z.infer<typeof TeamPermissionUpdateSchema>
export const ProtocolMessageSchema = z.discriminatedUnion("type", [
PermissionRequestSchema,
PermissionResponseSchema,
ShutdownRequestSchema,
ShutdownApprovedSchema,
ShutdownRejectedSchema,
TaskAssignmentSchema,
TaskCompletedSchema,
IdleNotificationSchema,
JoinRequestSchema,
JoinApprovedSchema,
JoinRejectedSchema,
PlanApprovalRequestSchema,
PlanApprovalResponseSchema,
ModeSetRequestSchema,
TeamPermissionUpdateSchema,
])
export type ProtocolMessage = z.infer<typeof ProtocolMessageSchema>

View File

@@ -0,0 +1,178 @@
import { describe, it, expect, beforeEach, afterEach } from "bun:test"
import { join } from "path"
import { mkdirSync, rmSync, existsSync, writeFileSync, readFileSync } from "fs"
import { z } from "zod"
import {
getTaskDir,
getTaskPath,
getTeamDir,
getInboxPath,
ensureDir,
readJsonSafe,
writeJsonAtomic,
} from "./storage"
const TEST_DIR = join(import.meta.dirname, ".test-storage")
describe("Storage Utilities", () => {
beforeEach(() => {
rmSync(TEST_DIR, { recursive: true, force: true })
mkdirSync(TEST_DIR, { recursive: true })
})
afterEach(() => {
rmSync(TEST_DIR, { recursive: true, force: true })
})
describe("getTaskDir", () => {
//#given default config (no claude_code_compat)
//#when getting task directory
//#then it should return .sisyphus/tasks/{listId}
it("returns sisyphus path by default", () => {
const config = { sisyphus: { tasks: { storage_path: ".sisyphus/tasks" } } }
const result = getTaskDir("list-123", config as any)
expect(result).toContain(".sisyphus/tasks/list-123")
})
//#given claude_code_compat enabled
//#when getting task directory
//#then it should return Claude Code path
it("returns claude code path when compat enabled", () => {
const config = {
sisyphus: {
tasks: {
storage_path: ".sisyphus/tasks",
claude_code_compat: true,
},
},
}
const result = getTaskDir("list-123", config as any)
expect(result).toContain(".cache/claude-code/tasks/list-123")
})
})
describe("getTaskPath", () => {
//#given list and task IDs
//#when getting task path
//#then it should return path to task JSON file
it("returns path to task JSON", () => {
const config = { sisyphus: { tasks: { storage_path: ".sisyphus/tasks" } } }
const result = getTaskPath("list-123", "1", config as any)
expect(result).toContain("list-123/1.json")
})
})
describe("getTeamDir", () => {
//#given team name and default config
//#when getting team directory
//#then it should return .sisyphus/teams/{teamName}
it("returns sisyphus team path", () => {
const config = { sisyphus: { swarm: { storage_path: ".sisyphus/teams" } } }
const result = getTeamDir("my-team", config as any)
expect(result).toContain(".sisyphus/teams/my-team")
})
})
describe("getInboxPath", () => {
//#given team and agent names
//#when getting inbox path
//#then it should return path to inbox JSON file
it("returns path to inbox JSON", () => {
const config = { sisyphus: { swarm: { storage_path: ".sisyphus/teams" } } }
const result = getInboxPath("my-team", "agent-001", config as any)
expect(result).toContain("my-team/inboxes/agent-001.json")
})
})
describe("ensureDir", () => {
//#given a non-existent directory path
//#when calling ensureDir
//#then it should create the directory
it("creates directory if not exists", () => {
const dirPath = join(TEST_DIR, "new-dir", "nested")
ensureDir(dirPath)
expect(existsSync(dirPath)).toBe(true)
})
//#given an existing directory
//#when calling ensureDir
//#then it should not throw
it("does not throw for existing directory", () => {
const dirPath = join(TEST_DIR, "existing")
mkdirSync(dirPath, { recursive: true })
expect(() => ensureDir(dirPath)).not.toThrow()
})
})
describe("readJsonSafe", () => {
//#given a valid JSON file matching schema
//#when reading with readJsonSafe
//#then it should return parsed object
it("reads and parses valid JSON", () => {
const testSchema = z.object({ name: z.string(), value: z.number() })
const filePath = join(TEST_DIR, "test.json")
writeFileSync(filePath, JSON.stringify({ name: "test", value: 42 }))
const result = readJsonSafe(filePath, testSchema)
expect(result).toEqual({ name: "test", value: 42 })
})
//#given a non-existent file
//#when reading with readJsonSafe
//#then it should return null
it("returns null for non-existent file", () => {
const testSchema = z.object({ name: z.string() })
const result = readJsonSafe(join(TEST_DIR, "missing.json"), testSchema)
expect(result).toBeNull()
})
//#given invalid JSON content
//#when reading with readJsonSafe
//#then it should return null
it("returns null for invalid JSON", () => {
const testSchema = z.object({ name: z.string() })
const filePath = join(TEST_DIR, "invalid.json")
writeFileSync(filePath, "not valid json")
const result = readJsonSafe(filePath, testSchema)
expect(result).toBeNull()
})
//#given JSON that doesn't match schema
//#when reading with readJsonSafe
//#then it should return null
it("returns null for schema mismatch", () => {
const testSchema = z.object({ name: z.string(), required: z.number() })
const filePath = join(TEST_DIR, "mismatch.json")
writeFileSync(filePath, JSON.stringify({ name: "test" }))
const result = readJsonSafe(filePath, testSchema)
expect(result).toBeNull()
})
})
describe("writeJsonAtomic", () => {
//#given data to write
//#when calling writeJsonAtomic
//#then it should write to file atomically
it("writes JSON atomically", () => {
const filePath = join(TEST_DIR, "atomic.json")
const data = { key: "value", number: 123 }
writeJsonAtomic(filePath, data)
const content = readFileSync(filePath, "utf-8")
expect(JSON.parse(content)).toEqual(data)
})
//#given a deeply nested path
//#when calling writeJsonAtomic
//#then it should create parent directories
it("creates parent directories", () => {
const filePath = join(TEST_DIR, "deep", "nested", "file.json")
writeJsonAtomic(filePath, { test: true })
expect(existsSync(filePath)).toBe(true)
})
})
})

View File

@@ -0,0 +1,82 @@
import { join, dirname } from "path"
import { existsSync, mkdirSync, readFileSync, writeFileSync, renameSync, unlinkSync } from "fs"
import { homedir } from "os"
import type { z } from "zod"
import type { OhMyOpenCodeConfig } from "../../config/schema"
export function getTaskDir(listId: string, config: Partial<OhMyOpenCodeConfig>): string {
const tasksConfig = config.sisyphus?.tasks
if (tasksConfig?.claude_code_compat) {
return join(homedir(), ".cache", "claude-code", "tasks", listId)
}
const storagePath = tasksConfig?.storage_path ?? ".sisyphus/tasks"
return join(process.cwd(), storagePath, listId)
}
export function getTaskPath(listId: string, taskId: string, config: Partial<OhMyOpenCodeConfig>): string {
return join(getTaskDir(listId, config), `${taskId}.json`)
}
export function getTeamDir(teamName: string, config: Partial<OhMyOpenCodeConfig>): string {
const swarmConfig = config.sisyphus?.swarm
if (swarmConfig?.storage_path?.includes("claude")) {
return join(homedir(), ".claude", "teams", teamName)
}
const storagePath = swarmConfig?.storage_path ?? ".sisyphus/teams"
return join(process.cwd(), storagePath, teamName)
}
export function getInboxPath(teamName: string, agentName: string, config: Partial<OhMyOpenCodeConfig>): string {
return join(getTeamDir(teamName, config), "inboxes", `${agentName}.json`)
}
export function ensureDir(dirPath: string): void {
if (!existsSync(dirPath)) {
mkdirSync(dirPath, { recursive: true })
}
}
export function readJsonSafe<T>(filePath: string, schema: z.ZodType<T>): T | null {
try {
if (!existsSync(filePath)) {
return null
}
const content = readFileSync(filePath, "utf-8")
const parsed = JSON.parse(content)
const result = schema.safeParse(parsed)
if (!result.success) {
return null
}
return result.data
} catch {
return null
}
}
export function writeJsonAtomic(filePath: string, data: unknown): void {
const dir = dirname(filePath)
ensureDir(dir)
const tempPath = `${filePath}.tmp.${Date.now()}`
try {
writeFileSync(tempPath, JSON.stringify(data, null, 2), "utf-8")
renameSync(tempPath, filePath)
} catch (error) {
try {
if (existsSync(tempPath)) {
unlinkSync(tempPath)
}
} catch {
// Ignore cleanup errors
}
throw error
}
}

View File

@@ -0,0 +1,82 @@
import { describe, it, expect } from "bun:test"
import { TaskSchema, TaskStatusSchema, type Task } from "./types"
describe("TaskSchema", () => {
//#given a valid task object
//#when parsing with TaskSchema
//#then it should succeed
it("parses valid task object", () => {
const validTask = {
id: "1",
subject: "Fix authentication bug",
description: "Users report 401 errors",
status: "pending",
blocks: [],
blockedBy: [],
}
const result = TaskSchema.safeParse(validTask)
expect(result.success).toBe(true)
})
//#given a task with all optional fields
//#when parsing with TaskSchema
//#then it should succeed
it("parses task with optional fields", () => {
const taskWithOptionals = {
id: "2",
subject: "Add unit tests",
description: "Write tests for auth module",
activeForm: "Adding unit tests",
owner: "agent-001",
status: "in_progress",
blocks: ["3"],
blockedBy: ["1"],
metadata: { priority: "high", labels: ["bug"] },
}
const result = TaskSchema.safeParse(taskWithOptionals)
expect(result.success).toBe(true)
})
//#given an invalid status value
//#when parsing with TaskSchema
//#then it should fail
it("rejects invalid status", () => {
const invalidTask = {
id: "1",
subject: "Test",
description: "Test",
status: "invalid_status",
blocks: [],
blockedBy: [],
}
const result = TaskSchema.safeParse(invalidTask)
expect(result.success).toBe(false)
})
//#given missing required fields
//#when parsing with TaskSchema
//#then it should fail
it("rejects missing required fields", () => {
const invalidTask = {
id: "1",
// missing subject, description, status, blocks, blockedBy
}
const result = TaskSchema.safeParse(invalidTask)
expect(result.success).toBe(false)
})
})
describe("TaskStatusSchema", () => {
//#given valid status values
//#when parsing
//#then all should succeed
it("accepts valid statuses", () => {
expect(TaskStatusSchema.safeParse("pending").success).toBe(true)
expect(TaskStatusSchema.safeParse("in_progress").success).toBe(true)
expect(TaskStatusSchema.safeParse("completed").success).toBe(true)
})
})

View File

@@ -0,0 +1,41 @@
import { z } from "zod"
export const TaskStatusSchema = z.enum(["pending", "in_progress", "completed"])
export type TaskStatus = z.infer<typeof TaskStatusSchema>
export const TaskSchema = z.object({
id: z.string(),
subject: z.string(),
description: z.string(),
activeForm: z.string().optional(),
owner: z.string().optional(),
status: TaskStatusSchema,
blocks: z.array(z.string()),
blockedBy: z.array(z.string()),
metadata: z.record(z.string(), z.unknown()).optional(),
})
export type Task = z.infer<typeof TaskSchema>
export const TaskCreateInputSchema = z.object({
subject: z.string().describe("Task title"),
description: z.string().describe("Detailed description"),
activeForm: z.string().optional().describe("Text shown when in progress"),
metadata: z.record(z.string(), z.unknown()).optional(),
})
export type TaskCreateInput = z.infer<typeof TaskCreateInputSchema>
export const TaskUpdateInputSchema = z.object({
taskId: z.string().describe("Task ID to update"),
subject: z.string().optional(),
description: z.string().optional(),
activeForm: z.string().optional(),
status: z.enum(["pending", "in_progress", "completed", "deleted"]).optional(),
addBlocks: z.array(z.string()).optional().describe("Task IDs this task will block"),
addBlockedBy: z.array(z.string()).optional().describe("Task IDs that block this task"),
owner: z.string().optional(),
metadata: z.record(z.string(), z.unknown()).optional(),
})
export type TaskUpdateInput = z.infer<typeof TaskUpdateInputSchema>

View File

@@ -0,0 +1,97 @@
import type { TmuxConfig } from "../../config/schema"
import type { PaneAction, WindowState } from "./types"
import { spawnTmuxPane, closeTmuxPane, enforceMainPaneWidth, replaceTmuxPane } from "../../shared/tmux"
import { log } from "../../shared"
export interface ActionResult {
success: boolean
paneId?: string
error?: string
}
export interface ExecuteActionsResult {
success: boolean
spawnedPaneId?: string
results: Array<{ action: PaneAction; result: ActionResult }>
}
export interface ExecuteContext {
config: TmuxConfig
serverUrl: string
windowState: WindowState
}
async function enforceMainPane(windowState: WindowState): Promise<void> {
if (!windowState.mainPane) return
await enforceMainPaneWidth(windowState.mainPane.paneId, windowState.windowWidth)
}
export async function executeAction(
action: PaneAction,
ctx: ExecuteContext
): Promise<ActionResult> {
if (action.type === "close") {
const success = await closeTmuxPane(action.paneId)
if (success) {
await enforceMainPane(ctx.windowState)
}
return { success }
}
if (action.type === "replace") {
const result = await replaceTmuxPane(
action.paneId,
action.newSessionId,
action.description,
ctx.config,
ctx.serverUrl
)
return {
success: result.success,
paneId: result.paneId,
}
}
const result = await spawnTmuxPane(
action.sessionId,
action.description,
ctx.config,
ctx.serverUrl,
action.targetPaneId,
action.splitDirection
)
if (result.success) {
await enforceMainPane(ctx.windowState)
}
return {
success: result.success,
paneId: result.paneId,
}
}
export async function executeActions(
actions: PaneAction[],
ctx: ExecuteContext
): Promise<ExecuteActionsResult> {
const results: Array<{ action: PaneAction; result: ActionResult }> = []
let spawnedPaneId: string | undefined
for (const action of actions) {
log("[action-executor] executing", { type: action.type })
const result = await executeAction(action, ctx)
results.push({ action, result })
if (!result.success) {
log("[action-executor] action failed", { type: action.type, error: result.error })
return { success: false, results }
}
if ((action.type === "spawn" || action.type === "replace") && result.paneId) {
spawnedPaneId = result.paneId
}
}
return { success: true, spawnedPaneId, results }
}

View File

@@ -0,0 +1,354 @@
import { describe, it, expect } from "bun:test"
import {
decideSpawnActions,
calculateCapacity,
canSplitPane,
canSplitPaneAnyDirection,
getBestSplitDirection,
type SessionMapping
} from "./decision-engine"
import type { WindowState, CapacityConfig, TmuxPaneInfo } from "./types"
import { MIN_PANE_WIDTH, MIN_PANE_HEIGHT } from "./types"
const MIN_SPLIT_WIDTH = 2 * MIN_PANE_WIDTH + 1
const MIN_SPLIT_HEIGHT = 2 * MIN_PANE_HEIGHT + 1
describe("canSplitPane", () => {
const createPane = (width: number, height: number): TmuxPaneInfo => ({
paneId: "%1",
width,
height,
left: 100,
top: 0,
title: "test",
isActive: false,
})
it("returns true for horizontal split when width >= 2*MIN+1", () => {
//#given - pane with exactly minimum splittable width (107)
const pane = createPane(MIN_SPLIT_WIDTH, 20)
//#when
const result = canSplitPane(pane, "-h")
//#then
expect(result).toBe(true)
})
it("returns false for horizontal split when width < 2*MIN+1", () => {
//#given - pane just below minimum splittable width
const pane = createPane(MIN_SPLIT_WIDTH - 1, 20)
//#when
const result = canSplitPane(pane, "-h")
//#then
expect(result).toBe(false)
})
it("returns true for vertical split when height >= 2*MIN+1", () => {
//#given - pane with exactly minimum splittable height (23)
const pane = createPane(50, MIN_SPLIT_HEIGHT)
//#when
const result = canSplitPane(pane, "-v")
//#then
expect(result).toBe(true)
})
it("returns false for vertical split when height < 2*MIN+1", () => {
//#given - pane just below minimum splittable height
const pane = createPane(50, MIN_SPLIT_HEIGHT - 1)
//#when
const result = canSplitPane(pane, "-v")
//#then
expect(result).toBe(false)
})
})
describe("canSplitPaneAnyDirection", () => {
const createPane = (width: number, height: number): TmuxPaneInfo => ({
paneId: "%1",
width,
height,
left: 100,
top: 0,
title: "test",
isActive: false,
})
it("returns true when can split horizontally but not vertically", () => {
//#given
const pane = createPane(MIN_SPLIT_WIDTH, MIN_SPLIT_HEIGHT - 1)
//#when
const result = canSplitPaneAnyDirection(pane)
//#then
expect(result).toBe(true)
})
it("returns true when can split vertically but not horizontally", () => {
//#given
const pane = createPane(MIN_SPLIT_WIDTH - 1, MIN_SPLIT_HEIGHT)
//#when
const result = canSplitPaneAnyDirection(pane)
//#then
expect(result).toBe(true)
})
it("returns false when cannot split in any direction", () => {
//#given - pane too small in both dimensions
const pane = createPane(MIN_SPLIT_WIDTH - 1, MIN_SPLIT_HEIGHT - 1)
//#when
const result = canSplitPaneAnyDirection(pane)
//#then
expect(result).toBe(false)
})
})
describe("getBestSplitDirection", () => {
const createPane = (width: number, height: number): TmuxPaneInfo => ({
paneId: "%1",
width,
height,
left: 100,
top: 0,
title: "test",
isActive: false,
})
it("returns -h when only horizontal split possible", () => {
//#given
const pane = createPane(MIN_SPLIT_WIDTH, MIN_SPLIT_HEIGHT - 1)
//#when
const result = getBestSplitDirection(pane)
//#then
expect(result).toBe("-h")
})
it("returns -v when only vertical split possible", () => {
//#given
const pane = createPane(MIN_SPLIT_WIDTH - 1, MIN_SPLIT_HEIGHT)
//#when
const result = getBestSplitDirection(pane)
//#then
expect(result).toBe("-v")
})
it("returns null when no split possible", () => {
//#given
const pane = createPane(MIN_SPLIT_WIDTH - 1, MIN_SPLIT_HEIGHT - 1)
//#when
const result = getBestSplitDirection(pane)
//#then
expect(result).toBe(null)
})
it("returns -h when width >= height and both splits possible", () => {
//#given - wider than tall
const pane = createPane(MIN_SPLIT_WIDTH + 10, MIN_SPLIT_HEIGHT)
//#when
const result = getBestSplitDirection(pane)
//#then
expect(result).toBe("-h")
})
it("returns -v when height > width and both splits possible", () => {
//#given - taller than wide (height needs to be > width for -v)
const pane = createPane(MIN_SPLIT_WIDTH, MIN_SPLIT_WIDTH + 10)
//#when
const result = getBestSplitDirection(pane)
//#then
expect(result).toBe("-v")
})
})
describe("decideSpawnActions", () => {
const defaultConfig: CapacityConfig = {
mainPaneMinWidth: 120,
agentPaneWidth: 40,
}
const createWindowState = (
windowWidth: number,
windowHeight: number,
agentPanes: Array<{ paneId: string; width: number; height: number; left: number; top: number }> = []
): WindowState => ({
windowWidth,
windowHeight,
mainPane: { paneId: "%0", width: Math.floor(windowWidth / 2), height: windowHeight, left: 0, top: 0, title: "main", isActive: true },
agentPanes: agentPanes.map((p, i) => ({
...p,
title: `agent-${i}`,
isActive: false,
})),
})
describe("minimum size enforcement", () => {
it("returns canSpawn=false when window too small", () => {
//#given - window smaller than minimum pane size
const state = createWindowState(50, 5)
//#when
const result = decideSpawnActions(state, "ses1", "test", defaultConfig, [])
//#then
expect(result.canSpawn).toBe(false)
expect(result.reason).toContain("too small")
})
it("returns canSpawn=true when main pane can be split", () => {
//#given - main pane width >= 2*MIN_PANE_WIDTH+1 = 107
const state = createWindowState(220, 44)
//#when
const result = decideSpawnActions(state, "ses1", "test", defaultConfig, [])
//#then
expect(result.canSpawn).toBe(true)
expect(result.actions.length).toBe(1)
expect(result.actions[0].type).toBe("spawn")
})
it("closes oldest pane when existing panes are too small to split", () => {
//#given - existing pane is below minimum splittable size
const state = createWindowState(220, 30, [
{ paneId: "%1", width: 50, height: 15, left: 110, top: 0 },
])
const mappings: SessionMapping[] = [
{ sessionId: "old-ses", paneId: "%1", createdAt: new Date("2024-01-01") },
]
//#when
const result = decideSpawnActions(state, "ses1", "test", defaultConfig, mappings)
//#then
expect(result.canSpawn).toBe(true)
expect(result.actions.length).toBe(2)
expect(result.actions[0].type).toBe("close")
expect(result.actions[1].type).toBe("spawn")
})
it("can spawn when existing pane is large enough to split", () => {
//#given - existing pane is above minimum splittable size
const state = createWindowState(320, 50, [
{ paneId: "%1", width: MIN_SPLIT_WIDTH + 10, height: MIN_SPLIT_HEIGHT + 10, left: 160, top: 0 },
])
//#when
const result = decideSpawnActions(state, "ses1", "test", defaultConfig, [])
//#then
expect(result.canSpawn).toBe(true)
expect(result.actions.length).toBe(1)
expect(result.actions[0].type).toBe("spawn")
})
})
describe("basic spawn decisions", () => {
it("returns canSpawn=true when capacity allows new pane", () => {
//#given - 220x44 window, mainPane width=110 >= MIN_SPLIT_WIDTH(107)
const state = createWindowState(220, 44)
//#when
const result = decideSpawnActions(state, "ses1", "test", defaultConfig, [])
//#then
expect(result.canSpawn).toBe(true)
expect(result.actions.length).toBe(1)
expect(result.actions[0].type).toBe("spawn")
})
it("spawns with splitDirection", () => {
//#given
const state = createWindowState(212, 44, [
{ paneId: "%1", width: MIN_SPLIT_WIDTH, height: MIN_SPLIT_HEIGHT, left: 106, top: 0 },
])
//#when
const result = decideSpawnActions(state, "ses1", "test", defaultConfig, [])
//#then
expect(result.canSpawn).toBe(true)
expect(result.actions[0].type).toBe("spawn")
if (result.actions[0].type === "spawn") {
expect(result.actions[0].sessionId).toBe("ses1")
expect(result.actions[0].splitDirection).toBeDefined()
}
})
it("returns canSpawn=false when no main pane", () => {
//#given
const state: WindowState = { windowWidth: 212, windowHeight: 44, mainPane: null, agentPanes: [] }
//#when
const result = decideSpawnActions(state, "ses1", "test", defaultConfig, [])
//#then
expect(result.canSpawn).toBe(false)
expect(result.reason).toBe("no main pane found")
})
})
})
describe("calculateCapacity", () => {
it("calculates 2D grid capacity (cols x rows)", () => {
//#given - 212x44 window (user's actual screen)
//#when
const capacity = calculateCapacity(212, 44)
//#then - availableWidth=106, cols=(106+1)/(52+1)=2, rows=(44+1)/(11+1)=3 (accounting for dividers)
expect(capacity.cols).toBe(2)
expect(capacity.rows).toBe(3)
expect(capacity.total).toBe(6)
})
it("returns 0 cols when agent area too narrow", () => {
//#given - window too narrow for even 1 agent pane
//#when
const capacity = calculateCapacity(100, 44)
//#then - availableWidth=50, cols=50/53=0
expect(capacity.cols).toBe(0)
expect(capacity.total).toBe(0)
})
it("returns 0 rows when window too short", () => {
//#given - window too short
//#when
const capacity = calculateCapacity(212, 10)
//#then - rows=10/11=0
expect(capacity.rows).toBe(0)
expect(capacity.total).toBe(0)
})
it("scales with larger screens but caps at MAX_GRID_SIZE=4", () => {
//#given - larger 4K-like screen (400x100)
//#when
const capacity = calculateCapacity(400, 100)
//#then - cols capped at 4, rows capped at 4 (MAX_GRID_SIZE)
expect(capacity.cols).toBe(3)
expect(capacity.rows).toBe(4)
expect(capacity.total).toBe(12)
})
})

View File

@@ -0,0 +1,386 @@
import type { WindowState, PaneAction, SpawnDecision, CapacityConfig, TmuxPaneInfo, SplitDirection } from "./types"
import { MIN_PANE_WIDTH, MIN_PANE_HEIGHT } from "./types"
export interface SessionMapping {
sessionId: string
paneId: string
createdAt: Date
}
export interface GridCapacity {
cols: number
rows: number
total: number
}
export interface GridSlot {
row: number
col: number
}
export interface GridPlan {
cols: number
rows: number
slotWidth: number
slotHeight: number
}
export interface SpawnTarget {
targetPaneId: string
splitDirection: SplitDirection
}
const MAIN_PANE_RATIO = 0.5
const MAX_COLS = 2
const MAX_ROWS = 3
const MAX_GRID_SIZE = 4
const DIVIDER_SIZE = 1
const MIN_SPLIT_WIDTH = 2 * MIN_PANE_WIDTH + DIVIDER_SIZE
const MIN_SPLIT_HEIGHT = 2 * MIN_PANE_HEIGHT + DIVIDER_SIZE
export function getColumnCount(paneCount: number): number {
if (paneCount <= 0) return 1
return Math.min(MAX_COLS, Math.max(1, Math.ceil(paneCount / MAX_ROWS)))
}
export function getColumnWidth(agentAreaWidth: number, paneCount: number): number {
const cols = getColumnCount(paneCount)
const dividersWidth = (cols - 1) * DIVIDER_SIZE
return Math.floor((agentAreaWidth - dividersWidth) / cols)
}
export function isSplittableAtCount(agentAreaWidth: number, paneCount: number): boolean {
const columnWidth = getColumnWidth(agentAreaWidth, paneCount)
return columnWidth >= MIN_SPLIT_WIDTH
}
export function findMinimalEvictions(agentAreaWidth: number, currentCount: number): number | null {
for (let k = 1; k <= currentCount; k++) {
if (isSplittableAtCount(agentAreaWidth, currentCount - k)) {
return k
}
}
return null
}
export function canSplitPane(pane: TmuxPaneInfo, direction: SplitDirection): boolean {
if (direction === "-h") {
return pane.width >= MIN_SPLIT_WIDTH
}
return pane.height >= MIN_SPLIT_HEIGHT
}
export function canSplitPaneAnyDirection(pane: TmuxPaneInfo): boolean {
return pane.width >= MIN_SPLIT_WIDTH || pane.height >= MIN_SPLIT_HEIGHT
}
export function getBestSplitDirection(pane: TmuxPaneInfo): SplitDirection | null {
const canH = pane.width >= MIN_SPLIT_WIDTH
const canV = pane.height >= MIN_SPLIT_HEIGHT
if (!canH && !canV) return null
if (canH && !canV) return "-h"
if (!canH && canV) return "-v"
return pane.width >= pane.height ? "-h" : "-v"
}
export function calculateCapacity(
windowWidth: number,
windowHeight: number
): GridCapacity {
const availableWidth = Math.floor(windowWidth * (1 - MAIN_PANE_RATIO))
const cols = Math.min(MAX_GRID_SIZE, Math.max(0, Math.floor((availableWidth + DIVIDER_SIZE) / (MIN_PANE_WIDTH + DIVIDER_SIZE))))
const rows = Math.min(MAX_GRID_SIZE, Math.max(0, Math.floor((windowHeight + DIVIDER_SIZE) / (MIN_PANE_HEIGHT + DIVIDER_SIZE))))
const total = cols * rows
return { cols, rows, total }
}
export function computeGridPlan(
windowWidth: number,
windowHeight: number,
paneCount: number
): GridPlan {
const capacity = calculateCapacity(windowWidth, windowHeight)
const { cols: maxCols, rows: maxRows } = capacity
if (maxCols === 0 || maxRows === 0 || paneCount === 0) {
return { cols: 1, rows: 1, slotWidth: 0, slotHeight: 0 }
}
let bestCols = 1
let bestRows = 1
let bestArea = Infinity
for (let rows = 1; rows <= maxRows; rows++) {
for (let cols = 1; cols <= maxCols; cols++) {
if (cols * rows >= paneCount) {
const area = cols * rows
if (area < bestArea || (area === bestArea && rows < bestRows)) {
bestCols = cols
bestRows = rows
bestArea = area
}
}
}
}
const availableWidth = Math.floor(windowWidth * (1 - MAIN_PANE_RATIO))
const slotWidth = Math.floor(availableWidth / bestCols)
const slotHeight = Math.floor(windowHeight / bestRows)
return { cols: bestCols, rows: bestRows, slotWidth, slotHeight }
}
export function mapPaneToSlot(
pane: TmuxPaneInfo,
plan: GridPlan,
mainPaneWidth: number
): GridSlot {
const rightAreaX = mainPaneWidth
const relativeX = Math.max(0, pane.left - rightAreaX)
const relativeY = pane.top
const col = plan.slotWidth > 0
? Math.min(plan.cols - 1, Math.floor(relativeX / plan.slotWidth))
: 0
const row = plan.slotHeight > 0
? Math.min(plan.rows - 1, Math.floor(relativeY / plan.slotHeight))
: 0
return { row, col }
}
function buildOccupancy(
agentPanes: TmuxPaneInfo[],
plan: GridPlan,
mainPaneWidth: number
): Map<string, TmuxPaneInfo> {
const occupancy = new Map<string, TmuxPaneInfo>()
for (const pane of agentPanes) {
const slot = mapPaneToSlot(pane, plan, mainPaneWidth)
const key = `${slot.row}:${slot.col}`
occupancy.set(key, pane)
}
return occupancy
}
function findFirstEmptySlot(
occupancy: Map<string, TmuxPaneInfo>,
plan: GridPlan
): GridSlot {
for (let row = 0; row < plan.rows; row++) {
for (let col = 0; col < plan.cols; col++) {
const key = `${row}:${col}`
if (!occupancy.has(key)) {
return { row, col }
}
}
}
return { row: plan.rows - 1, col: plan.cols - 1 }
}
function findSplittableTarget(
state: WindowState,
preferredDirection?: SplitDirection
): SpawnTarget | null {
if (!state.mainPane) return null
const existingCount = state.agentPanes.length
if (existingCount === 0) {
const virtualMainPane: TmuxPaneInfo = {
...state.mainPane,
width: state.windowWidth,
}
if (canSplitPane(virtualMainPane, "-h")) {
return { targetPaneId: state.mainPane.paneId, splitDirection: "-h" }
}
return null
}
const plan = computeGridPlan(state.windowWidth, state.windowHeight, existingCount + 1)
const mainPaneWidth = Math.floor(state.windowWidth * MAIN_PANE_RATIO)
const occupancy = buildOccupancy(state.agentPanes, plan, mainPaneWidth)
const targetSlot = findFirstEmptySlot(occupancy, plan)
const leftKey = `${targetSlot.row}:${targetSlot.col - 1}`
const leftPane = occupancy.get(leftKey)
if (leftPane && canSplitPane(leftPane, "-h")) {
return { targetPaneId: leftPane.paneId, splitDirection: "-h" }
}
const aboveKey = `${targetSlot.row - 1}:${targetSlot.col}`
const abovePane = occupancy.get(aboveKey)
if (abovePane && canSplitPane(abovePane, "-v")) {
return { targetPaneId: abovePane.paneId, splitDirection: "-v" }
}
const splittablePanes = state.agentPanes
.map(p => ({ pane: p, direction: getBestSplitDirection(p) }))
.filter(({ direction }) => direction !== null)
.sort((a, b) => (b.pane.width * b.pane.height) - (a.pane.width * a.pane.height))
if (splittablePanes.length > 0) {
const best = splittablePanes[0]
return { targetPaneId: best.pane.paneId, splitDirection: best.direction! }
}
return null
}
export function findSpawnTarget(state: WindowState): SpawnTarget | null {
return findSplittableTarget(state)
}
function findOldestSession(mappings: SessionMapping[]): SessionMapping | null {
if (mappings.length === 0) return null
return mappings.reduce((oldest, current) =>
current.createdAt < oldest.createdAt ? current : oldest
)
}
function findOldestAgentPane(
agentPanes: TmuxPaneInfo[],
sessionMappings: SessionMapping[]
): TmuxPaneInfo | null {
if (agentPanes.length === 0) return null
const paneIdToAge = new Map<string, Date>()
for (const mapping of sessionMappings) {
paneIdToAge.set(mapping.paneId, mapping.createdAt)
}
const panesWithAge = agentPanes
.map(p => ({ pane: p, age: paneIdToAge.get(p.paneId) }))
.filter(({ age }) => age !== undefined)
.sort((a, b) => a.age!.getTime() - b.age!.getTime())
if (panesWithAge.length > 0) {
return panesWithAge[0].pane
}
return agentPanes.reduce((oldest, p) => {
if (p.top < oldest.top || (p.top === oldest.top && p.left < oldest.left)) {
return p
}
return oldest
})
}
export function decideSpawnActions(
state: WindowState,
sessionId: string,
description: string,
_config: CapacityConfig,
sessionMappings: SessionMapping[]
): SpawnDecision {
if (!state.mainPane) {
return { canSpawn: false, actions: [], reason: "no main pane found" }
}
const agentAreaWidth = Math.floor(state.windowWidth * (1 - MAIN_PANE_RATIO))
const currentCount = state.agentPanes.length
if (agentAreaWidth < MIN_PANE_WIDTH) {
return {
canSpawn: false,
actions: [],
reason: `window too small for agent panes: ${state.windowWidth}x${state.windowHeight}`,
}
}
const oldestPane = findOldestAgentPane(state.agentPanes, sessionMappings)
const oldestMapping = oldestPane
? sessionMappings.find(m => m.paneId === oldestPane.paneId)
: null
if (currentCount === 0) {
const virtualMainPane: TmuxPaneInfo = { ...state.mainPane, width: state.windowWidth }
if (canSplitPane(virtualMainPane, "-h")) {
return {
canSpawn: true,
actions: [{
type: "spawn",
sessionId,
description,
targetPaneId: state.mainPane.paneId,
splitDirection: "-h"
}]
}
}
return { canSpawn: false, actions: [], reason: "mainPane too small to split" }
}
if (isSplittableAtCount(agentAreaWidth, currentCount)) {
const spawnTarget = findSplittableTarget(state)
if (spawnTarget) {
return {
canSpawn: true,
actions: [{
type: "spawn",
sessionId,
description,
targetPaneId: spawnTarget.targetPaneId,
splitDirection: spawnTarget.splitDirection
}]
}
}
}
const minEvictions = findMinimalEvictions(agentAreaWidth, currentCount)
if (minEvictions === 1 && oldestPane) {
return {
canSpawn: true,
actions: [
{
type: "close",
paneId: oldestPane.paneId,
sessionId: oldestMapping?.sessionId || ""
},
{
type: "spawn",
sessionId,
description,
targetPaneId: state.mainPane.paneId,
splitDirection: "-h"
}
],
reason: "closed 1 pane to make room for split"
}
}
if (oldestPane) {
return {
canSpawn: true,
actions: [{
type: "replace",
paneId: oldestPane.paneId,
oldSessionId: oldestMapping?.sessionId || "",
newSessionId: sessionId,
description
}],
reason: "replaced oldest pane (no split possible)"
}
}
return {
canSpawn: false,
actions: [],
reason: "no pane available to replace"
}
}
export function decideCloseAction(
state: WindowState,
sessionId: string,
sessionMappings: SessionMapping[]
): PaneAction | null {
const mapping = sessionMappings.find((m) => m.sessionId === sessionId)
if (!mapping) return null
const paneExists = state.agentPanes.some((p) => p.paneId === mapping.paneId)
if (!paneExists) return null
return { type: "close", paneId: mapping.paneId, sessionId }
}

View File

@@ -0,0 +1,5 @@
export * from "./manager"
export * from "./types"
export * from "./pane-state-querier"
export * from "./decision-engine"
export * from "./action-executor"

View File

@@ -0,0 +1,690 @@
import { describe, test, expect, mock, beforeEach } from 'bun:test'
import type { TmuxConfig } from '../../config/schema'
import type { WindowState, PaneAction } from './types'
import type { ActionResult, ExecuteContext } from './action-executor'
type ExecuteActionsResult = {
success: boolean
spawnedPaneId?: string
results: Array<{ action: PaneAction; result: ActionResult }>
}
const mockQueryWindowState = mock<(paneId: string) => Promise<WindowState | null>>(
async () => ({
windowWidth: 212,
windowHeight: 44,
mainPane: { paneId: '%0', width: 106, height: 44, left: 0, top: 0, title: 'main', isActive: true },
agentPanes: [],
})
)
const mockPaneExists = mock<(paneId: string) => Promise<boolean>>(async () => true)
const mockExecuteActions = mock<(
actions: PaneAction[],
ctx: ExecuteContext
) => Promise<ExecuteActionsResult>>(async () => ({
success: true,
spawnedPaneId: '%mock',
results: [],
}))
const mockExecuteAction = mock<(
action: PaneAction,
ctx: ExecuteContext
) => Promise<ActionResult>>(async () => ({ success: true }))
const mockIsInsideTmux = mock<() => boolean>(() => true)
const mockGetCurrentPaneId = mock<() => string | undefined>(() => '%0')
mock.module('./pane-state-querier', () => ({
queryWindowState: mockQueryWindowState,
paneExists: mockPaneExists,
getRightmostAgentPane: (state: WindowState) =>
state.agentPanes.length > 0
? state.agentPanes.reduce((r, p) => (p.left > r.left ? p : r))
: null,
getOldestAgentPane: (state: WindowState) =>
state.agentPanes.length > 0
? state.agentPanes.reduce((o, p) => (p.left < o.left ? p : o))
: null,
}))
mock.module('./action-executor', () => ({
executeActions: mockExecuteActions,
executeAction: mockExecuteAction,
}))
mock.module('../../shared/tmux', () => ({
isInsideTmux: mockIsInsideTmux,
getCurrentPaneId: mockGetCurrentPaneId,
POLL_INTERVAL_BACKGROUND_MS: 2000,
SESSION_TIMEOUT_MS: 600000,
SESSION_MISSING_GRACE_MS: 6000,
SESSION_READY_POLL_INTERVAL_MS: 100,
SESSION_READY_TIMEOUT_MS: 500,
}))
const trackedSessions = new Set<string>()
function createMockContext(overrides?: {
sessionStatusResult?: { data?: Record<string, { type: string }> }
}) {
return {
serverUrl: new URL('http://localhost:4096'),
client: {
session: {
status: mock(async () => {
if (overrides?.sessionStatusResult) {
return overrides.sessionStatusResult
}
const data: Record<string, { type: string }> = {}
for (const sessionId of trackedSessions) {
data[sessionId] = { type: 'running' }
}
return { data }
}),
},
},
} as any
}
function createSessionCreatedEvent(
id: string,
parentID: string | undefined,
title: string
) {
return {
type: 'session.created',
properties: {
info: { id, parentID, title },
},
}
}
function createWindowState(overrides?: Partial<WindowState>): WindowState {
return {
windowWidth: 220,
windowHeight: 44,
mainPane: { paneId: '%0', width: 110, height: 44, left: 0, top: 0, title: 'main', isActive: true },
agentPanes: [],
...overrides,
}
}
describe('TmuxSessionManager', () => {
beforeEach(() => {
mockQueryWindowState.mockClear()
mockPaneExists.mockClear()
mockExecuteActions.mockClear()
mockExecuteAction.mockClear()
mockIsInsideTmux.mockClear()
mockGetCurrentPaneId.mockClear()
trackedSessions.clear()
mockQueryWindowState.mockImplementation(async () => createWindowState())
mockExecuteActions.mockImplementation(async (actions) => {
for (const action of actions) {
if (action.type === 'spawn') {
trackedSessions.add(action.sessionId)
}
}
return {
success: true,
spawnedPaneId: '%mock',
results: [],
}
})
})
describe('constructor', () => {
test('enabled when config.enabled=true and isInsideTmux=true', async () => {
//#given
mockIsInsideTmux.mockReturnValue(true)
const { TmuxSessionManager } = await import('./manager')
const ctx = createMockContext()
const config: TmuxConfig = {
enabled: true,
layout: 'main-vertical',
main_pane_size: 60,
main_pane_min_width: 80,
agent_pane_min_width: 40,
}
//#when
const manager = new TmuxSessionManager(ctx, config)
//#then
expect(manager).toBeDefined()
})
test('disabled when config.enabled=true but isInsideTmux=false', async () => {
//#given
mockIsInsideTmux.mockReturnValue(false)
const { TmuxSessionManager } = await import('./manager')
const ctx = createMockContext()
const config: TmuxConfig = {
enabled: true,
layout: 'main-vertical',
main_pane_size: 60,
main_pane_min_width: 80,
agent_pane_min_width: 40,
}
//#when
const manager = new TmuxSessionManager(ctx, config)
//#then
expect(manager).toBeDefined()
})
test('disabled when config.enabled=false', async () => {
//#given
mockIsInsideTmux.mockReturnValue(true)
const { TmuxSessionManager } = await import('./manager')
const ctx = createMockContext()
const config: TmuxConfig = {
enabled: false,
layout: 'main-vertical',
main_pane_size: 60,
main_pane_min_width: 80,
agent_pane_min_width: 40,
}
//#when
const manager = new TmuxSessionManager(ctx, config)
//#then
expect(manager).toBeDefined()
})
})
describe('onSessionCreated', () => {
test('first agent spawns from source pane via decision engine', async () => {
//#given
mockIsInsideTmux.mockReturnValue(true)
mockQueryWindowState.mockImplementation(async () => createWindowState())
const { TmuxSessionManager } = await import('./manager')
const ctx = createMockContext()
const config: TmuxConfig = {
enabled: true,
layout: 'main-vertical',
main_pane_size: 60,
main_pane_min_width: 80,
agent_pane_min_width: 40,
}
const manager = new TmuxSessionManager(ctx, config)
const event = createSessionCreatedEvent(
'ses_child',
'ses_parent',
'Background: Test Task'
)
//#when
await manager.onSessionCreated(event)
//#then
expect(mockQueryWindowState).toHaveBeenCalledTimes(1)
expect(mockExecuteActions).toHaveBeenCalledTimes(1)
const call = mockExecuteActions.mock.calls[0]
expect(call).toBeDefined()
const actionsArg = call![0]
expect(actionsArg).toHaveLength(1)
expect(actionsArg[0].type).toBe('spawn')
if (actionsArg[0].type === 'spawn') {
expect(actionsArg[0].sessionId).toBe('ses_child')
expect(actionsArg[0].description).toBe('Background: Test Task')
expect(actionsArg[0].targetPaneId).toBe('%0')
expect(actionsArg[0].splitDirection).toBe('-h')
}
})
test('second agent spawns with correct split direction', async () => {
//#given
mockIsInsideTmux.mockReturnValue(true)
let callCount = 0
mockQueryWindowState.mockImplementation(async () => {
callCount++
if (callCount === 1) {
return createWindowState()
}
return createWindowState({
agentPanes: [
{
paneId: '%1',
width: 40,
height: 44,
left: 100,
top: 0,
title: 'omo-subagent-Task 1',
isActive: false,
},
],
})
})
const { TmuxSessionManager } = await import('./manager')
const ctx = createMockContext()
const config: TmuxConfig = {
enabled: true,
layout: 'main-vertical',
main_pane_size: 60,
main_pane_min_width: 80,
agent_pane_min_width: 40,
}
const manager = new TmuxSessionManager(ctx, config)
//#when - first agent
await manager.onSessionCreated(
createSessionCreatedEvent('ses_1', 'ses_parent', 'Task 1')
)
mockExecuteActions.mockClear()
//#when - second agent
await manager.onSessionCreated(
createSessionCreatedEvent('ses_2', 'ses_parent', 'Task 2')
)
//#then
expect(mockExecuteActions).toHaveBeenCalledTimes(1)
const call = mockExecuteActions.mock.calls[0]
expect(call).toBeDefined()
const actionsArg = call![0]
expect(actionsArg).toHaveLength(1)
expect(actionsArg[0].type).toBe('spawn')
})
test('does NOT spawn pane when session has no parentID', async () => {
//#given
mockIsInsideTmux.mockReturnValue(true)
const { TmuxSessionManager } = await import('./manager')
const ctx = createMockContext()
const config: TmuxConfig = {
enabled: true,
layout: 'main-vertical',
main_pane_size: 60,
main_pane_min_width: 80,
agent_pane_min_width: 40,
}
const manager = new TmuxSessionManager(ctx, config)
const event = createSessionCreatedEvent('ses_root', undefined, 'Root Session')
//#when
await manager.onSessionCreated(event)
//#then
expect(mockExecuteActions).toHaveBeenCalledTimes(0)
})
test('does NOT spawn pane when disabled', async () => {
//#given
mockIsInsideTmux.mockReturnValue(true)
const { TmuxSessionManager } = await import('./manager')
const ctx = createMockContext()
const config: TmuxConfig = {
enabled: false,
layout: 'main-vertical',
main_pane_size: 60,
main_pane_min_width: 80,
agent_pane_min_width: 40,
}
const manager = new TmuxSessionManager(ctx, config)
const event = createSessionCreatedEvent(
'ses_child',
'ses_parent',
'Background: Test Task'
)
//#when
await manager.onSessionCreated(event)
//#then
expect(mockExecuteActions).toHaveBeenCalledTimes(0)
})
test('does NOT spawn pane for non session.created event type', async () => {
//#given
mockIsInsideTmux.mockReturnValue(true)
const { TmuxSessionManager } = await import('./manager')
const ctx = createMockContext()
const config: TmuxConfig = {
enabled: true,
layout: 'main-vertical',
main_pane_size: 60,
main_pane_min_width: 80,
agent_pane_min_width: 40,
}
const manager = new TmuxSessionManager(ctx, config)
const event = {
type: 'session.deleted',
properties: {
info: { id: 'ses_child', parentID: 'ses_parent', title: 'Task' },
},
}
//#when
await manager.onSessionCreated(event)
//#then
expect(mockExecuteActions).toHaveBeenCalledTimes(0)
})
test('replaces oldest agent when unsplittable (small window)', async () => {
//#given - small window where split is not possible
mockIsInsideTmux.mockReturnValue(true)
mockQueryWindowState.mockImplementation(async () =>
createWindowState({
windowWidth: 160,
windowHeight: 11,
agentPanes: [
{
paneId: '%1',
width: 40,
height: 11,
left: 80,
top: 0,
title: 'omo-subagent-Task 1',
isActive: false,
},
],
})
)
const { TmuxSessionManager } = await import('./manager')
const ctx = createMockContext()
const config: TmuxConfig = {
enabled: true,
layout: 'main-vertical',
main_pane_size: 60,
main_pane_min_width: 120,
agent_pane_min_width: 40,
}
const manager = new TmuxSessionManager(ctx, config)
//#when
await manager.onSessionCreated(
createSessionCreatedEvent('ses_new', 'ses_parent', 'New Task')
)
//#then - with small window, replace action is used instead of close+spawn
expect(mockExecuteActions).toHaveBeenCalledTimes(1)
const call = mockExecuteActions.mock.calls[0]
expect(call).toBeDefined()
const actionsArg = call![0]
expect(actionsArg).toHaveLength(1)
expect(actionsArg[0].type).toBe('replace')
})
})
describe('onSessionDeleted', () => {
test('closes pane when tracked session is deleted', async () => {
//#given
mockIsInsideTmux.mockReturnValue(true)
let stateCallCount = 0
mockQueryWindowState.mockImplementation(async () => {
stateCallCount++
if (stateCallCount === 1) {
return createWindowState()
}
return createWindowState({
agentPanes: [
{
paneId: '%mock',
width: 40,
height: 44,
left: 100,
top: 0,
title: 'omo-subagent-Task',
isActive: false,
},
],
})
})
const { TmuxSessionManager } = await import('./manager')
const ctx = createMockContext()
const config: TmuxConfig = {
enabled: true,
layout: 'main-vertical',
main_pane_size: 60,
main_pane_min_width: 80,
agent_pane_min_width: 40,
}
const manager = new TmuxSessionManager(ctx, config)
await manager.onSessionCreated(
createSessionCreatedEvent(
'ses_child',
'ses_parent',
'Background: Test Task'
)
)
mockExecuteAction.mockClear()
//#when
await manager.onSessionDeleted({ sessionID: 'ses_child' })
//#then
expect(mockExecuteAction).toHaveBeenCalledTimes(1)
const call = mockExecuteAction.mock.calls[0]
expect(call).toBeDefined()
expect(call![0]).toEqual({
type: 'close',
paneId: '%mock',
sessionId: 'ses_child',
})
})
test('does nothing when untracked session is deleted', async () => {
//#given
mockIsInsideTmux.mockReturnValue(true)
const { TmuxSessionManager } = await import('./manager')
const ctx = createMockContext()
const config: TmuxConfig = {
enabled: true,
layout: 'main-vertical',
main_pane_size: 60,
main_pane_min_width: 80,
agent_pane_min_width: 40,
}
const manager = new TmuxSessionManager(ctx, config)
//#when
await manager.onSessionDeleted({ sessionID: 'ses_unknown' })
//#then
expect(mockExecuteAction).toHaveBeenCalledTimes(0)
})
})
describe('cleanup', () => {
test('closes all tracked panes', async () => {
//#given
mockIsInsideTmux.mockReturnValue(true)
let callCount = 0
mockExecuteActions.mockImplementation(async () => {
callCount++
return {
success: true,
spawnedPaneId: `%${callCount}`,
results: [],
}
})
const { TmuxSessionManager } = await import('./manager')
const ctx = createMockContext()
const config: TmuxConfig = {
enabled: true,
layout: 'main-vertical',
main_pane_size: 60,
main_pane_min_width: 80,
agent_pane_min_width: 40,
}
const manager = new TmuxSessionManager(ctx, config)
await manager.onSessionCreated(
createSessionCreatedEvent('ses_1', 'ses_parent', 'Task 1')
)
await manager.onSessionCreated(
createSessionCreatedEvent('ses_2', 'ses_parent', 'Task 2')
)
mockExecuteAction.mockClear()
//#when
await manager.cleanup()
//#then
expect(mockExecuteAction).toHaveBeenCalledTimes(2)
})
})
})
describe('DecisionEngine', () => {
describe('calculateCapacity', () => {
test('calculates correct 2D grid capacity', async () => {
//#given
const { calculateCapacity } = await import('./decision-engine')
//#when
const result = calculateCapacity(212, 44)
//#then - availableWidth=106, cols=(106+1)/(52+1)=2, rows=(44+1)/(11+1)=3 (accounting for dividers)
expect(result.cols).toBe(2)
expect(result.rows).toBe(3)
expect(result.total).toBe(6)
})
test('returns 0 cols when agent area too narrow', async () => {
//#given
const { calculateCapacity } = await import('./decision-engine')
//#when
const result = calculateCapacity(100, 44)
//#then - availableWidth=50, cols=50/53=0
expect(result.cols).toBe(0)
expect(result.total).toBe(0)
})
})
describe('decideSpawnActions', () => {
test('returns spawn action with splitDirection when under capacity', async () => {
//#given
const { decideSpawnActions } = await import('./decision-engine')
const state: WindowState = {
windowWidth: 212,
windowHeight: 44,
mainPane: {
paneId: '%0',
width: 106,
height: 44,
left: 0,
top: 0,
title: 'main',
isActive: true,
},
agentPanes: [],
}
//#when
const decision = decideSpawnActions(
state,
'ses_1',
'Test Task',
{ mainPaneMinWidth: 120, agentPaneWidth: 40 },
[]
)
//#then
expect(decision.canSpawn).toBe(true)
expect(decision.actions).toHaveLength(1)
expect(decision.actions[0].type).toBe('spawn')
if (decision.actions[0].type === 'spawn') {
expect(decision.actions[0].sessionId).toBe('ses_1')
expect(decision.actions[0].description).toBe('Test Task')
expect(decision.actions[0].targetPaneId).toBe('%0')
expect(decision.actions[0].splitDirection).toBe('-h')
}
})
test('returns replace when split not possible', async () => {
//#given - small window where split is never possible
const { decideSpawnActions } = await import('./decision-engine')
const state: WindowState = {
windowWidth: 160,
windowHeight: 11,
mainPane: {
paneId: '%0',
width: 80,
height: 11,
left: 0,
top: 0,
title: 'main',
isActive: true,
},
agentPanes: [
{
paneId: '%1',
width: 80,
height: 11,
left: 80,
top: 0,
title: 'omo-subagent-Old',
isActive: false,
},
],
}
const sessionMappings = [
{ sessionId: 'ses_old', paneId: '%1', createdAt: new Date('2024-01-01') },
]
//#when
const decision = decideSpawnActions(
state,
'ses_new',
'New Task',
{ mainPaneMinWidth: 120, agentPaneWidth: 40 },
sessionMappings
)
//#then - agent area (80) < MIN_SPLIT_WIDTH (105), so replace is used
expect(decision.canSpawn).toBe(true)
expect(decision.actions).toHaveLength(1)
expect(decision.actions[0].type).toBe('replace')
})
test('returns canSpawn=false when window too small', async () => {
//#given
const { decideSpawnActions } = await import('./decision-engine')
const state: WindowState = {
windowWidth: 60,
windowHeight: 5,
mainPane: {
paneId: '%0',
width: 30,
height: 5,
left: 0,
top: 0,
title: 'main',
isActive: true,
},
agentPanes: [],
}
//#when
const decision = decideSpawnActions(
state,
'ses_1',
'Test Task',
{ mainPaneMinWidth: 120, agentPaneWidth: 40 },
[]
)
//#then
expect(decision.canSpawn).toBe(false)
expect(decision.reason).toContain('too small')
})
})
})

View File

@@ -0,0 +1,396 @@
import type { PluginInput } from "@opencode-ai/plugin"
import type { TmuxConfig } from "../../config/schema"
import type { TrackedSession, CapacityConfig } from "./types"
import {
isInsideTmux,
getCurrentPaneId,
POLL_INTERVAL_BACKGROUND_MS,
SESSION_MISSING_GRACE_MS,
SESSION_READY_POLL_INTERVAL_MS,
SESSION_READY_TIMEOUT_MS,
} from "../../shared/tmux"
import { log } from "../../shared"
import { queryWindowState } from "./pane-state-querier"
import { decideSpawnActions, decideCloseAction, type SessionMapping } from "./decision-engine"
import { executeActions, executeAction } from "./action-executor"
type OpencodeClient = PluginInput["client"]
interface SessionCreatedEvent {
type: string
properties?: { info?: { id?: string; parentID?: string; title?: string } }
}
const SESSION_TIMEOUT_MS = 10 * 60 * 1000
/**
* State-first Tmux Session Manager
*
* Architecture:
* 1. QUERY: Get actual tmux pane state (source of truth)
* 2. DECIDE: Pure function determines actions based on state
* 3. EXECUTE: Execute actions with verification
* 4. UPDATE: Update internal cache only after tmux confirms success
*
* The internal `sessions` Map is just a cache for sessionId<->paneId mapping.
* The REAL source of truth is always queried from tmux.
*/
export class TmuxSessionManager {
private client: OpencodeClient
private tmuxConfig: TmuxConfig
private serverUrl: string
private sourcePaneId: string | undefined
private sessions = new Map<string, TrackedSession>()
private pendingSessions = new Set<string>()
private pollInterval?: ReturnType<typeof setInterval>
constructor(ctx: PluginInput, tmuxConfig: TmuxConfig) {
this.client = ctx.client
this.tmuxConfig = tmuxConfig
const defaultPort = process.env.OPENCODE_PORT ?? "4096"
this.serverUrl = ctx.serverUrl?.toString() ?? `http://localhost:${defaultPort}`
this.sourcePaneId = getCurrentPaneId()
log("[tmux-session-manager] initialized", {
configEnabled: this.tmuxConfig.enabled,
tmuxConfig: this.tmuxConfig,
serverUrl: this.serverUrl,
sourcePaneId: this.sourcePaneId,
})
}
private isEnabled(): boolean {
return this.tmuxConfig.enabled && isInsideTmux()
}
private getCapacityConfig(): CapacityConfig {
return {
mainPaneMinWidth: this.tmuxConfig.main_pane_min_width,
agentPaneWidth: this.tmuxConfig.agent_pane_min_width,
}
}
private getSessionMappings(): SessionMapping[] {
return Array.from(this.sessions.values()).map((s) => ({
sessionId: s.sessionId,
paneId: s.paneId,
createdAt: s.createdAt,
}))
}
private async waitForSessionReady(sessionId: string): Promise<boolean> {
const startTime = Date.now()
while (Date.now() - startTime < SESSION_READY_TIMEOUT_MS) {
try {
const statusResult = await this.client.session.status({ path: undefined })
const allStatuses = (statusResult.data ?? {}) as Record<string, { type: string }>
if (allStatuses[sessionId]) {
log("[tmux-session-manager] session ready", {
sessionId,
status: allStatuses[sessionId].type,
waitedMs: Date.now() - startTime,
})
return true
}
} catch (err) {
log("[tmux-session-manager] session status check error", { error: String(err) })
}
await new Promise((resolve) => setTimeout(resolve, SESSION_READY_POLL_INTERVAL_MS))
}
log("[tmux-session-manager] session ready timeout", {
sessionId,
timeoutMs: SESSION_READY_TIMEOUT_MS,
})
return false
}
async onSessionCreated(event: SessionCreatedEvent): Promise<void> {
const enabled = this.isEnabled()
log("[tmux-session-manager] onSessionCreated called", {
enabled,
tmuxConfigEnabled: this.tmuxConfig.enabled,
isInsideTmux: isInsideTmux(),
eventType: event.type,
infoId: event.properties?.info?.id,
infoParentID: event.properties?.info?.parentID,
})
if (!enabled) return
if (event.type !== "session.created") return
const info = event.properties?.info
if (!info?.id || !info?.parentID) return
const sessionId = info.id
const title = info.title ?? "Subagent"
if (this.sessions.has(sessionId) || this.pendingSessions.has(sessionId)) {
log("[tmux-session-manager] session already tracked or pending", { sessionId })
return
}
if (!this.sourcePaneId) {
log("[tmux-session-manager] no source pane id")
return
}
this.pendingSessions.add(sessionId)
try {
const state = await queryWindowState(this.sourcePaneId)
if (!state) {
log("[tmux-session-manager] failed to query window state")
return
}
log("[tmux-session-manager] window state queried", {
windowWidth: state.windowWidth,
mainPane: state.mainPane?.paneId,
agentPaneCount: state.agentPanes.length,
agentPanes: state.agentPanes.map((p) => p.paneId),
})
const decision = decideSpawnActions(
state,
sessionId,
title,
this.getCapacityConfig(),
this.getSessionMappings()
)
log("[tmux-session-manager] spawn decision", {
canSpawn: decision.canSpawn,
reason: decision.reason,
actionCount: decision.actions.length,
actions: decision.actions.map((a) => {
if (a.type === "close") return { type: "close", paneId: a.paneId }
if (a.type === "replace") return { type: "replace", paneId: a.paneId, newSessionId: a.newSessionId }
return { type: "spawn", sessionId: a.sessionId }
}),
})
if (!decision.canSpawn) {
log("[tmux-session-manager] cannot spawn", { reason: decision.reason })
return
}
const result = await executeActions(
decision.actions,
{ config: this.tmuxConfig, serverUrl: this.serverUrl, windowState: state }
)
for (const { action, result: actionResult } of result.results) {
if (action.type === "close" && actionResult.success) {
this.sessions.delete(action.sessionId)
log("[tmux-session-manager] removed closed session from cache", {
sessionId: action.sessionId,
})
}
if (action.type === "replace" && actionResult.success) {
this.sessions.delete(action.oldSessionId)
log("[tmux-session-manager] removed replaced session from cache", {
oldSessionId: action.oldSessionId,
newSessionId: action.newSessionId,
})
}
}
if (result.success && result.spawnedPaneId) {
const sessionReady = await this.waitForSessionReady(sessionId)
if (!sessionReady) {
log("[tmux-session-manager] session not ready after timeout, tracking anyway", {
sessionId,
paneId: result.spawnedPaneId,
})
}
const now = Date.now()
this.sessions.set(sessionId, {
sessionId,
paneId: result.spawnedPaneId,
description: title,
createdAt: new Date(now),
lastSeenAt: new Date(now),
})
log("[tmux-session-manager] pane spawned and tracked", {
sessionId,
paneId: result.spawnedPaneId,
sessionReady,
})
this.startPolling()
} else {
log("[tmux-session-manager] spawn failed", {
success: result.success,
results: result.results.map((r) => ({
type: r.action.type,
success: r.result.success,
error: r.result.error,
})),
})
}
} finally {
this.pendingSessions.delete(sessionId)
}
}
async onSessionDeleted(event: { sessionID: string }): Promise<void> {
if (!this.isEnabled()) return
if (!this.sourcePaneId) return
const tracked = this.sessions.get(event.sessionID)
if (!tracked) return
log("[tmux-session-manager] onSessionDeleted", { sessionId: event.sessionID })
const state = await queryWindowState(this.sourcePaneId)
if (!state) {
this.sessions.delete(event.sessionID)
return
}
const closeAction = decideCloseAction(state, event.sessionID, this.getSessionMappings())
if (closeAction) {
await executeAction(closeAction, { config: this.tmuxConfig, serverUrl: this.serverUrl, windowState: state })
}
this.sessions.delete(event.sessionID)
if (this.sessions.size === 0) {
this.stopPolling()
}
}
private startPolling(): void {
if (this.pollInterval) return
this.pollInterval = setInterval(
() => this.pollSessions(),
POLL_INTERVAL_BACKGROUND_MS,
)
log("[tmux-session-manager] polling started")
}
private stopPolling(): void {
if (this.pollInterval) {
clearInterval(this.pollInterval)
this.pollInterval = undefined
log("[tmux-session-manager] polling stopped")
}
}
private async pollSessions(): Promise<void> {
if (this.sessions.size === 0) {
this.stopPolling()
return
}
try {
const statusResult = await this.client.session.status({ path: undefined })
const allStatuses = (statusResult.data ?? {}) as Record<string, { type: string }>
log("[tmux-session-manager] pollSessions", {
trackedSessions: Array.from(this.sessions.keys()),
allStatusKeys: Object.keys(allStatuses),
})
const now = Date.now()
const sessionsToClose: string[] = []
for (const [sessionId, tracked] of this.sessions.entries()) {
const status = allStatuses[sessionId]
const isIdle = status?.type === "idle"
if (status) {
tracked.lastSeenAt = new Date(now)
}
const missingSince = !status ? now - tracked.lastSeenAt.getTime() : 0
const missingTooLong = missingSince >= SESSION_MISSING_GRACE_MS
const isTimedOut = now - tracked.createdAt.getTime() > SESSION_TIMEOUT_MS
log("[tmux-session-manager] session check", {
sessionId,
statusType: status?.type,
isIdle,
missingSince,
missingTooLong,
isTimedOut,
shouldClose: isIdle || missingTooLong || isTimedOut,
})
if (isIdle || missingTooLong || isTimedOut) {
sessionsToClose.push(sessionId)
}
}
for (const sessionId of sessionsToClose) {
log("[tmux-session-manager] closing session due to poll", { sessionId })
await this.closeSessionById(sessionId)
}
} catch (err) {
log("[tmux-session-manager] poll error", { error: String(err) })
}
}
private async closeSessionById(sessionId: string): Promise<void> {
const tracked = this.sessions.get(sessionId)
if (!tracked) return
log("[tmux-session-manager] closing session pane", {
sessionId,
paneId: tracked.paneId,
})
const state = this.sourcePaneId ? await queryWindowState(this.sourcePaneId) : null
if (state) {
await executeAction(
{ type: "close", paneId: tracked.paneId, sessionId },
{ config: this.tmuxConfig, serverUrl: this.serverUrl, windowState: state }
)
}
this.sessions.delete(sessionId)
if (this.sessions.size === 0) {
this.stopPolling()
}
}
createEventHandler(): (input: { event: { type: string; properties?: unknown } }) => Promise<void> {
return async (input) => {
await this.onSessionCreated(input.event as SessionCreatedEvent)
}
}
async cleanup(): Promise<void> {
this.stopPolling()
if (this.sessions.size > 0) {
log("[tmux-session-manager] closing all panes", { count: this.sessions.size })
const state = this.sourcePaneId ? await queryWindowState(this.sourcePaneId) : null
if (state) {
const closePromises = Array.from(this.sessions.values()).map((s) =>
executeAction(
{ type: "close", paneId: s.paneId, sessionId: s.sessionId },
{ config: this.tmuxConfig, serverUrl: this.serverUrl, windowState: state }
).catch((err) =>
log("[tmux-session-manager] cleanup error for pane", {
paneId: s.paneId,
error: String(err),
}),
),
)
await Promise.all(closePromises)
}
this.sessions.clear()
}
log("[tmux-session-manager] cleanup complete")
}
}

View File

@@ -0,0 +1,73 @@
import { spawn } from "bun"
import type { WindowState, TmuxPaneInfo } from "./types"
import { getTmuxPath } from "../../tools/interactive-bash/utils"
import { log } from "../../shared"
export async function queryWindowState(sourcePaneId: string): Promise<WindowState | null> {
const tmux = await getTmuxPath()
if (!tmux) return null
const proc = spawn(
[
tmux,
"list-panes",
"-t",
sourcePaneId,
"-F",
"#{pane_id},#{pane_width},#{pane_height},#{pane_left},#{pane_top},#{pane_title},#{pane_active},#{window_width},#{window_height}",
],
{ stdout: "pipe", stderr: "pipe" }
)
const exitCode = await proc.exited
const stdout = await new Response(proc.stdout).text()
if (exitCode !== 0) {
log("[pane-state-querier] list-panes failed", { exitCode })
return null
}
const lines = stdout.trim().split("\n").filter(Boolean)
if (lines.length === 0) return null
let windowWidth = 0
let windowHeight = 0
const panes: TmuxPaneInfo[] = []
for (const line of lines) {
const [paneId, widthStr, heightStr, leftStr, topStr, title, activeStr, windowWidthStr, windowHeightStr] = line.split(",")
const width = parseInt(widthStr, 10)
const height = parseInt(heightStr, 10)
const left = parseInt(leftStr, 10)
const top = parseInt(topStr, 10)
const isActive = activeStr === "1"
windowWidth = parseInt(windowWidthStr, 10)
windowHeight = parseInt(windowHeightStr, 10)
if (!isNaN(width) && !isNaN(left) && !isNaN(height) && !isNaN(top)) {
panes.push({ paneId, width, height, left, top, title, isActive })
}
}
panes.sort((a, b) => a.left - b.left || a.top - b.top)
const mainPane = panes.find((p) => p.paneId === sourcePaneId)
if (!mainPane) {
log("[pane-state-querier] CRITICAL: sourcePaneId not found in panes", {
sourcePaneId,
availablePanes: panes.map((p) => p.paneId),
})
return null
}
const agentPanes = panes.filter((p) => p.paneId !== mainPane.paneId)
log("[pane-state-querier] window state", {
windowWidth,
windowHeight,
mainPane: mainPane.paneId,
agentPaneCount: agentPanes.length,
})
return { windowWidth, windowHeight, mainPane, agentPanes }
}

View File

@@ -0,0 +1,45 @@
export interface TrackedSession {
sessionId: string
paneId: string
description: string
createdAt: Date
lastSeenAt: Date
}
export const MIN_PANE_WIDTH = 52
export const MIN_PANE_HEIGHT = 11
export interface TmuxPaneInfo {
paneId: string
width: number
height: number
left: number
top: number
title: string
isActive: boolean
}
export interface WindowState {
windowWidth: number
windowHeight: number
mainPane: TmuxPaneInfo | null
agentPanes: TmuxPaneInfo[]
}
export type SplitDirection = "-h" | "-v"
export type PaneAction =
| { type: "close"; paneId: string; sessionId: string }
| { type: "spawn"; sessionId: string; description: string; targetPaneId: string; splitDirection: SplitDirection }
| { type: "replace"; paneId: string; oldSessionId: string; newSessionId: string; description: string }
export interface SpawnDecision {
canSpawn: boolean
actions: PaneAction[]
reason?: string
}
export interface CapacityConfig {
mainPaneMinWidth: number
agentPaneWidth: number
}

View File

@@ -1,16 +1,14 @@
# HOOKS KNOWLEDGE BASE
## OVERVIEW
31 lifecycle hooks intercepting/modifying agent behavior. Events: PreToolUse, PostToolUse, UserPromptSubmit, Stop, onSummarize.
32 lifecycle hooks intercepting/modifying agent behavior. Events: PreToolUse, PostToolUse, UserPromptSubmit, Stop, onSummarize.
## STRUCTURE
```
hooks/
├── atlas/ # Main orchestration (773 lines)
├── anthropic-context-window-limit-recovery/ # Auto-summarize
├── todo-continuation-enforcer.ts # Force TODO completion (489 lines)
├── atlas/ # Main orchestration (752 lines)
├── anthropic-context-window-limit-recovery/ # Auto-summarize
├── todo-continuation-enforcer.ts # Force TODO completion (16k lines)
├── ralph-loop/ # Self-referential dev loop
├── claude-code-hooks/ # settings.json compat layer - see AGENTS.md
├── comment-checker/ # Prevents AI slop
@@ -35,45 +33,54 @@ hooks/
├── non-interactive-env/ # Non-TTY environment handling
├── start-work/ # Sisyphus work session starter
├── task-resume-info/ # Resume info for cancelled tasks
├── question-label-truncator/ # Auto-truncates question labels >30 chars
├── question-label-truncator/ # Auto-truncates question labels
├── category-skill-reminder/ # Reminds of category skills
├── empty-task-response-detector.ts # Detects empty responses
├── sisyphus-junior-notepad/ # Sisyphus Junior notepad
└── index.ts # Hook aggregation + registration
```
## HOOK EVENTS
| Event | Timing | Can Block | Use Case |
|-------|--------|-----------|----------|
| PreToolUse | Before tool | Yes | Validate/modify inputs |
| PostToolUse | After tool | No | Append warnings, truncate |
| UserPromptSubmit | On prompt | Yes | Keyword detection |
| Stop | Session idle | No | Auto-continue |
| onSummarize | Compaction | No | Preserve state |
| UserPromptSubmit | `chat.message` | Yes | Keyword detection, slash commands |
| PreToolUse | `tool.execute.before` | Yes | Validate/modify inputs, inject context |
| PostToolUse | `tool.execute.after` | No | Truncate output, error recovery |
| Stop | `event` (session.stop) | No | Auto-continue, notifications |
| onSummarize | Compaction | No | Preserve state, inject summary context |
## EXECUTION ORDER
**chat.message**: keywordDetector → claudeCodeHooks → autoSlashCommand → startWorkralphLoop
**tool.execute.before**: claudeCodeHooks → nonInteractiveEnv → commentChecker → directoryAgentsInjector → rulesInjector
**tool.execute.after**: editErrorRecovery → delegateTaskRetry → commentChecker → toolOutputTruncator → claudeCodeHooks
- **UserPromptSubmit**: keywordDetector → claudeCodeHooks → autoSlashCommand → startWork
- **PreToolUse**: questionLabelTruncator → claudeCodeHooks → nonInteractiveEnv → commentChecker → directoryAgentsInjector → directoryReadmeInjector → rulesInjector → prometheusMdOnly → sisyphusJuniorNotepad → atlasHook
- **PostToolUse**: claudeCodeHooks → toolOutputTruncator → contextWindowMonitor → commentChecker → directoryAgentsInjector → directoryReadmeInjector → rulesInjector → emptyTaskResponseDetector → agentUsageReminder → interactiveBashSession → editErrorRecovery → delegateTaskRetry → atlasHook → taskResumeInfo
## HOW TO ADD
1. Create `src/hooks/name/` with `index.ts` exporting `createMyHook(ctx)`
2. Add hook name to `HookNameSchema` in `src/config/schema.ts`
3. Register in `src/index.ts`:
```typescript
const myHook = isHookEnabled("my-hook") ? createMyHook(ctx) : null
```
3. Register in `src/index.ts` and add to relevant lifecycle methods
## PATTERNS
## HOOK PATTERNS
- **Session-scoped state**: `Map<sessionID, Set<string>>`
- **Conditional execution**: Check `input.tool` before processing
- **Output modification**: `output.output += "\n${REMINDER}"`
**Simple Single-Event**:
```typescript
export function createToolOutputTruncatorHook(ctx) {
return { "tool.execute.after": async (input, output) => { ... } }
}
```
**Multi-Event with State**:
```typescript
export function createThinkModeHook() {
const state = new Map<string, ThinkModeState>()
return {
"chat.params": async (output, sessionID) => { ... },
"event": async ({ event }) => { /* cleanup */ }
}
}
```
## ANTI-PATTERNS
- **Blocking non-critical**: Use PostToolUse warnings instead
- **Heavy computation**: Keep PreToolUse light
- **Redundant injection**: Track injected files
- **Heavy computation**: Keep PreToolUse light to avoid latency
- **Redundant injection**: Track injected files to avoid context bloat
- **Direct state mutation**: Use `output.output +=` instead of replacing

View File

@@ -373,7 +373,7 @@ describe("atlas hook", () => {
const ORCHESTRATOR_SESSION = "orchestrator-write-test"
beforeEach(() => {
setupMessageStorage(ORCHESTRATOR_SESSION, "Atlas")
setupMessageStorage(ORCHESTRATOR_SESSION, "atlas")
})
afterEach(() => {
@@ -396,9 +396,9 @@ describe("atlas hook", () => {
)
// #then
expect(output.output).toContain("DELEGATION REQUIRED")
expect(output.output).toContain("ORCHESTRATOR, not an IMPLEMENTER")
expect(output.output).toContain("delegate_task")
expect(output.output).toContain("delegate_task")
})
test("should append delegation reminder when orchestrator edits outside .sisyphus/", async () => {
@@ -417,7 +417,7 @@ describe("atlas hook", () => {
)
// #then
expect(output.output).toContain("DELEGATION REQUIRED")
expect(output.output).toContain("ORCHESTRATOR, not an IMPLEMENTER")
})
test("should NOT append reminder when orchestrator writes inside .sisyphus/", async () => {
@@ -438,13 +438,13 @@ describe("atlas hook", () => {
// #then
expect(output.output).toBe(originalOutput)
expect(output.output).not.toContain("DELEGATION REQUIRED")
expect(output.output).not.toContain("ORCHESTRATOR, not an IMPLEMENTER")
})
test("should NOT append reminder when non-orchestrator writes outside .sisyphus/", async () => {
// #given
const nonOrchestratorSession = "non-orchestrator-session"
setupMessageStorage(nonOrchestratorSession, "Sisyphus-Junior")
setupMessageStorage(nonOrchestratorSession, "sisyphus-junior")
const hook = createAtlasHook(createMockPluginInput())
const originalOutput = "File written successfully"
@@ -462,7 +462,7 @@ describe("atlas hook", () => {
// #then
expect(output.output).toBe(originalOutput)
expect(output.output).not.toContain("DELEGATION REQUIRED")
expect(output.output).not.toContain("ORCHESTRATOR, not an IMPLEMENTER")
cleanupMessageStorage(nonOrchestratorSession)
})
@@ -526,7 +526,7 @@ describe("atlas hook", () => {
// #then
expect(output.output).toBe(originalOutput)
expect(output.output).not.toContain("DELEGATION REQUIRED")
expect(output.output).not.toContain("ORCHESTRATOR, not an IMPLEMENTER")
})
test("should NOT append reminder when orchestrator writes inside .sisyphus with mixed separators", async () => {
@@ -547,7 +547,7 @@ describe("atlas hook", () => {
// #then
expect(output.output).toBe(originalOutput)
expect(output.output).not.toContain("DELEGATION REQUIRED")
expect(output.output).not.toContain("ORCHESTRATOR, not an IMPLEMENTER")
})
test("should NOT append reminder for absolute Windows path inside .sisyphus\\", async () => {
@@ -568,7 +568,7 @@ describe("atlas hook", () => {
// #then
expect(output.output).toBe(originalOutput)
expect(output.output).not.toContain("DELEGATION REQUIRED")
expect(output.output).not.toContain("ORCHESTRATOR, not an IMPLEMENTER")
})
test("should append reminder for Windows path outside .sisyphus\\", async () => {
@@ -587,7 +587,7 @@ describe("atlas hook", () => {
)
// #then
expect(output.output).toContain("DELEGATION REQUIRED")
expect(output.output).toContain("ORCHESTRATOR, not an IMPLEMENTER")
})
})
})
@@ -601,7 +601,7 @@ describe("atlas hook", () => {
getMainSessionID: () => MAIN_SESSION_ID,
subagentSessions: new Set<string>(),
}))
setupMessageStorage(MAIN_SESSION_ID, "Atlas")
setupMessageStorage(MAIN_SESSION_ID, "atlas")
})
afterEach(() => {
@@ -636,7 +636,7 @@ describe("atlas hook", () => {
expect(mockInput._promptMock).toHaveBeenCalled()
const callArgs = mockInput._promptMock.mock.calls[0][0]
expect(callArgs.path.id).toBe(MAIN_SESSION_ID)
expect(callArgs.body.parts[0].text).toContain("BOULDER CONTINUATION")
expect(callArgs.body.parts[0].text).toContain("incomplete tasks")
expect(callArgs.body.parts[0].text).toContain("2 remaining")
})
@@ -845,7 +845,7 @@ describe("atlas hook", () => {
// #given - last agent is NOT Atlas
cleanupMessageStorage(MAIN_SESSION_ID)
setupMessageStorage(MAIN_SESSION_ID, "Sisyphus")
setupMessageStorage(MAIN_SESSION_ID, "sisyphus")
const mockInput = createMockPluginInput()
const hook = createAtlasHook(mockInput)

View File

@@ -11,6 +11,7 @@ import { getMainSessionID, subagentSessions } from "../../features/claude-code-s
import { findNearestMessageWithFields, MESSAGE_STORAGE } from "../../features/hook-message-injector"
import { log } from "../../shared/logger"
import { createSystemDirective, SYSTEM_DIRECTIVE_PREFIX, SystemDirectiveTypes } from "../../shared/system-directive"
import { isCallerOrchestrator, getMessageDir } from "../../shared/session-utils"
import type { BackgroundManager } from "../../features/background-agent"
export const HOOK_NAME = "atlas"
@@ -380,28 +381,6 @@ interface ToolExecuteAfterOutput {
metadata: Record<string, unknown>
}
function getMessageDir(sessionID: string): string | null {
if (!existsSync(MESSAGE_STORAGE)) return null
const directPath = join(MESSAGE_STORAGE, sessionID)
if (existsSync(directPath)) return directPath
for (const dir of readdirSync(MESSAGE_STORAGE)) {
const sessionPath = join(MESSAGE_STORAGE, dir, sessionID)
if (existsSync(sessionPath)) return sessionPath
}
return null
}
function isCallerOrchestrator(sessionID?: string): boolean {
if (!sessionID) return false
const messageDir = getMessageDir(sessionID)
if (!messageDir) return false
const nearest = findNearestMessageWithFields(messageDir)
return nearest?.agent?.toLowerCase() === "atlas"
}
interface SessionState {
lastEventWasAbortError?: boolean
lastContinuationInjectedAt?: number
@@ -672,7 +651,7 @@ export function createAtlasHook(
if (input.tool === "delegate_task") {
const prompt = output.args.prompt as string | undefined
if (prompt && !prompt.includes(SYSTEM_DIRECTIVE_PREFIX)) {
output.args.prompt = prompt + `\n<system-reminder>${SINGLE_TASK_DIRECTIVE}</system-reminder>`
output.args.prompt = `<system-reminder>${SINGLE_TASK_DIRECTIVE}</system-reminder>\n` + prompt
log(`[${HOOK_NAME}] Injected single-task directive to delegate_task`, {
sessionID: input.sessionID,
})

View File

@@ -170,6 +170,20 @@ export function getCachedVersion(): string | null {
log("[auto-update-checker] Failed to resolve version from current directory:", err)
}
// Fallback for compiled binaries (npm global install)
// process.execPath points to the actual binary location
try {
const execDir = path.dirname(fs.realpathSync(process.execPath))
const pkgPath = findPackageJsonUp(execDir)
if (pkgPath) {
const content = fs.readFileSync(pkgPath, "utf-8")
const pkg = JSON.parse(content) as PackageJson
if (pkg.version) return pkg.version
}
} catch (err) {
log("[auto-update-checker] Failed to resolve version from execPath:", err)
}
return null
}

View File

@@ -6,6 +6,7 @@ import { log } from "../../shared/logger"
import { getConfigLoadErrors, clearConfigLoadErrors } from "../../shared/config-errors"
import { runBunInstall } from "../../cli/config-manager"
import { isModelCacheAvailable } from "../../shared/model-availability"
import { hasConnectedProvidersCache, updateConnectedProvidersCache } from "../../shared/connected-providers-cache"
import type { AutoUpdateCheckerOptions } from "./types"
const SISYPHUS_SPINNER = ["·", "•", "●", "○", "◌", "◦", " "]
@@ -77,6 +78,7 @@ export function createAutoUpdateCheckerHook(ctx: PluginInput, options: AutoUpdat
await showConfigErrorsIfAny(ctx)
await showModelCacheWarningIfNeeded(ctx)
await updateAndShowConnectedProvidersCacheStatus(ctx)
if (localDevVersion) {
if (showStartupToast) {
@@ -186,6 +188,29 @@ async function showModelCacheWarningIfNeeded(ctx: PluginInput): Promise<void> {
log("[auto-update-checker] Model cache warning shown")
}
async function updateAndShowConnectedProvidersCacheStatus(ctx: PluginInput): Promise<void> {
const hadCache = hasConnectedProvidersCache()
updateConnectedProvidersCache(ctx.client).catch(() => {})
if (!hadCache) {
await ctx.client.tui
.showToast({
body: {
title: "Connected Providers Cache",
message: "Building provider cache for first time. Restart OpenCode for full model filtering.",
variant: "info" as const,
duration: 8000,
},
})
.catch(() => {})
log("[auto-update-checker] Connected providers cache toast shown (first run)")
} else {
log("[auto-update-checker] Connected providers cache exists, updating in background")
}
}
async function showConfigErrorsIfAny(ctx: PluginInput): Promise<void> {
const errors = getConfigLoadErrors()
if (errors.length === 0) return

View File

@@ -0,0 +1,346 @@
import { describe, expect, test, beforeEach, afterEach, spyOn } from "bun:test"
import { createCategorySkillReminderHook } from "./index"
import { updateSessionAgent, clearSessionAgent, _resetForTesting } from "../../features/claude-code-session-state"
import * as sharedModule from "../../shared"
describe("category-skill-reminder hook", () => {
let logCalls: Array<{ msg: string; data?: unknown }>
let logSpy: ReturnType<typeof spyOn>
beforeEach(() => {
_resetForTesting()
logCalls = []
logSpy = spyOn(sharedModule, "log").mockImplementation((msg: string, data?: unknown) => {
logCalls.push({ msg, data })
})
})
afterEach(() => {
logSpy?.mockRestore()
})
function createMockPluginInput() {
return {
client: {
tui: {
showToast: async () => {},
},
},
} as any
}
describe("target agent detection", () => {
test("should inject reminder for sisyphus agent after 3 tool calls", async () => {
// #given - sisyphus agent session with multiple tool calls
const hook = createCategorySkillReminderHook(createMockPluginInput())
const sessionID = "sisyphus-session"
updateSessionAgent(sessionID, "Sisyphus")
const output = { title: "", output: "file content", metadata: {} }
// #when - 3 edit tool calls are made
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "1" }, output)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "2" }, output)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "3" }, output)
// #then - reminder should be injected
expect(output.output).toContain("[Category+Skill Reminder]")
expect(output.output).toContain("delegate_task")
clearSessionAgent(sessionID)
})
test("should inject reminder for atlas agent", async () => {
// #given - atlas agent session
const hook = createCategorySkillReminderHook(createMockPluginInput())
const sessionID = "atlas-session"
updateSessionAgent(sessionID, "Atlas")
const output = { title: "", output: "result", metadata: {} }
// #when - 3 tool calls are made
await hook["tool.execute.after"]({ tool: "bash", sessionID, callID: "1" }, output)
await hook["tool.execute.after"]({ tool: "bash", sessionID, callID: "2" }, output)
await hook["tool.execute.after"]({ tool: "bash", sessionID, callID: "3" }, output)
// #then - reminder should be injected
expect(output.output).toContain("[Category+Skill Reminder]")
clearSessionAgent(sessionID)
})
test("should inject reminder for sisyphus-junior agent", async () => {
// #given - sisyphus-junior agent session
const hook = createCategorySkillReminderHook(createMockPluginInput())
const sessionID = "junior-session"
updateSessionAgent(sessionID, "sisyphus-junior")
const output = { title: "", output: "result", metadata: {} }
// #when - 3 tool calls are made
await hook["tool.execute.after"]({ tool: "write", sessionID, callID: "1" }, output)
await hook["tool.execute.after"]({ tool: "write", sessionID, callID: "2" }, output)
await hook["tool.execute.after"]({ tool: "write", sessionID, callID: "3" }, output)
// #then - reminder should be injected
expect(output.output).toContain("[Category+Skill Reminder]")
clearSessionAgent(sessionID)
})
test("should NOT inject reminder for non-target agents", async () => {
// #given - librarian agent session (not a target)
const hook = createCategorySkillReminderHook(createMockPluginInput())
const sessionID = "librarian-session"
updateSessionAgent(sessionID, "librarian")
const output = { title: "", output: "result", metadata: {} }
// #when - 3 tool calls are made
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "1" }, output)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "2" }, output)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "3" }, output)
// #then - reminder should NOT be injected
expect(output.output).not.toContain("[Category+Skill Reminder]")
clearSessionAgent(sessionID)
})
test("should detect agent from input.agent when session state is empty", async () => {
// #given - no session state, agent provided in input
const hook = createCategorySkillReminderHook(createMockPluginInput())
const sessionID = "input-agent-session"
const output = { title: "", output: "result", metadata: {} }
// #when - 3 tool calls with agent in input
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "1", agent: "Sisyphus" }, output)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "2", agent: "Sisyphus" }, output)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "3", agent: "Sisyphus" }, output)
// #then - reminder should be injected
expect(output.output).toContain("[Category+Skill Reminder]")
})
})
describe("delegation tool tracking", () => {
test("should NOT inject reminder if delegate_task is used", async () => {
// #given - sisyphus agent that uses delegate_task
const hook = createCategorySkillReminderHook(createMockPluginInput())
const sessionID = "delegation-session"
updateSessionAgent(sessionID, "Sisyphus")
const output = { title: "", output: "result", metadata: {} }
// #when - delegate_task is used, then more tool calls
await hook["tool.execute.after"]({ tool: "delegate_task", sessionID, callID: "1" }, output)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "2" }, output)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "3" }, output)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "4" }, output)
// #then - reminder should NOT be injected (delegation was used)
expect(output.output).not.toContain("[Category+Skill Reminder]")
clearSessionAgent(sessionID)
})
test("should NOT inject reminder if call_omo_agent is used", async () => {
// #given - sisyphus agent that uses call_omo_agent
const hook = createCategorySkillReminderHook(createMockPluginInput())
const sessionID = "omo-agent-session"
updateSessionAgent(sessionID, "Sisyphus")
const output = { title: "", output: "result", metadata: {} }
// #when - call_omo_agent is used first
await hook["tool.execute.after"]({ tool: "call_omo_agent", sessionID, callID: "1" }, output)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "2" }, output)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "3" }, output)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "4" }, output)
// #then - reminder should NOT be injected
expect(output.output).not.toContain("[Category+Skill Reminder]")
clearSessionAgent(sessionID)
})
test("should NOT inject reminder if task tool is used", async () => {
// #given - sisyphus agent that uses task tool
const hook = createCategorySkillReminderHook(createMockPluginInput())
const sessionID = "task-session"
updateSessionAgent(sessionID, "Sisyphus")
const output = { title: "", output: "result", metadata: {} }
// #when - task tool is used
await hook["tool.execute.after"]({ tool: "task", sessionID, callID: "1" }, output)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "2" }, output)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "3" }, output)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "4" }, output)
// #then - reminder should NOT be injected
expect(output.output).not.toContain("[Category+Skill Reminder]")
clearSessionAgent(sessionID)
})
})
describe("tool call counting", () => {
test("should NOT inject reminder before 3 tool calls", async () => {
// #given - sisyphus agent with only 2 tool calls
const hook = createCategorySkillReminderHook(createMockPluginInput())
const sessionID = "few-calls-session"
updateSessionAgent(sessionID, "Sisyphus")
const output = { title: "", output: "result", metadata: {} }
// #when - only 2 tool calls are made
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "1" }, output)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "2" }, output)
// #then - reminder should NOT be injected yet
expect(output.output).not.toContain("[Category+Skill Reminder]")
clearSessionAgent(sessionID)
})
test("should only inject reminder once per session", async () => {
// #given - sisyphus agent session
const hook = createCategorySkillReminderHook(createMockPluginInput())
const sessionID = "once-session"
updateSessionAgent(sessionID, "Sisyphus")
const output1 = { title: "", output: "result1", metadata: {} }
const output2 = { title: "", output: "result2", metadata: {} }
// #when - 6 tool calls are made (should trigger at 3, not again at 6)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "1" }, output1)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "2" }, output1)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "3" }, output1)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "4" }, output2)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "5" }, output2)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "6" }, output2)
// #then - reminder should be in output1 but not output2
expect(output1.output).toContain("[Category+Skill Reminder]")
expect(output2.output).not.toContain("[Category+Skill Reminder]")
clearSessionAgent(sessionID)
})
test("should only count delegatable work tools", async () => {
// #given - sisyphus agent with mixed tool calls
const hook = createCategorySkillReminderHook(createMockPluginInput())
const sessionID = "mixed-tools-session"
updateSessionAgent(sessionID, "Sisyphus")
const output = { title: "", output: "result", metadata: {} }
// #when - non-delegatable tools are called (should not count)
await hook["tool.execute.after"]({ tool: "lsp_goto_definition", sessionID, callID: "1" }, output)
await hook["tool.execute.after"]({ tool: "lsp_find_references", sessionID, callID: "2" }, output)
await hook["tool.execute.after"]({ tool: "lsp_symbols", sessionID, callID: "3" }, output)
// #then - reminder should NOT be injected (LSP tools don't count)
expect(output.output).not.toContain("[Category+Skill Reminder]")
clearSessionAgent(sessionID)
})
})
describe("event handling", () => {
test("should reset state on session.deleted event", async () => {
// #given - sisyphus agent with reminder already shown
const hook = createCategorySkillReminderHook(createMockPluginInput())
const sessionID = "delete-session"
updateSessionAgent(sessionID, "Sisyphus")
const output1 = { title: "", output: "result1", metadata: {} }
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "1" }, output1)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "2" }, output1)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "3" }, output1)
expect(output1.output).toContain("[Category+Skill Reminder]")
// #when - session is deleted and new session starts
await hook.event({ event: { type: "session.deleted", properties: { info: { id: sessionID } } } })
const output2 = { title: "", output: "result2", metadata: {} }
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "4" }, output2)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "5" }, output2)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "6" }, output2)
// #then - reminder should be shown again (state was reset)
expect(output2.output).toContain("[Category+Skill Reminder]")
clearSessionAgent(sessionID)
})
test("should reset state on session.compacted event", async () => {
// #given - sisyphus agent with reminder already shown
const hook = createCategorySkillReminderHook(createMockPluginInput())
const sessionID = "compact-session"
updateSessionAgent(sessionID, "Sisyphus")
const output1 = { title: "", output: "result1", metadata: {} }
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "1" }, output1)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "2" }, output1)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "3" }, output1)
expect(output1.output).toContain("[Category+Skill Reminder]")
// #when - session is compacted
await hook.event({ event: { type: "session.compacted", properties: { sessionID } } })
const output2 = { title: "", output: "result2", metadata: {} }
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "4" }, output2)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "5" }, output2)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "6" }, output2)
// #then - reminder should be shown again (state was reset)
expect(output2.output).toContain("[Category+Skill Reminder]")
clearSessionAgent(sessionID)
})
})
describe("case insensitivity", () => {
test("should handle tool names case-insensitively", async () => {
// #given - sisyphus agent with mixed case tool names
const hook = createCategorySkillReminderHook(createMockPluginInput())
const sessionID = "case-session"
updateSessionAgent(sessionID, "Sisyphus")
const output = { title: "", output: "result", metadata: {} }
// #when - tool calls with different cases
await hook["tool.execute.after"]({ tool: "EDIT", sessionID, callID: "1" }, output)
await hook["tool.execute.after"]({ tool: "Edit", sessionID, callID: "2" }, output)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "3" }, output)
// #then - reminder should be injected (all counted)
expect(output.output).toContain("[Category+Skill Reminder]")
clearSessionAgent(sessionID)
})
test("should handle delegation tool names case-insensitively", async () => {
// #given - sisyphus agent using DELEGATE_TASK in uppercase
const hook = createCategorySkillReminderHook(createMockPluginInput())
const sessionID = "case-delegate-session"
updateSessionAgent(sessionID, "Sisyphus")
const output = { title: "", output: "result", metadata: {} }
// #when - DELEGATE_TASK in uppercase is used
await hook["tool.execute.after"]({ tool: "DELEGATE_TASK", sessionID, callID: "1" }, output)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "2" }, output)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "3" }, output)
await hook["tool.execute.after"]({ tool: "edit", sessionID, callID: "4" }, output)
// #then - reminder should NOT be injected (delegation was detected)
expect(output.output).not.toContain("[Category+Skill Reminder]")
clearSessionAgent(sessionID)
})
})
})

View File

@@ -0,0 +1,165 @@
import type { PluginInput } from "@opencode-ai/plugin"
import { getSessionAgent } from "../../features/claude-code-session-state"
import { log } from "../../shared"
/**
* Target agents that should receive category+skill reminders.
* These are orchestrator agents that delegate work to specialized agents.
*/
const TARGET_AGENTS = new Set([
"sisyphus",
"sisyphus-junior",
"atlas",
])
/**
* Tools that indicate the agent is doing work that could potentially be delegated.
* When these tools are used, we remind the agent about the category+skill system.
*/
const DELEGATABLE_WORK_TOOLS = new Set([
"edit",
"write",
"bash",
"read",
"grep",
"glob",
])
/**
* Tools that indicate the agent is already using delegation properly.
*/
const DELEGATION_TOOLS = new Set([
"delegate_task",
"call_omo_agent",
"task",
])
const REMINDER_MESSAGE = `
[Category+Skill Reminder]
You are an orchestrator agent. Consider whether this work should be delegated:
**DELEGATE when:**
- UI/Frontend work → category: "visual-engineering", skills: ["frontend-ui-ux"]
- Complex logic/architecture → category: "ultrabrain"
- Quick/trivial tasks → category: "quick"
- Git operations → skills: ["git-master"]
- Browser automation → skills: ["playwright"] or ["agent-browser"]
**DO IT YOURSELF when:**
- Gathering context/exploring codebase
- Simple edits that are part of a larger task you're coordinating
- Tasks requiring your full context understanding
Example delegation:
\`\`\`
delegate_task(
category="visual-engineering",
load_skills=["frontend-ui-ux"],
description="Implement responsive navbar with animations",
run_in_background=true
)
\`\`\`
`
interface ToolExecuteInput {
tool: string
sessionID: string
callID: string
agent?: string
}
interface ToolExecuteOutput {
title: string
output: string
metadata: unknown
}
interface SessionState {
delegationUsed: boolean
reminderShown: boolean
toolCallCount: number
}
export function createCategorySkillReminderHook(_ctx: PluginInput) {
const sessionStates = new Map<string, SessionState>()
function getOrCreateState(sessionID: string): SessionState {
if (!sessionStates.has(sessionID)) {
sessionStates.set(sessionID, {
delegationUsed: false,
reminderShown: false,
toolCallCount: 0,
})
}
return sessionStates.get(sessionID)!
}
function isTargetAgent(sessionID: string, inputAgent?: string): boolean {
const agent = getSessionAgent(sessionID) ?? inputAgent
if (!agent) return false
const agentLower = agent.toLowerCase()
return TARGET_AGENTS.has(agentLower) ||
agentLower.includes("sisyphus") ||
agentLower.includes("atlas")
}
const toolExecuteAfter = async (
input: ToolExecuteInput,
output: ToolExecuteOutput,
) => {
const { tool, sessionID } = input
const toolLower = tool.toLowerCase()
if (!isTargetAgent(sessionID, input.agent)) {
return
}
const state = getOrCreateState(sessionID)
if (DELEGATION_TOOLS.has(toolLower)) {
state.delegationUsed = true
log("[category-skill-reminder] Delegation tool used", { sessionID, tool })
return
}
if (!DELEGATABLE_WORK_TOOLS.has(toolLower)) {
return
}
state.toolCallCount++
if (state.toolCallCount >= 3 && !state.delegationUsed && !state.reminderShown) {
output.output += REMINDER_MESSAGE
state.reminderShown = true
log("[category-skill-reminder] Reminder injected", {
sessionID,
toolCallCount: state.toolCallCount
})
}
}
const eventHandler = async ({ event }: { event: { type: string; properties?: unknown } }) => {
const props = event.properties as Record<string, unknown> | undefined
if (event.type === "session.deleted") {
const sessionInfo = props?.info as { id?: string } | undefined
if (sessionInfo?.id) {
sessionStates.delete(sessionInfo.id)
}
}
if (event.type === "session.compacted") {
const sessionID = (props?.sessionID ??
(props?.info as { id?: string } | undefined)?.id) as string | undefined
if (sessionID) {
sessionStates.delete(sessionID)
}
}
}
return {
"tool.execute.after": toolExecuteAfter,
event: eventHandler,
}
}

View File

@@ -1,51 +1,48 @@
# CLAUDE CODE HOOKS COMPATIBILITY
## OVERVIEW
Full Claude Code settings.json hook compatibility. 5 lifecycle events: PreToolUse, PostToolUse, UserPromptSubmit, Stop, PreCompact.
Full Claude Code `settings.json` hook compatibility layer. Intercepts OpenCode events to execute external scripts/commands defined in Claude Code configuration.
## STRUCTURE
```
claude-code-hooks/
├── index.ts # Main factory (401 lines)
├── config.ts # Loads ~/.claude/settings.json
├── config-loader.ts # Extended config
├── config-loader.ts # Extended config (disabledHooks)
├── pre-tool-use.ts # PreToolUse executor
├── post-tool-use.ts # PostToolUse executor
├── user-prompt-submit.ts # UserPromptSubmit executor
├── stop.ts # Stop hook executor
├── stop.ts # Stop hook executor (with active state tracking)
├── pre-compact.ts # PreCompact executor
├── transcript.ts # Tool use recording
├── tool-input-cache.ts # Pre→post caching
── types.ts # Hook types
└── todo.ts # Todo JSON fix
├── tool-input-cache.ts # Pre→post input caching
── types.ts # Hook & IO type definitions
```
## HOOK LIFECYCLE
| Event | When | Can Block | Context |
|-------|------|-----------|---------|
| PreToolUse | Before tool | Yes | sessionId, toolName, toolInput |
| PostToolUse | After tool | Warn | + toolOutput, transcriptPath |
| UserPromptSubmit | On message | Yes | sessionId, prompt, parts |
| Stop | Session idle | inject | sessionId, parentSessionId |
| PreCompact | Before summarize | No | sessionId |
| Event | Timing | Can Block | Context Provided |
|-------|--------|-----------|------------------|
| PreToolUse | Before tool exec | Yes | sessionId, toolName, toolInput, cwd |
| PostToolUse | After tool exec | Warn | + toolOutput, transcriptPath |
| UserPromptSubmit | On message send | Yes | sessionId, prompt, parts, cwd |
| Stop | Session idle/end | Inject | sessionId, parentSessionId, cwd |
| PreCompact | Before summarize | No | sessionId, cwd |
## CONFIG SOURCES
Priority (highest first):
1. `.claude/settings.json` (project)
2. `~/.claude/settings.json` (user)
1. `.claude/settings.json` (Project-local)
2. `~/.claude/settings.json` (Global user)
## HOOK EXECUTION
1. Hooks loaded from settings.json
2. Matchers filter by tool name
3. Commands via subprocess with `$SESSION_ID`, `$TOOL_NAME`
4. Exit codes: 0=pass, 1=warn, 2=block
- **Matchers**: Hooks filter by tool name or event type via regex/glob.
- **Commands**: Executed via subprocess with env vars (`$SESSION_ID`, `$TOOL_NAME`).
- **Exit Codes**:
- `0`: Pass (Success)
- `1`: Warn (Continue with system message)
- `2`: Block (Abort operation/prompt)
## ANTI-PATTERNS
- **Heavy PreToolUse**: Runs before EVERY tool call
- **Blocking non-critical**: Use PostToolUse warnings
- **Heavy PreToolUse**: Runs before EVERY tool; keep logic light to avoid latency.
- **Blocking non-critical**: Prefer PostToolUse warnings for non-fatal issues.
- **Direct state mutation**: Use `updatedInput` in PreToolUse instead of side effects.
- **Ignoring Exit Codes**: Ensure scripts return `2` to properly block sensitive tools.

View File

@@ -0,0 +1,102 @@
import { describe, expect, it, mock, beforeEach } from "bun:test"
// Mock dependencies before importing
const mockInjectHookMessage = mock(() => true)
mock.module("../../features/hook-message-injector", () => ({
injectHookMessage: mockInjectHookMessage,
}))
mock.module("../../shared/logger", () => ({
log: () => {},
}))
mock.module("../../shared/system-directive", () => ({
createSystemDirective: (type: string) => `[DIRECTIVE:${type}]`,
SystemDirectiveTypes: {
TODO_CONTINUATION: "TODO CONTINUATION",
RALPH_LOOP: "RALPH LOOP",
BOULDER_CONTINUATION: "BOULDER CONTINUATION",
DELEGATION_REQUIRED: "DELEGATION REQUIRED",
SINGLE_TASK_ONLY: "SINGLE TASK ONLY",
COMPACTION_CONTEXT: "COMPACTION CONTEXT",
CONTEXT_WINDOW_MONITOR: "CONTEXT WINDOW MONITOR",
PROMETHEUS_READ_ONLY: "PROMETHEUS READ-ONLY",
},
}))
import { createCompactionContextInjector } from "./index"
import type { SummarizeContext } from "./index"
describe("createCompactionContextInjector", () => {
beforeEach(() => {
mockInjectHookMessage.mockClear()
})
describe("Agent Verification State preservation", () => {
it("includes Agent Verification State section in compaction prompt", async () => {
// given
const injector = createCompactionContextInjector()
const context: SummarizeContext = {
sessionID: "test-session",
providerID: "anthropic",
modelID: "claude-sonnet-4-5",
usageRatio: 0.85,
directory: "/test/dir",
}
// when
await injector(context)
// then
expect(mockInjectHookMessage).toHaveBeenCalledTimes(1)
const calls = mockInjectHookMessage.mock.calls as unknown as [string, string, unknown][]
const injectedPrompt = calls[0]?.[1] ?? ""
expect(injectedPrompt).toContain("Agent Verification State")
expect(injectedPrompt).toContain("Current Agent")
expect(injectedPrompt).toContain("Verification Progress")
})
it("includes Momus-specific context for reviewer agents", async () => {
// given
const injector = createCompactionContextInjector()
const context: SummarizeContext = {
sessionID: "test-session",
providerID: "anthropic",
modelID: "claude-sonnet-4-5",
usageRatio: 0.9,
directory: "/test/dir",
}
// when
await injector(context)
// then
const calls = mockInjectHookMessage.mock.calls as unknown as [string, string, unknown][]
const injectedPrompt = calls[0]?.[1] ?? ""
expect(injectedPrompt).toContain("Previous Rejections")
expect(injectedPrompt).toContain("Acceptance Status")
expect(injectedPrompt).toContain("reviewer agents")
})
it("preserves file verification progress in compaction prompt", async () => {
// given
const injector = createCompactionContextInjector()
const context: SummarizeContext = {
sessionID: "test-session",
providerID: "anthropic",
modelID: "claude-sonnet-4-5",
usageRatio: 0.95,
directory: "/test/dir",
}
// when
await injector(context)
// then
const calls = mockInjectHookMessage.mock.calls as unknown as [string, string, unknown][]
const injectedPrompt = calls[0]?.[1] ?? ""
expect(injectedPrompt).toContain("Pending Verifications")
expect(injectedPrompt).toContain("Files already verified")
})
})
})

View File

@@ -33,12 +33,27 @@ When summarizing this session, you MUST include the following sections in your s
- Pending items from the original request
- Follow-up tasks identified during the work
## 5. MUST NOT Do (Critical Constraints)
## 5. Active Working Context (For Seamless Continuation)
- **Files**: Paths of files currently being edited or frequently referenced
- **Code in Progress**: Key code snippets, function signatures, or data structures under active development
- **External References**: Documentation URLs, library APIs, or external resources being consulted
- **State & Variables**: Important variable names, configuration values, or runtime state relevant to ongoing work
## 6. MUST NOT Do (Critical Constraints)
- Things that were explicitly forbidden
- Approaches that failed and should not be retried
- User's explicit restrictions or preferences
- Anti-patterns identified during the session
## 7. Agent Verification State (Critical for Reviewers)
- **Current Agent**: What agent is running (momus, oracle, etc.)
- **Verification Progress**: Files already verified/validated
- **Pending Verifications**: Files still needing verification
- **Previous Rejections**: If reviewer agent, what was rejected and why
- **Acceptance Status**: Current state of review process
This section is CRITICAL for reviewer agents (momus, oracle) to maintain continuity.
This context is critical for maintaining continuity after compaction.
`

View File

@@ -22,12 +22,15 @@ export { createNonInteractiveEnvHook } from "./non-interactive-env";
export { createInteractiveBashSessionHook } from "./interactive-bash-session";
export { createThinkingBlockValidatorHook } from "./thinking-block-validator";
export { createCategorySkillReminderHook } from "./category-skill-reminder";
export { createRalphLoopHook, type RalphLoopHook } from "./ralph-loop";
export { createAutoSlashCommandHook } from "./auto-slash-command";
export { createEditErrorRecoveryHook } from "./edit-error-recovery";
export { createPrometheusMdOnlyHook } from "./prometheus-md-only";
export { createSisyphusJuniorNotepadHook } from "./sisyphus-junior-notepad";
export { createTaskResumeInfoHook } from "./task-resume-info";
export { createStartWorkHook } from "./start-work";
export { createAtlasHook } from "./atlas";
export { createDelegateTaskRetryHook } from "./delegate-task-retry";
export { createQuestionLabelTruncatorHook } from "./question-label-truncator";
export { createSubagentQuestionBlockerHook } from "./subagent-question-blocker";

View File

@@ -55,7 +55,7 @@ You ARE the planner. Your job: create bulletproof work plans.
* Determines if the agent is a planner-type agent.
* Planner agents should NOT be told to call plan agent (they ARE the planner).
*/
function isPlannerAgent(agentName?: string): boolean {
export function isPlannerAgent(agentName?: string): boolean {
if (!agentName) return false
const lowerName = agentName.toLowerCase()
return lowerName.includes("prometheus") || lowerName.includes("planner") || lowerName === "plan"
@@ -166,34 +166,142 @@ delegate_task(agent="oracle", prompt="Review my approach: [describe plan]")
YOU MUST LEVERAGE ALL AVAILABLE AGENTS / **CATEGORY + SKILLS** TO THEIR FULLEST POTENTIAL.
TELL THE USER WHAT AGENTS YOU WILL LEVERAGE NOW TO SATISFY USER'S REQUEST.
## AGENTS / **CATEGORY + SKILLS** UTILIZATION PRINCIPLES (by capability, not by name)
- **Codebase Exploration**: Spawn exploration agents using BACKGROUND TASKS for file patterns, internal implementations, project structure
- **Documentation & References**: Use librarian-type agents via BACKGROUND TASKS for API references, examples, external library docs
- **Planning & Strategy**: NEVER plan yourself - ALWAYS spawn the Plan agent for work breakdown
- MUST invoke: \`delegate_task(subagent_type="plan", prompt="<gathered context + user request>")\`
- In your prompt to the Plan agent, ASK it to recommend which CATEGORY + SKILLS / AGENTS to leverage for implementation.
- IF IMPLEMENT TASK, MUST ADD TODO NOW: "Consult Plan agent via delegate_task(subagent_type='plan') for work breakdown with category + skills recommendations"
- **High-IQ Reasoning**: Leverage specialized agents for architecture decisions, code review, strategic planning
- **SPECIAL TASKS COVERED WITH CATEGORY + LOAD_SKILLS**: Delegate to specialized agents with category+skills for design and implementation, as following guide:
- CATEGORY + SKILL GUIDE
- MUST PASS \`load_skills\` FOR REQUIRED_SKILLS. MUST USE \`load_skills\` FOR REQUIRED_SKILLS.
- Simple project setup -> delegate_task(category="unspecified-low", load_skills=[{project-setup-skill}])
- Super Complex Server Workflow Implementation -> delegate_task(category="ultrabrain", load_skills=["terraform-master"], ...)
- Web Frontend Component Writing -> delegate_task(category="visual-engineering", load_skills=["frontend-ui-ux", "playwright"], ...)
## MANDATORY: PROMETHEUS AGENT INVOCATION (NON-NEGOTIABLE)
## EXECUTION RULES
- **TODO**: Track EVERY step. Mark complete IMMEDIATELY after each.
- **PARALLEL**: Fire independent agent calls simultaneously via delegate_task(background=true) - NEVER wait sequentially.
- **BACKGROUND FIRST**: Use delegate_task for exploration/research agents (10+ concurrent if needed).
- **VERIFY**: Re-read request after completion. Check ALL requirements met before reporting done.
- **DELEGATE**: Don't do everything yourself - orchestrate specialized agents for their strengths.
- **CATEGORY + LOAD_SKILLS**
**YOU MUST ALWAYS INVOKE PROMETHEUS (THE PLANNER) FOR ANY NON-TRIVIAL TASK.**
## WORKFLOW
1. Analyze the request and identify required capabilities
2. Spawn exploration/librarian agents via delegate_task(background=true) in PARALLEL (10+ if needed)
3. Spawn Plan agent: \`delegate_task(subagent_type="plan", prompt="<context + request>")\` to create detailed work breakdown
4. Execute with continuous verification against original requirements
| Condition | Action |
|-----------|--------|
| Task has 2+ steps | MUST call Prometheus |
| Task scope unclear | MUST call Prometheus |
| Implementation required | MUST call Prometheus |
| Architecture decision needed | MUST call Prometheus |
\`\`\`
delegate_task(subagent_type="prometheus", prompt="<gathered context + user request>")
\`\`\`
**WHY PROMETHEUS IS MANDATORY:**
- Prometheus analyzes dependencies and parallel execution opportunities
- Prometheus recommends CATEGORY + SKILLS for each task (in TL;DR + per-task)
- Prometheus ensures nothing is missed with structured work plans
- YOU are an orchestrator, NOT an implementer
### SESSION CONTINUITY WITH PROMETHEUS (CRITICAL)
**Prometheus returns a session_id. USE IT for follow-up interactions.**
| Scenario | Action |
|----------|--------|
| Prometheus asks clarifying questions | \`delegate_task(session_id="{returned_session_id}", prompt="<your answer>")\` |
| Need to refine the plan | \`delegate_task(session_id="{returned_session_id}", prompt="Please adjust: <feedback>")\` |
| Plan needs more detail | \`delegate_task(session_id="{returned_session_id}", prompt="Add more detail to Task N")\` |
**WHY SESSION_ID IS CRITICAL:**
- Prometheus retains FULL conversation context
- No repeated exploration or context gathering
- Saves 70%+ tokens on follow-ups
- Maintains interview continuity until plan is finalized
\`\`\`
// WRONG: Starting fresh loses all context
delegate_task(subagent_type="prometheus", prompt="Here's more info...")
// CORRECT: Resume preserves everything
delegate_task(session_id="ses_abc123", prompt="Here's my answer to your question: ...")
\`\`\`
**FAILURE TO CALL PROMETHEUS = INCOMPLETE WORK.**
---
## AGENTS / **CATEGORY + SKILLS** UTILIZATION PRINCIPLES
**DEFAULT BEHAVIOR: DELEGATE. DO NOT WORK YOURSELF.**
| Task Type | Action | Why |
|-----------|--------|-----|
| Codebase exploration | delegate_task(subagent_type="explore", run_in_background=true) | Parallel, context-efficient |
| Documentation lookup | delegate_task(subagent_type="librarian", run_in_background=true) | Specialized knowledge |
| Planning | delegate_task(subagent_type="plan") | Structured work breakdown |
| Architecture/Debugging | delegate_task(subagent_type="oracle") | High-IQ reasoning |
| Implementation | delegate_task(category="...", load_skills=[...]) | Domain-optimized models |
**CATEGORY + SKILL DELEGATION:**
\`\`\`
// Frontend work
delegate_task(category="visual-engineering", load_skills=["frontend-ui-ux"])
// Complex logic
delegate_task(category="ultrabrain", load_skills=["typescript-programmer"])
// Quick fixes
delegate_task(category="quick", load_skills=["git-master"])
\`\`\`
**YOU SHOULD ONLY DO IT YOURSELF WHEN:**
- Task is trivially simple (1-2 lines, obvious change)
- You have ALL context already loaded
- Delegation overhead exceeds task complexity
**OTHERWISE: DELEGATE. ALWAYS.**
---
## EXECUTION RULES (PARALLELIZATION MANDATORY)
| Rule | Implementation |
|------|----------------|
| **PARALLEL FIRST** | Fire ALL independent agents simultaneously via delegate_task(run_in_background=true) |
| **NEVER SEQUENTIAL** | If tasks A and B are independent, launch BOTH at once |
| **10+ CONCURRENT** | Use 10+ background agents if needed for comprehensive exploration |
| **COLLECT LATER** | Launch agents -> continue work -> background_output when needed |
**ANTI-PATTERN (BLOCKING):**
\`\`\`
// WRONG: Sequential, slow
result1 = delegate_task(..., run_in_background=false) // waits
result2 = delegate_task(..., run_in_background=false) // waits again
\`\`\`
**CORRECT PATTERN:**
\`\`\`
// RIGHT: Parallel, fast
delegate_task(..., run_in_background=true) // task_id_1
delegate_task(..., run_in_background=true) // task_id_2
delegate_task(..., run_in_background=true) // task_id_3
// Continue working, collect with background_output when needed
\`\`\`
---
## WORKFLOW (MANDATORY SEQUENCE)
1. **GATHER CONTEXT** (parallel background agents):
\`\`\`
delegate_task(subagent_type="explore", run_in_background=true, prompt="...")
delegate_task(subagent_type="librarian", run_in_background=true, prompt="...")
\`\`\`
2. **INVOKE PROMETHEUS** (MANDATORY for non-trivial tasks):
\`\`\`
result = delegate_task(subagent_type="prometheus", prompt="<context + request>")
// STORE the session_id for follow-ups!
prometheus_session_id = result.session_id
\`\`\`
3. **ITERATE WITH PROMETHEUS** (if clarification needed):
\`\`\`
// Use session_id to continue the conversation
delegate_task(session_id=prometheus_session_id, prompt="<answer to Prometheus's question>")
\`\`\`
4. **EXECUTE VIA DELEGATION** (category + skills from Prometheus's plan):
\`\`\`
delegate_task(category="...", load_skills=[...], prompt="<task from plan>")
\`\`\`
5. **VERIFY** against original requirements
## VERIFICATION GUARANTEE (NON-NEGOTIABLE)
@@ -267,8 +375,9 @@ Write these criteria explicitly. Share with user if scope is non-trivial.
THE USER ASKED FOR X. DELIVER EXACTLY X. NOT A SUBSET. NOT A DEMO. NOT A STARTING POINT.
1. EXPLORES + LIBRARIANS (background)
2. GATHER -> delegate_task(subagent_type="plan", prompt="<context + request>")
3. WORK BY DELEGATING TO CATEGORY + SKILLS AGENTS
2. GATHER -> delegate_task(subagent_type="prometheus", prompt="<context + request>")
3. ITERATE WITH PROMETHEUS (session_id resume) UNTIL PLAN IS FINALIZED
4. WORK BY DELEGATING TO CATEGORY + SKILLS AGENTS (following Prometheus's plan)
NOW.

View File

@@ -365,7 +365,7 @@ describe("keyword-detector agent-specific ultrawork messages", () => {
} as any
}
test("should use planner-specific ultrawork message when agent is prometheus", async () => {
test("should skip ultrawork injection when agent is prometheus", async () => {
// #given - collector and prometheus agent
const collector = new ContextCollector()
const hook = createKeywordDetectorHook(createMockPluginInput(), collector)
@@ -378,16 +378,15 @@ describe("keyword-detector agent-specific ultrawork messages", () => {
// #when - ultrawork keyword detected with prometheus agent
await hook["chat.message"]({ sessionID, agent: "prometheus" }, output)
// #then - should use planner-specific message with "YOU ARE A PLANNER" content
// #then - ultrawork should be skipped for planner agents, text unchanged
const textPart = output.parts.find(p => p.type === "text")
expect(textPart).toBeDefined()
expect(textPart!.text).toContain("YOU ARE A PLANNER, NOT AN IMPLEMENTER")
expect(textPart!.text).toBe("ultrawork plan this feature")
expect(textPart!.text).not.toContain("YOU ARE A PLANNER, NOT AN IMPLEMENTER")
expect(textPart!.text).not.toContain("YOU MUST LEVERAGE ALL AVAILABLE AGENTS")
expect(textPart!.text).toContain("---")
expect(textPart!.text).toContain("plan this feature")
})
test("should use planner-specific ultrawork message when agent name contains 'planner'", async () => {
test("should skip ultrawork injection when agent name contains 'planner'", async () => {
// #given - collector and agent with 'planner' in name
const collector = new ContextCollector()
const hook = createKeywordDetectorHook(createMockPluginInput(), collector)
@@ -400,12 +399,11 @@ describe("keyword-detector agent-specific ultrawork messages", () => {
// #when - ultrawork keyword detected with planner agent
await hook["chat.message"]({ sessionID, agent: "Prometheus (Planner)" }, output)
// #then - should use planner-specific message
// #then - ultrawork should be skipped, text unchanged
const textPart = output.parts.find(p => p.type === "text")
expect(textPart).toBeDefined()
expect(textPart!.text).toContain("YOU ARE A PLANNER, NOT AN IMPLEMENTER")
expect(textPart!.text).toContain("---")
expect(textPart!.text).toContain("create a work plan")
expect(textPart!.text).toBe("ulw create a work plan")
expect(textPart!.text).not.toContain("YOU ARE A PLANNER, NOT AN IMPLEMENTER")
})
test("should use normal ultrawork message when agent is Sisyphus", async () => {
@@ -419,7 +417,7 @@ describe("keyword-detector agent-specific ultrawork messages", () => {
}
// #when - ultrawork keyword detected with Sisyphus agent
await hook["chat.message"]({ sessionID, agent: "Sisyphus" }, output)
await hook["chat.message"]({ sessionID, agent: "sisyphus" }, output)
// #then - should use normal ultrawork message with agent utilization instructions
const textPart = output.parts.find(p => p.type === "text")
@@ -452,7 +450,7 @@ describe("keyword-detector agent-specific ultrawork messages", () => {
expect(textPart!.text).toContain("do something")
})
test("should switch from planner to normal message when agent changes", async () => {
test("should skip ultrawork for prometheus but inject for sisyphus", async () => {
// #given - two sessions, one with prometheus, one with sisyphus
const collector = new ContextCollector()
const hook = createKeywordDetectorHook(createMockPluginInput(), collector)
@@ -471,13 +469,11 @@ describe("keyword-detector agent-specific ultrawork messages", () => {
message: {} as Record<string, unknown>,
parts: [{ type: "text", text: "ultrawork implement" }],
}
await hook["chat.message"]({ sessionID: sisyphusSessionID, agent: "Sisyphus" }, sisyphusOutput)
await hook["chat.message"]({ sessionID: sisyphusSessionID, agent: "sisyphus" }, sisyphusOutput)
// #then - each session should have the correct message type
// #then - prometheus should have no injection, sisyphus should have normal ultrawork
const prometheusTextPart = prometheusOutput.parts.find(p => p.type === "text")
expect(prometheusTextPart!.text).toContain("YOU ARE A PLANNER, NOT AN IMPLEMENTER")
expect(prometheusTextPart!.text).toContain("---")
expect(prometheusTextPart!.text).toContain("plan")
expect(prometheusTextPart!.text).toBe("ultrawork plan")
const sisyphusTextPart = sisyphusOutput.parts.find(p => p.type === "text")
expect(sisyphusTextPart!.text).toContain("YOU MUST LEVERAGE ALL AVAILABLE AGENTS")
@@ -492,7 +488,7 @@ describe("keyword-detector agent-specific ultrawork messages", () => {
const sessionID = "same-session-agent-switch"
// Simulate: session state was updated to sisyphus (by index.ts updateSessionAgent)
updateSessionAgent(sessionID, "Sisyphus")
updateSessionAgent(sessionID, "sisyphus")
const output = {
message: {} as Record<string, unknown>,
@@ -514,7 +510,7 @@ describe("keyword-detector agent-specific ultrawork messages", () => {
clearSessionAgent(sessionID)
})
test("should fall back to input.agent when session state is empty", async () => {
test("should fall back to input.agent when session state is empty and skip ultrawork for prometheus", async () => {
// #given - no session state, only input.agent available
const collector = new ContextCollector()
const hook = createKeywordDetectorHook(createMockPluginInput(), collector)
@@ -531,11 +527,10 @@ describe("keyword-detector agent-specific ultrawork messages", () => {
// #when - hook receives input.agent="prometheus" with no session state
await hook["chat.message"]({ sessionID, agent: "prometheus" }, output)
// #then - should use prometheus from input.agent as fallback
// #then - prometheus fallback from input.agent, ultrawork skipped
const textPart = output.parts.find(p => p.type === "text")
expect(textPart).toBeDefined()
expect(textPart!.text).toContain("YOU ARE A PLANNER, NOT AN IMPLEMENTER")
expect(textPart!.text).toContain("---")
expect(textPart!.text).toContain("plan this")
expect(textPart!.text).toBe("ultrawork plan this")
expect(textPart!.text).not.toContain("YOU ARE A PLANNER, NOT AN IMPLEMENTER")
})
})

View File

@@ -1,5 +1,6 @@
import type { PluginInput } from "@opencode-ai/plugin"
import { detectKeywordsWithType, extractPromptText, removeCodeBlocks } from "./detector"
import { isPlannerAgent } from "./constants"
import { log } from "../../shared"
import { isSystemDirective } from "../../shared/system-directive"
import { getMainSessionID, getSessionAgent, subagentSessions } from "../../features/claude-code-session-state"
@@ -33,6 +34,10 @@ export function createKeywordDetectorHook(ctx: PluginInput, collector?: ContextC
const currentAgent = getSessionAgent(input.sessionID) ?? input.agent
let detectedKeywords = detectKeywordsWithType(removeCodeBlocks(promptText), currentAgent)
if (isPlannerAgent(currentAgent)) {
detectedKeywords = detectedKeywords.filter((k) => k.type !== "ultrawork")
}
if (detectedKeywords.length === 0) {
return
}

View File

@@ -277,7 +277,7 @@ describe("prometheus-md-only", () => {
describe("with non-Prometheus agent in message storage", () => {
beforeEach(() => {
setupMessageStorage(TEST_SESSION_ID, "Sisyphus")
setupMessageStorage(TEST_SESSION_ID, "sisyphus")
})
test("should not affect non-Prometheus agents", async () => {

View File

@@ -89,10 +89,10 @@ export function createPrometheusMdOnlyHook(ctx: PluginInput) {
const toolName = input.tool
// Inject read-only warning for task tools called by Prometheus
if (TASK_TOOLS.includes(toolName)) {
const prompt = output.args.prompt as string | undefined
if (prompt && !prompt.includes(SYSTEM_DIRECTIVE_PREFIX)) {
output.args.prompt = prompt + PLANNING_CONSULT_WARNING
if (TASK_TOOLS.includes(toolName)) {
const prompt = output.args.prompt as string | undefined
if (prompt && !prompt.includes(SYSTEM_DIRECTIVE_PREFIX)) {
output.args.prompt = PLANNING_CONSULT_WARNING + prompt
log(`[${HOOK_NAME}] Injected read-only planning warning to ${toolName}`, {
sessionID: input.sessionID,
tool: toolName,

View File

@@ -891,40 +891,40 @@ Original task: Build something`
})
describe("API timeout protection", () => {
// FIXME: Flaky in CI - times out intermittently
test.skip("should not hang when session.messages() times out", async () => {
// #given - slow API that takes longer than timeout
const slowMock = {
test("should not hang when session.messages() throws", async () => {
// #given - API that throws (simulates timeout error)
let apiCallCount = 0
const errorMock = {
...createMockPluginInput(),
client: {
...createMockPluginInput().client,
session: {
...createMockPluginInput().client.session,
messages: async () => {
// Simulate slow API (would hang without timeout)
await new Promise((resolve) => setTimeout(resolve, 10000))
return { data: [] }
apiCallCount++
throw new Error("API timeout")
},
},
},
}
const hook = createRalphLoopHook(slowMock as any, {
const hook = createRalphLoopHook(errorMock as any, {
getTranscriptPath: () => join(TEST_DIR, "nonexistent.jsonl"),
apiTimeout: 100, // 100ms timeout for test
apiTimeout: 100,
})
hook.startLoop("session-123", "Build something")
// #when - session goes idle (API will timeout)
// #when - session goes idle (API will throw)
const startTime = Date.now()
await hook.event({
event: { type: "session.idle", properties: { sessionID: "session-123" } },
})
const elapsed = Date.now() - startTime
// #then - should complete within timeout + buffer (not hang for 10s)
expect(elapsed).toBeLessThan(500)
// #then - loop should continue (API timeout = no completion detected)
// #then - should complete quickly (not hang for 10s)
expect(elapsed).toBeLessThan(2000)
// #then - loop should continue (API error = no completion detected)
expect(promptCalls.length).toBe(1)
expect(apiCallCount).toBeGreaterThan(0)
})
})
})

View File

@@ -0,0 +1,29 @@
export const HOOK_NAME = "sisyphus-junior-notepad"
export const NOTEPAD_DIRECTIVE = `
<Work_Context>
## Notepad Location (for recording learnings)
NOTEPAD PATH: .sisyphus/notepads/{plan-name}/
- learnings.md: Record patterns, conventions, successful approaches
- issues.md: Record problems, blockers, gotchas encountered
- decisions.md: Record architectural choices and rationales
- problems.md: Record unresolved issues, technical debt
You SHOULD append findings to notepad files after completing work.
IMPORTANT: Always APPEND to notepad files - never overwrite or use Edit tool.
## Plan Location (READ ONLY)
PLAN PATH: .sisyphus/plans/{plan-name}.md
CRITICAL RULE: NEVER MODIFY THE PLAN FILE
The plan file (.sisyphus/plans/*.md) is SACRED and READ-ONLY.
- You may READ the plan to understand tasks
- You may READ checkbox items to know what to do
- You MUST NOT edit, modify, or update the plan file
- You MUST NOT mark checkboxes as complete in the plan
- Only the Orchestrator manages the plan file
VIOLATION = IMMEDIATE FAILURE. The Orchestrator tracks plan state.
</Work_Context>
`

View File

@@ -0,0 +1,45 @@
import type { PluginInput } from "@opencode-ai/plugin"
import { isCallerOrchestrator } from "../../shared/session-utils"
import { SYSTEM_DIRECTIVE_PREFIX } from "../../shared/system-directive"
import { log } from "../../shared/logger"
import { HOOK_NAME, NOTEPAD_DIRECTIVE } from "./constants"
export * from "./constants"
export function createSisyphusJuniorNotepadHook(ctx: PluginInput) {
return {
"tool.execute.before": async (
input: { tool: string; sessionID: string; callID: string },
output: { args: Record<string, unknown>; message?: string }
): Promise<void> => {
// 1. Check if tool is delegate_task
if (input.tool !== "delegate_task") {
return
}
// 2. Check if caller is Atlas (orchestrator)
if (!isCallerOrchestrator(input.sessionID)) {
return
}
// 3. Get prompt from output.args
const prompt = output.args.prompt as string | undefined
if (!prompt) {
return
}
// 4. Check for double injection
if (prompt.includes(SYSTEM_DIRECTIVE_PREFIX)) {
return
}
// 5. Prepend directive
output.args.prompt = NOTEPAD_DIRECTIVE + prompt
// 6. Log injection
log(`[${HOOK_NAME}] Injected notepad directive to delegate_task`, {
sessionID: input.sessionID,
})
},
}
}

View File

@@ -0,0 +1,82 @@
import { describe, test, expect, beforeEach } from "bun:test"
import { createSubagentQuestionBlockerHook } from "./index"
import { subagentSessions, _resetForTesting } from "../../features/claude-code-session-state"
describe("createSubagentQuestionBlockerHook", () => {
const hook = createSubagentQuestionBlockerHook()
beforeEach(() => {
_resetForTesting()
})
describe("tool.execute.before", () => {
test("allows question tool for non-subagent sessions", async () => {
//#given
const sessionID = "ses_main"
const input = { tool: "question", sessionID, callID: "call_1" }
const output = { args: { questions: [] } }
//#when
const result = hook["tool.execute.before"]?.(input as any, output as any)
//#then
await expect(result).resolves.toBeUndefined()
})
test("blocks question tool for subagent sessions", async () => {
//#given
const sessionID = "ses_subagent"
subagentSessions.add(sessionID)
const input = { tool: "question", sessionID, callID: "call_1" }
const output = { args: { questions: [] } }
//#when
const result = hook["tool.execute.before"]?.(input as any, output as any)
//#then
await expect(result).rejects.toThrow("Question tool is disabled for subagent sessions")
})
test("blocks Question tool (case insensitive) for subagent sessions", async () => {
//#given
const sessionID = "ses_subagent"
subagentSessions.add(sessionID)
const input = { tool: "Question", sessionID, callID: "call_1" }
const output = { args: { questions: [] } }
//#when
const result = hook["tool.execute.before"]?.(input as any, output as any)
//#then
await expect(result).rejects.toThrow("Question tool is disabled for subagent sessions")
})
test("blocks AskUserQuestion tool for subagent sessions", async () => {
//#given
const sessionID = "ses_subagent"
subagentSessions.add(sessionID)
const input = { tool: "AskUserQuestion", sessionID, callID: "call_1" }
const output = { args: { questions: [] } }
//#when
const result = hook["tool.execute.before"]?.(input as any, output as any)
//#then
await expect(result).rejects.toThrow("Question tool is disabled for subagent sessions")
})
test("ignores non-question tools for subagent sessions", async () => {
//#given
const sessionID = "ses_subagent"
subagentSessions.add(sessionID)
const input = { tool: "bash", sessionID, callID: "call_1" }
const output = { args: { command: "ls" } }
//#when
const result = hook["tool.execute.before"]?.(input as any, output as any)
//#then
await expect(result).resolves.toBeUndefined()
})
})
})

View File

@@ -0,0 +1,29 @@
import type { Hooks } from "@opencode-ai/plugin"
import { subagentSessions } from "../../features/claude-code-session-state"
import { log } from "../../shared"
export function createSubagentQuestionBlockerHook(): Hooks {
return {
"tool.execute.before": async (input) => {
const toolName = input.tool?.toLowerCase()
if (toolName !== "question" && toolName !== "askuserquestion") {
return
}
if (!subagentSessions.has(input.sessionID)) {
return
}
log("[subagent-question-blocker] Blocking question tool call from subagent session", {
sessionID: input.sessionID,
tool: input.tool,
})
throw new Error(
"Question tool is disabled for subagent sessions. " +
"Subagents should complete their work autonomously without asking questions to users. " +
"If you need clarification, return to the parent agent with your findings and uncertainties."
)
},
}
}

View File

@@ -350,4 +350,63 @@ describe("createThinkModeHook integration", () => {
expect(input.message.model?.modelID).toBe("claude-opus-4-5")
})
})
describe("Agent-level thinking configuration respect", () => {
it("should NOT inject thinking config when agent has thinking disabled", async () => {
// #given agent with thinking explicitly disabled
const hook = createThinkModeHook()
const input: ThinkModeInput = {
parts: [{ type: "text", text: "ultrathink deeply" }],
message: {
model: { providerID: "google", modelID: "gemini-3-pro" },
thinking: { type: "disabled" },
} as ThinkModeInput["message"],
}
// #when the chat.params hook is called
await hook["chat.params"](input, sessionID)
// #then should NOT override agent's thinking disabled setting
const message = input.message as MessageWithInjectedProps
expect((message.thinking as { type: string }).type).toBe("disabled")
expect(message.providerOptions).toBeUndefined()
})
it("should NOT inject thinking config when agent has custom providerOptions", async () => {
// #given agent with custom providerOptions
const hook = createThinkModeHook()
const input: ThinkModeInput = {
parts: [{ type: "text", text: "ultrathink" }],
message: {
model: { providerID: "google", modelID: "gemini-3-flash" },
providerOptions: {
google: { thinkingConfig: { thinkingBudget: 0 } },
},
} as ThinkModeInput["message"],
}
// #when the chat.params hook is called
await hook["chat.params"](input, sessionID)
// #then should NOT override agent's providerOptions
const message = input.message as MessageWithInjectedProps
const providerOpts = message.providerOptions as Record<string, unknown>
expect((providerOpts.google as Record<string, unknown>).thinkingConfig).toEqual({
thinkingBudget: 0,
})
})
it("should still inject thinking config when agent has no thinking override", async () => {
// #given agent without thinking override
const hook = createThinkModeHook()
const input = createMockInput("google", "gemini-3-pro", "ultrathink")
// #when the chat.params hook is called
await hook["chat.params"](input, sessionID)
// #then should inject thinking config as normal
const message = input.message as MessageWithInjectedProps
expect(message.providerOptions).toBeDefined()
})
})
})

View File

@@ -65,13 +65,32 @@ export function createThinkModeHook() {
}
if (thinkingConfig) {
Object.assign(output.message, thinkingConfig)
state.thinkingConfigInjected = true
log("Think mode: thinking config injected", {
sessionID,
provider: currentModel.providerID,
config: thinkingConfig,
})
const messageData = output.message as Record<string, unknown>
const agentThinking = messageData.thinking as { type?: string } | undefined
const agentProviderOptions = messageData.providerOptions
const agentDisabledThinking = agentThinking?.type === "disabled"
const agentHasCustomProviderOptions = Boolean(agentProviderOptions)
if (agentDisabledThinking) {
log("Think mode: skipping - agent has thinking disabled", {
sessionID,
provider: currentModel.providerID,
})
} else if (agentHasCustomProviderOptions) {
log("Think mode: skipping - agent has custom providerOptions", {
sessionID,
provider: currentModel.providerID,
})
} else {
Object.assign(output.message, thinkingConfig)
state.thinkingConfigInjected = true
log("Think mode: thinking config injected", {
sessionID,
provider: currentModel.providerID,
config: thinkingConfig,
})
}
}
thinkModeState.set(sessionID, state)

View File

@@ -835,8 +835,8 @@ describe("todo-continuation-enforcer", () => {
// OpenCode returns assistant messages with flat modelID/providerID, not nested model object
const mockMessagesWithAssistant = [
{ info: { id: "msg-1", role: "user", agent: "Sisyphus", model: { providerID: "openai", modelID: "gpt-5.2" } } },
{ info: { id: "msg-2", role: "assistant", agent: "Sisyphus", modelID: "gpt-5.2", providerID: "openai" } },
{ info: { id: "msg-1", role: "user", agent: "sisyphus", model: { providerID: "openai", modelID: "gpt-5.2" } } },
{ info: { id: "msg-2", role: "assistant", agent: "sisyphus", modelID: "gpt-5.2", providerID: "openai" } },
]
const mockInput = {
@@ -886,8 +886,8 @@ describe("todo-continuation-enforcer", () => {
setMainSession(sessionID)
const mockMessagesWithCompaction = [
{ info: { id: "msg-1", role: "user", agent: "Sisyphus", model: { providerID: "anthropic", modelID: "claude-sonnet-4-5" } } },
{ info: { id: "msg-2", role: "assistant", agent: "Sisyphus", modelID: "claude-sonnet-4-5", providerID: "anthropic" } },
{ info: { id: "msg-1", role: "user", agent: "sisyphus", model: { providerID: "anthropic", modelID: "claude-sonnet-4-5" } } },
{ info: { id: "msg-2", role: "assistant", agent: "sisyphus", modelID: "claude-sonnet-4-5", providerID: "anthropic" } },
{ info: { id: "msg-3", role: "assistant", agent: "compaction", modelID: "claude-sonnet-4-5", providerID: "anthropic" } },
]
@@ -923,7 +923,7 @@ describe("todo-continuation-enforcer", () => {
// #then - continuation uses Sisyphus (skipped compaction agent)
expect(promptCalls.length).toBe(1)
expect(promptCalls[0].agent).toBe("Sisyphus")
expect(promptCalls[0].agent).toBe("sisyphus")
})
test("should skip injection when only compaction agent messages exist", async () => {

View File

@@ -23,6 +23,7 @@ import {
createInteractiveBashSessionHook,
createThinkingBlockValidatorHook,
createCategorySkillReminderHook,
createRalphLoopHook,
createAutoSlashCommandHook,
createEditErrorRecoveryHook,
@@ -31,7 +32,9 @@ import {
createStartWorkHook,
createAtlasHook,
createPrometheusMdOnlyHook,
createSisyphusJuniorNotepadHook,
createQuestionLabelTruncatorHook,
createSubagentQuestionBlockerHook,
} from "./hooks";
import {
contextCollector,
@@ -73,8 +76,9 @@ import {
import { BackgroundManager } from "./features/background-agent";
import { SkillMcpManager } from "./features/skill-mcp-manager";
import { initTaskToastManager } from "./features/task-toast-manager";
import { TmuxSessionManager } from "./features/tmux-subagent";
import { type HookName } from "./config";
import { log, detectExternalNotificationPlugin, getNotificationConflictWarning, resetMessageCursor, includesCaseInsensitive } from "./shared";
import { log, detectExternalNotificationPlugin, getNotificationConflictWarning, resetMessageCursor, includesCaseInsensitive, hasConnectedProvidersCache } from "./shared";
import { loadPluginConfig } from "./plugin-config";
import { createModelCacheState, getModelLimit } from "./plugin-state";
import { createConfigHandler } from "./plugin-handlers";
@@ -87,6 +91,14 @@ const OhMyOpenCodePlugin: Plugin = async (ctx) => {
const pluginConfig = loadPluginConfig(ctx.directory, ctx);
const disabledHooks = new Set(pluginConfig.disabled_hooks ?? []);
const firstMessageVariantGate = createFirstMessageVariantGate();
const tmuxConfig = {
enabled: pluginConfig.tmux?.enabled ?? false,
layout: pluginConfig.tmux?.layout ?? 'main-vertical',
main_pane_size: pluginConfig.tmux?.main_pane_size ?? 60,
main_pane_min_width: pluginConfig.tmux?.main_pane_min_width ?? 120,
agent_pane_min_width: pluginConfig.tmux?.agent_pane_min_width ?? 40,
} as const;
const isHookEnabled = (hookName: HookName) => !disabledHooks.has(hookName);
const modelCacheState = createModelCacheState();
@@ -181,6 +193,10 @@ const OhMyOpenCodePlugin: Plugin = async (ctx) => {
? createThinkingBlockValidatorHook()
: null;
const categorySkillReminder = isHookEnabled("category-skill-reminder")
? createCategorySkillReminderHook(ctx)
: null;
const ralphLoop = isHookEnabled("ralph-loop")
? createRalphLoopHook(ctx, {
config: pluginConfig.ralph_loop,
@@ -204,11 +220,38 @@ const OhMyOpenCodePlugin: Plugin = async (ctx) => {
? createPrometheusMdOnlyHook(ctx)
: null;
const sisyphusJuniorNotepad = isHookEnabled("sisyphus-junior-notepad")
? createSisyphusJuniorNotepadHook(ctx)
: null;
const questionLabelTruncator = createQuestionLabelTruncatorHook();
const subagentQuestionBlocker = createSubagentQuestionBlockerHook();
const taskResumeInfo = createTaskResumeInfoHook();
const backgroundManager = new BackgroundManager(ctx, pluginConfig.background_task);
const tmuxSessionManager = new TmuxSessionManager(ctx, tmuxConfig);
const backgroundManager = new BackgroundManager(ctx, pluginConfig.background_task, {
tmuxConfig,
onSubagentSessionCreated: async (event) => {
log("[index] onSubagentSessionCreated callback received", {
sessionID: event.sessionID,
parentID: event.parentID,
title: event.title,
});
await tmuxSessionManager.onSessionCreated({
type: "session.created",
properties: {
info: {
id: event.sessionID,
parentID: event.parentID,
title: event.title,
},
},
});
log("[index] onSubagentSessionCreated callback completed");
},
});
const atlasHook = isHookEnabled("atlas")
? createAtlasHook(ctx, { directory: ctx.directory, backgroundManager })
@@ -238,6 +281,7 @@ const OhMyOpenCodePlugin: Plugin = async (ctx) => {
"multimodal-looker"
);
const lookAt = isMultimodalLookerEnabled ? createLookAt(ctx) : null;
const browserProvider = pluginConfig.browser_automation_engine?.provider ?? "playwright";
const delegateTask = createDelegateTask({
manager: backgroundManager,
client: ctx.client,
@@ -245,10 +289,28 @@ const OhMyOpenCodePlugin: Plugin = async (ctx) => {
userCategories: pluginConfig.categories,
gitMasterConfig: pluginConfig.git_master,
sisyphusJuniorModel: pluginConfig.agents?.["sisyphus-junior"]?.model,
browserProvider,
onSyncSessionCreated: async (event) => {
log("[index] onSyncSessionCreated callback", {
sessionID: event.sessionID,
parentID: event.parentID,
title: event.title,
});
await tmuxSessionManager.onSessionCreated({
type: "session.created",
properties: {
info: {
id: event.sessionID,
parentID: event.parentID,
title: event.title,
},
},
});
},
});
const disabledSkills = new Set(pluginConfig.disabled_skills ?? []);
const systemMcpNames = getSystemMcpServerNames();
const builtinSkills = createBuiltinSkills().filter((skill) => {
const builtinSkills = createBuiltinSkills({ browserProvider }).filter((skill) => {
if (disabledSkills.has(skill.name as never)) return false;
if (skill.mcpConfig) {
for (const mcpName of Object.keys(skill.mcpConfig)) {
@@ -336,6 +398,17 @@ const OhMyOpenCodePlugin: Plugin = async (ctx) => {
await autoSlashCommand?.["chat.message"]?.(input, output);
await startWork?.["chat.message"]?.(input, output);
if (!hasConnectedProvidersCache()) {
ctx.client.tui.showToast({
body: {
title: "⚠️ Provider Cache Missing",
message: "Model filtering disabled. RESTART OpenCode to enable full functionality.",
variant: "warning" as const,
duration: 6000,
},
}).catch(() => {});
}
if (ralphLoop) {
const parts = (
output as { parts?: Array<{ type: string; text?: string }> }
@@ -418,6 +491,7 @@ const OhMyOpenCodePlugin: Plugin = async (ctx) => {
await thinkMode?.event(input);
await anthropicContextWindowLimitRecovery?.event(input);
await agentUsageReminder?.event(input);
await categorySkillReminder?.event(input);
await interactiveBashSession?.event(input);
await ralphLoop?.event(input);
await atlasHook?.handler(input);
@@ -425,29 +499,36 @@ const OhMyOpenCodePlugin: Plugin = async (ctx) => {
const { event } = input;
const props = event.properties as Record<string, unknown> | undefined;
if (event.type === "session.created") {
const sessionInfo = props?.info as
| { id?: string; title?: string; parentID?: string }
| undefined;
if (!sessionInfo?.parentID) {
setMainSession(sessionInfo?.id);
}
firstMessageVariantGate.markSessionCreated(sessionInfo);
}
if (event.type === "session.created") {
const sessionInfo = props?.info as
| { id?: string; title?: string; parentID?: string }
| undefined;
log("[event] session.created", { sessionInfo, props });
if (!sessionInfo?.parentID) {
setMainSession(sessionInfo?.id);
}
firstMessageVariantGate.markSessionCreated(sessionInfo);
await tmuxSessionManager.onSessionCreated(
event as { type: string; properties?: { info?: { id?: string; parentID?: string; title?: string } } }
);
}
if (event.type === "session.deleted") {
const sessionInfo = props?.info as { id?: string } | undefined;
if (sessionInfo?.id === getMainSessionID()) {
setMainSession(undefined);
}
if (sessionInfo?.id) {
clearSessionAgent(sessionInfo.id);
resetMessageCursor(sessionInfo.id);
firstMessageVariantGate.clear(sessionInfo.id);
await skillMcpManager.disconnectSession(sessionInfo.id);
await lspManager.cleanupTempDirectoryClients();
}
}
if (event.type === "session.deleted") {
const sessionInfo = props?.info as { id?: string } | undefined;
if (sessionInfo?.id === getMainSessionID()) {
setMainSession(undefined);
}
if (sessionInfo?.id) {
clearSessionAgent(sessionInfo.id);
resetMessageCursor(sessionInfo.id);
firstMessageVariantGate.clear(sessionInfo.id);
await skillMcpManager.disconnectSession(sessionInfo.id);
await lspManager.cleanupTempDirectoryClients();
await tmuxSessionManager.onSessionDeleted({
sessionID: sessionInfo.id,
});
}
}
if (event.type === "message.updated") {
const info = props?.info as Record<string, unknown> | undefined;
@@ -487,6 +568,7 @@ const OhMyOpenCodePlugin: Plugin = async (ctx) => {
},
"tool.execute.before": async (input, output) => {
await subagentQuestionBlocker["tool.execute.before"]?.(input, output);
await questionLabelTruncator["tool.execute.before"]?.(input, output);
await claudeCodeHooks["tool.execute.before"](input, output);
await nonInteractiveEnv?.["tool.execute.before"](input, output);
@@ -495,6 +577,7 @@ const OhMyOpenCodePlugin: Plugin = async (ctx) => {
await directoryReadmeInjector?.["tool.execute.before"]?.(input, output);
await rulesInjector?.["tool.execute.before"]?.(input, output);
await prometheusMdOnly?.["tool.execute.before"]?.(input, output);
await sisyphusJuniorNotepad?.["tool.execute.before"]?.(input, output);
await atlasHook?.["tool.execute.before"]?.(input, output);
if (input.tool === "task") {
@@ -574,6 +657,7 @@ const OhMyOpenCodePlugin: Plugin = async (ctx) => {
await rulesInjector?.["tool.execute.after"](input, output);
await emptyTaskResponseDetector?.["tool.execute.after"](input, output);
await agentUsageReminder?.["tool.execute.after"](input, output);
await categorySkillReminder?.["tool.execute.after"](input, output);
await interactiveBashSession?.["tool.execute.after"](input, output);
await editErrorRecovery?.["tool.execute.after"](input, output);
await delegateTaskRetry?.["tool.execute.after"](input, output);

View File

@@ -2,7 +2,7 @@
## OVERVIEW
3 remote MCP servers: web search, documentation, code search. HTTP/SSE transport.
3 remote MCP servers: web search, documentation, code search. HTTP/SSE transport. Part of three-tier MCP system.
## STRUCTURE
@@ -20,10 +20,16 @@ mcp/
| Name | URL | Purpose | Auth |
|------|-----|---------|------|
| websearch | mcp.exa.ai | Real-time web search | EXA_API_KEY |
| context7 | mcp.context7.com | Library docs | None |
| websearch | mcp.exa.ai/mcp?tools=web_search_exa | Real-time web search | EXA_API_KEY |
| context7 | mcp.context7.com/mcp | Library docs | CONTEXT7_API_KEY |
| grep_app | mcp.grep.app | GitHub code search | None |
## THREE-TIER MCP SYSTEM
1. **Built-in** (this directory): websearch, context7, grep_app
2. **Claude Code compat**: `.mcp.json` with `${VAR}` expansion
3. **Skill-embedded**: YAML frontmatter in skills (handled by skill-mcp-manager)
## CONFIG PATTERN
```typescript
@@ -54,5 +60,6 @@ const mcps = createBuiltinMcps(["websearch"]) // Disable specific
## NOTES
- **Remote only**: HTTP/SSE, no stdio
- **Disable**: User can set `disabled_mcps: ["name"]`
- **Exa**: Requires `EXA_API_KEY` env var
- **Disable**: User can set `disabled_mcps: ["name"]` in config
- **Context7**: Optional auth using `CONTEXT7_API_KEY` env var
- **Exa**: Optional auth using `EXA_API_KEY` env var

View File

@@ -2,5 +2,9 @@ export const context7 = {
type: "remote" as const,
url: "https://mcp.context7.com/mcp",
enabled: true,
headers: process.env.CONTEXT7_API_KEY
? { Authorization: `Bearer ${process.env.CONTEXT7_API_KEY}` }
: undefined,
// Disable OAuth auto-detection - Context7 uses API key header, not OAuth
oauth: false as const,
}

View File

@@ -1,6 +1,185 @@
import { describe, test, expect } from "bun:test"
import { resolveCategoryConfig } from "./config-handler"
import { describe, test, expect, mock, beforeEach } from "bun:test"
import { resolveCategoryConfig, createConfigHandler } from "./config-handler"
import type { CategoryConfig } from "../config/schema"
import type { OhMyOpenCodeConfig } from "../config"
mock.module("../agents", () => ({
createBuiltinAgents: async () => ({
sisyphus: { name: "sisyphus", prompt: "test", mode: "primary" },
oracle: { name: "oracle", prompt: "test", mode: "subagent" },
}),
}))
mock.module("../agents/sisyphus-junior", () => ({
createSisyphusJuniorAgentWithOverrides: () => ({
name: "sisyphus-junior",
prompt: "test",
mode: "subagent",
}),
}))
mock.module("../features/claude-code-command-loader", () => ({
loadUserCommands: async () => ({}),
loadProjectCommands: async () => ({}),
loadOpencodeGlobalCommands: async () => ({}),
loadOpencodeProjectCommands: async () => ({}),
}))
mock.module("../features/builtin-commands", () => ({
loadBuiltinCommands: () => ({}),
}))
mock.module("../features/opencode-skill-loader", () => ({
loadUserSkills: async () => ({}),
loadProjectSkills: async () => ({}),
loadOpencodeGlobalSkills: async () => ({}),
loadOpencodeProjectSkills: async () => ({}),
discoverUserClaudeSkills: async () => [],
discoverProjectClaudeSkills: async () => [],
discoverOpencodeGlobalSkills: async () => [],
discoverOpencodeProjectSkills: async () => [],
}))
mock.module("../features/claude-code-agent-loader", () => ({
loadUserAgents: () => ({}),
loadProjectAgents: () => ({}),
}))
mock.module("../features/claude-code-mcp-loader", () => ({
loadMcpConfigs: async () => ({ servers: {} }),
}))
mock.module("../features/claude-code-plugin-loader", () => ({
loadAllPluginComponents: async () => ({
commands: {},
skills: {},
agents: {},
mcpServers: {},
hooksConfigs: [],
plugins: [],
errors: [],
}),
}))
mock.module("../mcp", () => ({
createBuiltinMcps: () => ({}),
}))
mock.module("../shared", () => ({
log: () => {},
fetchAvailableModels: async () => new Set(["anthropic/claude-opus-4-5"]),
readConnectedProvidersCache: () => null,
}))
mock.module("../shared/opencode-config-dir", () => ({
getOpenCodeConfigPaths: () => ({
global: "/tmp/.config/opencode",
project: "/tmp/.opencode",
}),
}))
mock.module("../shared/permission-compat", () => ({
migrateAgentConfig: (config: Record<string, unknown>) => config,
}))
mock.module("../shared/migration", () => ({
AGENT_NAME_MAP: {},
}))
mock.module("../shared/model-resolver", () => ({
resolveModelWithFallback: () => ({ model: "anthropic/claude-opus-4-5" }),
}))
mock.module("../shared/model-requirements", () => ({
AGENT_MODEL_REQUIREMENTS: {
sisyphus: { fallbackChain: [{ providers: ["anthropic", "github-copilot", "opencode"], model: "claude-opus-4-5" }] },
oracle: { fallbackChain: [{ providers: ["openai", "github-copilot", "opencode"], model: "gpt-5.2" }] },
librarian: { fallbackChain: [{ providers: ["anthropic", "github-copilot", "opencode"], model: "claude-sonnet-4-5" }] },
explore: { fallbackChain: [{ providers: ["anthropic", "opencode"], model: "claude-haiku-4-5" }] },
"multimodal-looker": { fallbackChain: [{ providers: ["google", "github-copilot", "opencode"], model: "gemini-3-flash" }] },
prometheus: { fallbackChain: [{ providers: ["anthropic", "github-copilot", "opencode"], model: "claude-opus-4-5" }] },
metis: { fallbackChain: [{ providers: ["anthropic", "github-copilot", "opencode"], model: "claude-opus-4-5" }] },
momus: { fallbackChain: [{ providers: ["openai", "github-copilot", "opencode"], model: "gpt-5.2" }] },
atlas: { fallbackChain: [{ providers: ["anthropic", "github-copilot", "opencode"], model: "claude-sonnet-4-5" }] },
},
CATEGORY_MODEL_REQUIREMENTS: {
"visual-engineering": { fallbackChain: [{ providers: ["google", "github-copilot", "opencode"], model: "gemini-3-pro" }] },
ultrabrain: { fallbackChain: [{ providers: ["openai", "github-copilot", "opencode"], model: "gpt-5.2-codex" }] },
artistry: { fallbackChain: [{ providers: ["google", "github-copilot", "opencode"], model: "gemini-3-pro" }] },
quick: { fallbackChain: [{ providers: ["anthropic", "github-copilot", "opencode"], model: "claude-haiku-4-5" }] },
"unspecified-low": { fallbackChain: [{ providers: ["anthropic", "github-copilot", "opencode"], model: "claude-sonnet-4-5" }] },
"unspecified-high": { fallbackChain: [{ providers: ["anthropic", "github-copilot", "opencode"], model: "claude-opus-4-5" }] },
writing: { fallbackChain: [{ providers: ["google", "github-copilot", "opencode"], model: "gemini-3-flash" }] },
},
}))
describe("Plan agent demote behavior", () => {
test("plan agent should be demoted to subagent mode when replacePlan is true", async () => {
// #given
const pluginConfig: OhMyOpenCodeConfig = {
sisyphus_agent: {
planner_enabled: true,
replace_plan: true,
},
}
const config: Record<string, unknown> = {
model: "anthropic/claude-opus-4-5",
agent: {
plan: {
name: "plan",
mode: "primary",
prompt: "original plan prompt",
},
},
}
const handler = createConfigHandler({
ctx: { directory: "/tmp" },
pluginConfig,
modelCacheState: {
anthropicContext1MEnabled: false,
modelContextLimitsCache: new Map(),
},
})
// #when
await handler(config)
// #then
const agents = config.agent as Record<string, { mode?: string; name?: string }>
expect(agents.plan).toBeDefined()
expect(agents.plan.mode).toBe("subagent")
expect(agents.plan.name).toBe("plan")
})
test("prometheus should have mode 'all' to be callable via delegate_task", async () => {
// #given
const pluginConfig: OhMyOpenCodeConfig = {
sisyphus_agent: {
planner_enabled: true,
},
}
const config: Record<string, unknown> = {
model: "anthropic/claude-opus-4-5",
agent: {},
}
const handler = createConfigHandler({
ctx: { directory: "/tmp" },
pluginConfig,
modelCacheState: {
anthropicContext1MEnabled: false,
modelContextLimitsCache: new Map(),
},
})
// #when
await handler(config)
// #then
const agents = config.agent as Record<string, { mode?: string }>
expect(agents.prometheus).toBeDefined()
expect(agents.prometheus.mode).toBe("all")
})
})
describe("Prometheus category config resolution", () => {
test("resolves ultrabrain category config", () => {

View File

@@ -25,10 +25,12 @@ import { loadMcpConfigs } from "../features/claude-code-mcp-loader";
import { loadAllPluginComponents } from "../features/claude-code-plugin-loader";
import { createBuiltinMcps } from "../mcp";
import type { OhMyOpenCodeConfig } from "../config";
import { log } from "../shared";
import { log, fetchAvailableModels, readConnectedProvidersCache } from "../shared";
import { getOpenCodeConfigPaths } from "../shared/opencode-config-dir";
import { migrateAgentConfig } from "../shared/permission-compat";
import { AGENT_NAME_MAP } from "../shared/migration";
import { resolveModelWithFallback } from "../shared/model-resolver";
import { AGENT_MODEL_REQUIREMENTS } from "../shared/model-requirements";
import { PROMETHEUS_SYSTEM_PROMPT, PROMETHEUS_PERMISSION } from "../agents/prometheus-prompt";
import { DEFAULT_CATEGORIES } from "../tools/delegate-task/constants";
import type { ModelCacheState } from "../plugin-state";
@@ -105,41 +107,6 @@ export function createConfigHandler(deps: ConfigHandlerDeps) {
log(`Plugin load errors`, { errors: pluginComponents.errors });
}
if (!(config.model as string | undefined)?.trim()) {
let fallbackModel: string | undefined
for (const agentConfig of Object.values(pluginConfig.agents ?? {})) {
const model = (agentConfig as { model?: string })?.model
if (model && typeof model === 'string' && model.trim()) {
fallbackModel = model.trim()
break
}
}
if (!fallbackModel) {
for (const categoryConfig of Object.values(pluginConfig.categories ?? {})) {
const model = (categoryConfig as { model?: string })?.model
if (model && typeof model === 'string' && model.trim()) {
fallbackModel = model.trim()
break
}
}
}
if (fallbackModel) {
config.model = fallbackModel
log(`No default model specified, using fallback from config: ${fallbackModel}`)
} else {
const paths = getOpenCodeConfigPaths({ binary: "opencode", version: null })
throw new Error(
'oh-my-opencode requires a default model.\n\n' +
`Add this to ${paths.configJsonc}:\n\n` +
' "model": "anthropic/claude-sonnet-4-5"\n\n' +
'(Replace with your preferred provider/model)'
)
}
}
// Migrate disabled_agents from old names to new names
const migratedDisabledAgents = (pluginConfig.disabled_agents ?? []).map(agent => {
return AGENT_NAME_MAP[agent.toLowerCase()] ?? AGENT_NAME_MAP[agent] ?? agent
@@ -165,6 +132,7 @@ export function createConfigHandler(deps: ConfigHandlerDeps) {
...discoveredUserSkills,
];
const browserProvider = pluginConfig.browser_automation_engine?.provider ?? "playwright";
const builtinAgents = await createBuiltinAgents(
migratedDisabledAgents,
pluginConfig.agents,
@@ -173,7 +141,8 @@ export function createConfigHandler(deps: ConfigHandlerDeps) {
pluginConfig.categories,
pluginConfig.git_master,
allDiscoveredSkills,
ctx.client
ctx.client,
browserProvider
);
// Claude Code agents: Do NOT apply permission migration
@@ -254,13 +223,10 @@ export function createConfigHandler(deps: ConfigHandlerDeps) {
);
const prometheusOverride =
pluginConfig.agents?.["prometheus"] as
| (Record<string, unknown> & { category?: string; model?: string })
| (Record<string, unknown> & { category?: string; model?: string; variant?: string })
| undefined;
const defaultModel = config.model as string | undefined;
// Resolve full category config (model, temperature, top_p, tools, etc.)
// Apply all category properties when category is specified, but explicit
// overrides (model, temperature, etc.) will take precedence during merge
const categoryConfig = prometheusOverride?.category
? resolveCategoryConfig(
prometheusOverride.category,
@@ -268,19 +234,31 @@ export function createConfigHandler(deps: ConfigHandlerDeps) {
)
: undefined;
// Model resolution: explicit override → category config → OpenCode default
// No hardcoded fallback - OpenCode config.model is the terminal fallback
const resolvedModel = prometheusOverride?.model ?? categoryConfig?.model ?? defaultModel;
const prometheusRequirement = AGENT_MODEL_REQUIREMENTS["prometheus"];
const connectedProviders = readConnectedProvidersCache();
const availableModels = ctx.client
? await fetchAvailableModels(ctx.client, { connectedProviders: connectedProviders ?? undefined })
: new Set<string>();
const modelResolution = resolveModelWithFallback({
userModel: prometheusOverride?.model ?? categoryConfig?.model,
fallbackChain: prometheusRequirement?.fallbackChain,
availableModels,
systemDefaultModel: defaultModel ?? "",
});
const resolvedModel = modelResolution?.model;
const resolvedVariant = modelResolution?.variant;
const variantToUse = prometheusOverride?.variant ?? resolvedVariant;
const prometheusBase = {
// Only include model if one was resolved - let OpenCode apply its own default if none
name: "prometheus",
...(resolvedModel ? { model: resolvedModel } : {}),
mode: "primary" as const,
...(variantToUse ? { variant: variantToUse } : {}),
mode: "all" as const,
prompt: PROMETHEUS_SYSTEM_PROMPT,
permission: PROMETHEUS_PERMISSION,
description: `${configAgent?.plan?.description ?? "Plan agent"} (Prometheus - OhMyOpenCode)`,
color: (configAgent?.plan?.color as string) ?? "#FF6347",
// Apply category properties (temperature, top_p, tools, etc.)
...(categoryConfig?.temperature !== undefined
? { temperature: categoryConfig.temperature }
: {}),
@@ -328,8 +306,12 @@ export function createConfigHandler(deps: ConfigHandlerDeps) {
? migrateAgentConfig(configAgent.build as Record<string, unknown>)
: {};
const planDemoteConfig = replacePlan
? { mode: "subagent" as const }
const planDemoteConfig = replacePlan && agentConfig["prometheus"]
? {
...agentConfig["prometheus"],
name: "plan",
mode: "subagent" as const
}
: undefined;
config.agent = {
@@ -403,8 +385,8 @@ export function createConfigHandler(deps: ConfigHandlerDeps) {
: { servers: {} };
config.mcp = {
...(config.mcp as Record<string, unknown>),
...createBuiltinMcps(pluginConfig.disabled_mcps),
...(config.mcp as Record<string, unknown>),
...mcpResult.servers,
...pluginComponents.mcpServers,
};

View File

@@ -1,81 +1,78 @@
# SHARED UTILITIES KNOWLEDGE BASE
## OVERVIEW
34 cross-cutting utilities: path resolution, token truncation, config parsing, model resolution, agent display names.
55 cross-cutting utilities: path resolution, token truncation, config parsing, model resolution.
## STRUCTURE
```
shared/
├── logger.ts # File-based logging
├── permission-compat.ts # Agent tool restrictions
├── dynamic-truncator.ts # Token-aware truncation
├── frontmatter.ts # YAML frontmatter
├── jsonc-parser.ts # JSON with Comments
├── data-path.ts # XDG-compliant storage
├── opencode-config-dir.ts # ~/.config/opencode
├── claude-config-dir.ts # ~/.claude
├── migration.ts # Legacy config migration
├── opencode-version.ts # Version comparison
├── external-plugin-detector.ts # OAuth spoofing detection
├── model-requirements.ts # Agent/Category requirements
├── model-availability.ts # Models fetch + fuzzy match
├── model-resolver.ts # 3-step resolution
├── model-sanitizer.ts # Model ID normalization
├── shell-env.ts # Cross-platform shell
├── agent-display-names.ts # Agent display name mapping
├── agent-tool-restrictions.ts # Tool restriction helpers
├── agent-variant.ts # Agent variant detection
├── command-executor.ts # Subprocess execution
├── config-errors.ts # Config error types
├── deep-merge.ts # Deep object merge
├── file-reference-resolver.ts # File path resolution
── file-utils.ts # File utilities
├── hook-disabled.ts # Hook enable/disable check
├── pattern-matcher.ts # Glob pattern matching
├── session-cursor.ts # Session cursor tracking
├── snake-case.ts # String case conversion
├── system-directive.ts # System prompt helpers
├── tool-name.ts # Tool name constants
├── zip-extractor.ts # ZIP file extraction
├── index.ts # Barrel export
└── *.test.ts # Colocated tests
├── tmux/ # Tmux TUI integration (types, utils, constants)
├── logger.ts # File-based logging (/tmp/oh-my-opencode.log)
├── dynamic-truncator.ts # Token-aware context window management (194 lines)
├── model-resolver.ts # 3-step resolution (Override → Fallback → Default)
├── model-requirements.ts # Agent/category model fallback chains (132 lines)
├── model-availability.ts # Provider model fetching & fuzzy matching (154 lines)
├── jsonc-parser.ts # JSONC parsing with comment support
├── frontmatter.ts # YAML frontmatter extraction (JSON_SCHEMA only)
├── data-path.ts # XDG-compliant storage resolution
├── opencode-config-dir.ts # ~/.config/opencode resolution (143 lines)
├── claude-config-dir.ts # ~/.claude resolution
├── migration.ts # Legacy config migration logic (231 lines)
├── opencode-version.ts # Semantic version comparison
├── permission-compat.ts # Agent tool restriction enforcement
├── system-directive.ts # Unified system message prefix & types
├── session-utils.ts # Session cursor, orchestrator detection
├── shell-env.ts # Cross-platform shell environment
├── agent-variant.ts # Agent variant from config
├── zip-extractor.ts # Binary/Resource ZIP extraction
├── deep-merge.ts # Recursive object merging (proto-pollution safe, MAX_DEPTH=50)
├── case-insensitive.ts # Case-insensitive object lookups
├── session-cursor.ts # Session message cursor tracking
├── command-executor.ts # Shell command execution (225 lines)
── index.ts # Barrel export for all utilities
```
## WHEN TO USE
## MOST IMPORTED
| Utility | Users | Purpose |
|---------|-------|---------|
| logger.ts | 16+ | Background task visibility |
| system-directive.ts | 8+ | Message filtering |
| opencode-config-dir.ts | 8+ | Path resolution |
| permission-compat.ts | 6+ | Tool restrictions |
## WHEN TO USE
| Task | Utility |
|------|---------|
| Debug logging | `log(message, data)` |
| Limit context | `dynamicTruncate(ctx, sessionId, output)` |
| Parse frontmatter | `parseFrontmatter(content)` |
| Load JSONC | `parseJsonc(text)` or `readJsoncFile(path)` |
| Restrict tools | `createAgentToolAllowlist(tools)` |
| Resolve paths | `getOpenCodeConfigDir()` |
| Compare versions | `isOpenCodeVersionAtLeast("1.1.0")` |
| Resolve model | `resolveModelWithFallback()` |
| Agent display name | `getAgentDisplayName(agentName)` |
| Path Resolution | `getOpenCodeConfigDir()`, `getDataPath()` |
| Token Truncation | `dynamicTruncate(ctx, sessionId, output)` |
| Config Parsing | `readJsoncFile<T>(path)`, `parseJsonc(text)` |
| Model Resolution | `resolveModelWithFallback(client, reqs, override)` |
| Version Gating | `isOpenCodeVersionAtLeast(version)` |
| YAML Metadata | `parseFrontmatter(content)` |
| Tool Security | `createAgentToolAllowlist(tools)` |
| System Messages | `createSystemDirective(type)`, `isSystemDirective(msg)` |
| Deep Merge | `deepMerge(target, source)` |
## PATTERNS
## KEY PATTERNS
**3-Step Resolution** (Override → Fallback → Default):
```typescript
// Token-aware truncation
const { result } = await dynamicTruncate(ctx, sessionID, buffer)
const model = resolveModelWithFallback({
userModel: config.agents.sisyphus.model,
fallbackChain: AGENT_MODEL_REQUIREMENTS.sisyphus.fallbackChain,
availableModels: fetchedModels,
})
```
// JSONC config
const settings = readJsoncFile<Settings>(configPath)
// Version-gated
if (isOpenCodeVersionAtLeast("1.1.0")) { /* ... */ }
// Model resolution
const model = await resolveModelWithFallback(client, requirements, override)
**System Directive Filtering**:
```typescript
if (isSystemDirective(message)) return // Skip system-generated
const directive = createSystemDirective("TODO CONTINUATION")
```
## ANTI-PATTERNS
- **Raw JSON.parse**: Use `jsonc-parser.ts`
- **Hardcoded paths**: Use `*-config-dir.ts`
- **console.log**: Use `logger.ts` for background
- **Unbounded output**: Use `dynamic-truncator.ts`
- **Raw JSON.parse**: Use `jsonc-parser.ts` for comment support
- **Hardcoded Paths**: Use `*-config-dir.ts` or `data-path.ts`
- **console.log**: Use `logger.ts` for background task visibility
- **Unbounded Output**: Use `dynamic-truncator.ts` to prevent overflow
- **Manual Version Check**: Use `opencode-version.ts` for semver safety

Some files were not shown because too many files have changed in this diff Show More