Compare commits

..

125 Commits

Author SHA1 Message Date
github-actions[bot]
6c8527f29b release: v3.1.7 2026-01-29 12:39:22 +00:00
justsisyphus
cd4da93bf2 fix(test): migrate config-handler tests from mock.module to spyOn to prevent cross-file cache pollution 2026-01-29 21:35:14 +09:00
justsisyphus
71b2f1518a chore(agents): unify agent description format with OhMyOpenCode attribution 2026-01-29 21:27:04 +09:00
YeonGyu-Kim
dcda8769cc feat(mcp-oauth): add full OAuth 2.1 authentication for MCP servers (#1169)
* feat(mcp-oauth): add oauth field to ClaudeCodeMcpServer schema

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

* feat(mcp-oauth): add RFC 7591 Dynamic Client Registration

* feat(mcp-oauth): add RFC 9728 PRM + RFC 8414 AS discovery

* feat(mcp-oauth): add secure token storage with {host}/{resource} key format

* feat(mcp-oauth): add dynamic port OAuth callback server

* feat(mcp-oauth): add RFC 8707 Resource Indicators

* feat(mcp-oauth): implement full-spec McpOAuthProvider

* feat(mcp-oauth): add step-up authorization handler

* feat(mcp-oauth): integrate authProvider into SkillMcpManager

* feat(doctor): add MCP OAuth token status check

* feat(cli): add mcp oauth subcommand structure

* feat(cli): implement mcp oauth login command

* fix(mcp-oauth): address cubic review — security, correctness, and test issues

- Remove @ts-nocheck from provider.ts, storage.ts, provider.test.ts
- Fix server resource leak on missing code/state (close + reject)
- Fix command injection in openBrowser (spawn array args, cross-platform)
- Mock McpOAuthProvider in login.test.ts for deterministic CI
- Recreate auth provider with merged scopes in step-up flow
- Add listAllTokens() for global status listing
- Fix logout to accept --server-url for correct token deletion
- Support both quoted and unquoted WWW-Authenticate params (RFC 2617)
- Save/restore OPENCODE_CONFIG_DIR in storage.test.ts
- Fix index.test.ts: vitest → bun:test

* fix(mcp-oauth): use explorer instead of cmd /c start on Windows to prevent shell injection

* fix(mcp-oauth): address remaining cubic review issues

- Add 5-minute timeout to provider callback server to prevent indefinite hangs
- Persist client registration from token storage across process restarts
- Require --server-url for logout to match token storage key format
- Use listTokensByHost for server-specific status lookups
- Fix callback-server test to handle promise rejection ordering
- Fix provider test port expectations (8912 → 19877)
- Fix cli-guide.md duplicate Section 7 numbering
- Fix manager test for login-on-missing-tokens behavior

* fix(mcp-oauth): address final review issues

- P1: Redact token values in status.ts output to prevent credential leakage
- P2: Read OAuth error response body before throwing in token exchange
- Test: Fix mcp-oauth doctor test to use epoch seconds (not milliseconds)

---------

Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-01-29 19:48:36 +09:00
YeonGyu-Kim
a94fbadd57 Migrate LSP client to vscode-jsonrpc for improved stability (#1095)
* refactor(lsp): migrate to vscode-jsonrpc for improved stability

Replace custom JSON-RPC implementation with vscode-jsonrpc library.
Use MessageConnection with StreamMessageReader/Writer.
Implement Bun↔Node stream bridges for compatibility.
Preserve all existing functionality (warmup, cleanup, capabilities).
Net reduction of ~60 lines while improving protocol handling.

* fix(lsp): clear timeout on successful response to prevent unhandled rejections

---------

Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
2026-01-29 19:48:28 +09:00
YeonGyu-Kim
23b49c4a5c fix: expand override.category and explicit reasoningEffort priority (#1219) (#1235)
* fix: expand override.category and explicit reasoningEffort priority (#1219)

Two bugs fixed:

1. createBuiltinAgents(): override.category was never expanded into concrete
   config properties (model, variant, reasoningEffort, etc.). Added
   applyCategoryOverride() helper and applied it in the standard agent loop,
   Sisyphus path, and Atlas path.

2. Prometheus config-handler: reasoningEffort/textVerbosity/thinking from
   direct override now use explicit priority chains (direct > category)
   matching the existing variant pattern, instead of relying on spread
   ordering.

Priority order (highest to lowest):
  1. Direct override properties
  2. Override category properties
  3. Resolved variant from model fallback chain
  4. Factory base defaults

Closes #1219

* fix: use undefined check for thinking to allow explicit false
2026-01-29 19:46:34 +09:00
YeonGyu-Kim
b4973954e3 fix(background-agent): prevent zombie processes by aborting sessions on shutdown (#1240) (#1243)
- BackgroundManager.shutdown() now aborts all running child sessions via
  client.session.abort() before clearing state, preventing orphaned
  opencode processes when parent exits
- Add onShutdown callback to BackgroundManager constructor, used to
  trigger TmuxSessionManager.cleanup() on process exit signals
- Interactive bash session hook now aborts tracked subagent opencode
  sessions when killing tmux sessions (defense-in-depth)
- Add 4 tests verifying shutdown abort behavior and callback invocation

Closes #1240
2026-01-29 18:29:47 +09:00
github-actions[bot]
6d50fbe563 @Lynricsy has signed the CLA in code-yeongyu/oh-my-opencode#1241 2026-01-29 09:00:40 +00:00
YeonGyu-Kim
9850dd0f6e fix(test): align agent tests with connected-providers-cache fallback behavior (#1227)
Tests in utils.test.ts were written before bffa1ad introduced
connected-providers-cache fallback in resolveModelWithFallback.
Update assertions to match the new resolution path:
- Oracle resolves to openai/gpt-5.2 via cache (not systemDefault)
- Agents are created via cache fallback even without systemDefaultModel
2026-01-29 11:47:17 +09:00
YeonGyu-Kim
34aaef2219 fix(delegate-task): pass registered agent model explicitly for subagent_type (#1225)
When delegate_task uses subagent_type, extract the matched agent's model
object and pass it explicitly to session.prompt/manager.launch. This
ensures the model is always in the correct object format regardless of
how OpenCode handles string→object conversion for plugin-registered
agents.

Closes #1225
2026-01-29 11:27:07 +09:00
Mike
faca80caa9 fix(start-work): prevent overwriting session agent if already set; inherit parent model for subagent types (#1201)
* fix(start-work): prevent overwriting session agent if already set; inherit parent model for subagent types

* fix(model): include variant in StoredMessage model structure for better context propagation

* fix(injector): include variant in model structure for hook message injection
2026-01-29 09:30:37 +09:00
SUHO LEE
0c3fbd724b fix(model-resolver): respect UI model selection in agent initialization (#1158)
- Add uiSelectedModel parameter to resolveModelWithFallback()
- Update model resolution priority: UI Selection → Config Override → Fallback → System Default
- Pass config.model as uiSelectedModel in createBuiltinAgents()
- Fix ProviderModelNotFoundError when model is unset in config but selected in UI
2026-01-29 09:30:35 +09:00
Srijan Guchhait
c7455708f8 docs: Add missing configuration options to configurations.md (#1186)
- Add disabled_commands section with available commands
- Add comment_checker configuration (custom_prompt)
- Add notification configuration (force_enable)
- Add sisyphus tasks & swarm configuration sections
- Add staleTimeoutMs to background_task section
- Add dynamic_context_pruning to experimental section with full documentation
- Extend skills configuration with advanced options (sources, custom skills)
- Extend agents configuration with missing options (category, variant, maxTokens, thinking, reasoningEffort, textVerbosity, providerOptions)
- Extend categories configuration with missing options (description, is_unstable_agent)
- Extend LSP configuration with missing server options (env, initialization, disabled) and detailed examples
- Add missing hooks (auto-slash-command, sisyphus-junior-notepad, start-work) to hooks list
- Update available agents list to include all agents from schema

Co-authored-by: GitHub Actions <actions@github.com>
2026-01-29 09:30:32 +09:00
Peïo Thibault
bffa1ad43d fix(model-resolver): use connected providers cache when model cache is empty (#1227)
- Remove resolved.model from userModel in tools.ts (was bypassing fallback chain)
- Use connected providers cache in model-resolver when availableModels is empty
- Allows proper provider selection (e.g., github-copilot instead of google)
2026-01-29 09:30:19 +09:00
github-actions[bot]
6560dedd4c @mrdavidlaing has signed the CLA in code-yeongyu/oh-my-opencode#1226 2026-01-28 19:51:45 +00:00
sisyphus-dev-ai
b7e32a99f2 chore: changes by sisyphus-dev-ai 2026-01-28 16:51:21 +00:00
github-actions[bot]
a06e656565 release: v3.1.6 2026-01-28 16:15:27 +00:00
justsisyphus
30ed086c40 fix(delegate-task): use category default model when availableModels is empty 2026-01-29 01:11:42 +09:00
justsisyphus
7c15b06da7 fix(test): update tests to reflect new model-resolver behavior 2026-01-29 00:54:16 +09:00
justsisyphus
0e7ee2ac30 chore: remove noisy console.warn for AGENTS.md auto-disable 2026-01-29 00:46:16 +09:00
justsisyphus
ca93d2f0fe fix(model-resolver): skip fallback chain when model availability cannot be verified
When model cache is empty, the fallback chain resolution was blindly
trusting connected providers without verifying if the model actually
exists. This caused errors when a provider (e.g., opencode) was marked
as connected but didn't have the requested model (e.g., claude-haiku-4-5).

Now skips fallback chain entirely when model cache is unavailable and
falls through to system default, letting OpenCode handle the resolution.
2026-01-29 00:15:57 +09:00
YeonGyu-Kim
3ab4529bc7 fix(look-at): handle JSON parse errors from session.prompt gracefully (#1216)
When multimodal-looker agent returns empty/malformed response, the SDK
throws 'JSON Parse error: Unexpected EOF'. This commit adds try-catch
around session.prompt() to provide user-friendly error message with
troubleshooting guidance.

- Add error handling for JSON parse errors with detailed guidance
- Add error handling for generic prompt failures
- Add test cases for both error scenarios
2026-01-28 23:58:01 +09:00
github-actions[bot]
9d3e152b19 @KennyDizi has signed the CLA in code-yeongyu/oh-my-opencode#1214 2026-01-28 14:26:21 +00:00
github-actions[bot]
68c8f3dda7 release: v3.1.5 2026-01-28 14:15:42 +00:00
justsisyphus
03f6e72c9b refactor(ultrawork): replace prometheus with plan agent, add parallel task graph output
- Change all prometheus references to plan agent in ultrawork mode
- Add MANDATORY OUTPUT section to ULTRAWORK_PLANNER_SECTION:
  - Parallel Execution Waves structure
  - Dependency Matrix format
  - TODO List with category + skills + parallel group
  - Agent Dispatch Summary table
- Plan agent now outputs parallel task graphs for orchestrator execution
2026-01-28 23:09:51 +09:00
justsisyphus
4fd9f0fd04 refactor(agents): enforce zero user intervention in QA/acceptance criteria
- Prometheus: rename 'Manual QA' to 'Automated Verification Only'
- Prometheus: add explicit ZERO USER INTERVENTION principle
- Prometheus: replace placeholder examples with concrete executable commands
- Metis: add QA automation directives in output format
- Metis: strengthen CRITICAL RULES to forbid user-intervention criteria
2026-01-28 23:00:55 +09:00
github-actions[bot]
4413336724 @youming-ai has signed the CLA in code-yeongyu/oh-my-opencode#1203 2026-01-28 13:04:28 +00:00
Doyoon Kwon
895f366a11 docs: add Ollama streaming NDJSON issue guide and workaround (#1197)
* docs: add Ollama streaming NDJSON issue troubleshooting guide

- Document problem: JSON Parse error when using Ollama with stream: true
- Explain root cause: NDJSON vs single JSON object mismatch
- Provide 3 solutions: disable streaming, avoid tool agents, wait for SDK fix
- Include NDJSON parsing code example for SDK maintainers
- Add curl testing command for verification
- Link to issue #1124 and Ollama API docs

Fixes #1124

* docs: add Ollama provider configuration with streaming workaround

- Add Ollama Provider section to configurations.md
- Document stream: false requirement for Ollama
- Explain NDJSON vs single JSON mismatch
- Provide supported models table (qwen3-coder, ministral-3, lfm2.5-thinking)
- Add troubleshooting steps and curl test command
- Link to troubleshooting guide

feat: add NDJSON parser utility for Ollama streaming responses

- Create src/shared/ollama-ndjson-parser.ts
- Implement parseOllamaStreamResponse() for merging NDJSON lines
- Implement isNDJSONResponse() for format detection
- Add TypeScript interfaces for Ollama message structures
- Include JSDoc with usage examples
- Handle edge cases: malformed lines, stats aggregation

This utility can be contributed to Claude Code SDK for proper NDJSON support.

Related to #1124

* fix: use logger instead of console, remove trailing whitespace

- Replace console.warn with log() from shared/logger
- Remove trailing whitespace from troubleshooting guide
- Ensure TypeScript compatibility
2026-01-28 19:01:33 +09:00
YeonGyu-Kim
acc19fcd41 feat(hooks): auto-disable directory-agents-injector for OpenCode 1.1.37+ native support (#1204)
* feat(delegate-task): add prometheus self-delegation block and delegate_task permission

- Block prometheus from delegating to itself via delegate_task
- Grant delegate_task permission to prometheus when called as subagent
- Other subagents still have delegate_task disabled

* feat(version): add OPENCODE_NATIVE_AGENTS_INJECTION_VERSION constant

* docs: add deprecation notes for directory-agents-injector

* feat(hooks): auto-disable directory-agents-injector for OpenCode 1.1.37+

---------

Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
2026-01-28 18:46:51 +09:00
justsisyphus
68e0a32183 chore(issue-templates): add English language requirement checkbox 2026-01-28 18:24:15 +09:00
justsisyphus
dee89c1556 feat(delegate-task): add prometheus self-delegation block and delegate_task permission
- Block prometheus from delegating to itself via delegate_task
- Grant delegate_task permission to prometheus when called as subagent
- Other subagents still have delegate_task disabled
2026-01-28 18:24:15 +09:00
github-actions[bot]
315c75c51e @rooftop-Owl has signed the CLA in code-yeongyu/oh-my-opencode#1197 2026-01-28 08:47:09 +00:00
YeonGyu-Kim
3dd80889a5 fix(tools): add permission field to session.create() for consistency (#1192) (#1199)
- Add permission field to look_at and call_omo_agent session.create()
- Match pattern used in delegate_task and background-agent
- Add better error messages for Unauthorized failures
- Provide actionable guidance in error messages

This addresses potential session creation failures by ensuring
consistent session configuration across all tools that create
child sessions.
2026-01-28 17:35:25 +09:00
Sisyphus
8f6ed5b20f fix(hooks): add null guard for tool.execute.after output (#1054)
/review command and some Claude Code built-in commands trigger
tool.execute.after hooks with undefined output, causing crashes
when accessing output.metadata or output.output.

Fixes #1035

Co-authored-by: sisyphus-dev-ai <sisyphus-dev-ai@users.noreply.github.com>
2026-01-28 16:26:40 +09:00
TheEpTic
01500f1ebe Fix: prevent system-reminder tags from triggering mode keywords (#1155)
Automated system messages with <system-reminder> tags were incorrectly
triggering [search-mode], [analyze-mode], and other keyword modes when
they contained words like "search", "find", "explore", etc.

Changes:
- Add removeSystemReminders() to strip <system-reminder> content before keyword detection
- Add hasSystemReminder() utility function
- Update keyword-detector to clean text before pattern matching
- Add comprehensive test coverage for system-reminder filtering

Fixes issue where automated system notifications caused agents to
incorrectly enter MAXIMUM SEARCH EFFORT mode.

Co-authored-by: TheEpTic <git@eptic.me>
2026-01-28 16:26:37 +09:00
Thanh Nguyen
48f6c5e06d fix(skill): support YAML array format for allowed-tools field (#1163)
Fixes #1021

The allowed-tools field in skill frontmatter now supports both formats:
- Space-separated string: 'allowed-tools: Read Write Edit Bash'
- YAML array: 'allowed-tools: [Read, Write, Edit, Bash]'
- Multi-line YAML array format also works

Previously, skills using YAML array format would silently fail to parse,
causing them to not appear in the <available_skills> list.

Changes:
- Updated parseAllowedTools() in loader.ts, async-loader.ts, and merger.ts
  to handle both string and string[] types
- Updated SkillMetadata type to accept string | string[] for allowed-tools
- Added 4 test cases covering all allowed-tools formats
2026-01-28 16:26:34 +09:00
Moha Abdi
3e32afe646 fix(agent-variant): resolve variant based on current model, not static config (#1179) 2026-01-28 16:26:31 +09:00
Xiaoya Wang
d11c4a1f81 fix: guard JSON.parse(result.stdout) with || "{}" fallback in hook handlers (#1191)
Co-authored-by: wangxiaoya.2000 <wangxiaoya.2000@bytedance.com>
2026-01-28 16:26:28 +09:00
github-actions[bot]
5558ddf468 release: v3.1.4 2026-01-28 07:22:03 +00:00
justsisyphus
aa03d9b811 ci: sync publish.yml test isolation with ci.yml 2026-01-28 16:18:21 +09:00
YeonGyu-Kim
28a0dd06c7 fix: resolve version detection for npm global installations (#1194)
When oh-my-opencode is installed via npm global install and run as a
compiled binary, import.meta.url returns a virtual bun path ($bunfs)
instead of the actual filesystem path. This caused getCachedVersion()
to return null, resulting in 'unknown' version display.

Add fallback using process.execPath which correctly points to the actual
binary location, allowing us to walk up and find the package.json.

Fixes #1182
2026-01-28 15:54:17 +09:00
YeonGyu-Kim
995b7751af ci(cla): add repository owner to CLA allowlist (#1195)
The repository owner (code-yeongyu) was not in the CLA allowlist,
causing CLA signature requirement on their own PRs.

Added code-yeongyu to the allowlist to skip CLA for owner commits.

Co-authored-by: 김연규 <yeongyu@mengmotaMacbookAir.local>
2026-01-28 15:46:42 +09:00
justsisyphus
5087788f66 ci: split test execution to prevent mock.module pollution 2026-01-28 15:06:32 +09:00
justsisyphus
19524c8a27 ci: run tests sequentially to prevent mock.module pollution 2026-01-28 14:59:26 +09:00
justsisyphus
fbb4d46945 fix: explicit reset in mainSessionID test for parallel test safety 2026-01-28 14:40:15 +09:00
justsisyphus
5dc8d577a4 fix: add afterEach cleanup in session-state tests for parallel test isolation 2026-01-28 14:36:58 +09:00
justsisyphus
c249763d7e fix: reset sessionAgentMap in _resetForTesting for test isolation
- Add sessionAgentMap.clear() to _resetForTesting()
- Prevents test pollution when tests run in parallel in CI
2026-01-28 14:33:14 +09:00
justsisyphus
b2d618e851 fix: mock provider cache in delegate-task tests for CI stability
- Add spyOn for readConnectedProvidersCache to return connected providers
- Tests now work consistently regardless of actual provider cache state
- Fixes CI failures for category variant and unstable agent tests
2026-01-28 14:27:34 +09:00
justsisyphus
6f348a8a5c fix: resolve CI test timeouts with configurable timing
- Add timing.ts module for test-only timing configuration
- Replace hardcoded wait times with getTimingConfig()
- Enable all previously skipped tests (ralph-loop, session-state, delegate-task)
- Tests now complete in ~2s instead of timing out
2026-01-28 14:17:56 +09:00
justsisyphus
1da0adcbe8 feat(index): add provider cache missing warning toast
Show warning toast when hasConnectedProvidersCache() returns false,
indicating model filtering is disabled. Prompts user to restart
OpenCode for full functionality.
2026-01-28 13:31:11 +09:00
justsisyphus
8a9d966a3d fix(model-resolver): skip fallback chain when no cache exists
When no provider cache exists, skip the fallback chain entirely and let
OpenCode use Provider.defaultModel() as the final fallback. This prevents
incorrect model selection when the plugin loads before providers connect.

- Remove forced first-entry fallback when no cache
- Add log messages for cache miss scenarios
- Update tests for new behavior
2026-01-28 13:31:03 +09:00
justsisyphus
76f8c500cb fix(config): add 'dev-browser' to BrowserAutomationProviderSchema
Config validation was failing when 'dev-browser' was set as the browser
automation provider, causing the entire config to be rejected. This
silently disabled all config options including tmux.enabled.

- Add 'dev-browser' as valid option in BrowserAutomationProviderSchema
- Update JSDoc with dev-browser description
- Regenerate JSON schema
2026-01-28 12:05:20 +09:00
github-actions[bot]
388516bcc5 @agno01 has signed the CLA in code-yeongyu/oh-my-opencode#1188 2026-01-28 01:02:15 +00:00
github-actions[bot]
8dff875929 @zycaskevin has signed the CLA in code-yeongyu/oh-my-opencode#1184 2026-01-27 16:20:49 +00:00
github-actions[bot]
966cc90a02 release: v3.1.3 2026-01-27 16:12:43 +00:00
justsisyphus
1d27d78127 test: skip flaky sync variant test (CI timeout) 2026-01-28 01:07:14 +09:00
justsisyphus
38156d49f3 ci: use find/xargs to exclude mock-heavy test files 2026-01-28 01:01:45 +09:00
justsisyphus
897eea0263 ci: isolate mock-heavy test files to prevent parallel pollution 2026-01-28 01:00:17 +09:00
justsisyphus
9b59ef66e4 test: fix flaky tests caused by mock.module pollution across parallel test files 2026-01-28 00:54:20 +09:00
github-actions[bot]
0d938059f9 @moha-abdi has signed the CLA in code-yeongyu/oh-my-opencode#1179 2026-01-27 12:36:31 +00:00
github-actions[bot]
9d35f23725 @MoerAI has signed the CLA in code-yeongyu/oh-my-opencode#1172 2026-01-27 09:31:52 +00:00
justsisyphus
aa1646f82c fix(delegate-task): pass variant as top-level field in prompt body
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-01-27 17:54:58 +09:00
justsisyphus
e47ab084fd fix(keyword-detector): skip ultrawork injection for planner agents
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-01-27 17:54:52 +09:00
justsisyphus
baf6358736 fix(background-agent): pass variant as top-level field in prompt body 2026-01-27 16:49:03 +09:00
justsisyphus
488c89156b test(config-handler): add tests for plan demote and prometheus mode 2026-01-27 16:06:03 +09:00
justsisyphus
c4957a469d fix(prometheus): set mode to 'all' and restore plan demote logic
- Change prometheus mode from 'primary' to 'all' to allow delegate_task calls
- Restore plan agent demote logic to use prometheus config as base
- Revert d481c596 changes that broke plan agent inheritance
2026-01-27 15:57:45 +09:00
justsisyphus
d481c596bd fix(plan-agent): only inherit model from prometheus as fallback
Plan agent was incorrectly inheriting prometheus's entire config (prompt,
permission, etc.) causing it to behave as primary instead of subagent.

Now plan agent:
1. Uses plan config model if explicitly set
2. Falls back to prometheus model only if plan config has no model
3. Keeps original OpenCode plan config intact
2026-01-27 15:18:28 +09:00
justsisyphus
655d511294 Revert "docs: add v2.x to v3.x migration guide (#1057)"
This PR was incorrectly merged by AI agent without proper project owner review.

This reverts commit 1cb6b3de39a49acb43b76ac55a5b44b47ca4a9f7.
2026-01-27 14:09:37 +09:00
justsisyphus
7dedd6cf90 Revert "Add oh-my-opencode-slim (#1100)"
This PR was incorrectly merged by AI agent without proper project owner review.

The AI evaluated this as 'ULTRA SAFE' because it only modified README files,
but failed to recognize that adding external fork promotions to the project
README requires explicit project owner approval - not just technical safety.

This reverts commit 912a56db85.
2026-01-27 14:09:18 +09:00
justsisyphus
bd18f231f5 feat(sisyphus): add foundation schemas for tasks and swarm (Wave 1)
- Add SisyphusTasksConfig and SisyphusSwarmConfig to schema.ts
- Create Task JSON schema with Zod validation
- Create Mailbox IPC protocol message schemas
- Add storage utilities with Claude Code path compatibility
- 25 tests passing
2026-01-27 13:07:09 +09:00
justsisyphus
de439edc22 feat(subagent): block question tool at both SDK and hook level
- Add permission: [{ permission: 'question', action: 'deny' }] to session.create()
  in background-agent and delegate-task for SDK-level blocking
- Add subagent-question-blocker hook as backup layer to intercept question tool
  calls in tool.execute.before event
- Ensures subagents cannot ask questions to users and must work autonomously
2026-01-27 13:07:09 +09:00
github-actions[bot]
04500bae7d @code-yeongyu has signed the CLA in code-yeongyu/oh-my-opencode#1100 2026-01-27 02:59:24 +00:00
Sisyphus
1cb6b3de7d docs: add v2.x to v3.x migration guide (#1057)
Comprehensive migration guide covering:
- TL;DR quick upgrade section for most users
- What's new in v3.x (Atlas, Prometheus, categories, skills)
- Breaking changes checklist (high/medium/low impact)
- Step-by-step upgrade path
- Configuration changes (categories, permissions)
- API changes for plugin developers
- Troubleshooting common issues
- Complete agent and category reference

Consulted Oracle for migration guide strategy and structure.

Closes #1034 (item 4)

Co-authored-by: sisyphus-dev-ai <sisyphus-dev-ai@users.noreply.github.com>
2026-01-27 11:59:15 +09:00
Alvin
912a56db85 Add oh-my-opencode-slim (#1100) 2026-01-27 11:59:12 +09:00
itsmylife44
a5d9929c0a feat: support OPENCODE_SERVER_PORT and OPENCODE_SERVER_HOSTNAME env vars (#1157)
Add support for customizing the OpenCode server port and hostname via
environment variables. This enables orchestration tools like Open Agent
to run multiple concurrent missions without port conflicts.

Environment variables:
- OPENCODE_SERVER_PORT: Custom port for the OpenCode server
- OPENCODE_SERVER_HOSTNAME: Custom hostname for the OpenCode server

When running oh-my-opencode in parallel (e.g., multiple missions in
Open Agent), each instance can now use a unique port to avoid conflicts
with the default port 4096.
2026-01-27 11:59:10 +09:00
vmlinuzx
7f43f160b5 docs: clarify category model resolution priority and fallback behavior (#1074)
The previous documentation implied that categories automatically use their
built-in default models (e.g., Gemini for visual, GPT-5.2 for ultrabrain).

This was misleading. Categories only use built-in defaults if explicitly
configured. Otherwise, they fall back to the system default model.

Changes:
- Add explicit warning about model resolution priority
- Document all 7 built-in categories (was only showing 2)
- Show complete example config with all categories
- Explain the wasteful fallback scenario
- Add 'variant' to supported category options

Fixes confusion where users expect optimized model selection but get
system default for all unconfigured categories.

Co-authored-by: DC <vmlinux@p16.tailnet.freeflight.co>
2026-01-27 11:58:59 +09:00
0ln
af67bc8592 fix(mcp): add optional Context7 Authorization header (#1133)
Context7 should mirror `websearch` by only sending auth when
`CONTEXT7_API_KEY` is set.

Change: set bearer auth in `headers` using `CONTEXT7_API_KEY` if said environment variable is set, otherwise leave `headers` to `undefined`.
2026-01-27 11:58:55 +09:00
Peter Rallojay
c74d79e28a fix: prevent builtin MCPs from overwriting user MCP configs (#956) 2026-01-27 11:58:42 +09:00
justsisyphus
fc5298d778 feat(workflow): add ZAI Coding + OpenAI provider for sisyphus-agent
- Add zai-coding-plan provider with GLM 4.7 and GLM 4.6v models
- Add OpenAI provider with GPT-5.2 models
- Configure unspecified-low category to use zai-coding-plan/glm-4.7
- Auth is provided via OPENCODE_AUTH_JSON secret
2026-01-27 10:51:24 +09:00
justsisyphus
3e8e3db961 feat(prompts): enhance plan output with TL;DR, agent profiles, and parallelization
- prometheus-prompt: Add TL;DR section with quick summary, deliverables, effort estimate
- prometheus-prompt: Add recommended agent profile (category + skills) per task
- prometheus-prompt: Enhance parallelization with execution waves and dependency matrix
- ultrawork: Change plan agent to prometheus agent invocation
- ultrawork: Add session_id resume workflow for Prometheus iteration
2026-01-27 10:50:38 +09:00
justsisyphus
6fa5cac616 fix(compaction): preserve agent verification state (#1144) 2026-01-27 10:35:20 +09:00
justsisyphus
158ccabf24 fix(notification): prevent false positive plugin detection (#1148) 2026-01-27 10:35:20 +09:00
justsisyphus
2efbf2650f fix(cli): add baseline builds for non-AVX2 CPUs (#1154) 2026-01-27 10:35:20 +09:00
justsisyphus
acded4ba2a fix(delegate-task): add clear error when model not configured (#1139) 2026-01-27 10:35:20 +09:00
github-actions[bot]
911e43445f @ghtndl has signed the CLA in code-yeongyu/oh-my-opencode#1158 2026-01-27 01:27:26 +00:00
sisyphus-dev-ai
3049e1ebfb chore: changes by sisyphus-dev-ai 2026-01-27 01:10:31 +00:00
github-actions[bot]
62921b9e44 release: v3.1.2 2026-01-27 01:07:09 +00:00
github-actions[bot]
cd23f7ab7d release: v3.1.1 2026-01-26 23:48:28 +00:00
justsisyphus
518dceac72 Revert "feat(librarian): conditionally enable thinking based on model type"
This reverts commit f033b30549a396db90e148756130cddec1fcdb2b.
2026-01-27 08:39:45 +09:00
justsisyphus
19f43e30c8 feat(librarian): conditionally enable thinking based on model type
- Add isGeminiModel helper to detect Gemini models
- Disable thinking config for Gemini models (not supported)
- Enable thinking with 32000 token budget for other models
- Add tests verifying both Gemini and Claude behavior

🤖 Generated with assistance of OhMyOpenCode
2026-01-27 08:39:45 +09:00
justsisyphus
b3be9f33c6 feat(ultrawork): enforce plan agent invocation and parallel delegation
- Add MANDATORY section for delegate_task(subagent_type='plan') at top of ultrawork prompt
- Establish 'DELEGATE by default, work yourself only when trivial' principle
- Add parallel execution rules with anti-pattern and correct pattern examples
- Remove emoji (checkmark/cross) from PLAN_AGENT_SYSTEM_PREPEND
- Restructure workflow into clear 4-step sequence
2026-01-27 08:39:45 +09:00
github-actions[bot]
430098856a @itsmylife44 has signed the CLA in code-yeongyu/oh-my-opencode#1157 2026-01-26 23:20:52 +00:00
github-actions[bot]
5932f5f94f @acamq has signed the CLA in code-yeongyu/oh-my-opencode#1151 2026-01-26 18:20:30 +00:00
github-actions[bot]
fcf2e32071 @craftaholic has signed the CLA in code-yeongyu/oh-my-opencode#1110 2026-01-26 16:12:39 +00:00
github-actions[bot]
19827dac70 @orientpine has signed the CLA in code-yeongyu/oh-my-opencode#1145 2026-01-26 14:30:44 +00:00
github-actions[bot]
3ed1c6644e @Jeremy-Kr has signed the CLA in code-yeongyu/oh-my-opencode#1141 2026-01-26 11:59:22 +00:00
justsisyphus
cf6e714946 feat(plan-agent): apply prometheus config to plan agent with fallback chain
- Add prometheus model fallback chain (claude-opus-4-5 → gpt-5.2 → gemini-3-pro)
- Plan agent now inherits prometheus settings (model, prompt, permission, variant)
- Plan agent mode remains 'subagent' while using prometheus config
- Add name field to prometheus config to fix agent.name undefined error
2026-01-26 18:31:48 +09:00
justsisyphus
383f43548b feat(plan-agent): enforce dependency/parallel graphs and category+skill recommendations
Add mandatory sections to PLAN_AGENT_SYSTEM_PREPEND:
- Task Dependency Graph with blockers/dependents/reasons
- Parallel Execution Graph with wave structure
- Category + Skills recommendations per task
- Response format specification with exact structure

Uses ASCII art banners and visual emphasis for critical requirements.
2026-01-26 18:31:35 +09:00
justsisyphus
26b1c67964 fix(background-agent): disable question tool for background tasks 2026-01-26 18:25:06 +09:00
justsisyphus
7e065dfe12 feat(delegate-task): prepend system prompt for plan agent invocations
When plan agent (plan/prometheus/planner) is invoked via delegate_task,
automatically prepend a <system> prompt instructing the agent to:
- Launch explore/librarian agents in background to gather context
- Summarize user request and list uncertainties
- Ask clarifying questions until requirements are 100% clear
2026-01-26 18:25:06 +09:00
justsisyphus
8429da02b8 feat(config): add thinking/reasoningEffort/providerOptions to AgentOverrideConfigSchema
- Add maxTokens, thinking, reasoningEffort, textVerbosity, providerOptions fields to AgentOverrideConfigSchema
- Update think-mode hook to respect agent-level thinking settings (disabled or custom providerOptions)
- Add tests for agent-level thinking configuration override behavior
2026-01-26 18:25:06 +09:00
github-actions[bot]
ab51f5d39f @boguan has signed the CLA in code-yeongyu/oh-my-opencode#1137 2026-01-26 08:46:14 +00:00
justsisyphus
3ee519c7b0 feat: make systemDefaultModel optional for OpenCode fallback (#1136)
- Remove mandatory model requirement from plugin initialization
- Allow OpenCode to use its built-in model fallback when user doesn't specify
- Update model-resolver to handle undefined systemDefaultModel
- Remove throw errors in config-handler, utils, atlas, delegate-task
- Add tests for optional model scenarios

Closes #1129

Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
2026-01-26 17:01:08 +09:00
justsisyphus
c9b86b7815 test(cli): add version display test to verify package.json reading (#1134)
Closes #1063

Investigation findings:
- The CLI code correctly reads version from package.json
- The reported issue (bunx showing old version) is a caching issue
- Added test to ensure version is read as valid semver from package.json

Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
2026-01-26 17:00:55 +09:00
github-actions[bot]
9b6d8f629a @misyuari has signed the CLA in code-yeongyu/oh-my-opencode#1132 2026-01-26 07:31:12 +00:00
justsisyphus
6a2f43858a docs: add server mode and shell function examples for tmux integration
- Add --port flag requirement for tmux subagent pane spawning
- Add Fish shell function example with automatic port allocation
- Add Bash/Zsh equivalent function example
- Document how subagent panes work (opencode attach flow)
- Add OPENCODE_PORT environment variable documentation
- Add server mode reference section with opencode serve command
2026-01-26 16:24:14 +09:00
justsisyphus
601ea32a1c docs: add tmux integration and interactive terminal documentation
- Add Tmux Integration section to configurations.md with all config options
- Add Visual Multi-Agent with Tmux subsection to features.md
- Add Interactive Terminal Tools section documenting interactive_bash tool
2026-01-26 16:02:34 +09:00
github-actions[bot]
8f31211c75 release: v3.1.0 2026-01-26 06:46:47 +00:00
justsisyphus
04f2b513c6 feat(tmux-subagent): add replace action to prevent mass eviction
- Add column-based splittable calculation (getColumnCount, getColumnWidth)
- New decision tree: splittable → split, k=1 eviction → close+spawn, else → replace
- Add 'replace' action type using tmux respawn-pane (preserves layout)
- Replace oldest pane in-place instead of closing all panes when unsplittable
- Prevents scenario where all agent panes get closed leaving only 1
2026-01-26 15:25:11 +09:00
justsisyphus
8ebc933118 fix(tmux-subagent): enable 2D grid layout with divider-aware calculations
- Account for tmux pane dividers (1 char) in all size calculations
- Reduce MIN_PANE_WIDTH from 53 to 52 to fit 2 columns in standard terminals
- Fix enforceMainPaneWidth to use (windowWidth - divider) / 2
- Add virtual mainPane handling for close-spawn eviction loop
- Add comprehensive decision-engine tests (23 test cases)
2026-01-26 15:11:16 +09:00
justsisyphus
a67a35aea8 docs: regenerate AGENTS.md knowledge base via /init-deep 2026-01-26 14:56:55 +09:00
justsisyphus
9d66b80709 feat(hooks): add active working context section to compaction summary
Include files, code in progress, external references, and state/variables
in compaction summary for seamless continuation after context compaction.
2026-01-26 14:23:05 +09:00
justsisyphus
5c7eb02d5b chore(test): sync agent name casing in tests (#1128)
Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
2026-01-26 12:10:30 +09:00
justsisyphus
68aa913499 refactor(tmux-subagent): state-first architecture with decision engine (#1125)
* refactor(tmux-subagent): add state-first architecture with decision engine

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

* feat(tmux): add pane spawn callbacks for background and sync sessions

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

---------

Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-01-26 12:02:37 +09:00
justsisyphus
3a79b8761b feat(shared): add connected-providers-cache for model availability (#1121)
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-01-26 11:53:41 +09:00
justsisyphus
da416b362b feat(hooks): add category-skill-reminder hook (#1123)
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-01-26 11:48:32 +09:00
justsisyphus
90054b28ad chore(docs): regenerate AGENTS.md knowledge base (#1118)
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-01-26 11:48:30 +09:00
justsisyphus
892b245779 fix(test): update builtin skills count from 3 to 4 (#1126)
* fix(test): update builtin skills count from 3 to 4 (dev-browser added)

* chore(ci): add block-master-pr workflow

---------

Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
2026-01-26 02:29:28 +00:00
YeonGyu-Kim
aead4aebd2 Add tmux pane management for background agent sessions (#1094)
* feat(config): add TmuxConfigSchema for tmux subagent pane management

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

* feat(shared): add tmux module structure

* feat(shared/tmux): implement tmux pane utilities

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

* test(tmux-subagent): add TmuxSessionManager tests (TDD RED)

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

* feat(tmux-subagent): implement TmuxSessionManager

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

* feat(integration): wire TmuxSessionManager with 500ms delay

- Task 5: Add 500ms delay in BackgroundManager after session creation
- Task 6: Wire TmuxSessionManager event handlers (session.created/deleted)
- Both changes integrate tmux pane management into plugin lifecycle

Co-authored-by: Sisyphus <ultrawork@oh-my-opencode>

---------

Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Co-authored-by: Sisyphus <ultrawork@oh-my-opencode>
2026-01-25 15:34:10 +09:00
YeonGyu-Kim
bccc943173 feat(skills): add dev-browser skill with Windows support (#1093)
* feat(skills): add dev-browser skill with Windows support

* chore: trigger CI
2026-01-25 15:34:07 +09:00
justsisyphus
05904ca617 docs(agent-browser): add detailed installation guide with Playwright troubleshooting 2026-01-25 15:12:32 +09:00
YeonGyu-Kim
3af30b0a21 feat(skills): add agent-browser option for browser automation (#1090)
Add configurable browser automation allowing users to choose between
Playwright MCP (default) and Vercel's agent-browser CLI.

Changes:
- Add browser_automation_engine.provider config option
- Dynamic skill loading based on provider selection
- Comprehensive agent-browser CLI reference (inline in skills.ts)
- Propagate browserProvider to delegate_task and buildAgent
- Update documentation with provider comparison

Co-authored-by: Suyeol Jeon <devxoul@gmail.com>
Co-authored-by: YeonGyu Kim <code.yeongyu@gmail.com>
2026-01-25 15:02:41 +09:00
YeonGyu-Kim
b55fd8d76f feat(explore): add github-copilot/gpt-5-mini to fallback chain (#1091)
* feat(explore): add github-copilot/gpt-5-mini to fallback chain

* test(explore): add tests for github-copilot/gpt-5-mini fallback

---------

Co-authored-by: Suyeol Jeon <devxoul@gmail.com>
2026-01-25 05:53:11 +00:00
Sisyphus
208af055ef fix: generate skill/slashcommand descriptions synchronously when pre-provided (#1087)
* fix: generate skill/slashcommand tool descriptions synchronously when pre-provided

When skills are passed via options (pre-resolved), build the tool description
synchronously instead of fire-and-forget async. This eliminates the race
condition where the description getter returns the bare prefix before the
async cache-warming microtask completes.

Fixes #1039

* chore: changes by sisyphus-dev-ai

---------

Co-authored-by: sisyphus-dev-ai <sisyphus-dev-ai@users.noreply.github.com>
2026-01-25 14:52:50 +09:00
YeonGyu-Kim
0aa8f486af feat(hooks): add sisyphus-junior-notepad hook for conditional notepad rules injection (#1092)
* refactor(shared): extract isCallerOrchestrator to session-utils

* refactor(atlas): use shared isCallerOrchestrator, change to prepend

* refactor(prometheus-md-only): change to prepend pattern

* refactor(sisyphus-junior): remove Work_Context (moved to hook)

* feat(hooks): add sisyphus-junior-notepad hook

* fix(shared): replace dynamic require with static import in session-utils

- Change from dynamic require to static import for better bundler compatibility
- Fix import path: ../../features -> ../features
- Add barrel export to src/shared/index.ts

* feat(hooks): register sisyphus-junior-notepad hook

- Add to HookNameSchema in schema.ts
- Export from hooks/index.ts
- Register with isHookEnabled in index.ts
- Auto-generated schema.json update

---------

Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
2026-01-25 14:52:11 +09:00
201 changed files with 16901 additions and 1184 deletions

View File

@@ -14,6 +14,8 @@ body:
label: Prerequisites
description: Please confirm the following before submitting
options:
- label: I will write this issue in English (see our [Language Policy](https://github.com/code-yeongyu/oh-my-opencode/blob/dev/CONTRIBUTING.md#language-policy))
required: true
- label: I have searched existing issues to avoid duplicates
required: true
- label: I am using the latest version of oh-my-opencode

View File

@@ -14,6 +14,8 @@ body:
label: Prerequisites
description: Please confirm the following before submitting
options:
- label: I will write this issue in English (see our [Language Policy](https://github.com/code-yeongyu/oh-my-opencode/blob/dev/CONTRIBUTING.md#language-policy))
required: true
- label: I have searched existing issues and discussions to avoid duplicates
required: true
- label: This feature request is specific to oh-my-opencode (not OpenCode core)

View File

@@ -14,6 +14,8 @@ body:
label: Prerequisites
description: Please confirm the following before submitting
options:
- label: I will write this issue in English (see our [Language Policy](https://github.com/code-yeongyu/oh-my-opencode/blob/dev/CONTRIBUTING.md#language-policy))
required: true
- label: I have searched existing issues and discussions
required: true
- label: I have read the [documentation](https://github.com/code-yeongyu/oh-my-opencode#readme)

View File

@@ -4,13 +4,32 @@ on:
push:
branches: [master, dev]
pull_request:
branches: [dev]
branches: [master, dev]
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
# Block PRs targeting master branch
block-master-pr:
runs-on: ubuntu-latest
if: github.event_name == 'pull_request'
steps:
- name: Check PR target branch
run: |
if [ "${{ github.base_ref }}" = "master" ]; then
echo "::error::PRs to master branch are not allowed. Please target the 'dev' branch instead."
echo ""
echo "PULL REQUESTS TO MASTER ARE BLOCKED"
echo ""
echo "All PRs must target the 'dev' branch."
echo "Please close this PR and create a new one targeting 'dev'."
exit 1
else
echo "PR targets '${{ github.base_ref }}' branch - OK"
fi
test:
runs-on: ubuntu-latest
steps:
@@ -25,8 +44,34 @@ jobs:
env:
BUN_INSTALL_ALLOW_SCRIPTS: "@ast-grep/napi"
- name: Run tests
run: bun test
- name: Run mock-heavy tests (isolated)
run: |
# These files use mock.module() which pollutes module cache
# Run them in separate processes to prevent cross-file contamination
bun test src/plugin-handlers
bun test src/hooks/atlas
bun test src/hooks/compaction-context-injector
bun test src/features/tmux-subagent
- name: Run remaining tests
run: |
# Run all other tests (mock-heavy ones are re-run but that's acceptable)
bun test bin script src/cli src/config src/mcp src/index.test.ts \
src/agents src/tools src/shared \
src/hooks/anthropic-context-window-limit-recovery \
src/hooks/claude-code-compatibility \
src/hooks/context-injection \
src/hooks/provider-toast \
src/hooks/session-notification \
src/hooks/sisyphus \
src/hooks/todo-continuation-enforcer \
src/features/background-agent \
src/features/builtin-commands \
src/features/builtin-skills \
src/features/claude-code-session-state \
src/features/hook-message-injector \
src/features/opencode-skill-loader \
src/features/skill-mcp-manager
typecheck:
runs-on: ubuntu-latest

View File

@@ -25,7 +25,7 @@ jobs:
path-to-signatures: 'signatures/cla.json'
path-to-document: 'https://github.com/code-yeongyu/oh-my-opencode/blob/master/CLA.md'
branch: 'dev'
allowlist: bot*,dependabot*,github-actions*,*[bot],sisyphus-dev-ai
allowlist: code-yeongyu,bot*,dependabot*,github-actions*,*[bot],sisyphus-dev-ai
custom-notsigned-prcomment: |
Thank you for your contribution! Before we can merge this PR, we need you to sign our [Contributor License Agreement (CLA)](https://github.com/code-yeongyu/oh-my-opencode/blob/master/CLA.md).

View File

@@ -45,8 +45,34 @@ jobs:
env:
BUN_INSTALL_ALLOW_SCRIPTS: "@ast-grep/napi"
- name: Run tests
run: bun test
- name: Run mock-heavy tests (isolated)
run: |
# These files use mock.module() which pollutes module cache
# Run them in separate processes to prevent cross-file contamination
bun test src/plugin-handlers
bun test src/hooks/atlas
bun test src/hooks/compaction-context-injector
bun test src/features/tmux-subagent
- name: Run remaining tests
run: |
# Run all other tests (mock-heavy ones are re-run but that's acceptable)
bun test bin script src/cli src/config src/mcp src/index.test.ts \
src/agents src/tools src/shared \
src/hooks/anthropic-context-window-limit-recovery \
src/hooks/claude-code-compatibility \
src/hooks/context-injection \
src/hooks/provider-toast \
src/hooks/session-notification \
src/hooks/sisyphus \
src/hooks/todo-continuation-enforcer \
src/features/background-agent \
src/features/builtin-commands \
src/features/builtin-skills \
src/features/claude-code-session-state \
src/features/hook-message-injector \
src/features/opencode-skill-loader \
src/features/skill-mcp-manager
typecheck:
runs-on: ubuntu-latest

View File

@@ -152,6 +152,41 @@ jobs:
"limit": { "context": 200000, "output": 64000 }
}
}
} |
.provider["zai-coding-plan"] = {
"name": "Z.AI Coding Plan",
"npm": "@ai-sdk/openai-compatible",
"options": {
"baseURL": "https://api.z.ai/api/paas/v4"
},
"models": {
"glm-4.7": {
"id": "glm-4.7",
"name": "GLM 4.7",
"limit": { "context": 128000, "output": 16000 }
},
"glm-4.6v": {
"id": "glm-4.6v",
"name": "GLM 4.6 Vision",
"limit": { "context": 128000, "output": 16000 }
}
}
} |
.provider.openai = {
"name": "OpenAI",
"npm": "@ai-sdk/openai",
"models": {
"gpt-5.2": {
"id": "gpt-5.2",
"name": "GPT-5.2",
"limit": { "context": 128000, "output": 16000 }
},
"gpt-5.2-codex": {
"id": "gpt-5.2-codex",
"name": "GPT-5.2 Codex",
"limit": { "context": 128000, "output": 32000 }
}
}
}
' "$OPENCODE_JSON" > /tmp/oc.json && mv /tmp/oc.json "$OPENCODE_JSON"
@@ -287,6 +322,9 @@ jobs:
)
jq --arg append "$PROMPT_APPEND" '.agents.Sisyphus.prompt_append = $append' "$OMO_JSON" > /tmp/omo.json && mv /tmp/omo.json "$OMO_JSON"
# Add categories configuration for unspecified-low to use GLM 4.7
jq '.categories["unspecified-low"] = { "model": "zai-coding-plan/glm-4.7" }' "$OMO_JSON" > /tmp/omo.json && mv /tmp/omo.json "$OMO_JSON"
mkdir -p ~/.local/share/opencode
echo "$OPENCODE_AUTH_JSON" > ~/.local/share/opencode/auth.json
chmod 600 ~/.local/share/opencode/auth.json

1
.gitignore vendored
View File

@@ -33,3 +33,4 @@ yarn.lock
test-injection/
notepad.md
oauth-success.html
.188e87dbff6e7fd9-00000000.bun-build

View File

@@ -1,12 +1,24 @@
# PROJECT KNOWLEDGE BASE
**Generated:** 2026-01-25T13:10:00+09:00
**Commit:** 043b1a33
**Generated:** 2026-01-26T14:50:00+09:00
**Commit:** 9d66b807
**Branch:** dev
---
## **IMPORTANT: PULL REQUEST TARGET BRANCH**
> **ALL PULL REQUESTS MUST TARGET THE `dev` BRANCH.**
>
> **DO NOT CREATE PULL REQUESTS TARGETING `master` BRANCH.**
>
> PRs to `master` will be automatically rejected by CI.
---
## OVERVIEW
OpenCode plugin: multi-model agent orchestration (Claude Opus 4.5, GPT-5.2, Gemini 3 Flash, Grok Code, GLM-4.7). 31 lifecycle hooks, 20+ tools (LSP, AST-Grep, delegation), 10 specialized agents, full Claude Code compatibility. "oh-my-zsh" for OpenCode.
OpenCode plugin: multi-model agent orchestration (Claude Opus 4.5, GPT-5.2, Gemini 3 Flash, Grok Code). 32 lifecycle hooks, 20+ tools (LSP, AST-Grep, delegation), 10 specialized agents, full Claude Code compatibility. "oh-my-zsh" for OpenCode.
## STRUCTURE
@@ -14,14 +26,14 @@ OpenCode plugin: multi-model agent orchestration (Claude Opus 4.5, GPT-5.2, Gemi
oh-my-opencode/
├── src/
│ ├── agents/ # 10 AI agents - see src/agents/AGENTS.md
│ ├── hooks/ # 31 lifecycle hooks - see src/hooks/AGENTS.md
│ ├── hooks/ # 32 lifecycle hooks - see src/hooks/AGENTS.md
│ ├── tools/ # 20+ tools - see src/tools/AGENTS.md
│ ├── features/ # Background agents, Claude Code compat - see src/features/AGENTS.md
│ ├── shared/ # 50 cross-cutting utilities - see src/shared/AGENTS.md
│ ├── shared/ # 55 cross-cutting utilities - see src/shared/AGENTS.md
│ ├── cli/ # CLI installer, doctor - see src/cli/AGENTS.md
│ ├── mcp/ # Built-in MCPs - see src/mcp/AGENTS.md
│ ├── config/ # Zod schema, TypeScript types
│ └── index.ts # Main plugin entry (601 lines)
│ └── index.ts # Main plugin entry (672 lines)
├── script/ # build-schema.ts, build-binaries.ts
├── packages/ # 7 platform-specific binaries
└── dist/ # Build output (ESM + .d.ts)
@@ -38,8 +50,8 @@ oh-my-opencode/
| Add skill | `src/features/builtin-skills/` | Create dir with SKILL.md |
| Add command | `src/features/builtin-commands/` | Add template + register in commands.ts |
| Config schema | `src/config/schema.ts` | Zod schema, run `bun run build:schema` |
| Background agents | `src/features/background-agent/` | manager.ts (1335 lines) |
| Orchestrator | `src/hooks/atlas/` | Main orchestration hook (773 lines) |
| Background agents | `src/features/background-agent/` | manager.ts (1377 lines) |
| Orchestrator | `src/hooks/atlas/` | Main orchestration hook (752 lines) |
## TDD (Test-Driven Development)
@@ -51,8 +63,8 @@ oh-my-opencode/
**Rules:**
- NEVER write implementation before test
- NEVER delete failing tests - fix the code
- Test file: `*.test.ts` alongside source
- BDD comments: `#given`, `#when`, `#then`
- Test file: `*.test.ts` alongside source (100 test files)
- BDD comments: `//#given`, `//#when`, `//#then`
## CONVENTIONS
@@ -61,7 +73,7 @@ oh-my-opencode/
- **Build**: `bun build` (ESM) + `tsc --emitDeclarationOnly`
- **Exports**: Barrel pattern via index.ts
- **Naming**: kebab-case dirs, `createXXXHook`/`createXXXTool` factories
- **Testing**: BDD comments, 95 test files
- **Testing**: BDD comments, 100 test files
- **Temperature**: 0.1 for code agents, max 0.3
## ANTI-PATTERNS
@@ -100,7 +112,7 @@ oh-my-opencode/
bun run typecheck # Type check
bun run build # ESM + declarations + schema
bun run rebuild # Clean + Build
bun test # 95 test files
bun test # 100 test files
```
## DEPLOYMENT
@@ -114,16 +126,14 @@ bun test # 95 test files
| File | Lines | Description |
|------|-------|-------------|
| `src/features/background-agent/manager.ts` | 1335 | Task lifecycle, concurrency |
| `src/features/builtin-skills/skills.ts` | 1203 | Skill definitions |
| `src/features/builtin-skills/skills.ts` | 1729 | Skill definitions |
| `src/features/background-agent/manager.ts` | 1377 | Task lifecycle, concurrency |
| `src/agents/prometheus-prompt.ts` | 1196 | Planning agent |
| `src/tools/delegate-task/tools.ts` | 1039 | Category-based delegation |
| `src/hooks/atlas/index.ts` | 773 | Orchestrator hook |
| `src/tools/delegate-task/tools.ts` | 1070 | Category-based delegation |
| `src/hooks/atlas/index.ts` | 752 | Orchestrator hook |
| `src/cli/config-manager.ts` | 664 | JSONC config parsing |
| `src/index.ts` | 672 | Main plugin entry |
| `src/features/builtin-commands/templates/refactor.ts` | 619 | Refactor command template |
| `src/index.ts` | 601 | Main plugin entry |
| `src/tools/lsp/client.ts` | 596 | LSP JSON-RPC client |
| `src/agents/atlas.ts` | 572 | Atlas orchestrator agent |
## MCP ARCHITECTURE

View File

@@ -38,6 +38,7 @@
"type": "string",
"enum": [
"playwright",
"agent-browser",
"frontend-ui-ux",
"git-master"
]
@@ -70,12 +71,14 @@
"interactive-bash-session",
"thinking-block-validator",
"ralph-loop",
"category-skill-reminder",
"compaction-context-injector",
"claude-code-hooks",
"auto-slash-command",
"edit-error-recovery",
"delegate-task-retry",
"prometheus-md-only",
"sisyphus-junior-notepad",
"start-work",
"atlas"
]
@@ -217,6 +220,51 @@
]
}
}
},
"maxTokens": {
"type": "number"
},
"thinking": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"enabled",
"disabled"
]
},
"budgetTokens": {
"type": "number"
}
},
"required": [
"type"
]
},
"reasoningEffort": {
"type": "string",
"enum": [
"low",
"medium",
"high",
"xhigh"
]
},
"textVerbosity": {
"type": "string",
"enum": [
"low",
"medium",
"high"
]
},
"providerOptions": {
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {}
}
}
},
@@ -343,6 +391,51 @@
]
}
}
},
"maxTokens": {
"type": "number"
},
"thinking": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"enabled",
"disabled"
]
},
"budgetTokens": {
"type": "number"
}
},
"required": [
"type"
]
},
"reasoningEffort": {
"type": "string",
"enum": [
"low",
"medium",
"high",
"xhigh"
]
},
"textVerbosity": {
"type": "string",
"enum": [
"low",
"medium",
"high"
]
},
"providerOptions": {
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {}
}
}
},
@@ -469,6 +562,51 @@
]
}
}
},
"maxTokens": {
"type": "number"
},
"thinking": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"enabled",
"disabled"
]
},
"budgetTokens": {
"type": "number"
}
},
"required": [
"type"
]
},
"reasoningEffort": {
"type": "string",
"enum": [
"low",
"medium",
"high",
"xhigh"
]
},
"textVerbosity": {
"type": "string",
"enum": [
"low",
"medium",
"high"
]
},
"providerOptions": {
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {}
}
}
},
@@ -595,6 +733,51 @@
]
}
}
},
"maxTokens": {
"type": "number"
},
"thinking": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"enabled",
"disabled"
]
},
"budgetTokens": {
"type": "number"
}
},
"required": [
"type"
]
},
"reasoningEffort": {
"type": "string",
"enum": [
"low",
"medium",
"high",
"xhigh"
]
},
"textVerbosity": {
"type": "string",
"enum": [
"low",
"medium",
"high"
]
},
"providerOptions": {
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {}
}
}
},
@@ -721,6 +904,51 @@
]
}
}
},
"maxTokens": {
"type": "number"
},
"thinking": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"enabled",
"disabled"
]
},
"budgetTokens": {
"type": "number"
}
},
"required": [
"type"
]
},
"reasoningEffort": {
"type": "string",
"enum": [
"low",
"medium",
"high",
"xhigh"
]
},
"textVerbosity": {
"type": "string",
"enum": [
"low",
"medium",
"high"
]
},
"providerOptions": {
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {}
}
}
},
@@ -847,6 +1075,51 @@
]
}
}
},
"maxTokens": {
"type": "number"
},
"thinking": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"enabled",
"disabled"
]
},
"budgetTokens": {
"type": "number"
}
},
"required": [
"type"
]
},
"reasoningEffort": {
"type": "string",
"enum": [
"low",
"medium",
"high",
"xhigh"
]
},
"textVerbosity": {
"type": "string",
"enum": [
"low",
"medium",
"high"
]
},
"providerOptions": {
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {}
}
}
},
@@ -973,6 +1246,51 @@
]
}
}
},
"maxTokens": {
"type": "number"
},
"thinking": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"enabled",
"disabled"
]
},
"budgetTokens": {
"type": "number"
}
},
"required": [
"type"
]
},
"reasoningEffort": {
"type": "string",
"enum": [
"low",
"medium",
"high",
"xhigh"
]
},
"textVerbosity": {
"type": "string",
"enum": [
"low",
"medium",
"high"
]
},
"providerOptions": {
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {}
}
}
},
@@ -1099,6 +1417,51 @@
]
}
}
},
"maxTokens": {
"type": "number"
},
"thinking": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"enabled",
"disabled"
]
},
"budgetTokens": {
"type": "number"
}
},
"required": [
"type"
]
},
"reasoningEffort": {
"type": "string",
"enum": [
"low",
"medium",
"high",
"xhigh"
]
},
"textVerbosity": {
"type": "string",
"enum": [
"low",
"medium",
"high"
]
},
"providerOptions": {
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {}
}
}
},
@@ -1225,6 +1588,51 @@
]
}
}
},
"maxTokens": {
"type": "number"
},
"thinking": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"enabled",
"disabled"
]
},
"budgetTokens": {
"type": "number"
}
},
"required": [
"type"
]
},
"reasoningEffort": {
"type": "string",
"enum": [
"low",
"medium",
"high",
"xhigh"
]
},
"textVerbosity": {
"type": "string",
"enum": [
"low",
"medium",
"high"
]
},
"providerOptions": {
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {}
}
}
},
@@ -1351,6 +1759,51 @@
]
}
}
},
"maxTokens": {
"type": "number"
},
"thinking": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"enabled",
"disabled"
]
},
"budgetTokens": {
"type": "number"
}
},
"required": [
"type"
]
},
"reasoningEffort": {
"type": "string",
"enum": [
"low",
"medium",
"high",
"xhigh"
]
},
"textVerbosity": {
"type": "string",
"enum": [
"low",
"medium",
"high"
]
},
"providerOptions": {
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {}
}
}
},
@@ -1477,6 +1930,51 @@
]
}
}
},
"maxTokens": {
"type": "number"
},
"thinking": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"enabled",
"disabled"
]
},
"budgetTokens": {
"type": "number"
}
},
"required": [
"type"
]
},
"reasoningEffort": {
"type": "string",
"enum": [
"low",
"medium",
"high",
"xhigh"
]
},
"textVerbosity": {
"type": "string",
"enum": [
"low",
"medium",
"high"
]
},
"providerOptions": {
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {}
}
}
},
@@ -1603,6 +2101,51 @@
]
}
}
},
"maxTokens": {
"type": "number"
},
"thinking": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"enabled",
"disabled"
]
},
"budgetTokens": {
"type": "number"
}
},
"required": [
"type"
]
},
"reasoningEffort": {
"type": "string",
"enum": [
"low",
"medium",
"high",
"xhigh"
]
},
"textVerbosity": {
"type": "string",
"enum": [
"low",
"medium",
"high"
]
},
"providerOptions": {
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {}
}
}
},
@@ -1729,6 +2272,51 @@
]
}
}
},
"maxTokens": {
"type": "number"
},
"thinking": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"enabled",
"disabled"
]
},
"budgetTokens": {
"type": "number"
}
},
"required": [
"type"
]
},
"reasoningEffort": {
"type": "string",
"enum": [
"low",
"medium",
"high",
"xhigh"
]
},
"textVerbosity": {
"type": "string",
"enum": [
"low",
"medium",
"high"
]
},
"providerOptions": {
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {}
}
}
}
@@ -2171,6 +2759,100 @@
"type": "boolean"
}
}
},
"browser_automation_engine": {
"type": "object",
"properties": {
"provider": {
"default": "playwright",
"type": "string",
"enum": [
"playwright",
"agent-browser",
"dev-browser"
]
}
}
},
"tmux": {
"type": "object",
"properties": {
"enabled": {
"default": false,
"type": "boolean"
},
"layout": {
"default": "main-vertical",
"type": "string",
"enum": [
"main-horizontal",
"main-vertical",
"tiled",
"even-horizontal",
"even-vertical"
]
},
"main_pane_size": {
"default": 60,
"type": "number",
"minimum": 20,
"maximum": 80
},
"main_pane_min_width": {
"default": 120,
"type": "number",
"minimum": 40
},
"agent_pane_min_width": {
"default": 40,
"type": "number",
"minimum": 20
}
}
},
"sisyphus": {
"type": "object",
"properties": {
"tasks": {
"type": "object",
"properties": {
"enabled": {
"default": false,
"type": "boolean"
},
"storage_path": {
"default": ".sisyphus/tasks",
"type": "string"
},
"claude_code_compat": {
"default": false,
"type": "boolean"
}
}
},
"swarm": {
"type": "object",
"properties": {
"enabled": {
"default": false,
"type": "boolean"
},
"storage_path": {
"default": ".sisyphus/teams",
"type": "string"
},
"ui_mode": {
"default": "toast",
"type": "string",
"enum": [
"toast",
"tmux",
"both"
]
}
}
}
}
}
}
}

View File

@@ -18,6 +18,7 @@
"jsonc-parser": "^3.3.1",
"picocolors": "^1.1.1",
"picomatch": "^4.0.2",
"vscode-jsonrpc": "^8.2.0",
"zod": "^4.1.8",
},
"devDependencies": {
@@ -27,13 +28,13 @@
"typescript": "^5.7.3",
},
"optionalDependencies": {
"oh-my-opencode-darwin-arm64": "3.0.0",
"oh-my-opencode-darwin-x64": "3.0.0",
"oh-my-opencode-linux-arm64": "3.0.0",
"oh-my-opencode-linux-arm64-musl": "3.0.0",
"oh-my-opencode-linux-x64": "3.0.0",
"oh-my-opencode-linux-x64-musl": "3.0.0",
"oh-my-opencode-windows-x64": "3.0.0",
"oh-my-opencode-darwin-arm64": "3.1.6",
"oh-my-opencode-darwin-x64": "3.1.6",
"oh-my-opencode-linux-arm64": "3.1.6",
"oh-my-opencode-linux-arm64-musl": "3.1.6",
"oh-my-opencode-linux-x64": "3.1.6",
"oh-my-opencode-linux-x64-musl": "3.1.6",
"oh-my-opencode-windows-x64": "3.1.6",
},
},
},
@@ -225,19 +226,19 @@
"object-inspect": ["object-inspect@1.13.4", "", {}, "sha512-W67iLl4J2EXEGTbfeHCffrjDfitvLANg0UlX3wFUUSTx92KXRFegMHUVgSqE+wvhAbi4WqjGg9czysTV2Epbew=="],
"oh-my-opencode-darwin-arm64": ["oh-my-opencode-darwin-arm64@3.0.0", "", { "os": "darwin", "cpu": "arm64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-zelvb7qz5GsS+Dhyz9rACZrkUMtWbAZGijiHSQqmRcjlN/sRPNhXtsL55VheDjlPM3VP+t3+psv+se0WA/aw5w=="],
"oh-my-opencode-darwin-arm64": ["oh-my-opencode-darwin-arm64@3.1.6", "", { "os": "darwin", "cpu": "arm64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-KK+ptnkBigvDYbRtF/B5izEC4IoXDS8mAnRHWFBSCINhzQR2No6AtEcwijd6vKBPR+/r71ofq/8mTsIeb1PEVQ=="],
"oh-my-opencode-darwin-x64": ["oh-my-opencode-darwin-x64@3.0.0", "", { "os": "darwin", "cpu": "x64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-dRMD1U5zIrb6BsiKQJZtAFtuD8clAQquZyU2LajMoFTHBNhcBDIgsaBBwvMBIq7dTe8rnFq91ExiFA8OfdrzBA=="],
"oh-my-opencode-darwin-x64": ["oh-my-opencode-darwin-x64@3.1.6", "", { "os": "darwin", "cpu": "x64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-UkPI/RUi7INarFasBUZ4Rous6RUQXsU2nr0V8KFJp+70END43D/96dDUwX+zmPtpDhD+DfWkejuwzqfkZJ2ZDQ=="],
"oh-my-opencode-linux-arm64": ["oh-my-opencode-linux-arm64@3.0.0", "", { "os": "linux", "cpu": "arm64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-Wx6Cx2Nu2T69mfZa3FQ3gk0OFONvMh48rMVYK0Cp8VX5W4Zb/GZgTUFmZlYsApyxqP+7J9m18skd46qPOhzuEQ=="],
"oh-my-opencode-linux-arm64": ["oh-my-opencode-linux-arm64@3.1.6", "", { "os": "linux", "cpu": "arm64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-gvmvgh7WtTtcHiCbG7z43DOYfY/jrf2S6TX/jBMX2/e1AGkcLKwz30NjGhZxeK5SyzxRVypgfZZK1IuriRgbdA=="],
"oh-my-opencode-linux-arm64-musl": ["oh-my-opencode-linux-arm64-musl@3.0.0", "", { "os": "linux", "cpu": "arm64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-mfOlptgLoXLVuhFRcXgZU7BYGuL1axZOMOOjONgncNzOp/BQYU5B9BRFihBUXdDsWGmeMiLowrYGBhVpSv3NlA=="],
"oh-my-opencode-linux-arm64-musl": ["oh-my-opencode-linux-arm64-musl@3.1.6", "", { "os": "linux", "cpu": "arm64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-j3R76pmQ4HGVGFJUMMCeF/1lO3Jg7xFdpcBUKCeFh42N1jMgn1aeyxkAaJYB9RwCF/p6+P8B6gVDLCEDu2mxjA=="],
"oh-my-opencode-linux-x64": ["oh-my-opencode-linux-x64@3.0.0", "", { "os": "linux", "cpu": "x64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-vVjshfaz0UC9NrGD9FfjlYK5NvckIW0sZaE/wRv/LKjrukHFH1jJpJa5KKXxBWLsEJjt6ooJRguXXxtfNXpAWw=="],
"oh-my-opencode-linux-x64": ["oh-my-opencode-linux-x64@3.1.6", "", { "os": "linux", "cpu": "x64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-VDdo0tHCOr5nm7ajd652u798nPNOLRSTcPOnVh6vIPddkZ+ujRke+enOKOw9Pd5e+4AkthqHBwFXNm2VFgnEKg=="],
"oh-my-opencode-linux-x64-musl": ["oh-my-opencode-linux-x64-musl@3.0.0", "", { "os": "linux", "cpu": "x64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-N6cNJ7+Dj0a5dWqPf6OKfB39o8HWw5HQ3hB4omgYqc6Gzo6nChA4KIiVefEC3+tIL98x4XvMeD7OU+UYgwxHnQ=="],
"oh-my-opencode-linux-x64-musl": ["oh-my-opencode-linux-x64-musl@3.1.6", "", { "os": "linux", "cpu": "x64", "bin": { "oh-my-opencode": "bin/oh-my-opencode" } }, "sha512-hBG/dhsr8PZelUlYsPBruSLnelB9ocB7H92I+S9svTpDVo67rAmXOoR04twKQ9TeCO4ShOa6hhMhbQnuI8fgNw=="],
"oh-my-opencode-windows-x64": ["oh-my-opencode-windows-x64@3.0.0", "", { "os": "win32", "cpu": "x64", "bin": { "oh-my-opencode": "bin/oh-my-opencode.exe" } }, "sha512-TaC0hiHpnsS42GWTVUKoTwCb+QzNLBlQtTkIQ0PjlkDYFjlEC2LuR2FFcscik055PRRIGishyB9A1n/8XAgcvA=="],
"oh-my-opencode-windows-x64": ["oh-my-opencode-windows-x64@3.1.6", "", { "os": "win32", "cpu": "x64", "bin": { "oh-my-opencode": "bin/oh-my-opencode.exe" } }, "sha512-c8Awp03p2DsbS0G589nzveRCeJPgJRJ0vQrha4ChRmmo31Qc5OSmJ5xuMaF8L4nM+/trbTgAQMFMtCMLgtC8IQ=="],
"on-finished": ["on-finished@2.4.1", "", { "dependencies": { "ee-first": "1.1.1" } }, "sha512-oVlzkg3ENAhCk2zdv7IJwd/QUD4z2RxRwpkcGY8psCVcCYZNq4wYnVWALHM+brtuJjePWiYF/ClmuDr8Ch5+kg=="],
@@ -303,6 +304,8 @@
"vary": ["vary@1.1.2", "", {}, "sha512-BNGbWLfd0eUPabhkXUVm0j8uuvREyTh5ovRa/dyow/BqAbZJyC+5fU+IzQOzmAKzYqYRAISoRhdQr3eIZ/PXqg=="],
"vscode-jsonrpc": ["vscode-jsonrpc@8.2.1", "", {}, "sha512-kdjOSJ2lLIn7r1rtrMbbNCHjyMPfRnowdKjBQ+mGq6NAW5QY2bEZC/khaC5OR8svbbjvLEaIXkOq45e2X9BIbQ=="],
"which": ["which@2.0.2", "", { "dependencies": { "isexe": "^2.0.0" }, "bin": { "node-which": "./bin/node-which" } }, "sha512-BLI3Tl1TW3Pvl70l3yq3Y64i+awpwXqsGBYWkkqMtnbXgrMD+yj7rhW0kuEDxzJaYXGjEW5ogapKNMEKNMjibA=="],
"wrappy": ["wrappy@1.0.2", "", {}, "sha512-l4Sp/DRseor9wL6EvV2+TuQn63dMkPjZ/sp9XkghTEbV9KlPS1xUsZ3u7/IQO4wxtcFB4bgpQPRcR3QCvezPcQ=="],

View File

@@ -134,7 +134,41 @@ bunx oh-my-opencode run [prompt]
---
## 6. `auth` - Authentication Management
## 6. `mcp oauth` - MCP OAuth Management
Manages OAuth 2.1 authentication for remote MCP servers.
### Usage
```bash
# Login to an OAuth-protected MCP server
bunx oh-my-opencode mcp oauth login <server-name> --server-url https://api.example.com
# Login with explicit client ID and scopes
bunx oh-my-opencode mcp oauth login my-api --server-url https://api.example.com --client-id my-client --scopes "read,write"
# Remove stored OAuth tokens
bunx oh-my-opencode mcp oauth logout <server-name>
# Check OAuth token status
bunx oh-my-opencode mcp oauth status [server-name]
```
### Options
| Option | Description |
|--------|-------------|
| `--server-url <url>` | MCP server URL (required for login) |
| `--client-id <id>` | OAuth client ID (optional if server supports Dynamic Client Registration) |
| `--scopes <scopes>` | Comma-separated OAuth scopes |
### Token Storage
Tokens are stored in `~/.config/opencode/mcp-oauth.json` with `0600` permissions (owner read/write only). Key format: `{serverHost}/{resource}`.
---
## 7. `auth` - Authentication Management
Manages Google Antigravity OAuth authentication. Required for using Gemini models.
@@ -153,7 +187,7 @@ bunx oh-my-opencode auth status
---
## 7. Configuration Files
## 8. Configuration Files
The CLI searches for configuration files in the following locations (in priority order):
@@ -183,7 +217,7 @@ Configuration files support **JSONC (JSON with Comments)** format. You can use c
---
## 8. Troubleshooting
## 9. Troubleshooting
### "OpenCode version too old" Error
@@ -213,7 +247,7 @@ bunx oh-my-opencode doctor --category authentication
---
## 9. Non-Interactive Mode
## 10. Non-Interactive Mode
Use the `--no-tui` option for CI/CD environments.
@@ -227,7 +261,7 @@ bunx oh-my-opencode doctor --json > doctor-report.json
---
## 10. Developer Information
## 11. Developer Information
### CLI Structure

View File

@@ -85,6 +85,66 @@ When both `oh-my-opencode.jsonc` and `oh-my-opencode.json` files exist, `.jsonc`
**Recommended**: For Google Gemini authentication, install the [`opencode-antigravity-auth`](https://github.com/NoeFabris/opencode-antigravity-auth) plugin (`@latest`). It provides multi-account load balancing, variant-based thinking levels, dual quota system (Antigravity + Gemini CLI), and active maintenance. See [Installation > Google Gemini](docs/guide/installation.md#google-gemini-antigravity-oauth).
## Ollama Provider
**IMPORTANT**: When using Ollama as a provider, you **must** disable streaming to avoid JSON parsing errors.
### Required Configuration
```json
{
"agents": {
"explore": {
"model": "ollama/qwen3-coder",
"stream": false
}
}
}
```
### Why `stream: false` is Required
Ollama returns NDJSON (newline-delimited JSON) when streaming is enabled, but Claude Code SDK expects a single JSON object. This causes `JSON Parse error: Unexpected EOF` when agents attempt tool calls.
**Example of the problem**:
```json
// Ollama streaming response (NDJSON - multiple lines)
{"message":{"tool_calls":[...]}, "done":false}
{"message":{"content":""}, "done":true}
// Claude Code SDK expects (single JSON object)
{"message":{"tool_calls":[...], "content":""}, "done":true}
```
### Supported Models
Common Ollama models that work with oh-my-opencode:
| Model | Best For | Configuration |
|-------|----------|---------------|
| `ollama/qwen3-coder` | Code generation, build fixes | `{"model": "ollama/qwen3-coder", "stream": false}` |
| `ollama/ministral-3:14b` | Exploration, codebase search | `{"model": "ollama/ministral-3:14b", "stream": false}` |
| `ollama/lfm2.5-thinking` | Documentation, writing | `{"model": "ollama/lfm2.5-thinking", "stream": false}` |
### Troubleshooting
If you encounter `JSON Parse error: Unexpected EOF`:
1. **Verify `stream: false` is set** in your agent configuration
2. **Check Ollama is running**: `curl http://localhost:11434/api/tags`
3. **Test with curl**:
```bash
curl -s http://localhost:11434/api/chat \
-d '{"model": "qwen3-coder", "messages": [{"role": "user", "content": "Hello"}], "stream": false}'
```
4. **See detailed troubleshooting**: [docs/troubleshooting/ollama-streaming-issue.md](troubleshooting/ollama-streaming-issue.md)
### Future SDK Fix
The proper long-term fix requires Claude Code SDK to parse NDJSON responses correctly. Until then, use `stream: false` as a workaround.
**Tracking**: https://github.com/code-yeongyu/oh-my-opencode/issues/1124
## Agents
Override built-in agent settings:
@@ -103,7 +163,39 @@ Override built-in agent settings:
}
```
Each agent supports: `model`, `temperature`, `top_p`, `prompt`, `prompt_append`, `tools`, `disable`, `description`, `mode`, `color`, `permission`.
Each agent supports: `model`, `temperature`, `top_p`, `prompt`, `prompt_append`, `tools`, `disable`, `description`, `mode`, `color`, `permission`, `category`, `variant`, `maxTokens`, `thinking`, `reasoningEffort`, `textVerbosity`, `providerOptions`.
### Additional Agent Options
| Option | Type | Description |
| ------------------- | ------- | ----------------------------------------------------------------------------------------------- |
| `category` | string | Category name to inherit model and other settings from category defaults |
| `variant` | string | Model variant (e.g., `max`, `high`, `medium`, `low`, `xhigh`) |
| `maxTokens` | number | Maximum tokens for response. Passed directly to OpenCode SDK. |
| `thinking` | object | Extended thinking configuration for Anthropic models. See [Thinking Options](#thinking-options) below. |
| `reasoningEffort` | string | OpenAI reasoning effort level. Values: `low`, `medium`, `high`, `xhigh`. |
| `textVerbosity` | string | Text verbosity level. Values: `low`, `medium`, `high`. |
| `providerOptions` | object | Provider-specific options passed directly to OpenCode SDK. |
#### Thinking Options (Anthropic)
```json
{
"agents": {
"oracle": {
"thinking": {
"type": "enabled",
"budgetTokens": 200000
}
}
}
}
```
| Option | Type | Default | Description |
| ------------- | ------- | ------- | -------------------------------------------- |
| `type` | string | - | `enabled` or `disabled` |
| `budgetTokens`| number | - | Maximum budget tokens for extended thinking |
Use `prompt_append` to add extra instructions without replacing the default system prompt:
@@ -153,13 +245,13 @@ Or disable via `disabled_agents` in `~/.config/opencode/oh-my-opencode.json` or
}
```
Available agents: `oracle`, `librarian`, `explore`, `multimodal-looker`
Available agents: `sisyphus`, `prometheus`, `oracle`, `librarian`, `explore`, `multimodal-looker`, `metis`, `momus`, `atlas`
## Built-in Skills
Oh My OpenCode includes built-in skills that provide additional capabilities:
- **playwright**: Browser automation with Playwright MCP. Use for web scraping, testing, screenshots, and browser interactions.
- **playwright** (default) / **agent-browser**: Browser automation for web scraping, testing, screenshots, and browser interactions. See [Browser Automation](#browser-automation) for switching between providers.
- **git-master**: Git expert for atomic commits, rebase/squash, and history search (blame, bisect, log -S). STRONGLY RECOMMENDED: Use with `delegate_task(category='quick', load_skills=['git-master'], ...)` to save context.
Disable built-in skills via `disabled_skills` in `~/.config/opencode/oh-my-opencode.json` or `.opencode/oh-my-opencode.json`:
@@ -170,7 +262,330 @@ Disable built-in skills via `disabled_skills` in `~/.config/opencode/oh-my-openc
}
```
Available built-in skills: `playwright`, `git-master`
Available built-in skills: `playwright`, `agent-browser`, `git-master`
## Skills Configuration
Configure advanced skills settings including custom skill sources, enabling/disabling specific skills, and defining custom skills.
```json
{
"skills": {
"sources": [
{ "path": "./custom-skills", "recursive": true },
"https://example.com/skill.yaml"
],
"enable": ["my-custom-skill"],
"disable": ["other-skill"],
"my-skill": {
"description": "Custom skill description",
"template": "Custom prompt template",
"from": "source-file.ts",
"model": "custom/model",
"agent": "custom-agent",
"subtask": true,
"argument-hint": "usage hint",
"license": "MIT",
"compatibility": ">= 3.0.0",
"metadata": {
"author": "Your Name"
},
"allowed-tools": ["tool1", "tool2"]
}
}
}
```
### Sources
Load skills from local directories or remote URLs:
```json
{
"skills": {
"sources": [
{ "path": "./custom-skills", "recursive": true },
{ "path": "./single-skill.yaml" },
"https://example.com/skill.yaml",
"https://raw.githubusercontent.com/user/repo/main/skills/*"
]
}
}
```
| Option | Default | Description |
| ----------- | ------- | ---------------------------------------------- |
| `path` | - | Local file/directory path or remote URL |
| `recursive` | `false` | Recursively load from directory |
| `glob` | - | Glob pattern for file selection |
### Enable/Disable Skills
```json
{
"skills": {
"enable": ["skill-1", "skill-2"],
"disable": ["disabled-skill"]
}
}
```
### Custom Skill Definition
Define custom skills directly in your config:
| Option | Default | Description |
| ---------------- | ------- | ------------------------------------------------------------------------------------ |
| `description` | - | Human-readable description of the skill |
| `template` | - | Custom prompt template for the skill |
| `from` | - | Source file to load template from |
| `model` | - | Override model for this skill |
| `agent` | - | Override agent for this skill |
| `subtask` | `false` | Whether to run as a subtask |
| `argument-hint` | - | Hint for how to use the skill |
| `license` | - | Skill license |
| `compatibility` | - | Required oh-my-opencode version compatibility |
| `metadata` | - | Additional metadata as key-value pairs |
| `allowed-tools` | - | Array of tools this skill is allowed to use |
**Example: Custom skill**
```json
{
"skills": {
"data-analyst": {
"description": "Specialized for data analysis tasks",
"template": "You are a data analyst. Focus on statistical analysis, visualization, and data interpretation.",
"model": "openai/gpt-5.2",
"allowed-tools": ["read", "bash", "lsp_diagnostics"]
}
}
}
```
## Browser Automation
Choose between two browser automation providers:
| Provider | Interface | Features | Installation |
|----------|-----------|----------|--------------|
| **playwright** (default) | MCP tools | Playwright MCP server with structured tool calls | Auto-installed via npx |
| **agent-browser** | Bash CLI | Vercel's CLI with session management, parallel browsers | Requires `bun add -g agent-browser` |
**Switch providers** via `browser_automation_engine` in `oh-my-opencode.json`:
```json
{
"browser_automation_engine": {
"provider": "agent-browser"
}
}
```
### Playwright (Default)
Uses the official Playwright MCP server (`@playwright/mcp`). Browser automation happens through structured MCP tool calls.
### agent-browser
Uses [Vercel's agent-browser CLI](https://github.com/vercel-labs/agent-browser). Key advantages:
- **Session management**: Run multiple isolated browser instances with `--session` flag
- **Persistent profiles**: Keep browser state across restarts with `--profile`
- **Snapshot-based workflow**: Get element refs via `snapshot -i`, interact with `@e1`, `@e2`, etc.
- **CLI-first**: All commands via Bash - great for scripting
**Installation required**:
```bash
bun add -g agent-browser
agent-browser install # Download Chromium
```
**Example workflow**:
```bash
agent-browser open https://example.com
agent-browser snapshot -i # Get interactive elements with refs
agent-browser fill @e1 "user@example.com"
agent-browser click @e2
agent-browser screenshot result.png
agent-browser close
```
## Tmux Integration
Run background subagents in separate tmux panes for **visual multi-agent execution**. See your agents working in parallel, each in their own terminal pane.
**Enable tmux integration** via `tmux` in `oh-my-opencode.json`:
```json
{
"tmux": {
"enabled": true,
"layout": "main-vertical",
"main_pane_size": 60,
"main_pane_min_width": 120,
"agent_pane_min_width": 40
}
}
```
| Option | Default | Description |
|--------|---------|-------------|
| `enabled` | `false` | Enable tmux subagent pane spawning. Only works when running inside an existing tmux session. |
| `layout` | `main-vertical` | Tmux layout for agent panes. See [Layout Options](#layout-options) below. |
| `main_pane_size` | `60` | Main pane size as percentage (20-80). |
| `main_pane_min_width` | `120` | Minimum width for main pane in columns. |
| `agent_pane_min_width` | `40` | Minimum width for each agent pane in columns. |
### Layout Options
| Layout | Description |
|--------|-------------|
| `main-vertical` | Main pane left, agent panes stacked on right (default) |
| `main-horizontal` | Main pane top, agent panes stacked bottom |
| `tiled` | All panes in equal-sized grid |
| `even-horizontal` | All panes in horizontal row |
| `even-vertical` | All panes in vertical stack |
### Requirements
1. **Must run inside tmux**: The feature only activates when OpenCode is already running inside a tmux session
2. **Tmux installed**: Requires tmux to be available in PATH
3. **Server mode**: OpenCode must run with `--port` flag to enable subagent pane spawning
### How It Works
When `tmux.enabled` is `true` and you're inside a tmux session:
- Background agents (via `delegate_task(run_in_background=true)`) spawn in new tmux panes
- Each pane shows the subagent's real-time output
- Panes are automatically closed when the subagent completes
- Layout is automatically adjusted based on your configuration
### Running OpenCode with Tmux Subagent Support
To enable tmux subagent panes, OpenCode must run in **server mode** with the `--port` flag. This starts an HTTP server that subagent panes connect to via `opencode attach`.
**Basic setup**:
```bash
# Start tmux session
tmux new -s dev
# Run OpenCode with server mode (port 4096)
opencode --port 4096
# Now background agents will appear in separate panes
```
**Recommended: Shell Function**
For convenience, create a shell function that automatically handles tmux sessions and port allocation. Here's an example for Fish shell:
```fish
# ~/.config/fish/config.fish
function oc
set base_name (basename (pwd))
set path_hash (echo (pwd) | md5 | cut -c1-4)
set session_name "$base_name-$path_hash"
# Find available port starting from 4096
function __oc_find_port
set port 4096
while test $port -lt 5096
if not lsof -i :$port >/dev/null 2>&1
echo $port
return 0
end
set port (math $port + 1)
end
echo 4096
end
set oc_port (__oc_find_port)
set -x OPENCODE_PORT $oc_port
if set -q TMUX
# Already inside tmux - just run with port
opencode --port $oc_port $argv
else
# Create tmux session and run opencode
set oc_cmd "OPENCODE_PORT=$oc_port opencode --port $oc_port $argv; exec fish"
if tmux has-session -t "$session_name" 2>/dev/null
tmux new-window -t "$session_name" -c (pwd) "$oc_cmd"
tmux attach-session -t "$session_name"
else
tmux new-session -s "$session_name" -c (pwd) "$oc_cmd"
end
end
functions -e __oc_find_port
end
```
**Bash/Zsh equivalent**:
```bash
# ~/.bashrc or ~/.zshrc
oc() {
local base_name=$(basename "$PWD")
local path_hash=$(echo "$PWD" | md5sum | cut -c1-4)
local session_name="${base_name}-${path_hash}"
# Find available port
local port=4096
while [ $port -lt 5096 ]; do
if ! lsof -i :$port >/dev/null 2>&1; then
break
fi
port=$((port + 1))
done
export OPENCODE_PORT=$port
if [ -n "$TMUX" ]; then
opencode --port $port "$@"
else
local oc_cmd="OPENCODE_PORT=$port opencode --port $port $*; exec $SHELL"
if tmux has-session -t "$session_name" 2>/dev/null; then
tmux new-window -t "$session_name" -c "$PWD" "$oc_cmd"
tmux attach-session -t "$session_name"
else
tmux new-session -s "$session_name" -c "$PWD" "$oc_cmd"
fi
fi
}
```
**How subagent panes work**:
1. Main OpenCode starts HTTP server on specified port (e.g., `http://localhost:4096`)
2. When a background agent spawns, Oh My OpenCode creates a new tmux pane
3. The pane runs: `opencode attach http://localhost:4096 --session <session-id>`
4. Each subagent pane shows real-time streaming output
5. Panes are automatically closed when the subagent completes
**Environment variables**:
| Variable | Description |
|----------|-------------|
| `OPENCODE_PORT` | Default port for the HTTP server (used if `--port` not specified) |
### Server Mode Reference
OpenCode's server mode exposes an HTTP API for programmatic interaction:
```bash
# Standalone server (no TUI)
opencode serve --port 4096
# TUI with server (recommended for tmux integration)
opencode --port 4096
```
| Flag | Default | Description |
|------|---------|-------------|
| `--port` | `4096` | Port for HTTP server |
| `--hostname` | `127.0.0.1` | Hostname to listen on |
For more details, see the [OpenCode Server documentation](https://opencode.ai/docs/server/).
## Git Master
@@ -271,6 +686,7 @@ Configure concurrency limits for background agent tasks. This controls how many
{
"background_task": {
"defaultConcurrency": 5,
"staleTimeoutMs": 180000,
"providerConcurrency": {
"anthropic": 3,
"openai": 5,
@@ -287,6 +703,7 @@ Configure concurrency limits for background agent tasks. This controls how many
| Option | Default | Description |
| --------------------- | ------- | ----------------------------------------------------------------------------------------------------------------------- |
| `defaultConcurrency` | - | Default maximum concurrent background tasks for all providers/models |
| `staleTimeoutMs` | `180000` | Stale timeout in milliseconds - interrupt tasks with no activity for this duration (minimum: 60000 = 1 minute) |
| `providerConcurrency` | - | Per-provider concurrency limits. Keys are provider names (e.g., `anthropic`, `openai`, `google`) |
| `modelConcurrency` | - | Per-model concurrency limits. Keys are full model names (e.g., `anthropic/claude-opus-4-5`). Overrides provider limits. |
@@ -301,27 +718,96 @@ Configure concurrency limits for background agent tasks. This controls how many
Categories enable domain-specific task delegation via the `delegate_task` tool. Each category applies runtime presets (model, temperature, prompt additions) when calling the `Sisyphus-Junior` agent.
**Default Categories:**
### Built-in Categories
| Category | Model | Description |
| ---------------- | ----------------------------- | ---------------------------------------------------------------------------- |
| `visual` | `google/gemini-3-pro` | Frontend, UI/UX, design-focused tasks. High creativity (temp 0.7). |
| `business-logic` | `openai/gpt-5.2` | Backend logic, architecture, strategic reasoning. Low creativity (temp 0.1). |
All 7 categories come with optimal model defaults, but **you must configure them to use those defaults**:
**Usage:**
| Category | Built-in Default Model | Description |
| -------------------- | ---------------------------------- | -------------------------------------------------------------------- |
| `visual-engineering` | `google/gemini-3-pro-preview` | Frontend, UI/UX, design, styling, animation |
| `ultrabrain` | `openai/gpt-5.2-codex` (xhigh) | Deep logical reasoning, complex architecture decisions |
| `artistry` | `google/gemini-3-pro-preview` (max)| Highly creative/artistic tasks, novel ideas |
| `quick` | `anthropic/claude-haiku-4-5` | Trivial tasks - single file changes, typo fixes, simple modifications|
| `unspecified-low` | `anthropic/claude-sonnet-4-5` | Tasks that don't fit other categories, low effort required |
| `unspecified-high` | `anthropic/claude-opus-4-5` (max) | Tasks that don't fit other categories, high effort required |
| `writing` | `google/gemini-3-flash-preview` | Documentation, prose, technical writing |
### ⚠️ Critical: Model Resolution Priority
**Categories DO NOT use their built-in defaults unless configured.** Model resolution follows this priority:
```
// Via delegate_task tool
delegate_task(category="visual", prompt="Create a responsive dashboard component")
delegate_task(category="business-logic", prompt="Design the payment processing flow")
1. User-configured model (in oh-my-opencode.json)
2. Category's built-in default (if you add category to config)
3. System default model (from opencode.json)
```
// Or target a specific agent directly
**Example Problem:**
```json
// opencode.json
{ "model": "anthropic/claude-sonnet-4-5" }
// oh-my-opencode.json (empty categories section)
{}
// Result: ALL categories use claude-sonnet-4-5 (wasteful!)
// - quick tasks use Sonnet instead of Haiku (expensive)
// - ultrabrain uses Sonnet instead of GPT-5.2 (inferior reasoning)
// - visual tasks use Sonnet instead of Gemini (suboptimal for UI)
```
### Recommended Configuration
**To use optimal models for each category, add them to your config:**
```json
{
"categories": {
"visual-engineering": {
"model": "google/gemini-3-pro-preview"
},
"ultrabrain": {
"model": "openai/gpt-5.2-codex",
"variant": "xhigh"
},
"artistry": {
"model": "google/gemini-3-pro-preview",
"variant": "max"
},
"quick": {
"model": "anthropic/claude-haiku-4-5" // Fast + cheap for trivial tasks
},
"unspecified-low": {
"model": "anthropic/claude-sonnet-4-5"
},
"unspecified-high": {
"model": "anthropic/claude-opus-4-5",
"variant": "max"
},
"writing": {
"model": "google/gemini-3-flash-preview"
}
}
}
```
**Only configure categories you have access to.** Unconfigured categories fall back to your system default model.
### Usage
```javascript
// Via delegate_task tool
delegate_task(category="visual-engineering", prompt="Create a responsive dashboard component")
delegate_task(category="ultrabrain", prompt="Design the payment processing flow")
// Or target a specific agent directly (bypasses categories)
delegate_task(agent="oracle", prompt="Review this architecture")
```
**Custom Categories:**
### Custom Categories
Add custom categories in `oh-my-opencode.json`:
Add your own categories or override built-in ones:
```json
{
@@ -331,15 +817,22 @@ Add custom categories in `oh-my-opencode.json`:
"temperature": 0.2,
"prompt_append": "Focus on data analysis, ML pipelines, and statistical methods."
},
"visual": {
"model": "google/gemini-3-pro",
"visual-engineering": {
"model": "google/gemini-3-pro-preview",
"prompt_append": "Use shadcn/ui components and Tailwind CSS."
}
}
}
```
Each category supports: `model`, `temperature`, `top_p`, `maxTokens`, `thinking`, `reasoningEffort`, `textVerbosity`, `tools`, `prompt_append`.
Each category supports: `model`, `temperature`, `top_p`, `maxTokens`, `thinking`, `reasoningEffort`, `textVerbosity`, `tools`, `prompt_append`, `variant`, `description`, `is_unstable_agent`.
### Additional Category Options
| Option | Type | Default | Description |
| ------------------ | ------- | ------- | --------------------------------------------------------------------------------------------------- |
| `description` | string | - | Human-readable description of the category's purpose. Shown in delegate_task prompt. |
| `is_unstable_agent`| boolean | `false` | Mark agent as unstable - forces background mode for monitoring. Auto-enabled for gemini models. |
## Model Resolution System
@@ -473,10 +966,93 @@ Disable specific built-in hooks via `disabled_hooks` in `~/.config/opencode/oh-m
}
```
Available hooks: `todo-continuation-enforcer`, `context-window-monitor`, `session-recovery`, `session-notification`, `comment-checker`, `grep-output-truncator`, `tool-output-truncator`, `directory-agents-injector`, `directory-readme-injector`, `empty-task-response-detector`, `think-mode`, `anthropic-context-window-limit-recovery`, `rules-injector`, `background-notification`, `auto-update-checker`, `startup-toast`, `keyword-detector`, `agent-usage-reminder`, `non-interactive-env`, `interactive-bash-session`, `compaction-context-injector`, `thinking-block-validator`, `claude-code-hooks`, `ralph-loop`, `preemptive-compaction`
Available hooks: `todo-continuation-enforcer`, `context-window-monitor`, `session-recovery`, `session-notification`, `comment-checker`, `grep-output-truncator`, `tool-output-truncator`, `directory-agents-injector`, `directory-readme-injector`, `empty-task-response-detector`, `think-mode`, `anthropic-context-window-limit-recovery`, `rules-injector`, `background-notification`, `auto-update-checker`, `startup-toast`, `keyword-detector`, `agent-usage-reminder`, `non-interactive-env`, `interactive-bash-session`, `compaction-context-injector`, `thinking-block-validator`, `claude-code-hooks`, `ralph-loop`, `preemptive-compaction`, `auto-slash-command`, `sisyphus-junior-notepad`, `start-work`
**Note on `directory-agents-injector`**: This hook is **automatically disabled** when running on OpenCode 1.1.37+ because OpenCode now has native support for dynamically resolving AGENTS.md files from subdirectories (PR #10678). This prevents duplicate AGENTS.md injection. For older OpenCode versions, the hook remains active to provide the same functionality.
**Note on `auto-update-checker` and `startup-toast`**: The `startup-toast` hook is a sub-feature of `auto-update-checker`. To disable only the startup toast notification while keeping update checking enabled, add `"startup-toast"` to `disabled_hooks`. To disable all update checking features (including the toast), add `"auto-update-checker"` to `disabled_hooks`.
## Disabled Commands
Disable specific built-in commands via `disabled_commands` in `~/.config/opencode/oh-my-opencode.json` or `.opencode/oh-my-opencode.json`:
```json
{
"disabled_commands": ["init-deep", "start-work"]
}
```
Available commands: `init-deep`, `start-work`
## Comment Checker
Configure comment-checker hook behavior. The comment checker warns when excessive comments are added to code.
```json
{
"comment_checker": {
"custom_prompt": "Your custom warning message. Use {{comments}} placeholder for detected comments XML."
}
}
```
| Option | Default | Description |
| ------------- | ------- | -------------------------------------------------------------------------- |
| `custom_prompt` | - | Custom warning message to replace the default. Use `{{comments}}` placeholder. |
## Notification
Configure notification behavior for background task completion.
```json
{
"notification": {
"force_enable": true
}
}
```
| Option | Default | Description |
| -------------- | ------- | ---------------------------------------------------------------------------------------------- |
| `force_enable` | `false` | Force enable session-notification even if external notification plugins are detected. Default: `false`. |
## Sisyphus Tasks & Swarm
Configure Sisyphus Tasks and Swarm systems for advanced task management and multi-agent orchestration.
```json
{
"sisyphus": {
"tasks": {
"enabled": false,
"storage_path": ".sisyphus/tasks",
"claude_code_compat": false
},
"swarm": {
"enabled": false,
"storage_path": ".sisyphus/teams",
"ui_mode": "toast"
}
}
}
```
### Tasks Configuration
| Option | Default | Description |
| -------------------- | ------------------ | ------------------------------------------------------------------------- |
| `enabled` | `false` | Enable Sisyphus Tasks system |
| `storage_path` | `.sisyphus/tasks` | Storage path for tasks (relative to project root) |
| `claude_code_compat` | `false` | Enable Claude Code path compatibility mode |
### Swarm Configuration
| Option | Default | Description |
| -------------- | ------------------ | -------------------------------------------------------------- |
| `enabled` | `false` | Enable Sisyphus Swarm system for multi-agent orchestration |
| `storage_path` | `.sisyphus/teams` | Storage path for teams (relative to project root) |
| `ui_mode` | `toast` | UI mode: `toast` (notifications), `tmux` (panes), or `both` |
## MCPs
Exa, Context7 and grep.app MCP enabled by default.
@@ -518,6 +1094,38 @@ Add LSP servers via the `lsp` option in `~/.config/opencode/oh-my-opencode.json`
Each server supports: `command`, `extensions`, `priority`, `env`, `initialization`, `disabled`.
| Option | Type | Default | Description |
| -------------- | -------- | ------- | ---------------------------------------------------------------------- |
| `command` | array | - | Command to start the LSP server (executable + args) |
| `extensions` | array | - | File extensions this server handles (e.g., `[".ts", ".tsx"]`) |
| `priority` | number | - | Server priority when multiple servers match a file |
| `env` | object | - | Environment variables for the LSP server (key-value pairs) |
| `initialization`| object | - | Custom initialization options passed to the LSP server |
| `disabled` | boolean | `false` | Whether to disable this LSP server |
**Example with advanced options:**
```json
{
"lsp": {
"typescript-language-server": {
"command": ["typescript-language-server", "--stdio"],
"extensions": [".ts", ".tsx"],
"priority": 10,
"env": {
"NODE_OPTIONS": "--max-old-space-size=4096"
},
"initialization": {
"preferences": {
"includeInlayParameterNameHints": "all",
"includeInlayFunctionParameterTypeHints": true
}
}
}
}
}
```
## Experimental
Opt-in experimental features that may change or be removed in future versions. Use with caution.
@@ -527,7 +1135,29 @@ Opt-in experimental features that may change or be removed in future versions. U
"experimental": {
"truncate_all_tool_outputs": true,
"aggressive_truncation": true,
"auto_resume": true
"auto_resume": true,
"dynamic_context_pruning": {
"enabled": false,
"notification": "detailed",
"turn_protection": {
"enabled": true,
"turns": 3
},
"protected_tools": ["task", "todowrite", "lsp_rename"],
"strategies": {
"deduplication": {
"enabled": true
},
"supersede_writes": {
"enabled": true,
"aggressive": false
},
"purge_errors": {
"enabled": true,
"turns": 5
}
}
}
}
}
```
@@ -536,7 +1166,72 @@ Opt-in experimental features that may change or be removed in future versions. U
| --------------------------- | ------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `truncate_all_tool_outputs` | `false` | Truncates ALL tool outputs instead of just whitelisted tools (Grep, Glob, LSP, AST-grep). Tool output truncator is enabled by default - disable via `disabled_hooks`. |
| `aggressive_truncation` | `false` | When token limit is exceeded, aggressively truncates tool outputs to fit within limits. More aggressive than the default truncation behavior. Falls back to summarize/revert if insufficient. |
| `auto_resume` | `false` | Automatically resumes session after successful recovery from thinking block errors or thinking disabled violations. Extracts the last user message and continues. |
| `auto_resume` | `false` | Automatically resumes session after successful recovery from thinking block errors or thinking disabled violations. Extracts last user message and continues. |
| `dynamic_context_pruning` | See below | Dynamic context pruning configuration for managing context window usage automatically. See [Dynamic Context Pruning](#dynamic-context-pruning) below. |
### Dynamic Context Pruning
Dynamic context pruning automatically manages context window by intelligently pruning old tool outputs. This feature helps maintain performance in long sessions.
```json
{
"experimental": {
"dynamic_context_pruning": {
"enabled": false,
"notification": "detailed",
"turn_protection": {
"enabled": true,
"turns": 3
},
"protected_tools": ["task", "todowrite", "todoread", "lsp_rename", "session_read", "session_write", "session_search"],
"strategies": {
"deduplication": {
"enabled": true
},
"supersede_writes": {
"enabled": true,
"aggressive": false
},
"purge_errors": {
"enabled": true,
"turns": 5
}
}
}
}
}
```
| Option | Default | Description |
| ----------------- | ------- | ----------------------------------------------------------------------------------------- |
| `enabled` | `false` | Enable dynamic context pruning |
| `notification` | `detailed` | Notification level: `off`, `minimal`, or `detailed` |
| `turn_protection` | See below | Turn protection settings - prevent pruning recent tool outputs |
#### Turn Protection
| Option | Default | Description |
| --------- | ------- | ------------------------------------------------------------ |
| `enabled` | `true` | Enable turn protection |
| `turns` | `3` | Number of recent turns to protect from pruning (1-10) |
#### Protected Tools
Tools that should never be pruned (default):
```json
["task", "todowrite", "todoread", "lsp_rename", "session_read", "session_write", "session_search"]
```
#### Pruning Strategies
| Strategy | Option | Default | Description |
| ------------------- | ------------ | ------- | ---------------------------------------------------------------------------- |
| **deduplication** | `enabled` | `true` | Remove duplicate tool calls (same tool + same args) |
| **supersede_writes**| `enabled` | `true` | Prune write inputs when file subsequently read |
| | `aggressive` | `false` | Aggressive mode: prune any write if ANY subsequent read |
| **purge_errors** | `enabled` | `true` | Prune errored tool inputs after N turns |
| | `turns` | `5` | Number of turns before pruning errors (1-20) |
**Warning**: These features are experimental and may cause unexpected behavior. Enable only if you understand the implications.

View File

@@ -62,6 +62,27 @@ delegate_task(agent="explore", background=true, prompt="Find auth implementation
background_output(task_id="bg_abc123")
```
#### Visual Multi-Agent with Tmux
Enable `tmux.enabled` to see background agents in separate tmux panes:
```json
{
"tmux": {
"enabled": true,
"layout": "main-vertical"
}
}
```
When running inside tmux:
- Background agents spawn in new panes
- Watch multiple agents work in real-time
- Each pane shows agent output live
- Auto-cleanup when agents complete
See [Tmux Integration](configurations.md#tmux-integration) for full configuration options.
Customize agent models, prompts, and permissions in `oh-my-opencode.json`. See [Configuration](configurations.md#agents).
---
@@ -78,11 +99,15 @@ Skills provide specialized workflows with embedded MCP servers and detailed inst
| **frontend-ui-ux** | UI/UX tasks, styling | Designer-turned-developer persona. Crafts stunning UI/UX even without design mockups. Emphasizes bold aesthetic direction, distinctive typography, cohesive color palettes. |
| **git-master** | commit, rebase, squash, blame | MUST USE for ANY git operations. Atomic commits with automatic splitting, rebase/squash workflows, history search (blame, bisect, log -S). |
### Skill: playwright
### Skill: Browser Automation (playwright / agent-browser)
**Trigger**: Any browser-related request
Provides browser automation via Playwright MCP server:
Oh-My-OpenCode provides two browser automation providers, configurable via `browser_automation_engine.provider`:
#### Option 1: Playwright MCP (Default)
The default provider uses Playwright MCP server:
```yaml
mcp:
@@ -91,18 +116,41 @@ mcp:
args: ["@playwright/mcp@latest"]
```
**Capabilities**:
**Usage**:
```
/playwright Navigate to example.com and take a screenshot
```
#### Option 2: Agent Browser CLI (Vercel)
Alternative provider using [Vercel's agent-browser CLI](https://github.com/vercel-labs/agent-browser):
```json
{
"browser_automation_engine": {
"provider": "agent-browser"
}
}
```
**Requires installation**:
```bash
bun add -g agent-browser
```
**Usage**:
```
Use agent-browser to navigate to example.com and extract the main heading
```
#### Capabilities (Both Providers)
- Navigate and interact with web pages
- Take screenshots and PDFs
- Fill forms and click elements
- Wait for network requests
- Scrape content
**Usage**:
```
/playwright Navigate to example.com and take a screenshot
```
### Skill: frontend-ui-ux
**Trigger**: UI design tasks, visual changes
@@ -272,7 +320,7 @@ Hooks intercept and modify behavior at key points in the agent lifecycle.
| Hook | Event | Description |
|------|-------|-------------|
| **directory-agents-injector** | PostToolUse | Auto-injects AGENTS.md when reading files. Walks from file to project root, collecting all AGENTS.md files. |
| **directory-agents-injector** | PostToolUse | Auto-injects AGENTS.md when reading files. Walks from file to project root, collecting all AGENTS.md files. **Deprecated for OpenCode 1.1.37+** - Auto-disabled when native AGENTS.md injection is available. |
| **directory-readme-injector** | PostToolUse | Auto-injects README.md for directory context. |
| **rules-injector** | PostToolUse | Injects rules from `.claude/rules/` when conditions match. Supports globs and alwaysApply. |
| **compaction-context-injector** | Stop | Preserves critical context during session compaction. |
@@ -418,6 +466,29 @@ Disable specific hooks in config:
| **session_search** | Full-text search across session messages |
| **session_info** | Get session metadata and statistics |
### Interactive Terminal Tools
| Tool | Description |
|------|-------------|
| **interactive_bash** | Tmux-based terminal for TUI apps (vim, htop, pudb). Pass tmux subcommands directly without prefix. |
**Usage Examples**:
```bash
# Create a new session
interactive_bash(tmux_command="new-session -d -s dev-app")
# Send keystrokes to a session
interactive_bash(tmux_command="send-keys -t dev-app 'vim main.py' Enter")
# Capture pane output
interactive_bash(tmux_command="capture-pane -p -t dev-app")
```
**Key Points**:
- Commands are tmux subcommands (no `tmux` prefix)
- Use for interactive apps that need persistent sessions
- One-shot commands should use regular `Bash` tool with `&`
---
## MCPs: Built-in Servers
@@ -450,6 +521,37 @@ mcp:
The `skill_mcp` tool invokes these operations with full schema discovery.
#### OAuth-Enabled MCPs
Skills can define OAuth-protected remote MCP servers. OAuth 2.1 with full RFC compliance (RFC 9728, 8414, 8707, 7591) is supported:
```yaml
---
description: My API skill
mcp:
my-api:
url: https://api.example.com/mcp
oauth:
clientId: ${CLIENT_ID}
scopes: ["read", "write"]
---
```
When a skill MCP has `oauth` configured:
- **Auto-discovery**: Fetches `/.well-known/oauth-protected-resource` (RFC 9728), falls back to `/.well-known/oauth-authorization-server` (RFC 8414)
- **Dynamic Client Registration**: Auto-registers with servers supporting RFC 7591 (clientId becomes optional)
- **PKCE**: Mandatory for all flows
- **Resource Indicators**: Auto-generated from MCP URL per RFC 8707
- **Token Storage**: Persisted in `~/.config/opencode/mcp-oauth.json` (chmod 0600)
- **Auto-refresh**: Tokens refresh on 401; step-up authorization on 403 with `WWW-Authenticate`
- **Dynamic Port**: OAuth callback server uses an auto-discovered available port
Pre-authenticate via CLI:
```bash
bunx oh-my-opencode mcp oauth login <server-name> --server-url https://api.example.com
```
---
## Context Injection

View File

@@ -0,0 +1,126 @@
# Ollama Streaming Issue - JSON Parse Error
## Problem
When using Ollama as a provider with oh-my-opencode agents, you may encounter:
```
JSON Parse error: Unexpected EOF
```
This occurs when agents attempt tool calls (e.g., `explore` agent using `mcp_grep_search`).
## Root Cause
Ollama returns **NDJSON** (newline-delimited JSON) when `stream: true` is used in API requests:
```json
{"message":{"tool_calls":[{"function":{"name":"read","arguments":{"filePath":"README.md"}}}]}, "done":false}
{"message":{"content":""}, "done":true}
```
Claude Code SDK expects a single JSON object, not multiple NDJSON lines, causing the parse error.
### Why This Happens
- **Ollama API**: Returns streaming responses as NDJSON by design
- **Claude Code SDK**: Doesn't properly handle NDJSON responses for tool calls
- **oh-my-opencode**: Passes through the SDK's behavior (can't fix at this layer)
## Solutions
### Option 1: Disable Streaming (Recommended - Immediate Fix)
Configure your Ollama provider to use `stream: false`:
```json
{
"provider": "ollama",
"model": "qwen3-coder",
"stream": false
}
```
**Pros:**
- Works immediately
- No code changes needed
- Simple configuration
**Cons:**
- Slightly slower response time (no streaming)
- Less interactive feedback
### Option 2: Use Non-Tool Agents Only
If you need streaming, avoid agents that use tools:
-**Safe**: Simple text generation, non-tool tasks
-**Problematic**: Any agent with tool calls (explore, librarian, etc.)
### Option 3: Wait for SDK Fix (Long-term)
The proper fix requires Claude Code SDK to:
1. Detect NDJSON responses
2. Parse each line separately
3. Merge `tool_calls` from multiple lines
4. Return a single merged response
**Tracking**: https://github.com/code-yeongyu/oh-my-opencode/issues/1124
## Workaround Implementation
Until the SDK is fixed, here's how to implement NDJSON parsing (for SDK maintainers):
```typescript
async function parseOllamaStreamResponse(response: string): Promise<object> {
const lines = response.split('\n').filter(line => line.trim());
const mergedMessage = { tool_calls: [] };
for (const line of lines) {
try {
const json = JSON.parse(line);
if (json.message?.tool_calls) {
mergedMessage.tool_calls.push(...json.message.tool_calls);
}
if (json.message?.content) {
mergedMessage.content = json.message.content;
}
} catch (e) {
// Skip malformed lines
console.warn('Skipping malformed NDJSON line:', line);
}
}
return mergedMessage;
}
```
## Testing
To verify the fix works:
```bash
# Test with curl (should work with stream: false)
curl -s http://localhost:11434/api/chat \
-d '{
"model": "qwen3-coder",
"messages": [{"role": "user", "content": "Read file README.md"}],
"stream": false,
"tools": [{"type": "function", "function": {"name": "read", "description": "Read a file", "parameters": {"type": "object", "properties": {"filePath": {"type": "string"}}, "required": ["filePath"]}}}]
}'
```
## Related Issues
- **oh-my-opencode**: https://github.com/code-yeongyu/oh-my-opencode/issues/1124
- **Ollama API Docs**: https://github.com/ollama/ollama/blob/main/docs/api.md
## Getting Help
If you encounter this issue:
1. Check your Ollama provider configuration
2. Set `stream: false` as a workaround
3. Report any additional errors to the issue tracker
4. Provide your configuration (without secrets) for debugging

View File

@@ -1,6 +1,6 @@
{
"name": "oh-my-opencode",
"version": "3.0.1",
"version": "3.1.7",
"description": "The Best AI Agent Harness - Batteries-Included OpenCode Plugin with Multi-Model Orchestration, Parallel Background Agents, and Crafted LSP/AST Tools",
"main": "dist/index.js",
"types": "dist/index.d.ts",
@@ -64,6 +64,7 @@
"jsonc-parser": "^3.3.1",
"picocolors": "^1.1.1",
"picomatch": "^4.0.2",
"vscode-jsonrpc": "^8.2.0",
"zod": "^4.1.8"
},
"devDependencies": {
@@ -73,13 +74,13 @@
"typescript": "^5.7.3"
},
"optionalDependencies": {
"oh-my-opencode-darwin-arm64": "3.0.1",
"oh-my-opencode-darwin-x64": "3.0.1",
"oh-my-opencode-linux-arm64": "3.0.1",
"oh-my-opencode-linux-arm64-musl": "3.0.1",
"oh-my-opencode-linux-x64": "3.0.1",
"oh-my-opencode-linux-x64-musl": "3.0.1",
"oh-my-opencode-windows-x64": "3.0.1"
"oh-my-opencode-darwin-arm64": "3.1.7",
"oh-my-opencode-darwin-x64": "3.1.7",
"oh-my-opencode-linux-arm64": "3.1.7",
"oh-my-opencode-linux-arm64-musl": "3.1.7",
"oh-my-opencode-linux-x64": "3.1.7",
"oh-my-opencode-linux-x64-musl": "3.1.7",
"oh-my-opencode-windows-x64": "3.1.7"
},
"trustedDependencies": [
"@ast-grep/cli",

View File

@@ -1,6 +1,6 @@
{
"name": "oh-my-opencode-darwin-arm64",
"version": "3.0.1",
"version": "3.1.7",
"description": "Platform-specific binary for oh-my-opencode (darwin-arm64)",
"license": "MIT",
"repository": {

View File

@@ -0,0 +1,22 @@
{
"name": "oh-my-opencode-darwin-x64-baseline",
"version": "3.1.1",
"description": "Platform-specific binary for oh-my-opencode (darwin-x64-baseline, no AVX2)",
"license": "MIT",
"repository": {
"type": "git",
"url": "https://github.com/code-yeongyu/oh-my-opencode"
},
"os": [
"darwin"
],
"cpu": [
"x64"
],
"files": [
"bin"
],
"bin": {
"oh-my-opencode": "./bin/oh-my-opencode"
}
}

View File

@@ -1,6 +1,6 @@
{
"name": "oh-my-opencode-darwin-x64",
"version": "3.0.1",
"version": "3.1.7",
"description": "Platform-specific binary for oh-my-opencode (darwin-x64)",
"license": "MIT",
"repository": {

View File

@@ -1,6 +1,6 @@
{
"name": "oh-my-opencode-linux-arm64-musl",
"version": "3.0.1",
"version": "3.1.7",
"description": "Platform-specific binary for oh-my-opencode (linux-arm64-musl)",
"license": "MIT",
"repository": {

View File

@@ -1,6 +1,6 @@
{
"name": "oh-my-opencode-linux-arm64",
"version": "3.0.1",
"version": "3.1.7",
"description": "Platform-specific binary for oh-my-opencode (linux-arm64)",
"license": "MIT",
"repository": {

View File

@@ -0,0 +1,25 @@
{
"name": "oh-my-opencode-linux-x64-baseline",
"version": "3.1.1",
"description": "Platform-specific binary for oh-my-opencode (linux-x64-baseline, no AVX2)",
"license": "MIT",
"repository": {
"type": "git",
"url": "https://github.com/code-yeongyu/oh-my-opencode"
},
"os": [
"linux"
],
"cpu": [
"x64"
],
"libc": [
"glibc"
],
"files": [
"bin"
],
"bin": {
"oh-my-opencode": "./bin/oh-my-opencode"
}
}

View File

@@ -0,0 +1,25 @@
{
"name": "oh-my-opencode-linux-x64-musl-baseline",
"version": "3.1.1",
"description": "Platform-specific binary for oh-my-opencode (linux-x64-musl-baseline, no AVX2)",
"license": "MIT",
"repository": {
"type": "git",
"url": "https://github.com/code-yeongyu/oh-my-opencode"
},
"os": [
"linux"
],
"cpu": [
"x64"
],
"libc": [
"musl"
],
"files": [
"bin"
],
"bin": {
"oh-my-opencode": "./bin/oh-my-opencode"
}
}

View File

@@ -1,6 +1,6 @@
{
"name": "oh-my-opencode-linux-x64-musl",
"version": "3.0.1",
"version": "3.1.7",
"description": "Platform-specific binary for oh-my-opencode (linux-x64-musl)",
"license": "MIT",
"repository": {

View File

@@ -1,6 +1,6 @@
{
"name": "oh-my-opencode-linux-x64",
"version": "3.0.1",
"version": "3.1.7",
"description": "Platform-specific binary for oh-my-opencode (linux-x64)",
"license": "MIT",
"repository": {

View File

@@ -0,0 +1,22 @@
{
"name": "oh-my-opencode-windows-x64-baseline",
"version": "3.1.1",
"description": "Platform-specific binary for oh-my-opencode (windows-x64-baseline, no AVX2)",
"license": "MIT",
"repository": {
"type": "git",
"url": "https://github.com/code-yeongyu/oh-my-opencode"
},
"os": [
"win32"
],
"cpu": [
"x64"
],
"files": [
"bin"
],
"bin": {
"oh-my-opencode": "./bin/oh-my-opencode.exe"
}
}

View File

@@ -1,6 +1,6 @@
{
"name": "oh-my-opencode-windows-x64",
"version": "3.0.1",
"version": "3.1.7",
"description": "Platform-specific binary for oh-my-opencode (windows-x64)",
"license": "MIT",
"repository": {

View File

@@ -0,0 +1,79 @@
// script/build-binaries.test.ts
// Tests for platform binary build configuration
import { describe, expect, it } from "bun:test";
// Import PLATFORMS from build-binaries.ts
// We need to export it first, but for now we'll test the expected structure
const EXPECTED_BASELINE_TARGETS = [
"bun-linux-x64-baseline",
"bun-linux-x64-musl-baseline",
"bun-darwin-x64-baseline",
"bun-windows-x64-baseline",
];
describe("build-binaries", () => {
describe("PLATFORMS array", () => {
it("includes baseline variants for non-AVX2 CPU support", async () => {
// given
const module = await import("./build-binaries.ts");
const platforms = (module as { PLATFORMS: { target: string }[] }).PLATFORMS;
const targets = platforms.map((p) => p.target);
// when
const hasAllBaselineTargets = EXPECTED_BASELINE_TARGETS.every((baseline) =>
targets.includes(baseline)
);
// then
expect(hasAllBaselineTargets).toBe(true);
for (const baseline of EXPECTED_BASELINE_TARGETS) {
expect(targets).toContain(baseline);
}
});
it("has correct directory names for baseline platforms", async () => {
// given
const module = await import("./build-binaries.ts");
const platforms = (module as { PLATFORMS: { dir: string; target: string }[] }).PLATFORMS;
// when
const baselinePlatforms = platforms.filter((p) => p.target.includes("baseline"));
// then
expect(baselinePlatforms.length).toBe(4);
expect(baselinePlatforms.map((p) => p.dir)).toContain("linux-x64-baseline");
expect(baselinePlatforms.map((p) => p.dir)).toContain("linux-x64-musl-baseline");
expect(baselinePlatforms.map((p) => p.dir)).toContain("darwin-x64-baseline");
expect(baselinePlatforms.map((p) => p.dir)).toContain("windows-x64-baseline");
});
it("has correct binary names for baseline platforms", async () => {
// given
const module = await import("./build-binaries.ts");
const platforms = (module as { PLATFORMS: { dir: string; target: string; binary: string }[] }).PLATFORMS;
// when
const windowsBaseline = platforms.find((p) => p.target === "bun-windows-x64-baseline");
const linuxBaseline = platforms.find((p) => p.target === "bun-linux-x64-baseline");
// then
expect(windowsBaseline?.binary).toBe("oh-my-opencode.exe");
expect(linuxBaseline?.binary).toBe("oh-my-opencode");
});
it("has descriptions mentioning no AVX2 for baseline platforms", async () => {
// given
const module = await import("./build-binaries.ts");
const platforms = (module as { PLATFORMS: { target: string; description: string }[] }).PLATFORMS;
// when
const baselinePlatforms = platforms.filter((p) => p.target.includes("baseline"));
// then
for (const platform of baselinePlatforms) {
expect(platform.description).toContain("no AVX2");
}
});
});
});

View File

@@ -13,14 +13,18 @@ interface PlatformTarget {
description: string;
}
const PLATFORMS: PlatformTarget[] = [
export const PLATFORMS: PlatformTarget[] = [
{ dir: "darwin-arm64", target: "bun-darwin-arm64", binary: "oh-my-opencode", description: "macOS ARM64" },
{ dir: "darwin-x64", target: "bun-darwin-x64", binary: "oh-my-opencode", description: "macOS x64" },
{ dir: "darwin-x64-baseline", target: "bun-darwin-x64-baseline", binary: "oh-my-opencode", description: "macOS x64 (no AVX2)" },
{ dir: "linux-x64", target: "bun-linux-x64", binary: "oh-my-opencode", description: "Linux x64 (glibc)" },
{ dir: "linux-x64-baseline", target: "bun-linux-x64-baseline", binary: "oh-my-opencode", description: "Linux x64 (glibc, no AVX2)" },
{ dir: "linux-arm64", target: "bun-linux-arm64", binary: "oh-my-opencode", description: "Linux ARM64 (glibc)" },
{ dir: "linux-x64-musl", target: "bun-linux-x64-musl", binary: "oh-my-opencode", description: "Linux x64 (musl)" },
{ dir: "linux-x64-musl-baseline", target: "bun-linux-x64-musl-baseline", binary: "oh-my-opencode", description: "Linux x64 (musl, no AVX2)" },
{ dir: "linux-arm64-musl", target: "bun-linux-arm64-musl", binary: "oh-my-opencode", description: "Linux ARM64 (musl)" },
{ dir: "windows-x64", target: "bun-windows-x64", binary: "oh-my-opencode.exe", description: "Windows x64" },
{ dir: "windows-x64-baseline", target: "bun-windows-x64-baseline", binary: "oh-my-opencode.exe", description: "Windows x64 (no AVX2)" },
];
const ENTRY_POINT = "src/cli/index.ts";

View File

@@ -815,6 +815,158 @@
"created_at": "2026-01-25T03:13:52Z",
"repoId": 1108837393,
"pullRequestNo": 1084
},
{
"name": "misyuari",
"id": 12197761,
"comment_id": 3798225767,
"created_at": "2026-01-26T07:31:02Z",
"repoId": 1108837393,
"pullRequestNo": 1132
},
{
"name": "boguan",
"id": 3226538,
"comment_id": 3798448537,
"created_at": "2026-01-26T08:40:37Z",
"repoId": 1108837393,
"pullRequestNo": 1137
},
{
"name": "boguan",
"id": 3226538,
"comment_id": 3798471978,
"created_at": "2026-01-26T08:46:03Z",
"repoId": 1108837393,
"pullRequestNo": 1137
},
{
"name": "Jeremy-Kr",
"id": 110771206,
"comment_id": 3799211732,
"created_at": "2026-01-26T11:59:13Z",
"repoId": 1108837393,
"pullRequestNo": 1141
},
{
"name": "orientpine",
"id": 32758428,
"comment_id": 3799897021,
"created_at": "2026-01-26T14:30:33Z",
"repoId": 1108837393,
"pullRequestNo": 1145
},
{
"name": "craftaholic",
"id": 63741110,
"comment_id": 3797014417,
"created_at": "2026-01-25T17:52:34Z",
"repoId": 1108837393,
"pullRequestNo": 1110
},
{
"name": "acamq",
"id": 179265037,
"comment_id": 3801038978,
"created_at": "2026-01-26T18:20:17Z",
"repoId": 1108837393,
"pullRequestNo": 1151
},
{
"name": "itsmylife44",
"id": 34112129,
"comment_id": 3802225779,
"created_at": "2026-01-26T23:20:30Z",
"repoId": 1108837393,
"pullRequestNo": 1157
},
{
"name": "ghtndl",
"id": 117787238,
"comment_id": 3802593326,
"created_at": "2026-01-27T01:27:17Z",
"repoId": 1108837393,
"pullRequestNo": 1158
},
{
"name": "alvinunreal",
"id": 204474669,
"comment_id": 3796402213,
"created_at": "2026-01-25T10:26:58Z",
"repoId": 1108837393,
"pullRequestNo": 1100
},
{
"name": "MoerAI",
"id": 26067127,
"comment_id": 3803968993,
"created_at": "2026-01-27T09:00:57Z",
"repoId": 1108837393,
"pullRequestNo": 1172
},
{
"name": "moha-abdi",
"id": 83307623,
"comment_id": 3804988070,
"created_at": "2026-01-27T12:36:21Z",
"repoId": 1108837393,
"pullRequestNo": 1179
},
{
"name": "zycaskevin",
"id": 223135116,
"comment_id": 3806137669,
"created_at": "2026-01-27T16:20:38Z",
"repoId": 1108837393,
"pullRequestNo": 1184
},
{
"name": "agno01",
"id": 4479380,
"comment_id": 3808373433,
"created_at": "2026-01-28T01:02:02Z",
"repoId": 1108837393,
"pullRequestNo": 1188
},
{
"name": "rooftop-Owl",
"id": 254422872,
"comment_id": 3809867225,
"created_at": "2026-01-28T08:46:58Z",
"repoId": 1108837393,
"pullRequestNo": 1197
},
{
"name": "youming-ai",
"id": 173424537,
"comment_id": 3811195276,
"created_at": "2026-01-28T13:04:16Z",
"repoId": 1108837393,
"pullRequestNo": 1203
},
{
"name": "KennyDizi",
"id": 16578966,
"comment_id": 3811619818,
"created_at": "2026-01-28T14:26:10Z",
"repoId": 1108837393,
"pullRequestNo": 1214
},
{
"name": "mrdavidlaing",
"id": 227505,
"comment_id": 3813542625,
"created_at": "2026-01-28T19:51:34Z",
"repoId": 1108837393,
"pullRequestNo": 1226
},
{
"name": "Lynricsy",
"id": 62173814,
"comment_id": 3816370548,
"created_at": "2026-01-29T09:00:28Z",
"repoId": 1108837393,
"pullRequestNo": 1241
}
]
}

View File

@@ -1,31 +1,28 @@
# AGENTS KNOWLEDGE BASE
## OVERVIEW
10 AI agents for multi-model orchestration. Sisyphus (primary), Atlas (orchestrator), oracle, librarian, explore, multimodal-looker, Prometheus, Metis, Momus, Sisyphus-Junior.
## STRUCTURE
```
agents/
├── atlas.ts # Master Orchestrator (572 lines)
├── sisyphus.ts # Main prompt (450 lines)
├── sisyphus-junior.ts # Delegated task executor (135 lines)
├── dynamic-agent-prompt-builder.ts # Dynamic prompt generation (359 lines)
├── atlas.ts # Master Orchestrator (holds todo list)
├── sisyphus.ts # Main prompt (SF Bay Area engineer identity)
├── sisyphus-junior.ts # Delegated task executor (category-spawned)
├── oracle.ts # Strategic advisor (GPT-5.2)
├── librarian.ts # Multi-repo research (326 lines)
├── explore.ts # Fast grep (Grok Code)
├── librarian.ts # Multi-repo research (GitHub CLI, Context7)
├── explore.ts # Fast contextual grep (Grok Code)
├── multimodal-looker.ts # Media analyzer (Gemini 3 Flash)
├── prometheus-prompt.ts # Planning (1196 lines)
├── metis.ts # Plan consultant (315 lines)
├── momus.ts # Plan reviewer (444 lines)
├── prometheus-prompt.ts # Planning (Interview/Consultant mode, 1196 lines)
├── metis.ts # Pre-planning analysis (Gap detection)
├── momus.ts # Plan reviewer (Ruthless fault-finding)
├── dynamic-agent-prompt-builder.ts # Dynamic prompt generation
├── types.ts # AgentModelConfig, AgentPromptMetadata
├── utils.ts # createBuiltinAgents(), resolveModelWithFallback()
└── index.ts # builtinAgents export
```
## AGENT MODELS
| Agent | Model | Temp | Purpose |
|-------|-------|------|---------|
| Sisyphus | anthropic/claude-opus-4-5 | 0.1 | Primary orchestrator |
@@ -40,14 +37,12 @@ agents/
| Sisyphus-Junior | anthropic/claude-sonnet-4-5 | 0.1 | Category-spawned executor |
## HOW TO ADD
1. Create `src/agents/my-agent.ts` exporting factory + metadata
2. Add to `agentSources` in `src/agents/utils.ts`
3. Update `AgentNameSchema` in `src/config/schema.ts`
4. Register in `src/index.ts` initialization
1. Create `src/agents/my-agent.ts` exporting factory + metadata.
2. Add to `agentSources` in `src/agents/utils.ts`.
3. Update `AgentNameSchema` in `src/config/schema.ts`.
4. Register in `src/index.ts` initialization.
## TOOL RESTRICTIONS
| Agent | Denied Tools |
|-------|-------------|
| oracle | write, edit, task, delegate_task |
@@ -57,14 +52,13 @@ agents/
| Sisyphus-Junior | task, delegate_task |
## PATTERNS
- **Factory**: `createXXXAgent(model?: string): AgentConfig`
- **Metadata**: `XXX_PROMPT_METADATA` with category, cost, triggers
- **Tool restrictions**: `createAgentToolRestrictions(tools)` or `createAgentToolAllowlist(tools)`
- **Thinking**: 32k budget tokens for Sisyphus, Oracle, Prometheus, Atlas
- **Factory**: `createXXXAgent(model: string): AgentConfig`
- **Metadata**: `XXX_PROMPT_METADATA` with category, cost, triggers.
- **Tool restrictions**: `createAgentToolRestrictions(tools)` or `createAgentToolAllowlist(tools)`.
- **Thinking**: 32k budget tokens for Sisyphus, Oracle, Prometheus, Atlas.
## ANTI-PATTERNS
- **Trust reports**: NEVER trust "I'm done" - verify outputs
- **High temp**: Don't use >0.3 for code agents
- **Sequential calls**: Use `delegate_task` with `run_in_background`
- **Trust reports**: NEVER trust "I'm done" - verify outputs.
- **High temp**: Don't use >0.3 for code agents.
- **Sequential calls**: Use `delegate_task` with `run_in_background` for exploration.
- **Prometheus writing code**: Planner only - never implements.

View File

@@ -523,18 +523,15 @@ function buildDynamicOrchestratorPrompt(ctx?: OrchestratorContext): string {
}
export function createAtlasAgent(ctx: OrchestratorContext): AgentConfig {
if (!ctx.model) {
throw new Error("createAtlasAgent requires a model in context")
}
const restrictions = createAgentToolRestrictions([
"task",
"call_omo_agent",
])
return {
description:
"Orchestrates work via delegate_task() to complete ALL tasks in a todo list until fully done",
"Orchestrates work via delegate_task() to complete ALL tasks in a todo list until fully done. (Atlas - OhMyOpenCode)",
mode: "primary" as const,
model: ctx.model,
...(ctx.model ? { model: ctx.model } : {}),
temperature: 0.1,
prompt: buildDynamicOrchestratorPrompt(ctx),
thinking: { type: "enabled", budgetTokens: 32000 },

View File

@@ -33,7 +33,7 @@ export function createExploreAgent(model: string): AgentConfig {
return {
description:
'Contextual grep for codebases. Answers "Where is X?", "Which file has Y?", "Find the code that does Z". Fire multiple in parallel for broad searches. Specify thoroughness: "quick" for basic, "medium" for moderate, "very thorough" for comprehensive analysis.',
'Contextual grep for codebases. Answers "Where is X?", "Which file has Y?", "Find the code that does Z". Fire multiple in parallel for broad searches. Specify thoroughness: "quick" for basic, "medium" for moderate, "very thorough" for comprehensive analysis. (Explore - OhMyOpenCode)',
mode: "subagent" as const,
model,
temperature: 0.1,

View File

@@ -30,7 +30,7 @@ export function createLibrarianAgent(model: string): AgentConfig {
return {
description:
"Specialized codebase understanding agent for multi-repository analysis, searching remote codebases, retrieving official documentation, and finding implementation examples using GitHub CLI, Context7, and Web Search. MUST BE USED when users ask to look up code in remote repositories, explain library internals, or find usage examples in open source.",
"Specialized codebase understanding agent for multi-repository analysis, searching remote codebases, retrieving official documentation, and finding implementation examples using GitHub CLI, Context7, and Web Search. MUST BE USED when users ask to look up code in remote repositories, explain library internals, or find usage examples in open source. (Librarian - OhMyOpenCode)",
mode: "subagent" as const,
model,
temperature: 0.1,

View File

@@ -230,6 +230,8 @@ call_omo_agent(subagent_type="librarian", prompt="Find OSS implementations of Z.
- [Risk 2]: [Mitigation]
## Directives for Prometheus
### Core Directives
- MUST: [Required action]
- MUST: [Required action]
- MUST NOT: [Forbidden action]
@@ -237,6 +239,29 @@ call_omo_agent(subagent_type="librarian", prompt="Find OSS implementations of Z.
- PATTERN: Follow \`[file:lines]\`
- TOOL: Use \`[specific tool]\` for [purpose]
### QA/Acceptance Criteria Directives (MANDATORY)
> **ZERO USER INTERVENTION PRINCIPLE**: All acceptance criteria MUST be executable by agents.
- MUST: Write acceptance criteria as executable commands (curl, bun test, playwright actions)
- MUST: Include exact expected outputs, not vague descriptions
- MUST: Specify verification tool for each deliverable type (playwright for UI, curl for API, etc.)
- MUST NOT: Create criteria requiring "user manually tests..."
- MUST NOT: Create criteria requiring "user visually confirms..."
- MUST NOT: Create criteria requiring "user clicks/interacts..."
- MUST NOT: Use placeholders without concrete examples (bad: "[endpoint]", good: "/api/users")
Example of GOOD acceptance criteria:
\`\`\`
curl -s http://localhost:3000/api/health | jq '.status'
# Assert: Output is "ok"
\`\`\`
Example of BAD acceptance criteria (FORBIDDEN):
\`\`\`
User opens browser and checks if the page loads correctly.
User confirms the button works as expected.
\`\`\`
## Recommended Approach
[1-2 sentence summary of how to proceed]
\`\`\`
@@ -263,12 +288,16 @@ call_omo_agent(subagent_type="librarian", prompt="Find OSS implementations of Z.
- Ask generic questions ("What's the scope?")
- Proceed without addressing ambiguity
- Make assumptions about user's codebase
- Suggest acceptance criteria requiring user intervention ("user manually tests", "user confirms", "user clicks")
- Leave QA/acceptance criteria vague or placeholder-heavy
**ALWAYS**:
- Classify intent FIRST
- Be specific ("Should this change UserService only, or also AuthService?")
- Explore before asking (for Build/Research intents)
- Provide actionable directives for Prometheus
- Include QA automation directives in every output
- Ensure acceptance criteria are agent-executable (commands, not human actions)
`
const metisRestrictions = createAgentToolRestrictions([
@@ -281,7 +310,7 @@ const metisRestrictions = createAgentToolRestrictions([
export function createMetisAgent(model: string): AgentConfig {
return {
description:
"Pre-planning consultant that analyzes requests to identify hidden intentions, ambiguities, and AI failure points.",
"Pre-planning consultant that analyzes requests to identify hidden intentions, ambiguities, and AI failure points. (Metis - OhMyOpenCode)",
mode: "subagent" as const,
model,
temperature: 0.3,

View File

@@ -399,7 +399,7 @@ export function createMomusAgent(model: string): AgentConfig {
const base = {
description:
"Expert reviewer for evaluating work plans against rigorous clarity, verifiability, and completeness standards.",
"Expert reviewer for evaluating work plans against rigorous clarity, verifiability, and completeness standards. (Momus - OhMyOpenCode)",
mode: "subagent" as const,
model,
temperature: 0.1,

View File

@@ -14,7 +14,7 @@ export function createMultimodalLookerAgent(model: string): AgentConfig {
return {
description:
"Analyze media files (PDFs, images, diagrams) that require interpretation beyond raw text. Extracts specific information or summaries from documents, describes visual content. Use when you need analyzed/extracted data rather than literal file contents.",
"Analyze media files (PDFs, images, diagrams) that require interpretation beyond raw text. Extracts specific information or summaries from documents, describes visual content. Use when you need analyzed/extracted data rather than literal file contents. (Multimodal-Looker - OhMyOpenCode)",
mode: "subagent" as const,
model,
temperature: 0.1,

View File

@@ -105,7 +105,7 @@ export function createOracleAgent(model: string): AgentConfig {
const base = {
description:
"Read-only consultation agent. High-IQ reasoning specialist for debugging hard problems and high-difficulty architecture design.",
"Read-only consultation agent. High-IQ reasoning specialist for debugging hard problems and high-difficulty architecture design. (Oracle - OhMyOpenCode)",
mode: "subagent" as const,
model,
temperature: 0.1,

View File

@@ -863,6 +863,20 @@ Generate plan to: \`.sisyphus/plans/{name}.md\`
\`\`\`markdown
# {Plan Title}
## TL;DR
> **Quick Summary**: [1-2 sentences capturing the core objective and approach]
>
> **Deliverables**: [Bullet list of concrete outputs]
> - [Output 1]
> - [Output 2]
>
> **Estimated Effort**: [Quick | Short | Medium | Large | XL]
> **Parallel Execution**: [YES - N waves | NO - sequential]
> **Critical Path**: [Task X → Task Y → Task Z]
---
## Context
### Original Request
@@ -939,53 +953,89 @@ Each TODO follows RED-GREEN-REFACTOR:
- Example: Create \`src/__tests__/example.test.ts\`
- Verify: \`bun test\` → 1 test passes
### If Manual QA Only
### If Automated Verification Only (NO User Intervention)
**CRITICAL**: Without automated tests, manual verification MUST be exhaustive.
> **CRITICAL PRINCIPLE: ZERO USER INTERVENTION**
>
> **NEVER** create acceptance criteria that require:
> - "User manually tests..." / "사용자가 직접 테스트..."
> - "User visually confirms..." / "사용자가 눈으로 확인..."
> - "User interacts with..." / "사용자가 직접 조작..."
> - "Ask user to verify..." / "사용자에게 확인 요청..."
> - ANY step that requires a human to perform an action
>
> **ALL verification MUST be automated and executable by the agent.**
> If a verification cannot be automated, find an automated alternative or explicitly note it as a known limitation.
Each TODO includes detailed verification procedures:
Each TODO includes EXECUTABLE verification procedures that agents can run directly:
**By Deliverable Type:**
| Type | Verification Tool | Procedure |
|------|------------------|-----------|
| **Frontend/UI** | Playwright browser | Navigate, interact, screenshot |
| **TUI/CLI** | interactive_bash (tmux) | Run command, verify output |
| **API/Backend** | curl / httpie | Send request, verify response |
| **Library/Module** | Node/Python REPL | Import, call, verify |
| **Config/Infra** | Shell commands | Apply, verify state |
| Type | Verification Tool | Automated Procedure |
|------|------------------|---------------------|
| **Frontend/UI** | Playwright browser via playwright skill | Agent navigates, clicks, screenshots, asserts DOM state |
| **TUI/CLI** | interactive_bash (tmux) | Agent runs command, captures output, validates expected strings |
| **API/Backend** | curl / httpie via Bash | Agent sends request, parses response, validates JSON fields |
| **Library/Module** | Node/Python REPL via Bash | Agent imports, calls function, compares output |
| **Config/Infra** | Shell commands via Bash | Agent applies config, runs state check, validates output |
**Evidence Required:**
- Commands run with actual output
- Screenshots for visual changes
- Response bodies for API changes
- Terminal output for CLI changes
**Evidence Requirements (Agent-Executable):**
- Command output captured and compared against expected patterns
- Screenshots saved to .sisyphus/evidence/ for visual verification
- JSON response fields validated with specific assertions
- Exit codes checked (0 = success)
---
## Task Flow
## Execution Strategy
### Parallel Execution Waves
> Maximize throughput by grouping independent tasks into parallel waves.
> Each wave completes before the next begins.
\`\`\`
Task 1 → Task 2 → Task 3
↘ Task 4 (parallel)
Wave 1 (Start Immediately):
├── Task 1: [no dependencies]
└── Task 5: [no dependencies]
Wave 2 (After Wave 1):
├── Task 2: [depends: 1]
├── Task 3: [depends: 1]
└── Task 6: [depends: 5]
Wave 3 (After Wave 2):
└── Task 4: [depends: 2, 3]
Critical Path: Task 1 → Task 2 → Task 4
Parallel Speedup: ~40% faster than sequential
\`\`\`
## Parallelization
### Dependency Matrix
| Group | Tasks | Reason |
|-------|-------|--------|
| A | 2, 3 | Independent files |
| Task | Depends On | Blocks | Can Parallelize With |
|------|------------|--------|---------------------|
| 1 | None | 2, 3 | 5 |
| 2 | 1 | 4 | 3, 6 |
| 3 | 1 | 4 | 2, 6 |
| 4 | 2, 3 | None | None (final) |
| 5 | None | 6 | 1 |
| 6 | 5 | None | 2, 3 |
| Task | Depends On | Reason |
|------|------------|--------|
| 4 | 1 | Requires output from 1 |
### Agent Dispatch Summary
| Wave | Tasks | Recommended Agents |
|------|-------|-------------------|
| 1 | 1, 5 | delegate_task(category="...", load_skills=[...], run_in_background=true) |
| 2 | 2, 3, 6 | dispatch parallel after Wave 1 completes |
| 3 | 4 | final integration task |
---
## TODOs
> Implementation + Test = ONE Task. Never separate.
> Specify parallelizability for EVERY task.
> EVERY task MUST have: Recommended Agent Profile + Parallelization info.
- [ ] 1. [Task Title]
@@ -996,7 +1046,21 @@ Task 1 → Task 2 → Task 3
**Must NOT do**:
- [Specific exclusions from guardrails]
**Parallelizable**: YES (with 3, 4) | NO (depends on 0)
**Recommended Agent Profile**:
> Select category + skills based on task domain. Justify each choice.
- **Category**: \`[visual-engineering | ultrabrain | artistry | quick | unspecified-low | unspecified-high | writing]\`
- Reason: [Why this category fits the task domain]
- **Skills**: [\`skill-1\`, \`skill-2\`]
- \`skill-1\`: [Why needed - domain overlap explanation]
- \`skill-2\`: [Why needed - domain overlap explanation]
- **Skills Evaluated but Omitted**:
- \`omitted-skill\`: [Why domain doesn't overlap]
**Parallelization**:
- **Can Run In Parallel**: YES | NO
- **Parallel Group**: Wave N (with Tasks X, Y) | Sequential
- **Blocks**: [Tasks that depend on this task completing]
- **Blocked By**: [Tasks this depends on] | None (can start immediately)
**References** (CRITICAL - Be Exhaustive):
@@ -1029,53 +1093,76 @@ Task 1 → Task 2 → Task 3
**Acceptance Criteria**:
> CRITICAL: Acceptance = EXECUTION, not just "it should work".
> The executor MUST run these commands and verify output.
> **CRITICAL: AGENT-EXECUTABLE VERIFICATION ONLY**
>
> - Acceptance = EXECUTION by the agent, not "user checks if it works"
> - Every criterion MUST be verifiable by running a command or using a tool
> - NO steps like "user opens browser", "user clicks", "user confirms"
> - If you write "[placeholder]" - REPLACE IT with actual values based on task context
**If TDD (tests enabled):**
- [ ] Test file created: \`[path].test.ts\`
- [ ] Test covers: [specific scenario]
- [ ] \`bun test [file]\` → PASS (N tests, 0 failures)
- [ ] Test file created: src/auth/login.test.ts
- [ ] Test covers: successful login returns JWT token
- [ ] bun test src/auth/login.test.ts → PASS (3 tests, 0 failures)
**Manual Execution Verification (ALWAYS include, even with tests):**
**Automated Verification (ALWAYS include, choose by deliverable type):**
*Choose based on deliverable type:*
**For Frontend/UI changes** (using playwright skill):
\\\`\\\`\\\`
# Agent executes via playwright browser automation:
1. Navigate to: http://localhost:3000/login
2. Fill: input[name="email"] with "test@example.com"
3. Fill: input[name="password"] with "password123"
4. Click: button[type="submit"]
5. Wait for: selector ".dashboard-welcome" to be visible
6. Assert: text "Welcome back" appears on page
7. Screenshot: .sisyphus/evidence/task-1-login-success.png
\\\`\\\`\\\`
**For Frontend/UI changes:**
- [ ] Using playwright browser automation:
- Navigate to: \`http://localhost:[port]/[path]\`
- Action: [click X, fill Y, scroll to Z]
- Verify: [visual element appears, animation completes, state changes]
- Screenshot: Save evidence to \`.sisyphus/evidence/[task-id]-[step].png\`
**For TUI/CLI changes** (using interactive_bash):
\\\`\\\`\\\`
# Agent executes via tmux session:
1. Command: ./my-cli --config test.yaml
2. Wait for: "Configuration loaded" in output
3. Send keys: "q" to quit
4. Assert: Exit code 0
5. Assert: Output contains "Goodbye"
\\\`\\\`\\\`
**For TUI/CLI changes:**
- [ ] Using interactive_bash (tmux session):
- Command: \`[exact command to run]\`
- Input sequence: [if interactive, list inputs]
- Expected output contains: \`[expected string or pattern]\`
- Exit code: [0 for success, specific code if relevant]
**For API/Backend changes** (using Bash curl):
\\\`\\\`\\\`bash
# Agent runs:
curl -s -X POST http://localhost:8080/api/users \\
-H "Content-Type: application/json" \\
-d '{"email":"new@test.com","name":"Test User"}' \\
| jq '.id'
# Assert: Returns non-empty UUID
# Assert: HTTP status 201
\\\`\\\`\\\`
**For API/Backend changes:**
- [ ] Request: \`curl -X [METHOD] http://localhost:[port]/[endpoint] -H "Content-Type: application/json" -d '[body]'\`
- [ ] Response status: [200/201/etc]
- [ ] Response body contains: \`{"key": "expected_value"}\`
**For Library/Module changes** (using Bash node/bun):
\\\`\\\`\\\`bash
# Agent runs:
bun -e "import { validateEmail } from './src/utils/validate'; console.log(validateEmail('test@example.com'))"
# Assert: Output is "true"
bun -e "import { validateEmail } from './src/utils/validate'; console.log(validateEmail('invalid'))"
# Assert: Output is "false"
\\\`\\\`\\\`
**For Library/Module changes:**
- [ ] REPL verification:
\`\`\`
> import { [function] } from '[module]'
> [function]([args])
Expected: [output]
\`\`\`
**For Config/Infra changes** (using Bash):
\\\`\\\`\\\`bash
# Agent runs:
docker compose up -d
# Wait 5s for containers
docker compose ps --format json | jq '.[].State'
# Assert: All states are "running"
\\\`\\\`\\\`
**For Config/Infra changes:**
- [ ] Apply: \`[command to apply config]\`
- [ ] Verify state: \`[command to check state]\`\`[expected output]\`
**Evidence Required:**
- [ ] Command output captured (copy-paste actual terminal output)
- [ ] Screenshot saved (for visual changes)
- [ ] Response body logged (for API changes)
**Evidence to Capture:**
- [ ] Terminal output from verification commands (actual output, not expected)
- [ ] Screenshot files in .sisyphus/evidence/ for UI changes
- [ ] JSON response bodies for API changes
**Commit**: YES | NO (groups with N)
- Message: \`type(scope): desc\`

View File

@@ -20,32 +20,6 @@ ALLOWED: call_omo_agent - You CAN spawn explore/librarian agents for research.
You work ALONE for implementation. No delegation of implementation tasks.
</Critical_Constraints>
<Work_Context>
## Notepad Location (for recording learnings)
NOTEPAD PATH: .sisyphus/notepads/{plan-name}/
- learnings.md: Record patterns, conventions, successful approaches
- issues.md: Record problems, blockers, gotchas encountered
- decisions.md: Record architectural choices and rationales
- problems.md: Record unresolved issues, technical debt
You SHOULD append findings to notepad files after completing work.
IMPORTANT: Always APPEND to notepad files - never overwrite or use Edit tool.
## Plan Location (READ ONLY)
PLAN PATH: .sisyphus/plans/{plan-name}.md
CRITICAL RULE: NEVER MODIFY THE PLAN FILE
The plan file (.sisyphus/plans/*.md) is SACRED and READ-ONLY.
- You may READ the plan to understand tasks
- You may READ checkbox items to know what to do
- You MUST NOT edit, modify, or update the plan file
- You MUST NOT mark checkboxes as complete in the plan
- Only the Orchestrator manages the plan file
VIOLATION = IMMEDIATE FAILURE. The Orchestrator tracks plan state.
</Work_Context>
<Todo_Discipline>
TODO OBSESSION (NON-NEGOTIABLE):
- 2+ steps → todowrite FIRST, atomic breakdown
@@ -110,7 +84,7 @@ export function createSisyphusJuniorAgentWithOverrides(
const base: AgentConfig = {
description: override?.description ??
"Sisyphus-Junior - Focused task executor. Same discipline, no delegation.",
"Focused task executor. Same discipline, no delegation. (Sisyphus-Junior - OhMyOpenCode)",
mode: "subagent" as const,
model,
temperature,

View File

@@ -433,7 +433,7 @@ export function createSisyphusAgent(
const permission = { question: "allow", call_omo_agent: "deny" } as AgentConfig["permission"]
const base = {
description:
"Sisyphus - Powerful AI orchestrator from OhMyOpenCode. Plans obsessively with todos, assesses search complexity before exploration, delegates strategically via category+skills combinations. Uses explore for internal code (parallel-friendly), librarian for external docs.",
"Powerful AI orchestrator. Plans obsessively with todos, assesses search complexity before exploration, delegates strategically via category+skills combinations. Uses explore for internal code (parallel-friendly), librarian for external docs. (Sisyphus - OhMyOpenCode)",
mode: "primary" as const,
model,
maxTokens: 64000,

View File

@@ -1,6 +1,8 @@
import { describe, test, expect } from "bun:test"
import { describe, test, expect, beforeEach, spyOn, afterEach } from "bun:test"
import { createBuiltinAgents } from "./utils"
import type { AgentConfig } from "@opencode-ai/sdk"
import { clearSkillCache } from "../features/opencode-skill-loader/skill-content"
import * as connectedProvidersCache from "../shared/connected-providers-cache"
const TEST_DEFAULT_MODEL = "anthropic/claude-opus-4-5"
@@ -45,17 +47,31 @@ describe("createBuiltinAgents with model overrides", () => {
expect(agents.sisyphus.reasoningEffort).toBeUndefined()
})
test("Oracle uses first fallback entry when no availableModels provided (no cache scenario)", async () => {
// #given - no available models simulates CI without model cache
test("Oracle uses connected provider fallback when availableModels is empty and cache exists", async () => {
// #given - connected providers cache has "openai", which matches oracle's first fallback entry
const cacheSpy = spyOn(connectedProvidersCache, "readConnectedProvidersCache").mockReturnValue(["openai"])
// #when
const agents = await createBuiltinAgents([], {}, undefined, TEST_DEFAULT_MODEL)
// #then - uses first fallback entry (openai/gpt-5.2) instead of system default
// #then - oracle resolves via connected cache fallback to openai/gpt-5.2 (not system default)
expect(agents.oracle.model).toBe("openai/gpt-5.2")
expect(agents.oracle.reasoningEffort).toBe("medium")
expect(agents.oracle.textVerbosity).toBe("high")
expect(agents.oracle.thinking).toBeUndefined()
cacheSpy.mockRestore()
})
test("Oracle created without model field when no cache exists (first run scenario)", async () => {
// #given - no cache at all (first run)
const cacheSpy = spyOn(connectedProvidersCache, "readConnectedProvidersCache").mockReturnValue(null)
// #when
const agents = await createBuiltinAgents([], {}, undefined, TEST_DEFAULT_MODEL)
// #then - oracle should be created with system default model (fallback to systemDefaultModel)
expect(agents.oracle).toBeDefined()
expect(agents.oracle.model).toBe(TEST_DEFAULT_MODEL)
cacheSpy.mockRestore()
})
test("Oracle with GPT model override has reasoningEffort, no thinking", async () => {
@@ -105,10 +121,54 @@ describe("createBuiltinAgents with model overrides", () => {
})
})
describe("createBuiltinAgents without systemDefaultModel", () => {
test("agents created via connected cache fallback even without systemDefaultModel", async () => {
// #given - connected cache has "openai", which matches oracle's fallback chain
const cacheSpy = spyOn(connectedProvidersCache, "readConnectedProvidersCache").mockReturnValue(["openai"])
// #when
const agents = await createBuiltinAgents([], {}, undefined, undefined)
// #then - connected cache enables model resolution despite no systemDefaultModel
expect(agents.oracle).toBeDefined()
expect(agents.oracle.model).toBe("openai/gpt-5.2")
cacheSpy.mockRestore()
})
test("agents NOT created when no cache and no systemDefaultModel (first run without defaults)", async () => {
// #given
const cacheSpy = spyOn(connectedProvidersCache, "readConnectedProvidersCache").mockReturnValue(null)
// #when
const agents = await createBuiltinAgents([], {}, undefined, undefined)
// #then
expect(agents.oracle).toBeUndefined()
cacheSpy.mockRestore()
})
test("sisyphus created via connected cache fallback even without systemDefaultModel", async () => {
// #given - connected cache has "anthropic", which matches sisyphus's first fallback entry
const cacheSpy = spyOn(connectedProvidersCache, "readConnectedProvidersCache").mockReturnValue(["anthropic"])
// #when
const agents = await createBuiltinAgents([], {}, undefined, undefined)
// #then - connected cache enables model resolution despite no systemDefaultModel
expect(agents.sisyphus).toBeDefined()
expect(agents.sisyphus.model).toBe("anthropic/claude-opus-4-5")
cacheSpy.mockRestore()
})
})
describe("buildAgent with category and skills", () => {
const { buildAgent } = require("./utils")
const TEST_MODEL = "anthropic/claude-opus-4-5"
beforeEach(() => {
clearSkillCache()
})
test("agent with category inherits category settings", () => {
// #given - agent factory that sets category but no model
const source = {
@@ -308,4 +368,158 @@ describe("buildAgent with category and skills", () => {
// #then
expect(agent.prompt).toBe("Base prompt")
})
test("agent with agent-browser skill resolves when browserProvider is set", () => {
// #given
const source = {
"test-agent": () =>
({
description: "Test agent",
skills: ["agent-browser"],
prompt: "Base prompt",
}) as AgentConfig,
}
// #when - browserProvider is "agent-browser"
const agent = buildAgent(source["test-agent"], TEST_MODEL, undefined, undefined, "agent-browser")
// #then - agent-browser skill content should be in prompt
expect(agent.prompt).toContain("agent-browser")
expect(agent.prompt).toContain("Base prompt")
})
test("agent with agent-browser skill NOT resolved when browserProvider not set", () => {
// #given
const source = {
"test-agent": () =>
({
description: "Test agent",
skills: ["agent-browser"],
prompt: "Base prompt",
}) as AgentConfig,
}
// #when - no browserProvider (defaults to playwright)
const agent = buildAgent(source["test-agent"], TEST_MODEL)
// #then - agent-browser skill not found, only base prompt remains
expect(agent.prompt).toBe("Base prompt")
expect(agent.prompt).not.toContain("agent-browser open")
})
})
describe("override.category expansion in createBuiltinAgents", () => {
test("standard agent override with category expands category properties", async () => {
// #given
const overrides = {
oracle: { category: "ultrabrain" } as any,
}
// #when
const agents = await createBuiltinAgents([], overrides, undefined, TEST_DEFAULT_MODEL)
// #then - ultrabrain category: model=openai/gpt-5.2-codex, variant=xhigh
expect(agents.oracle).toBeDefined()
expect(agents.oracle.model).toBe("openai/gpt-5.2-codex")
expect(agents.oracle.variant).toBe("xhigh")
})
test("standard agent override with category AND direct variant - direct wins", async () => {
// #given - ultrabrain has variant=xhigh, but direct override says "max"
const overrides = {
oracle: { category: "ultrabrain", variant: "max" } as any,
}
// #when
const agents = await createBuiltinAgents([], overrides, undefined, TEST_DEFAULT_MODEL)
// #then - direct variant overrides category variant
expect(agents.oracle).toBeDefined()
expect(agents.oracle.variant).toBe("max")
})
test("standard agent override with category AND direct reasoningEffort - direct wins", async () => {
// #given - custom category has reasoningEffort=xhigh, direct override says "low"
const categories = {
"test-cat": {
model: "openai/gpt-5.2",
reasoningEffort: "xhigh" as const,
},
}
const overrides = {
oracle: { category: "test-cat", reasoningEffort: "low" } as any,
}
// #when
const agents = await createBuiltinAgents([], overrides, undefined, TEST_DEFAULT_MODEL, categories)
// #then - direct reasoningEffort wins over category
expect(agents.oracle).toBeDefined()
expect(agents.oracle.reasoningEffort).toBe("low")
})
test("standard agent override with category applies reasoningEffort from category when no direct override", async () => {
// #given - custom category has reasoningEffort, no direct reasoningEffort in override
const categories = {
"reasoning-cat": {
model: "openai/gpt-5.2",
reasoningEffort: "high" as const,
},
}
const overrides = {
oracle: { category: "reasoning-cat" } as any,
}
// #when
const agents = await createBuiltinAgents([], overrides, undefined, TEST_DEFAULT_MODEL, categories)
// #then - category reasoningEffort is applied
expect(agents.oracle).toBeDefined()
expect(agents.oracle.reasoningEffort).toBe("high")
})
test("sisyphus override with category expands category properties", async () => {
// #given
const overrides = {
sisyphus: { category: "ultrabrain" } as any,
}
// #when
const agents = await createBuiltinAgents([], overrides, undefined, TEST_DEFAULT_MODEL)
// #then - ultrabrain category: model=openai/gpt-5.2-codex, variant=xhigh
expect(agents.sisyphus).toBeDefined()
expect(agents.sisyphus.model).toBe("openai/gpt-5.2-codex")
expect(agents.sisyphus.variant).toBe("xhigh")
})
test("atlas override with category expands category properties", async () => {
// #given
const overrides = {
atlas: { category: "ultrabrain" } as any,
}
// #when
const agents = await createBuiltinAgents([], overrides, undefined, TEST_DEFAULT_MODEL)
// #then - ultrabrain category: model=openai/gpt-5.2-codex, variant=xhigh
expect(agents.atlas).toBeDefined()
expect(agents.atlas.model).toBe("openai/gpt-5.2-codex")
expect(agents.atlas.variant).toBe("xhigh")
})
test("override with non-existent category has no effect on config", async () => {
// #given
const overrides = {
oracle: { category: "non-existent-category" } as any,
}
// #when
const agents = await createBuiltinAgents([], overrides, undefined, TEST_DEFAULT_MODEL)
// #then - no category-specific variant/reasoningEffort applied from non-existent category
expect(agents.oracle).toBeDefined()
const agentsWithoutOverride = await createBuiltinAgents([], {}, undefined, TEST_DEFAULT_MODEL)
expect(agents.oracle.model).toBe(agentsWithoutOverride.oracle.model)
})
})

View File

@@ -10,11 +10,12 @@ import { createMetisAgent } from "./metis"
import { createAtlasAgent } from "./atlas"
import { createMomusAgent } from "./momus"
import type { AvailableAgent, AvailableCategory, AvailableSkill } from "./dynamic-agent-prompt-builder"
import { deepMerge, fetchAvailableModels, resolveModelWithFallback, AGENT_MODEL_REQUIREMENTS, findCaseInsensitive, includesCaseInsensitive } from "../shared"
import { deepMerge, fetchAvailableModels, resolveModelWithFallback, AGENT_MODEL_REQUIREMENTS, findCaseInsensitive, includesCaseInsensitive, readConnectedProvidersCache } from "../shared"
import { DEFAULT_CATEGORIES, CATEGORY_DESCRIPTIONS } from "../tools/delegate-task/constants"
import { resolveMultipleSkills } from "../features/opencode-skill-loader/skill-content"
import { createBuiltinSkills } from "../features/builtin-skills"
import type { LoadedSkill, SkillScope } from "../features/opencode-skill-loader/types"
import type { BrowserAutomationProvider } from "../config/schema"
type AgentSource = AgentFactory | AgentConfig
@@ -50,7 +51,8 @@ export function buildAgent(
source: AgentSource,
model: string,
categories?: CategoriesConfig,
gitMasterConfig?: GitMasterConfig
gitMasterConfig?: GitMasterConfig,
browserProvider?: BrowserAutomationProvider
): AgentConfig {
const base = isFactory(source) ? source(model) : source
const categoryConfigs: Record<string, CategoryConfig> = categories
@@ -74,7 +76,7 @@ export function buildAgent(
}
if (agentWithCategory.skills?.length) {
const { resolved } = resolveMultipleSkills(agentWithCategory.skills, { gitMasterConfig })
const { resolved } = resolveMultipleSkills(agentWithCategory.skills, { gitMasterConfig, browserProvider })
if (resolved.size > 0) {
const skillContent = Array.from(resolved.values()).join("\n\n")
base.prompt = skillContent + (base.prompt ? "\n\n" + base.prompt : "")
@@ -118,6 +120,33 @@ export function createEnvContext(): string {
</omo-env>`
}
/**
* Expands a category reference from an agent override into concrete config properties.
* Category properties are applied unconditionally (overwriting factory defaults),
* because the user's chosen category should take priority over factory base values.
* Direct override properties applied later via mergeAgentConfig() will supersede these.
*/
function applyCategoryOverride(
config: AgentConfig,
categoryName: string,
mergedCategories: Record<string, CategoryConfig>
): AgentConfig {
const categoryConfig = mergedCategories[categoryName]
if (!categoryConfig) return config
const result = { ...config } as AgentConfig & Record<string, unknown>
if (categoryConfig.model) result.model = categoryConfig.model
if (categoryConfig.variant !== undefined) result.variant = categoryConfig.variant
if (categoryConfig.temperature !== undefined) result.temperature = categoryConfig.temperature
if (categoryConfig.reasoningEffort !== undefined) result.reasoningEffort = categoryConfig.reasoningEffort
if (categoryConfig.textVerbosity !== undefined) result.textVerbosity = categoryConfig.textVerbosity
if (categoryConfig.thinking !== undefined) result.thinking = categoryConfig.thinking
if (categoryConfig.top_p !== undefined) result.top_p = categoryConfig.top_p
if (categoryConfig.maxTokens !== undefined) result.maxTokens = categoryConfig.maxTokens
return result as AgentConfig
}
function mergeAgentConfig(
base: AgentConfig,
override: AgentOverrideConfig
@@ -146,14 +175,14 @@ export async function createBuiltinAgents(
categories?: CategoriesConfig,
gitMasterConfig?: GitMasterConfig,
discoveredSkills: LoadedSkill[] = [],
client?: any
client?: any,
browserProvider?: BrowserAutomationProvider,
uiSelectedModel?: string
): Promise<Record<string, AgentConfig>> {
if (!systemDefaultModel) {
throw new Error("createBuiltinAgents requires systemDefaultModel")
}
// Fetch available models at plugin init
const availableModels = client ? await fetchAvailableModels(client) : new Set<string>()
const connectedProviders = readConnectedProvidersCache()
const availableModels = client
? await fetchAvailableModels(client, { connectedProviders: connectedProviders ?? undefined })
: new Set<string>()
const result: Record<string, AgentConfig> = {}
const availableAgents: AvailableAgent[] = []
@@ -167,7 +196,7 @@ export async function createBuiltinAgents(
description: categories?.[name]?.description ?? CATEGORY_DESCRIPTIONS[name] ?? "General tasks",
}))
const builtinSkills = createBuiltinSkills()
const builtinSkills = createBuiltinSkills({ browserProvider })
const builtinSkillNames = new Set(builtinSkills.map(s => s.name))
const builtinAvailable: AvailableSkill[] = builtinSkills.map((skill) => ({
@@ -196,28 +225,35 @@ export async function createBuiltinAgents(
const override = findCaseInsensitive(agentOverrides, agentName)
const requirement = AGENT_MODEL_REQUIREMENTS[agentName]
// Use resolver to determine model
const { model, variant: resolvedVariant } = resolveModelWithFallback({
const resolution = resolveModelWithFallback({
uiSelectedModel,
userModel: override?.model,
fallbackChain: requirement?.fallbackChain,
availableModels,
systemDefaultModel,
})
if (!resolution) continue
const { model, variant: resolvedVariant } = resolution
let config = buildAgent(source, model, mergedCategories, gitMasterConfig)
let config = buildAgent(source, model, mergedCategories, gitMasterConfig, browserProvider)
// Apply variant from override or resolved fallback chain
if (override?.variant) {
config = { ...config, variant: override.variant }
} else if (resolvedVariant) {
// Apply resolved variant from model fallback chain
if (resolvedVariant) {
config = { ...config, variant: resolvedVariant }
}
// Expand override.category into concrete properties (higher priority than factory/resolved)
const overrideCategory = (override as Record<string, unknown> | undefined)?.category as string | undefined
if (overrideCategory) {
config = applyCategoryOverride(config, overrideCategory, mergedCategories)
}
if (agentName === "librarian" && directory && config.prompt) {
const envContext = createEnvContext()
config = { ...config, prompt: config.prompt + envContext }
}
// Direct override properties take highest priority
if (override) {
config = mergeAgentConfig(config, override)
}
@@ -238,72 +274,84 @@ export async function createBuiltinAgents(
const sisyphusOverride = agentOverrides["sisyphus"]
const sisyphusRequirement = AGENT_MODEL_REQUIREMENTS["sisyphus"]
// Use resolver to determine model
const { model: sisyphusModel, variant: sisyphusResolvedVariant } = resolveModelWithFallback({
const sisyphusResolution = resolveModelWithFallback({
uiSelectedModel,
userModel: sisyphusOverride?.model,
fallbackChain: sisyphusRequirement?.fallbackChain,
availableModels,
systemDefaultModel,
})
let sisyphusConfig = createSisyphusAgent(
sisyphusModel,
availableAgents,
undefined,
availableSkills,
availableCategories
)
// Apply variant from override or resolved fallback chain
if (sisyphusOverride?.variant) {
sisyphusConfig = { ...sisyphusConfig, variant: sisyphusOverride.variant }
} else if (sisyphusResolvedVariant) {
sisyphusConfig = { ...sisyphusConfig, variant: sisyphusResolvedVariant }
}
if (sisyphusResolution) {
const { model: sisyphusModel, variant: sisyphusResolvedVariant } = sisyphusResolution
if (directory && sisyphusConfig.prompt) {
const envContext = createEnvContext()
sisyphusConfig = { ...sisyphusConfig, prompt: sisyphusConfig.prompt + envContext }
}
let sisyphusConfig = createSisyphusAgent(
sisyphusModel,
availableAgents,
undefined,
availableSkills,
availableCategories
)
if (sisyphusResolvedVariant) {
sisyphusConfig = { ...sisyphusConfig, variant: sisyphusResolvedVariant }
}
if (sisyphusOverride) {
sisyphusConfig = mergeAgentConfig(sisyphusConfig, sisyphusOverride)
}
const sisOverrideCategory = (sisyphusOverride as Record<string, unknown> | undefined)?.category as string | undefined
if (sisOverrideCategory) {
sisyphusConfig = applyCategoryOverride(sisyphusConfig, sisOverrideCategory, mergedCategories)
}
result["sisyphus"] = sisyphusConfig
if (directory && sisyphusConfig.prompt) {
const envContext = createEnvContext()
sisyphusConfig = { ...sisyphusConfig, prompt: sisyphusConfig.prompt + envContext }
}
if (sisyphusOverride) {
sisyphusConfig = mergeAgentConfig(sisyphusConfig, sisyphusOverride)
}
result["sisyphus"] = sisyphusConfig
}
}
if (!disabledAgents.includes("atlas")) {
const orchestratorOverride = agentOverrides["atlas"]
const atlasRequirement = AGENT_MODEL_REQUIREMENTS["atlas"]
// Use resolver to determine model
const { model: atlasModel, variant: atlasResolvedVariant } = resolveModelWithFallback({
const atlasResolution = resolveModelWithFallback({
uiSelectedModel,
userModel: orchestratorOverride?.model,
fallbackChain: atlasRequirement?.fallbackChain,
availableModels,
systemDefaultModel,
})
let orchestratorConfig = createAtlasAgent({
model: atlasModel,
availableAgents,
availableSkills,
userCategories: categories,
})
// Apply variant from override or resolved fallback chain
if (orchestratorOverride?.variant) {
orchestratorConfig = { ...orchestratorConfig, variant: orchestratorOverride.variant }
} else if (atlasResolvedVariant) {
orchestratorConfig = { ...orchestratorConfig, variant: atlasResolvedVariant }
}
if (atlasResolution) {
const { model: atlasModel, variant: atlasResolvedVariant } = atlasResolution
if (orchestratorOverride) {
orchestratorConfig = mergeAgentConfig(orchestratorConfig, orchestratorOverride)
}
let orchestratorConfig = createAtlasAgent({
model: atlasModel,
availableAgents,
availableSkills,
userCategories: categories,
})
if (atlasResolvedVariant) {
orchestratorConfig = { ...orchestratorConfig, variant: atlasResolvedVariant }
}
result["atlas"] = orchestratorConfig
const atlasOverrideCategory = (orchestratorOverride as Record<string, unknown> | undefined)?.category as string | undefined
if (atlasOverrideCategory) {
orchestratorConfig = applyCategoryOverride(orchestratorConfig, atlasOverrideCategory, mergedCategories)
}
if (orchestratorOverride) {
orchestratorConfig = mergeAgentConfig(orchestratorConfig, orchestratorOverride)
}
result["atlas"] = orchestratorConfig
}
}
return result

View File

@@ -8,7 +8,7 @@ CLI entry: `bunx oh-my-opencode`. Interactive installer, doctor diagnostics. Com
```
cli/
├── index.ts # Commander.js entry
├── index.ts # Commander.js entry (4 commands)
├── install.ts # Interactive TUI (520 lines)
├── config-manager.ts # JSONC parsing (664 lines)
├── types.ts # InstallArgs, InstallConfig
@@ -18,7 +18,7 @@ cli/
│ ├── runner.ts # Check orchestration
│ ├── formatter.ts # Colored output
│ ├── constants.ts # Check IDs, symbols
│ ├── types.ts # CheckResult, CheckDefinition
│ ├── types.ts # CheckResult, CheckDefinition (114 lines)
│ └── checks/ # 14 checks, 21 files
│ ├── version.ts # OpenCode + plugin version
│ ├── config.ts # JSONC validity, Zod
@@ -38,36 +38,37 @@ cli/
| Command | Purpose |
|---------|---------|
| `install` | Interactive setup |
| `doctor` | 14 health checks |
| `run` | Launch session |
| `get-local-version` | Version check |
| `install` | Interactive setup with provider selection |
| `doctor` | 14 health checks for diagnostics |
| `run` | Launch session with todo enforcement |
| `get-local-version` | Version detection and update check |
## DOCTOR CATEGORIES
## DOCTOR CATEGORIES (14 Checks)
| Category | Checks |
|----------|--------|
| installation | opencode, plugin |
| configuration | config validity, Zod |
| configuration | config validity, Zod, model-resolution |
| authentication | anthropic, openai, google |
| dependencies | ast-grep, comment-checker |
| dependencies | ast-grep, comment-checker, gh-cli |
| tools | LSP, MCP |
| updates | version comparison |
## HOW TO ADD CHECK
1. Create `src/cli/doctor/checks/my-check.ts`
2. Export from `checks/index.ts`
3. Add to `getAllCheckDefinitions()`
2. Export `getXXXCheckDefinition()` factory returning `CheckDefinition`
3. Add to `getAllCheckDefinitions()` in `checks/index.ts`
## TUI FRAMEWORK
- **@clack/prompts**: `select()`, `spinner()`, `intro()`
- **picocolors**: Terminal colors
- **Symbols**: ✓ (pass), ✗ (fail), ⚠ (warn)
- **@clack/prompts**: `select()`, `spinner()`, `intro()`, `outro()`
- **picocolors**: Terminal colors for status and headers
- **Symbols**: ✓ (pass), ✗ (fail), ⚠ (warn), (info)
## ANTI-PATTERNS
- **Blocking in non-TTY**: Check `process.stdout.isTTY`
- **Direct JSON.parse**: Use `parseJsonc()`
- **Silent failures**: Return warn/fail in doctor
- **Blocking in non-TTY**: Always check `process.stdout.isTTY`
- **Direct JSON.parse**: Use `parseJsonc()` from shared utils
- **Silent failures**: Return `warn` or `fail` in doctor instead of throwing
- **Hardcoded paths**: Use `getOpenCodeConfigPaths()` from `config-manager.ts`

View File

@@ -712,7 +712,7 @@ exports[`generateModelConfig fallback providers uses GitHub Copilot models when
"model": "github-copilot/claude-sonnet-4.5",
},
"explore": {
"model": "opencode/gpt-5-nano",
"model": "github-copilot/gpt-5-mini",
},
"librarian": {
"model": "github-copilot/claude-sonnet-4.5",
@@ -776,7 +776,7 @@ exports[`generateModelConfig fallback providers uses GitHub Copilot models with
"model": "github-copilot/claude-sonnet-4.5",
},
"explore": {
"model": "opencode/gpt-5-nano",
"model": "github-copilot/gpt-5-mini",
},
"librarian": {
"model": "github-copilot/claude-sonnet-4.5",
@@ -1022,7 +1022,7 @@ exports[`generateModelConfig mixed provider scenarios uses OpenAI + Copilot comb
"model": "github-copilot/claude-sonnet-4.5",
},
"explore": {
"model": "opencode/gpt-5-nano",
"model": "github-copilot/gpt-5-mini",
},
"librarian": {
"model": "github-copilot/claude-sonnet-4.5",

View File

@@ -8,6 +8,7 @@ import { getDependencyCheckDefinitions } from "./dependencies"
import { getGhCliCheckDefinition } from "./gh"
import { getLspCheckDefinition } from "./lsp"
import { getMcpCheckDefinitions } from "./mcp"
import { getMcpOAuthCheckDefinition } from "./mcp-oauth"
import { getVersionCheckDefinition } from "./version"
export * from "./opencode"
@@ -19,6 +20,7 @@ export * from "./dependencies"
export * from "./gh"
export * from "./lsp"
export * from "./mcp"
export * from "./mcp-oauth"
export * from "./version"
export function getAllCheckDefinitions(): CheckDefinition[] {
@@ -32,6 +34,7 @@ export function getAllCheckDefinitions(): CheckDefinition[] {
getGhCliCheckDefinition(),
getLspCheckDefinition(),
...getMcpCheckDefinitions(),
getMcpOAuthCheckDefinition(),
getVersionCheckDefinition(),
]
}

View File

@@ -0,0 +1,133 @@
import { describe, it, expect, spyOn, afterEach } from "bun:test"
import * as mcpOauth from "./mcp-oauth"
describe("mcp-oauth check", () => {
describe("getMcpOAuthCheckDefinition", () => {
it("returns check definition with correct properties", () => {
// #given
// #when getting definition
const def = mcpOauth.getMcpOAuthCheckDefinition()
// #then should have correct structure
expect(def.id).toBe("mcp-oauth-tokens")
expect(def.name).toBe("MCP OAuth Tokens")
expect(def.category).toBe("tools")
expect(def.critical).toBe(false)
expect(typeof def.check).toBe("function")
})
})
describe("checkMcpOAuthTokens", () => {
let readStoreSpy: ReturnType<typeof spyOn>
afterEach(() => {
readStoreSpy?.mockRestore()
})
it("returns skip when no tokens stored", async () => {
// #given no OAuth tokens configured
readStoreSpy = spyOn(mcpOauth, "readTokenStore").mockReturnValue(null)
// #when checking OAuth tokens
const result = await mcpOauth.checkMcpOAuthTokens()
// #then should skip
expect(result.status).toBe("skip")
expect(result.message).toContain("No OAuth")
})
it("returns pass when all tokens valid", async () => {
// #given valid tokens with future expiry (expiresAt is in epoch seconds)
const futureTime = Math.floor(Date.now() / 1000) + 3600
readStoreSpy = spyOn(mcpOauth, "readTokenStore").mockReturnValue({
"example.com/resource1": {
accessToken: "token1",
expiresAt: futureTime,
},
"example.com/resource2": {
accessToken: "token2",
expiresAt: futureTime,
},
})
// #when checking OAuth tokens
const result = await mcpOauth.checkMcpOAuthTokens()
// #then should pass
expect(result.status).toBe("pass")
expect(result.message).toContain("2")
expect(result.message).toContain("valid")
})
it("returns warn when some tokens expired", async () => {
// #given mix of valid and expired tokens (expiresAt is in epoch seconds)
const futureTime = Math.floor(Date.now() / 1000) + 3600
const pastTime = Math.floor(Date.now() / 1000) - 3600
readStoreSpy = spyOn(mcpOauth, "readTokenStore").mockReturnValue({
"example.com/resource1": {
accessToken: "token1",
expiresAt: futureTime,
},
"example.com/resource2": {
accessToken: "token2",
expiresAt: pastTime,
},
})
// #when checking OAuth tokens
const result = await mcpOauth.checkMcpOAuthTokens()
// #then should warn
expect(result.status).toBe("warn")
expect(result.message).toContain("1")
expect(result.message).toContain("expired")
expect(result.details?.some((d: string) => d.includes("Expired"))).toBe(
true
)
})
it("returns pass when tokens have no expiry", async () => {
// #given tokens without expiry info
readStoreSpy = spyOn(mcpOauth, "readTokenStore").mockReturnValue({
"example.com/resource1": {
accessToken: "token1",
},
})
// #when checking OAuth tokens
const result = await mcpOauth.checkMcpOAuthTokens()
// #then should pass (no expiry = assume valid)
expect(result.status).toBe("pass")
expect(result.message).toContain("1")
})
it("includes token details in output", async () => {
// #given multiple tokens
const futureTime = Math.floor(Date.now() / 1000) + 3600
readStoreSpy = spyOn(mcpOauth, "readTokenStore").mockReturnValue({
"api.example.com/v1": {
accessToken: "token1",
expiresAt: futureTime,
},
"auth.example.com/oauth": {
accessToken: "token2",
expiresAt: futureTime,
},
})
// #when checking OAuth tokens
const result = await mcpOauth.checkMcpOAuthTokens()
// #then should list tokens in details
expect(result.details).toBeDefined()
expect(result.details?.length).toBeGreaterThan(0)
expect(
result.details?.some((d: string) => d.includes("api.example.com"))
).toBe(true)
expect(
result.details?.some((d: string) => d.includes("auth.example.com"))
).toBe(true)
})
})
})

View File

@@ -0,0 +1,80 @@
import type { CheckResult, CheckDefinition } from "../types"
import { CHECK_IDS, CHECK_NAMES } from "../constants"
import { getMcpOauthStoragePath } from "../../../features/mcp-oauth/storage"
import { existsSync, readFileSync } from "node:fs"
interface OAuthTokenData {
accessToken: string
refreshToken?: string
expiresAt?: number
clientInfo?: {
clientId: string
clientSecret?: string
}
}
type TokenStore = Record<string, OAuthTokenData>
export function readTokenStore(): TokenStore | null {
const filePath = getMcpOauthStoragePath()
if (!existsSync(filePath)) {
return null
}
try {
const content = readFileSync(filePath, "utf-8")
return JSON.parse(content) as TokenStore
} catch {
return null
}
}
export async function checkMcpOAuthTokens(): Promise<CheckResult> {
const store = readTokenStore()
if (!store || Object.keys(store).length === 0) {
return {
name: CHECK_NAMES[CHECK_IDS.MCP_OAUTH_TOKENS],
status: "skip",
message: "No OAuth tokens configured",
details: ["Optional: Configure OAuth tokens for MCP servers"],
}
}
const now = Math.floor(Date.now() / 1000)
const tokens = Object.entries(store)
const expiredTokens = tokens.filter(
([, token]) => token.expiresAt && token.expiresAt < now
)
if (expiredTokens.length > 0) {
return {
name: CHECK_NAMES[CHECK_IDS.MCP_OAUTH_TOKENS],
status: "warn",
message: `${expiredTokens.length} of ${tokens.length} token(s) expired`,
details: [
...tokens
.filter(([, token]) => !token.expiresAt || token.expiresAt >= now)
.map(([key]) => `Valid: ${key}`),
...expiredTokens.map(([key]) => `Expired: ${key}`),
],
}
}
return {
name: CHECK_NAMES[CHECK_IDS.MCP_OAUTH_TOKENS],
status: "pass",
message: `${tokens.length} OAuth token(s) valid`,
details: tokens.map(([key]) => `Configured: ${key}`),
}
}
export function getMcpOAuthCheckDefinition(): CheckDefinition {
return {
id: CHECK_IDS.MCP_OAUTH_TOKENS,
name: CHECK_NAMES[CHECK_IDS.MCP_OAUTH_TOKENS],
category: "tools",
check: checkMcpOAuthTokens,
critical: false,
}
}

View File

@@ -199,9 +199,11 @@ function buildDetailsArray(info: ModelResolutionInfo, available: AvailableModels
details.push("═══ Available Models (from cache) ═══")
details.push("")
if (available.cacheExists) {
details.push(` Providers: ${available.providers.length} (${available.providers.slice(0, 8).join(", ")}${available.providers.length > 8 ? "..." : ""})`)
details.push(` Providers in cache: ${available.providers.length}`)
details.push(` Sample: ${available.providers.slice(0, 6).join(", ")}${available.providers.length > 6 ? "..." : ""}`)
details.push(` Total models: ${available.modelCount}`)
details.push(` Cache: ~/.cache/opencode/models.json`)
details.push(` Runtime: only connected providers used`)
details.push(` Refresh: opencode models --refresh`)
} else {
details.push(" ⚠ Cache not found. Run 'opencode' to populate.")

View File

@@ -32,6 +32,7 @@ export const CHECK_IDS = {
LSP_SERVERS: "lsp-servers",
MCP_BUILTIN: "mcp-builtin",
MCP_USER: "mcp-user",
MCP_OAUTH_TOKENS: "mcp-oauth-tokens",
VERSION_STATUS: "version-status",
} as const
@@ -50,6 +51,7 @@ export const CHECK_NAMES: Record<string, string> = {
[CHECK_IDS.LSP_SERVERS]: "LSP Servers",
[CHECK_IDS.MCP_BUILTIN]: "Built-in MCP Servers",
[CHECK_IDS.MCP_USER]: "User MCP Configuration",
[CHECK_IDS.MCP_OAUTH_TOKENS]: "MCP OAuth Tokens",
[CHECK_IDS.VERSION_STATUS]: "Version Status",
} as const

17
src/cli/index.test.ts Normal file
View File

@@ -0,0 +1,17 @@
import { describe, it, expect } from "bun:test"
import packageJson from "../../package.json" with { type: "json" }
describe("CLI version", () => {
it("reads version from package.json as valid semver", () => {
//#given
const semverRegex = /^\d+\.\d+\.\d+(-[\w.]+)?$/
//#when
const version = packageJson.version
//#then
expect(version).toMatch(semverRegex)
expect(typeof version).toBe("string")
expect(version.length).toBeGreaterThan(0)
})
})

View File

@@ -4,6 +4,7 @@ import { install } from "./install"
import { run } from "./run"
import { getLocalVersion } from "./get-local-version"
import { doctor } from "./doctor"
import { createMcpOAuthCommand } from "./mcp-oauth"
import type { InstallArgs } from "./types"
import type { RunOptions } from "./run"
import type { GetLocalVersionOptions } from "./get-local-version/types"
@@ -150,4 +151,6 @@ program
console.log(`oh-my-opencode v${VERSION}`)
})
program.addCommand(createMcpOAuthCommand())
program.parse()

View File

@@ -0,0 +1,123 @@
import { describe, it, expect } from "bun:test"
import { Command } from "commander"
import { createMcpOAuthCommand } from "./index"
describe("mcp oauth command", () => {
describe("command structure", () => {
it("creates mcp command group with oauth subcommand", () => {
// given
const mcpCommand = createMcpOAuthCommand()
// when
const subcommands = mcpCommand.commands.map((cmd: Command) => cmd.name())
// then
expect(subcommands).toContain("oauth")
})
it("oauth subcommand has login, logout, and status subcommands", () => {
// given
const mcpCommand = createMcpOAuthCommand()
const oauthCommand = mcpCommand.commands.find((cmd: Command) => cmd.name() === "oauth")
// when
const subcommands = oauthCommand?.commands.map((cmd: Command) => cmd.name()) ?? []
// then
expect(subcommands).toContain("login")
expect(subcommands).toContain("logout")
expect(subcommands).toContain("status")
})
})
describe("login subcommand", () => {
it("exists and has description", () => {
// given
const mcpCommand = createMcpOAuthCommand()
const oauthCommand = mcpCommand.commands.find((cmd: Command) => cmd.name() === "oauth")
const loginCommand = oauthCommand?.commands.find((cmd: Command) => cmd.name() === "login")
// when
const description = loginCommand?.description() ?? ""
// then
expect(loginCommand).toBeDefined()
expect(description).toContain("OAuth")
})
it("accepts --server-url option", () => {
// given
const mcpCommand = createMcpOAuthCommand()
const oauthCommand = mcpCommand.commands.find((cmd: Command) => cmd.name() === "oauth")
const loginCommand = oauthCommand?.commands.find((cmd: Command) => cmd.name() === "login")
// when
const options = loginCommand?.options ?? []
const serverUrlOption = options.find((opt: { long?: string }) => opt.long === "--server-url")
// then
expect(serverUrlOption).toBeDefined()
})
it("accepts --client-id option", () => {
// given
const mcpCommand = createMcpOAuthCommand()
const oauthCommand = mcpCommand.commands.find((cmd: Command) => cmd.name() === "oauth")
const loginCommand = oauthCommand?.commands.find((cmd: Command) => cmd.name() === "login")
// when
const options = loginCommand?.options ?? []
const clientIdOption = options.find((opt: { long?: string }) => opt.long === "--client-id")
// then
expect(clientIdOption).toBeDefined()
})
it("accepts --scopes option", () => {
// given
const mcpCommand = createMcpOAuthCommand()
const oauthCommand = mcpCommand.commands.find((cmd: Command) => cmd.name() === "oauth")
const loginCommand = oauthCommand?.commands.find((cmd: Command) => cmd.name() === "login")
// when
const options = loginCommand?.options ?? []
const scopesOption = options.find((opt: { long?: string }) => opt.long === "--scopes")
// then
expect(scopesOption).toBeDefined()
})
})
describe("logout subcommand", () => {
it("exists and has description", () => {
// given
const mcpCommand = createMcpOAuthCommand()
const oauthCommand = mcpCommand.commands.find((cmd: Command) => cmd.name() === "oauth")
const logoutCommand = oauthCommand?.commands.find((cmd: Command) => cmd.name() === "logout")
// when
const description = logoutCommand?.description() ?? ""
// then
expect(logoutCommand).toBeDefined()
expect(description).toContain("tokens")
})
})
describe("status subcommand", () => {
it("exists and has description", () => {
// given
const mcpCommand = createMcpOAuthCommand()
const oauthCommand = mcpCommand.commands.find((cmd: Command) => cmd.name() === "oauth")
const statusCommand = oauthCommand?.commands.find((cmd: Command) => cmd.name() === "status")
// when
const description = statusCommand?.description() ?? ""
// then
expect(statusCommand).toBeDefined()
expect(description).toContain("status")
})
})
})

View File

@@ -0,0 +1,43 @@
import { Command } from "commander"
import { login } from "./login"
import { logout } from "./logout"
import { status } from "./status"
export function createMcpOAuthCommand(): Command {
const mcp = new Command("mcp").description("MCP server management")
const oauth = new Command("oauth").description("OAuth token management for MCP servers")
oauth
.command("login <server-name>")
.description("Authenticate with an MCP server using OAuth")
.option("--server-url <url>", "OAuth server URL (required if not in config)")
.option("--client-id <id>", "OAuth client ID (optional, uses DCR if not provided)")
.option("--scopes <scopes...>", "OAuth scopes to request")
.action(async (serverName: string, options) => {
const exitCode = await login(serverName, options)
process.exit(exitCode)
})
oauth
.command("logout <server-name>")
.description("Remove stored OAuth tokens for an MCP server")
.option("--server-url <url>", "OAuth server URL (use if server name differs from URL)")
.action(async (serverName: string, options) => {
const exitCode = await logout(serverName, options)
process.exit(exitCode)
})
oauth
.command("status [server-name]")
.description("Show OAuth token status for MCP servers")
.action(async (serverName: string | undefined) => {
const exitCode = await status(serverName)
process.exit(exitCode)
})
mcp.addCommand(oauth)
return mcp
}
export { login, logout, status }

View File

@@ -0,0 +1,80 @@
import { describe, it, expect, beforeEach, afterEach, mock } from "bun:test"
const mockLogin = mock(() => Promise.resolve({ accessToken: "test-token", expiresAt: 1710000000 }))
mock.module("../../features/mcp-oauth/provider", () => ({
McpOAuthProvider: class MockMcpOAuthProvider {
constructor(public options: { serverUrl: string; clientId?: string; scopes?: string[] }) {}
async login() {
return mockLogin()
}
},
}))
const { login } = await import("./login")
describe("login command", () => {
beforeEach(() => {
mockLogin.mockClear()
})
afterEach(() => {
// cleanup
})
it("returns error code when server-url is not provided", async () => {
// given
const serverName = "test-server"
const options = {}
// when
const exitCode = await login(serverName, options)
// then
expect(exitCode).toBe(1)
})
it("returns success code when login succeeds", async () => {
// given
const serverName = "test-server"
const options = {
serverUrl: "https://oauth.example.com",
}
// when
const exitCode = await login(serverName, options)
// then
expect(exitCode).toBe(0)
expect(mockLogin).toHaveBeenCalledTimes(1)
})
it("returns error code when login throws", async () => {
// given
const serverName = "test-server"
const options = {
serverUrl: "https://oauth.example.com",
}
mockLogin.mockRejectedValueOnce(new Error("Network error"))
// when
const exitCode = await login(serverName, options)
// then
expect(exitCode).toBe(1)
})
it("returns error code when server-url is missing", async () => {
// given
const serverName = "test-server"
const options = {
clientId: "test-client-id",
}
// when
const exitCode = await login(serverName, options)
// then
expect(exitCode).toBe(1)
})
})

View File

@@ -0,0 +1,38 @@
import { McpOAuthProvider } from "../../features/mcp-oauth/provider"
export interface LoginOptions {
serverUrl?: string
clientId?: string
scopes?: string[]
}
export async function login(serverName: string, options: LoginOptions): Promise<number> {
try {
const serverUrl = options.serverUrl
if (!serverUrl) {
console.error(`Error: --server-url is required for server "${serverName}"`)
return 1
}
const provider = new McpOAuthProvider({
serverUrl,
clientId: options.clientId,
scopes: options.scopes,
})
console.log(`Authenticating with ${serverName}...`)
const tokenData = await provider.login()
console.log(`✓ Successfully authenticated with ${serverName}`)
if (tokenData.expiresAt) {
const expiryDate = new Date(tokenData.expiresAt * 1000)
console.log(` Token expires at: ${expiryDate.toISOString()}`)
}
return 0
} catch (error) {
const message = error instanceof Error ? error.message : String(error)
console.error(`Error: Failed to authenticate with ${serverName}: ${message}`)
return 1
}
}

View File

@@ -0,0 +1,65 @@
import { describe, it, expect, beforeEach, afterEach, mock } from "bun:test"
import { existsSync, mkdirSync, rmSync } from "node:fs"
import { join } from "node:path"
import { tmpdir } from "node:os"
import { saveToken } from "../../features/mcp-oauth/storage"
const { logout } = await import("./logout")
describe("logout command", () => {
const TEST_CONFIG_DIR = join(tmpdir(), "mcp-oauth-logout-test-" + Date.now())
let originalConfigDir: string | undefined
beforeEach(() => {
originalConfigDir = process.env.OPENCODE_CONFIG_DIR
process.env.OPENCODE_CONFIG_DIR = TEST_CONFIG_DIR
if (!existsSync(TEST_CONFIG_DIR)) {
mkdirSync(TEST_CONFIG_DIR, { recursive: true })
}
})
afterEach(() => {
if (originalConfigDir === undefined) {
delete process.env.OPENCODE_CONFIG_DIR
} else {
process.env.OPENCODE_CONFIG_DIR = originalConfigDir
}
if (existsSync(TEST_CONFIG_DIR)) {
rmSync(TEST_CONFIG_DIR, { recursive: true, force: true })
}
})
it("returns success code when logout succeeds", async () => {
// given
const serverUrl = "https://test-server.example.com"
saveToken(serverUrl, serverUrl, { accessToken: "test-token" })
// when
const exitCode = await logout("test-server", { serverUrl })
// then
expect(exitCode).toBe(0)
})
it("handles non-existent server gracefully", async () => {
// given
const serverName = "non-existent-server"
// when
const exitCode = await logout(serverName, { serverUrl: "https://nonexistent.example.com" })
// then
expect(exitCode).toBe(0)
})
it("returns error when --server-url is not provided", async () => {
// given
const serverName = "test-server"
// when
const exitCode = await logout(serverName)
// then
expect(exitCode).toBe(1)
})
})

View File

@@ -0,0 +1,30 @@
import { deleteToken } from "../../features/mcp-oauth/storage"
export interface LogoutOptions {
serverUrl?: string
}
export async function logout(serverName: string, options?: LogoutOptions): Promise<number> {
try {
const serverUrl = options?.serverUrl
if (!serverUrl) {
console.error(`Error: --server-url is required for logout. Token storage uses server URLs, not names.`)
console.error(` Usage: mcp oauth logout ${serverName} --server-url https://your-server.example.com`)
return 1
}
const success = deleteToken(serverUrl, serverUrl)
if (success) {
console.log(`✓ Successfully removed tokens for ${serverName}`)
return 0
}
console.error(`Error: Failed to remove tokens for ${serverName}`)
return 1
} catch (error) {
const message = error instanceof Error ? error.message : String(error)
console.error(`Error: Failed to remove tokens for ${serverName}: ${message}`)
return 1
}
}

View File

@@ -0,0 +1,48 @@
import { describe, it, expect, beforeEach, afterEach } from "bun:test"
import { status } from "./status"
describe("status command", () => {
beforeEach(() => {
// setup
})
afterEach(() => {
// cleanup
})
it("returns success code when checking status for specific server", async () => {
// given
const serverName = "test-server"
// when
const exitCode = await status(serverName)
// then
expect(typeof exitCode).toBe("number")
expect(exitCode).toBe(0)
})
it("returns success code when checking status for all servers", async () => {
// given
const serverName = undefined
// when
const exitCode = await status(serverName)
// then
expect(typeof exitCode).toBe("number")
expect(exitCode).toBe(0)
})
it("handles non-existent server gracefully", async () => {
// given
const serverName = "non-existent-server"
// when
const exitCode = await status(serverName)
// then
expect(typeof exitCode).toBe("number")
expect(exitCode).toBe(0)
})
})

View File

@@ -0,0 +1,50 @@
import { listAllTokens, listTokensByHost } from "../../features/mcp-oauth/storage"
export async function status(serverName: string | undefined): Promise<number> {
try {
if (serverName) {
const tokens = listTokensByHost(serverName)
if (Object.keys(tokens).length === 0) {
console.log(`No tokens found for ${serverName}`)
return 0
}
console.log(`OAuth Status for ${serverName}:`)
for (const [key, token] of Object.entries(tokens)) {
console.log(` ${key}:`)
console.log(` Access Token: [REDACTED]`)
if (token.refreshToken) {
console.log(` Refresh Token: [REDACTED]`)
}
if (token.expiresAt) {
const expiryDate = new Date(token.expiresAt * 1000)
const now = Date.now() / 1000
const isExpired = token.expiresAt < now
const tokenStatus = isExpired ? "EXPIRED" : "VALID"
console.log(` Expiry: ${expiryDate.toISOString()} (${tokenStatus})`)
}
}
return 0
}
const tokens = listAllTokens()
if (Object.keys(tokens).length === 0) {
console.log("No OAuth tokens stored")
return 0
}
console.log("Stored OAuth Tokens:")
for (const [key, token] of Object.entries(tokens)) {
const isExpired = token.expiresAt && token.expiresAt < Date.now() / 1000
const tokenStatus = isExpired ? "EXPIRED" : "VALID"
console.log(` ${key}: ${tokenStatus}`)
}
return 0
} catch (error) {
const message = error instanceof Error ? error.message : String(error)
console.error(`Error: Failed to get token status: ${message}`)
return 1
}
}

View File

@@ -353,6 +353,17 @@ describe("generateModelConfig", () => {
// #then explore should use gpt-5-nano (fallback)
expect(result.agents?.explore?.model).toBe("opencode/gpt-5-nano")
})
test("explore uses gpt-5-mini when only Copilot available", () => {
// #given only Copilot is available
const config = createConfig({ hasCopilot: true })
// #when generateModelConfig is called
const result = generateModelConfig(config)
// #then explore should use gpt-5-mini (Copilot fallback)
expect(result.agents?.explore?.model).toBe("github-copilot/gpt-5-mini")
})
})
describe("Sisyphus agent special cases", () => {

View File

@@ -139,12 +139,14 @@ export function generateModelConfig(config: InstallConfig): GeneratedOmoConfig {
continue
}
// Special case: explore uses Claude haiku → OpenCode gpt-5-nano
// Special case: explore uses Claude haiku → GitHub Copilot gpt-5-mini → OpenCode gpt-5-nano
if (role === "explore") {
if (avail.native.claude) {
agents[role] = { model: "anthropic/claude-haiku-4-5" }
} else if (avail.opencodeZen) {
agents[role] = { model: "opencode/claude-haiku-4-5" }
} else if (avail.copilot) {
agents[role] = { model: "github-copilot/gpt-5-mini" }
} else {
agents[role] = { model: "opencode/gpt-5-nano" }
}

View File

@@ -31,8 +31,18 @@ export async function run(options: RunOptions): Promise<number> {
}
try {
// Support custom OpenCode server port via environment variable
// This allows Open Agent and other orchestrators to run multiple
// concurrent missions without port conflicts
const serverPort = process.env.OPENCODE_SERVER_PORT
? parseInt(process.env.OPENCODE_SERVER_PORT, 10)
: undefined
const serverHostname = process.env.OPENCODE_SERVER_HOSTNAME || undefined
const { client, server } = await createOpencode({
signal: abortController.signal,
...(serverPort && !isNaN(serverPort) ? { port: serverPort } : {}),
...(serverHostname ? { hostname: serverHostname } : {}),
})
const cleanup = () => {

View File

@@ -9,6 +9,8 @@ export {
SisyphusAgentConfigSchema,
ExperimentalConfigSchema,
RalphLoopConfigSchema,
TmuxConfigSchema,
TmuxLayoutSchema,
} from "./schema"
export type {
@@ -23,4 +25,6 @@ export type {
ExperimentalConfig,
DynamicContextPruningConfig,
RalphLoopConfig,
TmuxConfig,
TmuxLayout,
} from "./schema"

View File

@@ -1,5 +1,12 @@
import { describe, expect, test } from "bun:test"
import { AgentOverrideConfigSchema, BuiltinCategoryNameSchema, CategoryConfigSchema, OhMyOpenCodeConfigSchema } from "./schema"
import {
AgentOverrideConfigSchema,
BrowserAutomationConfigSchema,
BrowserAutomationProviderSchema,
BuiltinCategoryNameSchema,
CategoryConfigSchema,
OhMyOpenCodeConfigSchema,
} from "./schema"
describe("disabled_mcps schema", () => {
test("should accept built-in MCP names", () => {
@@ -508,3 +515,94 @@ describe("Sisyphus-Junior agent override", () => {
}
})
})
describe("BrowserAutomationProviderSchema", () => {
test("accepts 'playwright' as valid provider", () => {
// #given
const input = "playwright"
// #when
const result = BrowserAutomationProviderSchema.safeParse(input)
// #then
expect(result.success).toBe(true)
expect(result.data).toBe("playwright")
})
test("accepts 'agent-browser' as valid provider", () => {
// #given
const input = "agent-browser"
// #when
const result = BrowserAutomationProviderSchema.safeParse(input)
// #then
expect(result.success).toBe(true)
expect(result.data).toBe("agent-browser")
})
test("rejects invalid provider", () => {
// #given
const input = "invalid-provider"
// #when
const result = BrowserAutomationProviderSchema.safeParse(input)
// #then
expect(result.success).toBe(false)
})
})
describe("BrowserAutomationConfigSchema", () => {
test("defaults provider to 'playwright' when not specified", () => {
// #given
const input = {}
// #when
const result = BrowserAutomationConfigSchema.parse(input)
// #then
expect(result.provider).toBe("playwright")
})
test("accepts agent-browser provider", () => {
// #given
const input = { provider: "agent-browser" }
// #when
const result = BrowserAutomationConfigSchema.parse(input)
// #then
expect(result.provider).toBe("agent-browser")
})
})
describe("OhMyOpenCodeConfigSchema - browser_automation_engine", () => {
test("accepts browser_automation_engine config", () => {
// #given
const input = {
browser_automation_engine: {
provider: "agent-browser",
},
}
// #when
const result = OhMyOpenCodeConfigSchema.safeParse(input)
// #then
expect(result.success).toBe(true)
expect(result.data?.browser_automation_engine?.provider).toBe("agent-browser")
})
test("accepts config without browser_automation_engine", () => {
// #given
const input = {}
// #when
const result = OhMyOpenCodeConfigSchema.safeParse(input)
// #then
expect(result.success).toBe(true)
expect(result.data?.browser_automation_engine).toBeUndefined()
})
})

View File

@@ -30,6 +30,7 @@ export const BuiltinAgentNameSchema = z.enum([
export const BuiltinSkillNameSchema = z.enum([
"playwright",
"agent-browser",
"frontend-ui-ux",
"git-master",
])
@@ -76,6 +77,7 @@ export const HookNameSchema = z.enum([
"thinking-block-validator",
"ralph-loop",
"category-skill-reminder",
"compaction-context-injector",
"claude-code-hooks",
@@ -83,6 +85,7 @@ export const HookNameSchema = z.enum([
"edit-error-recovery",
"delegate-task-retry",
"prometheus-md-only",
"sisyphus-junior-notepad",
"start-work",
"atlas",
])
@@ -113,6 +116,19 @@ export const AgentOverrideConfigSchema = z.object({
.regex(/^#[0-9A-Fa-f]{6}$/)
.optional(),
permission: AgentPermissionSchema.optional(),
/** Maximum tokens for response. Passed directly to OpenCode SDK. */
maxTokens: z.number().optional(),
/** Extended thinking configuration (Anthropic). Overrides category and default settings. */
thinking: z.object({
type: z.enum(["enabled", "disabled"]),
budgetTokens: z.number().optional(),
}).optional(),
/** Reasoning effort level (OpenAI). Overrides category and default settings. */
reasoningEffort: z.enum(["low", "medium", "high", "xhigh"]).optional(),
/** Text verbosity level. */
textVerbosity: z.enum(["low", "medium", "high"]).optional(),
/** Provider-specific options. Passed directly to OpenCode SDK. */
providerOptions: z.record(z.string(), z.unknown()).optional(),
})
export const AgentOverridesSchema = z.object({
@@ -297,6 +313,56 @@ export const GitMasterConfigSchema = z.object({
include_co_authored_by: z.boolean().default(true),
})
export const BrowserAutomationProviderSchema = z.enum(["playwright", "agent-browser", "dev-browser"])
export const BrowserAutomationConfigSchema = z.object({
/**
* Browser automation provider to use for the "playwright" skill.
* - "playwright": Uses Playwright MCP server (@playwright/mcp) - default
* - "agent-browser": Uses Vercel's agent-browser CLI (requires: bun add -g agent-browser)
* - "dev-browser": Uses dev-browser skill with persistent browser state
*/
provider: BrowserAutomationProviderSchema.default("playwright"),
})
export const TmuxLayoutSchema = z.enum([
'main-horizontal', // main pane top, agent panes bottom stack
'main-vertical', // main pane left, agent panes right stack (default)
'tiled', // all panes same size grid
'even-horizontal', // all panes horizontal row
'even-vertical', // all panes vertical stack
])
export const TmuxConfigSchema = z.object({
enabled: z.boolean().default(false),
layout: TmuxLayoutSchema.default('main-vertical'),
main_pane_size: z.number().min(20).max(80).default(60),
main_pane_min_width: z.number().min(40).default(120),
agent_pane_min_width: z.number().min(20).default(40),
})
export const SisyphusTasksConfigSchema = z.object({
/** Enable Sisyphus Tasks system (default: false) */
enabled: z.boolean().default(false),
/** Storage path for tasks (default: .sisyphus/tasks) */
storage_path: z.string().default(".sisyphus/tasks"),
/** Enable Claude Code path compatibility mode */
claude_code_compat: z.boolean().default(false),
})
export const SisyphusSwarmConfigSchema = z.object({
/** Enable Sisyphus Swarm system (default: false) */
enabled: z.boolean().default(false),
/** Storage path for teams (default: .sisyphus/teams) */
storage_path: z.string().default(".sisyphus/teams"),
/** UI mode: toast notifications, tmux panes, or both */
ui_mode: z.enum(["toast", "tmux", "both"]).default("toast"),
})
export const SisyphusConfigSchema = z.object({
tasks: SisyphusTasksConfigSchema.optional(),
swarm: SisyphusSwarmConfigSchema.optional(),
})
export const OhMyOpenCodeConfigSchema = z.object({
$schema: z.string().optional(),
disabled_mcps: z.array(AnyMcpNameSchema).optional(),
@@ -316,6 +382,9 @@ export const OhMyOpenCodeConfigSchema = z.object({
background_task: BackgroundTaskConfigSchema.optional(),
notification: NotificationConfigSchema.optional(),
git_master: GitMasterConfigSchema.optional(),
browser_automation_engine: BrowserAutomationConfigSchema.optional(),
tmux: TmuxConfigSchema.optional(),
sisyphus: SisyphusConfigSchema.optional(),
})
export type OhMyOpenCodeConfig = z.infer<typeof OhMyOpenCodeConfigSchema>
@@ -338,5 +407,12 @@ export type CategoryConfig = z.infer<typeof CategoryConfigSchema>
export type CategoriesConfig = z.infer<typeof CategoriesConfigSchema>
export type BuiltinCategoryName = z.infer<typeof BuiltinCategoryNameSchema>
export type GitMasterConfig = z.infer<typeof GitMasterConfigSchema>
export type BrowserAutomationProvider = z.infer<typeof BrowserAutomationProviderSchema>
export type BrowserAutomationConfig = z.infer<typeof BrowserAutomationConfigSchema>
export type TmuxConfig = z.infer<typeof TmuxConfigSchema>
export type TmuxLayout = z.infer<typeof TmuxLayoutSchema>
export type SisyphusTasksConfig = z.infer<typeof SisyphusTasksConfigSchema>
export type SisyphusSwarmConfig = z.infer<typeof SisyphusSwarmConfigSchema>
export type SisyphusConfig = z.infer<typeof SisyphusConfigSchema>
export { AnyMcpNameSchema, type AnyMcpName, McpNameSchema, type McpName } from "../mcp/types"

View File

@@ -2,34 +2,31 @@
## OVERVIEW
Core feature modules + Claude Code compatibility layer. Background agents, skill MCP, builtin skills/commands, 5 loaders.
Core feature modules + Claude Code compatibility layer. Orchestrates background agents, skill MCPs, builtin skills/commands, and 16 feature modules.
## STRUCTURE
```
features/
├── background-agent/ # Task lifecycle (1335 lines)
├── background-agent/ # Task lifecycle (1377 lines)
│ ├── manager.ts # Launch → poll → complete
── concurrency.ts # Per-provider limits
│ └── types.ts # BackgroundTask, LaunchInput
── skill-mcp-manager/ # MCP client lifecycle (520 lines)
│ ├── manager.ts # Lazy loading, cleanup
│ └── types.ts # SkillMcpConfig
├── builtin-skills/ # Playwright, git-master, frontend-ui-ux
│ └── skills.ts # 1203 lines
├── builtin-commands/ # ralph-loop, refactor, init-deep, start-work, remove-deadcode
│ ├── commands.ts # Command registry
│ └── templates/ # Command templates (4 files)
── concurrency.ts # Per-provider limits
├── builtin-skills/ # Core skills (1729 lines)
│ └── skills.ts # agent-browser, dev-browser, frontend-ui-ux, git-master, typescript-programmer
├── builtin-commands/ # ralph-loop, refactor, ulw-loop, init-deep, start-work, cancel-ralph
├── claude-code-agent-loader/ # ~/.claude/agents/*.md
├── claude-code-command-loader/ # ~/.claude/commands/*.md
├── claude-code-mcp-loader/ # .mcp.json
├── claude-code-mcp-loader/ # .mcp.json with ${VAR} expansion
├── claude-code-plugin-loader/ # installed_plugins.json
├── claude-code-session-state/ # Session persistence
├── opencode-skill-loader/ # Skills from 6 directories
├── context-injector/ # AGENTS.md/README.md injection
├── boulder-state/ # Todo state persistence
├── hook-message-injector/ # Message injection
── task-toast-manager/ # Background task notifications
── task-toast-manager/ # Background task notifications
├── skill-mcp-manager/ # MCP client lifecycle (520 lines)
├── tmux-subagent/ # Tmux session management
└── ... (16 modules total)
```
## LOADER PRIORITY
@@ -44,8 +41,9 @@ features/
- **Lifecycle**: `launch``poll` (2s) → `complete`
- **Stability**: 3 consecutive polls = idle
- **Concurrency**: Per-provider/model limits
- **Concurrency**: Per-provider/model limits via `ConcurrencyManager`
- **Cleanup**: 30m TTL, 3m stale timeout
- **State**: Per-session Maps, cleaned on `session.deleted`
## SKILL MCP
@@ -58,3 +56,4 @@ features/
- **Sequential delegation**: Use `delegate_task` parallel
- **Trust self-reports**: ALWAYS verify
- **Main thread blocks**: No heavy I/O in loader init
- **Direct state mutation**: Use managers for boulder/session state

View File

@@ -170,6 +170,7 @@ function createBackgroundManager(): BackgroundManager {
const client = {
session: {
prompt: async () => ({}),
abort: async () => ({}),
},
}
return new BackgroundManager({ client, directory: tmpdir() } as unknown as PluginInput)
@@ -776,7 +777,7 @@ describe("BackgroundManager.notifyParentSession - dynamic message lookup", () =>
parentModel: { providerID: "old", modelID: "old-model" },
}
const currentMessage: CurrentMessage = {
agent: "Sisyphus",
agent: "sisyphus",
model: { providerID: "anthropic", modelID: "claude-opus-4-5" },
}
@@ -784,7 +785,7 @@ describe("BackgroundManager.notifyParentSession - dynamic message lookup", () =>
const promptBody = buildNotificationPromptBody(task, currentMessage)
// #then - uses currentMessage values, not task.parentModel/parentAgent
expect(promptBody.agent).toBe("Sisyphus")
expect(promptBody.agent).toBe("sisyphus")
expect(promptBody.model).toEqual({ providerID: "anthropic", modelID: "claude-opus-4-5" })
})
@@ -827,11 +828,11 @@ describe("BackgroundManager.notifyParentSession - dynamic message lookup", () =>
status: "completed",
startedAt: new Date(),
completedAt: new Date(),
parentAgent: "Sisyphus",
parentAgent: "sisyphus",
parentModel: { providerID: "anthropic", modelID: "claude-opus" },
}
const currentMessage: CurrentMessage = {
agent: "Sisyphus",
agent: "sisyphus",
model: { providerID: "anthropic" },
}
@@ -839,7 +840,7 @@ describe("BackgroundManager.notifyParentSession - dynamic message lookup", () =>
const promptBody = buildNotificationPromptBody(task, currentMessage)
// #then - model not passed due to incomplete data
expect(promptBody.agent).toBe("Sisyphus")
expect(promptBody.agent).toBe("sisyphus")
expect("model" in promptBody).toBe(false)
})
@@ -856,7 +857,7 @@ describe("BackgroundManager.notifyParentSession - dynamic message lookup", () =>
status: "completed",
startedAt: new Date(),
completedAt: new Date(),
parentAgent: "Sisyphus",
parentAgent: "sisyphus",
parentModel: { providerID: "anthropic", modelID: "claude-opus" },
}
@@ -864,7 +865,7 @@ describe("BackgroundManager.notifyParentSession - dynamic message lookup", () =>
const promptBody = buildNotificationPromptBody(task, null)
// #then - falls back to task.parentAgent, no model
expect(promptBody.agent).toBe("Sisyphus")
expect(promptBody.agent).toBe("sisyphus")
expect("model" in promptBody).toBe(false)
})
})
@@ -1053,6 +1054,7 @@ describe("BackgroundManager.resume model persistence", () => {
promptCalls.push(args)
return {}
},
abort: async () => ({}),
},
}
manager = new BackgroundManager({ client, directory: tmpdir() } as unknown as PluginInput)
@@ -1926,3 +1928,162 @@ describe("BackgroundManager.checkAndInterruptStaleTasks", () => {
})
})
describe("BackgroundManager.shutdown session abort", () => {
test("should call session.abort for all running tasks during shutdown", () => {
// #given
const abortedSessionIDs: string[] = []
const client = {
session: {
prompt: async () => ({}),
abort: async (args: { path: { id: string } }) => {
abortedSessionIDs.push(args.path.id)
return {}
},
},
}
const manager = new BackgroundManager({ client, directory: tmpdir() } as unknown as PluginInput)
const task1: BackgroundTask = {
id: "task-1",
sessionID: "session-1",
parentSessionID: "parent-1",
parentMessageID: "msg-1",
description: "Running task 1",
prompt: "Test",
agent: "test-agent",
status: "running",
startedAt: new Date(),
}
const task2: BackgroundTask = {
id: "task-2",
sessionID: "session-2",
parentSessionID: "parent-2",
parentMessageID: "msg-2",
description: "Running task 2",
prompt: "Test",
agent: "test-agent",
status: "running",
startedAt: new Date(),
}
getTaskMap(manager).set(task1.id, task1)
getTaskMap(manager).set(task2.id, task2)
// #when
manager.shutdown()
// #then
expect(abortedSessionIDs).toContain("session-1")
expect(abortedSessionIDs).toContain("session-2")
expect(abortedSessionIDs).toHaveLength(2)
})
test("should not call session.abort for completed or cancelled tasks", () => {
// #given
const abortedSessionIDs: string[] = []
const client = {
session: {
prompt: async () => ({}),
abort: async (args: { path: { id: string } }) => {
abortedSessionIDs.push(args.path.id)
return {}
},
},
}
const manager = new BackgroundManager({ client, directory: tmpdir() } as unknown as PluginInput)
const completedTask: BackgroundTask = {
id: "task-completed",
sessionID: "session-completed",
parentSessionID: "parent-1",
parentMessageID: "msg-1",
description: "Completed task",
prompt: "Test",
agent: "test-agent",
status: "completed",
startedAt: new Date(),
completedAt: new Date(),
}
const cancelledTask: BackgroundTask = {
id: "task-cancelled",
sessionID: "session-cancelled",
parentSessionID: "parent-2",
parentMessageID: "msg-2",
description: "Cancelled task",
prompt: "Test",
agent: "test-agent",
status: "cancelled",
startedAt: new Date(),
completedAt: new Date(),
}
const pendingTask: BackgroundTask = {
id: "task-pending",
parentSessionID: "parent-3",
parentMessageID: "msg-3",
description: "Pending task",
prompt: "Test",
agent: "test-agent",
status: "pending",
queuedAt: new Date(),
}
getTaskMap(manager).set(completedTask.id, completedTask)
getTaskMap(manager).set(cancelledTask.id, cancelledTask)
getTaskMap(manager).set(pendingTask.id, pendingTask)
// #when
manager.shutdown()
// #then
expect(abortedSessionIDs).toHaveLength(0)
})
test("should call onShutdown callback during shutdown", () => {
// #given
let shutdownCalled = false
const client = {
session: {
prompt: async () => ({}),
abort: async () => ({}),
},
}
const manager = new BackgroundManager(
{ client, directory: tmpdir() } as unknown as PluginInput,
undefined,
{
onShutdown: () => {
shutdownCalled = true
},
}
)
// #when
manager.shutdown()
// #then
expect(shutdownCalled).toBe(true)
})
test("should not throw when onShutdown callback throws", () => {
// #given
const client = {
session: {
prompt: async () => ({}),
abort: async () => ({}),
},
}
const manager = new BackgroundManager(
{ client, directory: tmpdir() } as unknown as PluginInput,
undefined,
{
onShutdown: () => {
throw new Error("cleanup failed")
},
}
)
// #when / #then
expect(() => manager.shutdown()).not.toThrow()
})
})

View File

@@ -7,7 +7,8 @@ import type {
} from "./types"
import { log, getAgentToolRestrictions } from "../../shared"
import { ConcurrencyManager } from "./concurrency"
import type { BackgroundTaskConfig } from "../../config/schema"
import type { BackgroundTaskConfig, TmuxConfig } from "../../config/schema"
import { isInsideTmux } from "../../shared/tmux"
import { subagentSessions } from "../claude-code-session-state"
import { getTaskToastManager } from "../task-toast-manager"
@@ -54,6 +55,14 @@ interface QueueItem {
input: LaunchInput
}
export interface SubagentSessionCreatedEvent {
sessionID: string
parentID: string
title: string
}
export type OnSubagentSessionCreated = (event: SubagentSessionCreatedEvent) => Promise<void>
export class BackgroundManager {
private static cleanupManagers = new Set<BackgroundManager>()
private static cleanupRegistered = false
@@ -68,12 +77,22 @@ export class BackgroundManager {
private concurrencyManager: ConcurrencyManager
private shutdownTriggered = false
private config?: BackgroundTaskConfig
private tmuxEnabled: boolean
private onSubagentSessionCreated?: OnSubagentSessionCreated
private onShutdown?: () => void
private queuesByKey: Map<string, QueueItem[]> = new Map()
private processingKeys: Set<string> = new Set()
constructor(ctx: PluginInput, config?: BackgroundTaskConfig) {
constructor(
ctx: PluginInput,
config?: BackgroundTaskConfig,
options?: {
tmuxConfig?: TmuxConfig
onSubagentSessionCreated?: OnSubagentSessionCreated
onShutdown?: () => void
}
) {
this.tasks = new Map()
this.notifications = new Map()
this.pendingByParent = new Map()
@@ -81,6 +100,9 @@ export class BackgroundManager {
this.directory = ctx.directory
this.concurrencyManager = new ConcurrencyManager(config)
this.config = config
this.tmuxEnabled = options?.tmuxConfig?.enabled ?? false
this.onSubagentSessionCreated = options?.onSubagentSessionCreated
this.onShutdown = options?.onShutdown
this.registerProcessCleanup()
}
@@ -205,7 +227,10 @@ export class BackgroundManager {
body: {
parentID: input.parentSessionID,
title: `Background: ${input.description}`,
},
permission: [
{ permission: "question", action: "deny" as const, pattern: "*" },
],
} as any,
query: {
directory: parentDirectory,
},
@@ -222,6 +247,29 @@ export class BackgroundManager {
const sessionID = createResult.data.id
subagentSessions.add(sessionID)
log("[background-agent] tmux callback check", {
hasCallback: !!this.onSubagentSessionCreated,
tmuxEnabled: this.tmuxEnabled,
isInsideTmux: isInsideTmux(),
sessionID,
parentID: input.parentSessionID,
})
if (this.onSubagentSessionCreated && this.tmuxEnabled && isInsideTmux()) {
log("[background-agent] Invoking tmux callback NOW", { sessionID })
await this.onSubagentSessionCreated({
sessionID,
parentID: input.parentSessionID,
title: input.description,
}).catch((err) => {
log("[background-agent] Failed to spawn tmux pane:", err)
})
log("[background-agent] tmux callback completed, waiting 200ms")
await new Promise(r => setTimeout(r, 200))
} else {
log("[background-agent] SKIP tmux callback - conditions not met")
}
// Update task to running state
task.status = "running"
task.startedAt = new Date()
@@ -252,17 +300,26 @@ export class BackgroundManager {
// Use prompt() instead of promptAsync() to properly initialize agent loop (fire-and-forget)
// Include model if caller provided one (e.g., from Sisyphus category configs)
// IMPORTANT: variant must be a top-level field in the body, NOT nested inside model
// OpenCode's PromptInput schema expects: { model: { providerID, modelID }, variant: "max" }
const launchModel = input.model
? { providerID: input.model.providerID, modelID: input.model.modelID }
: undefined
const launchVariant = input.model?.variant
this.client.session.prompt({
path: { id: sessionID },
body: {
agent: input.agent,
...(input.model ? { model: input.model } : {}),
...(launchModel ? { model: launchModel } : {}),
...(launchVariant ? { variant: launchVariant } : {}),
system: input.skillContent,
tools: {
...getAgentToolRestrictions(input.agent),
task: false,
delegate_task: false,
call_omo_agent: true,
question: false,
},
parts: [{ type: "text", text: input.prompt }],
},
@@ -499,16 +556,24 @@ export class BackgroundManager {
// Use prompt() instead of promptAsync() to properly initialize agent loop
// Include model if task has one (preserved from original launch with category config)
// variant must be top-level in body, not nested inside model (OpenCode PromptInput schema)
const resumeModel = existingTask.model
? { providerID: existingTask.model.providerID, modelID: existingTask.model.modelID }
: undefined
const resumeVariant = existingTask.model?.variant
this.client.session.prompt({
path: { id: existingTask.sessionID },
body: {
agent: existingTask.agent,
...(existingTask.model ? { model: existingTask.model } : {}),
...(resumeModel ? { model: resumeModel } : {}),
...(resumeVariant ? { variant: resumeVariant } : {}),
tools: {
...getAgentToolRestrictions(existingTask.agent),
task: false,
delegate_task: false,
call_omo_agent: true,
question: false,
},
parts: [{ type: "text", text: input.prompt }],
},
@@ -1284,7 +1349,25 @@ Use \`background_output(task_id="${task.id}")\` to retrieve this result when rea
log("[background-agent] Shutting down BackgroundManager")
this.stopPolling()
// Release concurrency for all running tasks first
// Abort all running sessions to prevent zombie processes (#1240)
for (const task of this.tasks.values()) {
if (task.status === "running" && task.sessionID) {
this.client.session.abort({
path: { id: task.sessionID },
}).catch(() => {})
}
}
// Notify shutdown listeners (e.g., tmux cleanup)
if (this.onShutdown) {
try {
this.onShutdown()
} catch (error) {
log("[background-agent] Error in onShutdown callback:", error)
}
}
// Release concurrency for all running tasks
for (const task of this.tasks.values()) {
if (task.concurrencyKey) {
this.concurrencyManager.release(task.concurrencyKey)

View File

@@ -55,7 +55,6 @@ ${REFACTOR_TEMPLATE}
},
"start-work": {
description: "(builtin) Start Sisyphus work session from Prometheus plan",
agent: "atlas",
template: `<command-instruction>
${START_WORK_TEMPLATE}
</command-instruction>

View File

@@ -25,7 +25,7 @@ export const START_WORK_TEMPLATE = `You are starting a Sisyphus work session.
}
\`\`\`
5. **Read the plan file** and start executing tasks according to Orchestrator Sisyphus workflow
5. **Read the plan file** and start executing tasks according to atlas workflow
## OUTPUT FORMAT
@@ -69,4 +69,4 @@ Reading plan and beginning execution...
- The session_id is injected by the hook - use it directly
- Always update boulder.json BEFORE starting work
- Read the FULL plan file before delegating any tasks
- Follow Orchestrator Sisyphus delegation protocols (7-section format)`
- Follow atlas delegation protocols (7-section format)`

View File

@@ -0,0 +1,336 @@
---
name: agent-browser
description: Automates browser interactions for web testing, form filling, screenshots, and data extraction. Use when the user needs to navigate websites, interact with web pages, fill forms, take screenshots, test web applications, or extract information from web pages.
---
# Browser Automation with agent-browser
## Quick start
```bash
agent-browser open <url> # Navigate to page
agent-browser snapshot -i # Get interactive elements with refs
agent-browser click @e1 # Click element by ref
agent-browser fill @e2 "text" # Fill input by ref
agent-browser close # Close browser
```
## Core workflow
1. Navigate: `agent-browser open <url>`
2. Snapshot: `agent-browser snapshot -i` (returns elements with refs like `@e1`, `@e2`)
3. Interact using refs from the snapshot
4. Re-snapshot after navigation or significant DOM changes
## Commands
### Navigation
```bash
agent-browser open <url> # Navigate to URL
agent-browser back # Go back
agent-browser forward # Go forward
agent-browser reload # Reload page
agent-browser close # Close browser
```
### Snapshot (page analysis)
```bash
agent-browser snapshot # Full accessibility tree
agent-browser snapshot -i # Interactive elements only (recommended)
agent-browser snapshot -c # Compact output
agent-browser snapshot -d 3 # Limit depth to 3
agent-browser snapshot -s "#main" # Scope to CSS selector
```
### Interactions (use @refs from snapshot)
```bash
agent-browser click @e1 # Click
agent-browser dblclick @e1 # Double-click
agent-browser focus @e1 # Focus element
agent-browser fill @e2 "text" # Clear and type
agent-browser type @e2 "text" # Type without clearing
agent-browser press Enter # Press key
agent-browser press Control+a # Key combination
agent-browser keydown Shift # Hold key down
agent-browser keyup Shift # Release key
agent-browser hover @e1 # Hover
agent-browser check @e1 # Check checkbox
agent-browser uncheck @e1 # Uncheck checkbox
agent-browser select @e1 "value" # Select dropdown
agent-browser scroll down 500 # Scroll page
agent-browser scrollintoview @e1 # Scroll element into view
agent-browser drag @e1 @e2 # Drag and drop
agent-browser upload @e1 file.pdf # Upload files
```
### Get information
```bash
agent-browser get text @e1 # Get element text
agent-browser get html @e1 # Get innerHTML
agent-browser get value @e1 # Get input value
agent-browser get attr @e1 href # Get attribute
agent-browser get title # Get page title
agent-browser get url # Get current URL
agent-browser get count ".item" # Count matching elements
agent-browser get box @e1 # Get bounding box
```
### Check state
```bash
agent-browser is visible @e1 # Check if visible
agent-browser is enabled @e1 # Check if enabled
agent-browser is checked @e1 # Check if checked
```
### Screenshots & PDF
```bash
agent-browser screenshot # Screenshot to stdout
agent-browser screenshot path.png # Save to file
agent-browser screenshot --full # Full page
agent-browser pdf output.pdf # Save as PDF
```
### Video recording
```bash
agent-browser record start ./demo.webm # Start recording (uses current URL + state)
agent-browser click @e1 # Perform actions
agent-browser record stop # Stop and save video
agent-browser record restart ./take2.webm # Stop current + start new recording
```
Recording creates a fresh context but preserves cookies/storage from your session.
### Wait
```bash
agent-browser wait @e1 # Wait for element
agent-browser wait 2000 # Wait milliseconds
agent-browser wait --text "Success" # Wait for text
agent-browser wait --url "**/dashboard" # Wait for URL pattern
agent-browser wait --load networkidle # Wait for network idle
agent-browser wait --fn "window.ready" # Wait for JS condition
```
### Mouse control
```bash
agent-browser mouse move 100 200 # Move mouse
agent-browser mouse down left # Press button
agent-browser mouse up left # Release button
agent-browser mouse wheel 100 # Scroll wheel
```
### Semantic locators (alternative to refs)
```bash
agent-browser find role button click --name "Submit"
agent-browser find text "Sign In" click
agent-browser find label "Email" fill "user@test.com"
agent-browser find first ".item" click
agent-browser find nth 2 "a" text
```
### Browser settings
```bash
agent-browser set viewport 1920 1080 # Set viewport size
agent-browser set device "iPhone 14" # Emulate device
agent-browser set geo 37.7749 -122.4194 # Set geolocation
agent-browser set offline on # Toggle offline mode
agent-browser set headers '{"X-Key":"v"}' # Extra HTTP headers
agent-browser set credentials user pass # HTTP basic auth
agent-browser set media dark # Emulate color scheme
```
### Cookies & Storage
```bash
agent-browser cookies # Get all cookies
agent-browser cookies set name value # Set cookie
agent-browser cookies clear # Clear cookies
agent-browser storage local # Get all localStorage
agent-browser storage local key # Get specific key
agent-browser storage local set k v # Set value
agent-browser storage local clear # Clear all
agent-browser storage session # Get all sessionStorage
agent-browser storage session key # Get specific key
agent-browser storage session set k v # Set value
agent-browser storage session clear # Clear all
```
### Network
```bash
agent-browser network route <url> # Intercept requests
agent-browser network route <url> --abort # Block requests
agent-browser network route <url> --body '{}' # Mock response
agent-browser network unroute [url] # Remove routes
agent-browser network requests # View tracked requests
agent-browser network requests --filter api # Filter requests
```
### Tabs & Windows
```bash
agent-browser tab # List tabs
agent-browser tab new [url] # New tab
agent-browser tab 2 # Switch to tab
agent-browser tab close # Close tab
agent-browser window new # New window
```
### Frames
```bash
agent-browser frame "#iframe" # Switch to iframe
agent-browser frame main # Back to main frame
```
### Dialogs
```bash
agent-browser dialog accept [text] # Accept dialog
agent-browser dialog dismiss # Dismiss dialog
```
### JavaScript
```bash
agent-browser eval "document.title" # Run JavaScript
```
## Global Options
| Option | Description |
|--------|-------------|
| `--session <name>` | Isolated browser session (`AGENT_BROWSER_SESSION` env) |
| `--profile <path>` | Persistent browser profile (`AGENT_BROWSER_PROFILE` env) |
| `--headers <json>` | HTTP headers scoped to URL's origin |
| `--executable-path <path>` | Custom browser binary (`AGENT_BROWSER_EXECUTABLE_PATH` env) |
| `--args <args>` | Browser launch args (`AGENT_BROWSER_ARGS` env) |
| `--user-agent <ua>` | Custom User-Agent (`AGENT_BROWSER_USER_AGENT` env) |
| `--proxy <url>` | Proxy server (`AGENT_BROWSER_PROXY` env) |
| `--proxy-bypass <hosts>` | Hosts to bypass proxy (`AGENT_BROWSER_PROXY_BYPASS` env) |
| `-p, --provider <name>` | Cloud browser provider (`AGENT_BROWSER_PROVIDER` env) |
| `--json` | Machine-readable JSON output |
| `--headed` | Show browser window (not headless) |
| `--cdp <port\|wss://url>` | Connect via Chrome DevTools Protocol |
| `--debug` | Debug output |
## Example: Form submission
```bash
agent-browser open https://example.com/form
agent-browser snapshot -i
# Output shows: textbox "Email" [ref=e1], textbox "Password" [ref=e2], button "Submit" [ref=e3]
agent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser snapshot -i # Check result
```
## Example: Authentication with saved state
```bash
# Login once
agent-browser open https://app.example.com/login
agent-browser snapshot -i
agent-browser fill @e1 "username"
agent-browser fill @e2 "password"
agent-browser click @e3
agent-browser wait --url "**/dashboard"
agent-browser state save auth.json
# Later sessions: load saved state
agent-browser state load auth.json
agent-browser open https://app.example.com/dashboard
```
### Header-based Auth (Skip login flows)
```bash
# Headers scoped to api.example.com only
agent-browser open api.example.com --headers '{"Authorization": "Bearer <token>"}'
# Navigate to another domain - headers NOT sent (safe)
agent-browser open other-site.com
# Global headers (all domains)
agent-browser set headers '{"X-Custom-Header": "value"}'
```
## Sessions & Persistent Profiles
### Sessions (parallel browsers)
```bash
agent-browser --session test1 open site-a.com
agent-browser --session test2 open site-b.com
agent-browser session list
```
### Persistent Profiles
Persists cookies, localStorage, IndexedDB, service workers, cache, login sessions across browser restarts.
```bash
agent-browser --profile ~/.myapp-profile open myapp.com
# Or via env var
AGENT_BROWSER_PROFILE=~/.myapp-profile agent-browser open myapp.com
```
- Use different profile paths for different projects
- Login once → restart browser → still logged in
- Stores: cookies, localStorage, IndexedDB, service workers, browser cache
## JSON output (for parsing)
Add `--json` for machine-readable output:
```bash
agent-browser snapshot -i --json
agent-browser get text @e1 --json
```
## Debugging
```bash
agent-browser open example.com --headed # Show browser window
agent-browser console # View console messages
agent-browser errors # View page errors
agent-browser record start ./debug.webm # Record from current page
agent-browser record stop # Save recording
agent-browser connect 9222 # Local CDP port
agent-browser --cdp "wss://browser-service.com/cdp?token=..." snapshot # Remote via WebSocket
agent-browser console --clear # Clear console
agent-browser errors --clear # Clear errors
agent-browser highlight @e1 # Highlight element
agent-browser trace start # Start recording trace
agent-browser trace stop trace.zip # Stop and save trace
```
---
## Installation
### Step 1: Install agent-browser CLI
```bash
bun add -g agent-browser
```
### Step 2: Install Playwright browsers
**IMPORTANT**: `agent-browser install` may fail on some platforms (e.g., darwin-arm64) with "No binary found" error. In that case, install Playwright browsers directly:
```bash
# Create a temp project and install playwright
cd /tmp && bun init -y && bun add playwright
# Install Chromium browser
bun playwright install chromium
```
This downloads Chrome for Testing to `~/Library/Caches/ms-playwright/`.
### Verify installation
```bash
agent-browser open https://example.com --headed
```
If the browser opens successfully, installation is complete.
### Troubleshooting
| Error | Solution |
|-------|----------|
| `No binary found for darwin-arm64` | Run `bun playwright install chromium` in a project with playwright dependency |
| `Executable doesn't exist at .../chromium-XXXX` | Re-run `bun playwright install chromium` |
| Browser doesn't open | Ensure `--headed` flag is used for visible browser |
---
Run `agent-browser --help` for all commands. Repo: https://github.com/vercel-labs/agent-browser

View File

@@ -0,0 +1,213 @@
---
name: dev-browser
description: Browser automation with persistent page state. Use when users ask to navigate websites, fill forms, take screenshots, extract web data, test web apps, or automate browser workflows. Trigger phrases include "go to [url]", "click on", "fill out the form", "take a screenshot", "scrape", "automate", "test the website", "log into", or any browser interaction request.
---
# Dev Browser Skill
Browser automation that maintains page state across script executions. Write small, focused scripts to accomplish tasks incrementally. Once you've proven out part of a workflow and there is repeated work to be done, you can write a script to do the repeated work in a single execution.
## Choosing Your Approach
- **Local/source-available sites**: Read the source code first to write selectors directly
- **Unknown page layouts**: Use `getAISnapshot()` to discover elements and `selectSnapshotRef()` to interact with them
- **Visual feedback**: Take screenshots to see what the user sees
## Setup
> **Installation**: See [references/installation.md](references/installation.md) for detailed setup instructions including Windows support.
Two modes available. Ask the user if unclear which to use.
### Standalone Mode (Default)
Launches a new Chromium browser for fresh automation sessions.
```bash
./skills/dev-browser/server.sh &
```
Add `--headless` flag if user requests it. **Wait for the `Ready` message before running scripts.**
### Extension Mode
Connects to user's existing Chrome browser. Use this when:
- The user is already logged into sites and wants you to do things behind an authed experience that isn't local dev.
- The user asks you to use the extension
**Important**: The core flow is still the same. You create named pages inside of their browser.
**Start the relay server:**
```bash
cd skills/dev-browser && npm i && npm run start-extension &
```
Wait for `Waiting for extension to connect...` followed by `Extension connected` in the console. To know that a client has connected and the browser is ready to be controlled.
**Workflow:**
1. Scripts call `client.page("name")` just like the normal mode to create new pages / connect to existing ones.
2. Automation runs on the user's actual browser session
If the extension hasn't connected yet, tell the user to launch and activate it. Download link: https://github.com/SawyerHood/dev-browser/releases
## Writing Scripts
> **Run all scripts from `skills/dev-browser/` directory.** The `@/` import alias requires this directory's config.
Execute scripts inline using heredocs:
```bash
cd skills/dev-browser && npx tsx <<'EOF'
import { connect, waitForPageLoad } from "@/client.js";
const client = await connect();
// Create page with custom viewport size (optional)
const page = await client.page("example", { viewport: { width: 1920, height: 1080 } });
await page.goto("https://example.com");
await waitForPageLoad(page);
console.log({ title: await page.title(), url: page.url() });
await client.disconnect();
EOF
```
**Write to `tmp/` files only when** the script needs reuse, is complex, or user explicitly requests it.
### Key Principles
1. **Small scripts**: Each script does ONE thing (navigate, click, fill, check)
2. **Evaluate state**: Log/return state at the end to decide next steps
3. **Descriptive page names**: Use `"checkout"`, `"login"`, not `"main"`
4. **Disconnect to exit**: `await client.disconnect()` - pages persist on server
5. **Plain JS in evaluate**: `page.evaluate()` runs in browser - no TypeScript syntax
## Workflow Loop
Follow this pattern for complex tasks:
1. **Write a script** to perform one action
2. **Run it** and observe the output
3. **Evaluate** - did it work? What's the current state?
4. **Decide** - is the task complete or do we need another script?
5. **Repeat** until task is done
### No TypeScript in Browser Context
Code passed to `page.evaluate()` runs in the browser, which doesn't understand TypeScript:
```typescript
// ✅ Correct: plain JavaScript
const text = await page.evaluate(() => {
return document.body.innerText;
});
// ❌ Wrong: TypeScript syntax will fail at runtime
const text = await page.evaluate(() => {
const el: HTMLElement = document.body; // Type annotation breaks in browser!
return el.innerText;
});
```
## Scraping Data
For scraping large datasets, intercept and replay network requests rather than scrolling the DOM. See [references/scraping.md](references/scraping.md) for the complete guide covering request capture, schema discovery, and paginated API replay.
## Client API
```typescript
const client = await connect();
// Get or create named page (viewport only applies to new pages)
const page = await client.page("name");
const pageWithSize = await client.page("name", { viewport: { width: 1920, height: 1080 } });
const pages = await client.list(); // List all page names
await client.close("name"); // Close a page
await client.disconnect(); // Disconnect (pages persist)
// ARIA Snapshot methods
const snapshot = await client.getAISnapshot("name"); // Get accessibility tree
const element = await client.selectSnapshotRef("name", "e5"); // Get element by ref
```
The `page` object is a standard Playwright Page.
## Waiting
```typescript
import { waitForPageLoad } from "@/client.js";
await waitForPageLoad(page); // After navigation
await page.waitForSelector(".results"); // For specific elements
await page.waitForURL("**/success"); // For specific URL
```
## Inspecting Page State
### Screenshots
```typescript
await page.screenshot({ path: "tmp/screenshot.png" });
await page.screenshot({ path: "tmp/full.png", fullPage: true });
```
### ARIA Snapshot (Element Discovery)
Use `getAISnapshot()` to discover page elements. Returns YAML-formatted accessibility tree:
```yaml
- banner:
- link "Hacker News" [ref=e1]
- navigation:
- link "new" [ref=e2]
- main:
- list:
- listitem:
- link "Article Title" [ref=e8]
- link "328 comments" [ref=e9]
- contentinfo:
- textbox [ref=e10]
- /placeholder: "Search"
```
**Interpreting refs:**
- `[ref=eN]` - Element reference for interaction (visible, clickable elements only)
- `[checked]`, `[disabled]`, `[expanded]` - Element states
- `[level=N]` - Heading level
- `/url:`, `/placeholder:` - Element properties
**Interacting with refs:**
```typescript
const snapshot = await client.getAISnapshot("hackernews");
console.log(snapshot); // Find the ref you need
const element = await client.selectSnapshotRef("hackernews", "e2");
await element.click();
```
## Error Recovery
Page state persists after failures. Debug with:
```bash
cd skills/dev-browser && npx tsx <<'EOF'
import { connect } from "@/client.js";
const client = await connect();
const page = await client.page("hackernews");
await page.screenshot({ path: "tmp/debug.png" });
console.log({
url: page.url(),
title: await page.title(),
bodyText: await page.textContent("body").then((t) => t?.slice(0, 200)),
});
await client.disconnect();
EOF
```

View File

@@ -0,0 +1,193 @@
# Dev Browser Installation Guide
This guide covers installation for all platforms: macOS, Linux, and Windows.
## Prerequisites
- [Node.js](https://nodejs.org) v18 or later with npm
- Git (for cloning the skill)
## Installation
### Step 1: Clone the Skill
```bash
# Clone dev-browser to a temporary location
git clone https://github.com/sawyerhood/dev-browser /tmp/dev-browser-skill
# Copy to skills directory (adjust path as needed)
# For oh-my-opencode: already bundled
# For manual installation:
mkdir -p ~/.config/opencode/skills
cp -r /tmp/dev-browser-skill/skills/dev-browser ~/.config/opencode/skills/dev-browser
# Cleanup
rm -rf /tmp/dev-browser-skill
```
**Windows (PowerShell):**
```powershell
# Clone dev-browser to temp location
git clone https://github.com/sawyerhood/dev-browser $env:TEMP\dev-browser-skill
# Copy to skills directory
New-Item -ItemType Directory -Force -Path "$env:USERPROFILE\.config\opencode\skills"
Copy-Item -Recurse "$env:TEMP\dev-browser-skill\skills\dev-browser" "$env:USERPROFILE\.config\opencode\skills\dev-browser"
# Cleanup
Remove-Item -Recurse -Force "$env:TEMP\dev-browser-skill"
```
### Step 2: Install Dependencies
```bash
cd ~/.config/opencode/skills/dev-browser
npm install
```
**Windows (PowerShell):**
```powershell
cd "$env:USERPROFILE\.config\opencode\skills\dev-browser"
npm install
```
### Step 3: Start the Server
#### Standalone Mode (New Browser Instance)
**macOS/Linux:**
```bash
cd ~/.config/opencode/skills/dev-browser
./server.sh &
# Or for headless:
./server.sh --headless &
```
**Windows (PowerShell):**
```powershell
cd "$env:USERPROFILE\.config\opencode\skills\dev-browser"
Start-Process -NoNewWindow -FilePath "node" -ArgumentList "server.js"
# Or for headless:
Start-Process -NoNewWindow -FilePath "node" -ArgumentList "server.js", "--headless"
```
**Windows (CMD):**
```cmd
cd %USERPROFILE%\.config\opencode\skills\dev-browser
start /B node server.js
```
Wait for the `Ready` message before running scripts.
#### Extension Mode (Use Existing Chrome)
**macOS/Linux:**
```bash
cd ~/.config/opencode/skills/dev-browser
npm run start-extension &
```
**Windows (PowerShell):**
```powershell
cd "$env:USERPROFILE\.config\opencode\skills\dev-browser"
Start-Process -NoNewWindow -FilePath "npm" -ArgumentList "run", "start-extension"
```
Wait for `Extension connected` message.
## Chrome Extension Setup (Optional)
The Chrome extension allows controlling your existing Chrome browser with all your logged-in sessions.
### Installation
1. Download `extension.zip` from [latest release](https://github.com/sawyerhood/dev-browser/releases/latest)
2. Extract to a permanent location:
- **macOS/Linux:** `~/.dev-browser-extension`
- **Windows:** `%USERPROFILE%\.dev-browser-extension`
3. Open Chrome → `chrome://extensions`
4. Enable "Developer mode" (toggle in top right)
5. Click "Load unpacked" → select the extracted folder
### Usage
1. Click the Dev Browser extension icon in Chrome toolbar
2. Toggle to "Active"
3. Start the extension relay server (see above)
4. Use dev-browser scripts - they'll control your existing Chrome
## Troubleshooting
### Server Won't Start
**Check Node.js version:**
```bash
node --version # Should be v18+
```
**Check port availability:**
```bash
# macOS/Linux
lsof -i :3000
# Windows
netstat -ano | findstr :3000
```
### Playwright Installation Issues
If Chromium fails to install:
```bash
npx playwright install chromium
```
### Windows-Specific Issues
**Execution Policy:**
If PowerShell scripts are blocked:
```powershell
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
```
**Path Issues:**
Use forward slashes or escaped backslashes in paths:
```powershell
# Good
cd "$env:USERPROFILE/.config/opencode/skills/dev-browser"
# Also good
cd "$env:USERPROFILE\.config\opencode\skills\dev-browser"
```
### Extension Not Connecting
1. Ensure extension is "Active" (click icon to toggle)
2. Check relay server is running (`npm run start-extension`)
3. Look for `Extension connected` message in console
4. Try reloading the extension in `chrome://extensions`
## Permissions
To skip permission prompts in Claude Code, add to `~/.claude/settings.json`:
```json
{
"permissions": {
"allow": ["Skill(dev-browser:dev-browser)", "Bash(npx tsx:*)"]
}
}
```
## Updating
```bash
cd ~/.config/opencode/skills/dev-browser
git pull
npm install
```
**Windows:**
```powershell
cd "$env:USERPROFILE\.config\opencode\skills\dev-browser"
git pull
npm install
```

View File

@@ -0,0 +1,155 @@
# Data Scraping Guide
For large datasets (followers, posts, search results), **intercept and replay network requests** rather than scrolling and parsing the DOM. This is faster, more reliable, and handles pagination automatically.
## Why Not Scroll?
Scrolling is slow, unreliable, and wastes time. APIs return structured data with pagination built in. Always prefer API replay.
## Start Small, Then Scale
**Don't try to automate everything at once.** Work incrementally:
1. **Capture one request** - verify you're intercepting the right endpoint
2. **Inspect one response** - understand the schema before writing extraction code
3. **Extract a few items** - make sure your parsing logic works
4. **Then scale up** - add pagination loop only after the basics work
This prevents wasting time debugging a complex script when the issue is a simple path like `data.user.timeline` vs `data.user.result.timeline`.
## Step-by-Step Workflow
### 1. Capture Request Details
First, intercept a request to understand URL structure and required headers:
```typescript
import { connect, waitForPageLoad } from "@/client.js";
import * as fs from "node:fs";
const client = await connect();
const page = await client.page("site");
let capturedRequest = null;
page.on("request", (request) => {
const url = request.url();
// Look for API endpoints (adjust pattern for your target site)
if (url.includes("/api/") || url.includes("/graphql/")) {
capturedRequest = {
url: url,
headers: request.headers(),
method: request.method(),
};
fs.writeFileSync("tmp/request-details.json", JSON.stringify(capturedRequest, null, 2));
console.log("Captured request:", url.substring(0, 80) + "...");
}
});
await page.goto("https://example.com/profile");
await waitForPageLoad(page);
await page.waitForTimeout(3000);
await client.disconnect();
```
### 2. Capture Response to Understand Schema
Save a raw response to inspect the data structure:
```typescript
page.on("response", async (response) => {
const url = response.url();
if (url.includes("UserTweets") || url.includes("/api/data")) {
const json = await response.json();
fs.writeFileSync("tmp/api-response.json", JSON.stringify(json, null, 2));
console.log("Captured response");
}
});
```
Then analyze the structure to find:
- Where the data array lives (e.g., `data.user.result.timeline.instructions[].entries`)
- Where pagination cursors are (e.g., `cursor-bottom` entries)
- What fields you need to extract
### 3. Replay API with Pagination
Once you understand the schema, replay requests directly:
```typescript
import { connect } from "@/client.js";
import * as fs from "node:fs";
const client = await connect();
const page = await client.page("site");
const results = new Map(); // Use Map for deduplication
const headers = JSON.parse(fs.readFileSync("tmp/request-details.json", "utf8")).headers;
const baseUrl = "https://example.com/api/data";
let cursor = null;
let hasMore = true;
while (hasMore) {
// Build URL with pagination cursor
const params = { count: 20 };
if (cursor) params.cursor = cursor;
const url = `${baseUrl}?params=${encodeURIComponent(JSON.stringify(params))}`;
// Execute fetch in browser context (has auth cookies/headers)
const response = await page.evaluate(
async ({ url, headers }) => {
const res = await fetch(url, { headers });
return res.json();
},
{ url, headers }
);
// Extract data and cursor (adjust paths for your API)
const entries = response?.data?.entries || [];
for (const entry of entries) {
if (entry.type === "cursor-bottom") {
cursor = entry.value;
} else if (entry.id && !results.has(entry.id)) {
results.set(entry.id, {
id: entry.id,
text: entry.content,
timestamp: entry.created_at,
});
}
}
console.log(`Fetched page, total: ${results.size}`);
// Check stop conditions
if (!cursor || entries.length === 0) hasMore = false;
// Rate limiting - be respectful
await new Promise((r) => setTimeout(r, 500));
}
// Export results
const data = Array.from(results.values());
fs.writeFileSync("tmp/results.json", JSON.stringify(data, null, 2));
console.log(`Saved ${data.length} items`);
await client.disconnect();
```
## Key Patterns
| Pattern | Description |
| ----------------------- | ------------------------------------------------------ |
| `page.on('request')` | Capture outgoing request URL + headers |
| `page.on('response')` | Capture response data to understand schema |
| `page.evaluate(fetch)` | Replay requests in browser context (inherits auth) |
| `Map` for deduplication | APIs often return overlapping data across pages |
| Cursor-based pagination | Look for `cursor`, `next_token`, `offset` in responses |
## Tips
- **Extension mode**: `page.context().cookies()` doesn't work - capture auth headers from intercepted requests instead
- **Rate limiting**: Add 500ms+ delays between requests to avoid blocks
- **Stop conditions**: Check for empty results, missing cursor, or reaching a date/ID threshold
- **GraphQL APIs**: URL params often include `variables` and `features` JSON objects - capture and reuse them

View File

@@ -1,2 +1,2 @@
export * from "./types"
export { createBuiltinSkills } from "./skills"
export { createBuiltinSkills, type CreateBuiltinSkillsOptions } from "./skills"

View File

@@ -0,0 +1,89 @@
import { describe, test, expect } from "bun:test"
import { createBuiltinSkills } from "./skills"
describe("createBuiltinSkills", () => {
test("returns playwright skill by default", () => {
// #given - no options (default)
// #when
const skills = createBuiltinSkills()
// #then
const browserSkill = skills.find((s) => s.name === "playwright")
expect(browserSkill).toBeDefined()
expect(browserSkill!.description).toContain("browser")
expect(browserSkill!.mcpConfig).toHaveProperty("playwright")
})
test("returns playwright skill when browserProvider is 'playwright'", () => {
// #given
const options = { browserProvider: "playwright" as const }
// #when
const skills = createBuiltinSkills(options)
// #then
const playwrightSkill = skills.find((s) => s.name === "playwright")
const agentBrowserSkill = skills.find((s) => s.name === "agent-browser")
expect(playwrightSkill).toBeDefined()
expect(agentBrowserSkill).toBeUndefined()
})
test("returns agent-browser skill when browserProvider is 'agent-browser'", () => {
// #given
const options = { browserProvider: "agent-browser" as const }
// #when
const skills = createBuiltinSkills(options)
// #then
const agentBrowserSkill = skills.find((s) => s.name === "agent-browser")
const playwrightSkill = skills.find((s) => s.name === "playwright")
expect(agentBrowserSkill).toBeDefined()
expect(agentBrowserSkill!.description).toContain("browser")
expect(agentBrowserSkill!.allowedTools).toContain("Bash(agent-browser:*)")
expect(agentBrowserSkill!.template).toContain("agent-browser")
expect(playwrightSkill).toBeUndefined()
})
test("agent-browser skill template is inlined (not loaded from file)", () => {
// #given
const options = { browserProvider: "agent-browser" as const }
// #when
const skills = createBuiltinSkills(options)
const agentBrowserSkill = skills.find((s) => s.name === "agent-browser")
// #then - template should contain substantial content (inlined, not fallback)
expect(agentBrowserSkill!.template).toContain("## Quick start")
expect(agentBrowserSkill!.template).toContain("## Commands")
expect(agentBrowserSkill!.template).toContain("agent-browser open")
expect(agentBrowserSkill!.template).toContain("agent-browser snapshot")
})
test("always includes frontend-ui-ux and git-master skills", () => {
// #given - both provider options
// #when
const defaultSkills = createBuiltinSkills()
const agentBrowserSkills = createBuiltinSkills({ browserProvider: "agent-browser" })
// #then
for (const skills of [defaultSkills, agentBrowserSkills]) {
expect(skills.find((s) => s.name === "frontend-ui-ux")).toBeDefined()
expect(skills.find((s) => s.name === "git-master")).toBeDefined()
}
})
test("returns exactly 4 skills regardless of provider", () => {
// #given
// #when
const defaultSkills = createBuiltinSkills()
const agentBrowserSkills = createBuiltinSkills({ browserProvider: "agent-browser" })
// #then
expect(defaultSkills).toHaveLength(4)
expect(agentBrowserSkills).toHaveLength(4)
})
})

View File

@@ -1,4 +1,5 @@
import type { BuiltinSkill } from "./types"
import type { BrowserAutomationProvider } from "../../config/schema"
const playwrightSkill: BuiltinSkill = {
name: "playwright",
@@ -14,6 +15,303 @@ This skill provides browser automation capabilities via the Playwright MCP serve
},
}
const agentBrowserSkill: BuiltinSkill = {
name: "agent-browser",
description: "MUST USE for any browser-related tasks. Browser automation via agent-browser CLI - verification, browsing, information gathering, web scraping, testing, screenshots, and all browser interactions.",
template: `# Browser Automation with agent-browser
## Quick start
\`\`\`bash
agent-browser open <url> # Navigate to page
agent-browser snapshot -i # Get interactive elements with refs
agent-browser click @e1 # Click element by ref
agent-browser fill @e2 "text" # Fill input by ref
agent-browser close # Close browser
\`\`\`
## Core workflow
1. Navigate: \`agent-browser open <url>\`
2. Snapshot: \`agent-browser snapshot -i\` (returns elements with refs like \`@e1\`, \`@e2\`)
3. Interact using refs from the snapshot
4. Re-snapshot after navigation or significant DOM changes
## Commands
### Navigation
\`\`\`bash
agent-browser open <url> # Navigate to URL
agent-browser back # Go back
agent-browser forward # Go forward
agent-browser reload # Reload page
agent-browser close # Close browser
\`\`\`
### Snapshot (page analysis)
\`\`\`bash
agent-browser snapshot # Full accessibility tree
agent-browser snapshot -i # Interactive elements only (recommended)
agent-browser snapshot -c # Compact output
agent-browser snapshot -d 3 # Limit depth to 3
agent-browser snapshot -s "#main" # Scope to CSS selector
\`\`\`
### Interactions (use @refs from snapshot)
\`\`\`bash
agent-browser click @e1 # Click
agent-browser dblclick @e1 # Double-click
agent-browser focus @e1 # Focus element
agent-browser fill @e2 "text" # Clear and type
agent-browser type @e2 "text" # Type without clearing
agent-browser press Enter # Press key
agent-browser press Control+a # Key combination
agent-browser keydown Shift # Hold key down
agent-browser keyup Shift # Release key
agent-browser hover @e1 # Hover
agent-browser check @e1 # Check checkbox
agent-browser uncheck @e1 # Uncheck checkbox
agent-browser select @e1 "value" # Select dropdown
agent-browser scroll down 500 # Scroll page
agent-browser scrollintoview @e1 # Scroll element into view
agent-browser drag @e1 @e2 # Drag and drop
agent-browser upload @e1 file.pdf # Upload files
\`\`\`
### Get information
\`\`\`bash
agent-browser get text @e1 # Get element text
agent-browser get html @e1 # Get innerHTML
agent-browser get value @e1 # Get input value
agent-browser get attr @e1 href # Get attribute
agent-browser get title # Get page title
agent-browser get url # Get current URL
agent-browser get count ".item" # Count matching elements
agent-browser get box @e1 # Get bounding box
\`\`\`
### Check state
\`\`\`bash
agent-browser is visible @e1 # Check if visible
agent-browser is enabled @e1 # Check if enabled
agent-browser is checked @e1 # Check if checked
\`\`\`
### Screenshots & PDF
\`\`\`bash
agent-browser screenshot # Screenshot to stdout
agent-browser screenshot path.png # Save to file
agent-browser screenshot --full # Full page
agent-browser pdf output.pdf # Save as PDF
\`\`\`
### Video recording
\`\`\`bash
agent-browser record start ./demo.webm # Start recording (uses current URL + state)
agent-browser click @e1 # Perform actions
agent-browser record stop # Stop and save video
agent-browser record restart ./take2.webm # Stop current + start new recording
\`\`\`
Recording creates a fresh context but preserves cookies/storage from your session.
### Wait
\`\`\`bash
agent-browser wait @e1 # Wait for element
agent-browser wait 2000 # Wait milliseconds
agent-browser wait --text "Success" # Wait for text
agent-browser wait --url "**/dashboard" # Wait for URL pattern
agent-browser wait --load networkidle # Wait for network idle
agent-browser wait --fn "window.ready" # Wait for JS condition
\`\`\`
### Mouse control
\`\`\`bash
agent-browser mouse move 100 200 # Move mouse
agent-browser mouse down left # Press button
agent-browser mouse up left # Release button
agent-browser mouse wheel 100 # Scroll wheel
\`\`\`
### Semantic locators (alternative to refs)
\`\`\`bash
agent-browser find role button click --name "Submit"
agent-browser find text "Sign In" click
agent-browser find label "Email" fill "user@test.com"
agent-browser find first ".item" click
agent-browser find nth 2 "a" text
\`\`\`
### Browser settings
\`\`\`bash
agent-browser set viewport 1920 1080 # Set viewport size
agent-browser set device "iPhone 14" # Emulate device
agent-browser set geo 37.7749 -122.4194 # Set geolocation
agent-browser set offline on # Toggle offline mode
agent-browser set headers '{"X-Key":"v"}' # Extra HTTP headers
agent-browser set credentials user pass # HTTP basic auth
agent-browser set media dark # Emulate color scheme
\`\`\`
### Cookies & Storage
\`\`\`bash
agent-browser cookies # Get all cookies
agent-browser cookies set name value # Set cookie
agent-browser cookies clear # Clear cookies
agent-browser storage local # Get all localStorage
agent-browser storage local key # Get specific key
agent-browser storage local set k v # Set value
agent-browser storage local clear # Clear all
agent-browser storage session # Get all sessionStorage
agent-browser storage session key # Get specific key
agent-browser storage session set k v # Set value
agent-browser storage session clear # Clear all
\`\`\`
### Network
\`\`\`bash
agent-browser network route <url> # Intercept requests
agent-browser network route <url> --abort # Block requests
agent-browser network route <url> --body '{}' # Mock response
agent-browser network unroute [url] # Remove routes
agent-browser network requests # View tracked requests
agent-browser network requests --filter api # Filter requests
\`\`\`
### Tabs & Windows
\`\`\`bash
agent-browser tab # List tabs
agent-browser tab new [url] # New tab
agent-browser tab 2 # Switch to tab
agent-browser tab close # Close tab
agent-browser window new # New window
\`\`\`
### Frames
\`\`\`bash
agent-browser frame "#iframe" # Switch to iframe
agent-browser frame main # Back to main frame
\`\`\`
### Dialogs
\`\`\`bash
agent-browser dialog accept [text] # Accept dialog
agent-browser dialog dismiss # Dismiss dialog
\`\`\`
### JavaScript
\`\`\`bash
agent-browser eval "document.title" # Run JavaScript
\`\`\`
## Global Options
| Option | Description |
|--------|-------------|
| \`--session <name>\` | Isolated browser session (\`AGENT_BROWSER_SESSION\` env) |
| \`--profile <path>\` | Persistent browser profile (\`AGENT_BROWSER_PROFILE\` env) |
| \`--headers <json>\` | HTTP headers scoped to URL's origin |
| \`--executable-path <path>\` | Custom browser binary (\`AGENT_BROWSER_EXECUTABLE_PATH\` env) |
| \`--args <args>\` | Browser launch args (\`AGENT_BROWSER_ARGS\` env) |
| \`--user-agent <ua>\` | Custom User-Agent (\`AGENT_BROWSER_USER_AGENT\` env) |
| \`--proxy <url>\` | Proxy server (\`AGENT_BROWSER_PROXY\` env) |
| \`--proxy-bypass <hosts>\` | Hosts to bypass proxy (\`AGENT_BROWSER_PROXY_BYPASS\` env) |
| \`-p, --provider <name>\` | Cloud browser provider (\`AGENT_BROWSER_PROVIDER\` env) |
| \`--json\` | Machine-readable JSON output |
| \`--headed\` | Show browser window (not headless) |
| \`--cdp <port\\|wss://url>\` | Connect via Chrome DevTools Protocol |
| \`--debug\` | Debug output |
## Example: Form submission
\`\`\`bash
agent-browser open https://example.com/form
agent-browser snapshot -i
# Output shows: textbox "Email" [ref=e1], textbox "Password" [ref=e2], button "Submit" [ref=e3]
agent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser snapshot -i # Check result
\`\`\`
## Example: Authentication with saved state
\`\`\`bash
# Login once
agent-browser open https://app.example.com/login
agent-browser snapshot -i
agent-browser fill @e1 "username"
agent-browser fill @e2 "password"
agent-browser click @e3
agent-browser wait --url "**/dashboard"
agent-browser state save auth.json
# Later sessions: load saved state
agent-browser state load auth.json
agent-browser open https://app.example.com/dashboard
\`\`\`
### Header-based Auth (Skip login flows)
\`\`\`bash
# Headers scoped to api.example.com only
agent-browser open api.example.com --headers '{"Authorization": "Bearer <token>"}'
# Navigate to another domain - headers NOT sent (safe)
agent-browser open other-site.com
# Global headers (all domains)
agent-browser set headers '{"X-Custom-Header": "value"}'
\`\`\`
## Sessions & Persistent Profiles
### Sessions (parallel browsers)
\`\`\`bash
agent-browser --session test1 open site-a.com
agent-browser --session test2 open site-b.com
agent-browser session list
\`\`\`
### Persistent Profiles
Persists cookies, localStorage, IndexedDB, service workers, cache, login sessions across browser restarts.
\`\`\`bash
agent-browser --profile ~/.myapp-profile open myapp.com
# Or via env var
AGENT_BROWSER_PROFILE=~/.myapp-profile agent-browser open myapp.com
\`\`\`
- Use different profile paths for different projects
- Login once → restart browser → still logged in
- Stores: cookies, localStorage, IndexedDB, service workers, browser cache
## JSON output (for parsing)
Add \`--json\` for machine-readable output:
\`\`\`bash
agent-browser snapshot -i --json
agent-browser get text @e1 --json
\`\`\`
## Debugging
\`\`\`bash
agent-browser open example.com --headed # Show browser window
agent-browser console # View console messages
agent-browser errors # View page errors
agent-browser record start ./debug.webm # Record from current page
agent-browser record stop # Save recording
agent-browser connect 9222 # Local CDP port
agent-browser --cdp "wss://browser-service.com/cdp?token=..." snapshot # Remote via WebSocket
agent-browser console --clear # Clear console
agent-browser errors --clear # Clear errors
agent-browser highlight @e1 # Highlight element
agent-browser trace start # Start recording trace
agent-browser trace stop trace.zip # Stop and save trace
\`\`\`
---
Install: \`bun add -g agent-browser && agent-browser install\`. Run \`agent-browser --help\` for all commands. Repo: https://github.com/vercel-labs/agent-browser`,
allowedTools: ["Bash(agent-browser:*)"],
}
const frontendUiUxSkill: BuiltinSkill = {
name: "frontend-ui-ux",
description: "Designer-turned-developer who crafts stunning UI/UX even without design mockups",
@@ -1198,6 +1496,234 @@ POTENTIAL ACTIONS:
- Bisect without proper good/bad boundaries -> Wasted time`,
}
export function createBuiltinSkills(): BuiltinSkill[] {
return [playwrightSkill, frontendUiUxSkill, gitMasterSkill]
const devBrowserSkill: BuiltinSkill = {
name: "dev-browser",
description:
"Browser automation with persistent page state. Use when users ask to navigate websites, fill forms, take screenshots, extract web data, test web apps, or automate browser workflows. Trigger phrases include 'go to [url]', 'click on', 'fill out the form', 'take a screenshot', 'scrape', 'automate', 'test the website', 'log into', or any browser interaction request.",
template: `# Dev Browser Skill
Browser automation that maintains page state across script executions. Write small, focused scripts to accomplish tasks incrementally. Once you've proven out part of a workflow and there is repeated work to be done, you can write a script to do the repeated work in a single execution.
## Choosing Your Approach
- **Local/source-available sites**: Read the source code first to write selectors directly
- **Unknown page layouts**: Use \`getAISnapshot()\` to discover elements and \`selectSnapshotRef()\` to interact with them
- **Visual feedback**: Take screenshots to see what the user sees
## Setup
**IMPORTANT**: Before using this skill, ensure the server is running. See [references/installation.md](references/installation.md) for platform-specific setup instructions (macOS, Linux, Windows).
Two modes available. Ask the user if unclear which to use.
### Standalone Mode (Default)
Launches a new Chromium browser for fresh automation sessions.
**macOS/Linux:**
\`\`\`bash
./skills/dev-browser/server.sh &
\`\`\`
**Windows (PowerShell):**
\`\`\`powershell
Start-Process -NoNewWindow -FilePath "node" -ArgumentList "skills/dev-browser/server.js"
\`\`\`
Add \`--headless\` flag if user requests it. **Wait for the \`Ready\` message before running scripts.**
### Extension Mode
Connects to user's existing Chrome browser. Use this when:
- The user is already logged into sites and wants you to do things behind an authed experience that isn't local dev.
- The user asks you to use the extension
**Important**: The core flow is still the same. You create named pages inside of their browser.
**Start the relay server:**
**macOS/Linux:**
\`\`\`bash
cd skills/dev-browser && npm i && npm run start-extension &
\`\`\`
**Windows (PowerShell):**
\`\`\`powershell
cd skills/dev-browser; npm i; Start-Process -NoNewWindow -FilePath "npm" -ArgumentList "run", "start-extension"
\`\`\`
Wait for \`Waiting for extension to connect...\` followed by \`Extension connected\` in the console.
If the extension hasn't connected yet, tell the user to launch and activate it. Download link: https://github.com/SawyerHood/dev-browser/releases
## Writing Scripts
> **Run all scripts from \`skills/dev-browser/\` directory.** The \`@/\` import alias requires this directory's config.
Execute scripts inline using heredocs:
**macOS/Linux:**
\`\`\`bash
cd skills/dev-browser && npx tsx <<'EOF'
import { connect, waitForPageLoad } from "@/client.js";
const client = await connect();
const page = await client.page("example", { viewport: { width: 1920, height: 1080 } });
await page.goto("https://example.com");
await waitForPageLoad(page);
console.log({ title: await page.title(), url: page.url() });
await client.disconnect();
EOF
\`\`\`
**Windows (PowerShell):**
\`\`\`powershell
cd skills/dev-browser
@"
import { connect, waitForPageLoad } from "@/client.js";
const client = await connect();
const page = await client.page("example", { viewport: { width: 1920, height: 1080 } });
await page.goto("https://example.com");
await waitForPageLoad(page);
console.log({ title: await page.title(), url: page.url() });
await client.disconnect();
"@ | npx tsx --input-type=module
\`\`\`
### Key Principles
1. **Small scripts**: Each script does ONE thing (navigate, click, fill, check)
2. **Evaluate state**: Log/return state at the end to decide next steps
3. **Descriptive page names**: Use \`"checkout"\`, \`"login"\`, not \`"main"\`
4. **Disconnect to exit**: \`await client.disconnect()\` - pages persist on server
5. **Plain JS in evaluate**: \`page.evaluate()\` runs in browser - no TypeScript syntax
## Workflow Loop
1. **Write a script** to perform one action
2. **Run it** and observe the output
3. **Evaluate** - did it work? What's the current state?
4. **Decide** - is the task complete or do we need another script?
5. **Repeat** until task is done
### No TypeScript in Browser Context
Code passed to \`page.evaluate()\` runs in the browser, which doesn't understand TypeScript:
\`\`\`typescript
// Correct: plain JavaScript
const text = await page.evaluate(() => {
return document.body.innerText;
});
// Wrong: TypeScript syntax will fail at runtime
const text = await page.evaluate(() => {
const el: HTMLElement = document.body; // Type annotation breaks in browser!
return el.innerText;
});
\`\`\`
## Scraping Data
For scraping large datasets, intercept and replay network requests rather than scrolling the DOM. See [references/scraping.md](references/scraping.md) for the complete guide.
## Client API
\`\`\`typescript
const client = await connect();
// Get or create named page
const page = await client.page("name");
const pageWithSize = await client.page("name", { viewport: { width: 1920, height: 1080 } });
const pages = await client.list(); // List all page names
await client.close("name"); // Close a page
await client.disconnect(); // Disconnect (pages persist)
// ARIA Snapshot methods
const snapshot = await client.getAISnapshot("name"); // Get accessibility tree
const element = await client.selectSnapshotRef("name", "e5"); // Get element by ref
\`\`\`
## Waiting
\`\`\`typescript
import { waitForPageLoad } from "@/client.js";
await waitForPageLoad(page); // After navigation
await page.waitForSelector(".results"); // For specific elements
await page.waitForURL("**/success"); // For specific URL
\`\`\`
## Screenshots
\`\`\`typescript
await page.screenshot({ path: "tmp/screenshot.png" });
await page.screenshot({ path: "tmp/full.png", fullPage: true });
\`\`\`
## ARIA Snapshot (Element Discovery)
Use \`getAISnapshot()\` to discover page elements. Returns YAML-formatted accessibility tree:
\`\`\`yaml
- banner:
- link "Hacker News" [ref=e1]
- navigation:
- link "new" [ref=e2]
- main:
- list:
- listitem:
- link "Article Title" [ref=e8]
\`\`\`
**Interacting with refs:**
\`\`\`typescript
const snapshot = await client.getAISnapshot("hackernews");
console.log(snapshot); // Find the ref you need
const element = await client.selectSnapshotRef("hackernews", "e2");
await element.click();
\`\`\`
## Error Recovery
Page state persists after failures. Debug with:
\`\`\`bash
cd skills/dev-browser && npx tsx <<'EOF'
import { connect } from "@/client.js";
const client = await connect();
const page = await client.page("hackernews");
await page.screenshot({ path: "tmp/debug.png" });
console.log({
url: page.url(),
title: await page.title(),
bodyText: await page.textContent("body").then((t) => t?.slice(0, 200)),
});
await client.disconnect();
EOF
\`\`\``,
}
export interface CreateBuiltinSkillsOptions {
browserProvider?: BrowserAutomationProvider
}
export function createBuiltinSkills(options: CreateBuiltinSkillsOptions = {}): BuiltinSkill[] {
const { browserProvider = "playwright" } = options
const browserSkill = browserProvider === "agent-browser" ? agentBrowserSkill : playwrightSkill
return [browserSkill, frontendUiUxSkill, gitMasterSkill, devBrowserSkill]
}

View File

@@ -7,6 +7,10 @@ export interface ClaudeCodeMcpServer {
args?: string[]
env?: Record<string, string>
headers?: Record<string, string>
oauth?: {
clientId?: string
scopes?: string[]
}
disabled?: boolean
}

View File

@@ -1,4 +1,4 @@
import { describe, test, expect, beforeEach } from "bun:test"
import { describe, test, expect, beforeEach, afterEach } from "bun:test"
import {
setSessionAgent,
getSessionAgent,
@@ -13,9 +13,11 @@ describe("claude-code-session-state", () => {
beforeEach(() => {
// #given - clean state before each test
_resetForTesting()
clearSessionAgent("test-session-1")
clearSessionAgent("test-session-2")
clearSessionAgent("test-prometheus-session")
})
afterEach(() => {
// #then - cleanup after each test to prevent pollution
_resetForTesting()
})
describe("setSessionAgent", () => {
@@ -37,7 +39,7 @@ describe("claude-code-session-state", () => {
setSessionAgent(sessionID, "Prometheus (Planner)")
// #when - try to overwrite
setSessionAgent(sessionID, "Sisyphus")
setSessionAgent(sessionID, "sisyphus")
// #then - first agent preserved
expect(getSessionAgent(sessionID)).toBe("Prometheus (Planner)")
@@ -58,10 +60,10 @@ describe("claude-code-session-state", () => {
setSessionAgent(sessionID, "Prometheus (Planner)")
// #when - force update
updateSessionAgent(sessionID, "Sisyphus")
updateSessionAgent(sessionID, "sisyphus")
// #then
expect(getSessionAgent(sessionID)).toBe("Sisyphus")
expect(getSessionAgent(sessionID)).toBe("sisyphus")
})
})
@@ -92,9 +94,9 @@ describe("claude-code-session-state", () => {
expect(getMainSessionID()).toBe(mainID)
})
test.skip("should return undefined when not set", () => {
// #given - not set
// TODO: Fix flaky test - parallel test execution causes state pollution
test("should return undefined when not set", () => {
// #given - explicit reset to ensure clean state (parallel test isolation)
_resetForTesting()
// #then
expect(getMainSessionID()).toBeUndefined()
})
@@ -129,7 +131,7 @@ describe("claude-code-session-state", () => {
// #given - user switches to custom agent "MyCustomAgent"
const sessionID = "test-session-custom"
const customAgent = "MyCustomAgent"
const defaultAgent = "Sisyphus"
const defaultAgent = "sisyphus"
// User switches to custom agent (via UI)
setSessionAgent(sessionID, customAgent)

View File

@@ -14,6 +14,7 @@ export function getMainSessionID(): string | undefined {
export function _resetForTesting(): void {
_mainSessionID = undefined
subagentSessions.clear()
sessionAgentMap.clear()
}
const sessionAgentMap = new Map<string, string>()

View File

@@ -21,7 +21,7 @@ describe("createContextInjectorMessagesTransformHook", () => {
sessionID,
role,
time: { created: Date.now() },
agent: "Sisyphus",
agent: "sisyphus",
model: { providerID: "test", modelID: "test" },
path: { cwd: "/", root: "/" },
},

View File

@@ -5,7 +5,7 @@ import type { MessageMeta, OriginalMessageContext, TextPart, ToolPermission } fr
export interface StoredMessage {
agent?: string
model?: { providerID?: string; modelID?: string }
model?: { providerID?: string; modelID?: string; variant?: string }
tools?: Record<string, ToolPermission>
}
@@ -141,9 +141,17 @@ export function injectHookMessage(
const resolvedAgent = originalMessage.agent ?? fallback?.agent ?? "general"
const resolvedModel =
originalMessage.model?.providerID && originalMessage.model?.modelID
? { providerID: originalMessage.model.providerID, modelID: originalMessage.model.modelID }
? {
providerID: originalMessage.model.providerID,
modelID: originalMessage.model.modelID,
...(originalMessage.model.variant ? { variant: originalMessage.model.variant } : {})
}
: fallback?.model?.providerID && fallback?.model?.modelID
? { providerID: fallback.model.providerID, modelID: fallback.model.modelID }
? {
providerID: fallback.model.providerID,
modelID: fallback.model.modelID,
...(fallback.model.variant ? { variant: fallback.model.variant } : {})
}
: undefined
const resolvedTools = originalMessage.tools ?? fallback?.tools

View File

@@ -12,6 +12,7 @@ export interface MessageMeta {
model?: {
providerID: string
modelID: string
variant?: string
}
path?: {
cwd: string
@@ -25,6 +26,7 @@ export interface OriginalMessageContext {
model?: {
providerID?: string
modelID?: string
variant?: string
}
path?: {
cwd?: string

View File

@@ -0,0 +1,129 @@
import { afterEach, describe, expect, it } from "bun:test"
import { findAvailablePort, startCallbackServer, type CallbackServer } from "./callback-server"
describe("findAvailablePort", () => {
it("returns the start port when it is available", async () => {
//#given
const startPort = 19877
//#when
const port = await findAvailablePort(startPort)
//#then
expect(port).toBeGreaterThanOrEqual(startPort)
expect(port).toBeLessThan(startPort + 20)
})
it("skips busy ports and returns next available", async () => {
//#given
const blocker = Bun.serve({
port: 19877,
hostname: "127.0.0.1",
fetch: () => new Response(),
})
//#when
const port = await findAvailablePort(19877)
//#then
expect(port).toBeGreaterThan(19877)
blocker.stop(true)
})
})
describe("startCallbackServer", () => {
let server: CallbackServer | null = null
afterEach(() => {
server?.close()
server = null
})
it("starts server and returns port", async () => {
//#given - no preconditions
//#when
server = await startCallbackServer()
//#then
expect(server.port).toBeGreaterThanOrEqual(19877)
expect(typeof server.waitForCallback).toBe("function")
expect(typeof server.close).toBe("function")
})
it("resolves callback with code and state from query params", async () => {
//#given
server = await startCallbackServer()
const callbackUrl = `http://127.0.0.1:${server.port}/oauth/callback?code=test-code&state=test-state`
//#when
const fetchPromise = fetch(callbackUrl)
const result = await server.waitForCallback()
const response = await fetchPromise
//#then
expect(result).toEqual({ code: "test-code", state: "test-state" })
expect(response.status).toBe(200)
const html = await response.text()
expect(html).toContain("Authorization successful")
})
it("returns 404 for non-callback routes", async () => {
//#given
server = await startCallbackServer()
//#when
const response = await fetch(`http://127.0.0.1:${server.port}/other`)
//#then
expect(response.status).toBe(404)
})
it("returns 400 and rejects when code is missing", async () => {
//#given
server = await startCallbackServer()
const callbackRejection = server.waitForCallback().catch((e: Error) => e)
//#when
const response = await fetch(`http://127.0.0.1:${server.port}/oauth/callback?state=s`)
//#then
expect(response.status).toBe(400)
const error = await callbackRejection
expect(error).toBeInstanceOf(Error)
expect((error as Error).message).toContain("missing code or state")
})
it("returns 400 and rejects when state is missing", async () => {
//#given
server = await startCallbackServer()
const callbackRejection = server.waitForCallback().catch((e: Error) => e)
//#when
const response = await fetch(`http://127.0.0.1:${server.port}/oauth/callback?code=c`)
//#then
expect(response.status).toBe(400)
const error = await callbackRejection
expect(error).toBeInstanceOf(Error)
expect((error as Error).message).toContain("missing code or state")
})
it("close stops the server immediately", async () => {
//#given
server = await startCallbackServer()
const port = server.port
//#when
server.close()
server = null
//#then
try {
await fetch(`http://127.0.0.1:${port}/oauth/callback?code=c&state=s`)
expect(true).toBe(false)
} catch (error) {
expect(error).toBeDefined()
}
})
})

View File

@@ -0,0 +1,124 @@
const DEFAULT_PORT = 19877
const MAX_PORT_ATTEMPTS = 20
const TIMEOUT_MS = 5 * 60 * 1000
export type OAuthCallbackResult = {
code: string
state: string
}
export type CallbackServer = {
port: number
waitForCallback: () => Promise<OAuthCallbackResult>
close: () => void
}
const SUCCESS_HTML = `<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>OAuth Authorized</title>
<style>
body { font-family: -apple-system, BlinkMacSystemFont, sans-serif; display: flex; justify-content: center; align-items: center; height: 100vh; margin: 0; background: #0a0a0a; color: #fafafa; }
.container { text-align: center; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
p { color: #888; }
</style>
</head>
<body>
<div class="container">
<h1>Authorization successful</h1>
<p>You can close this window and return to your terminal.</p>
</div>
</body>
</html>`
async function isPortAvailable(port: number): Promise<boolean> {
try {
const server = Bun.serve({
port,
hostname: "127.0.0.1",
fetch: () => new Response(),
})
server.stop(true)
return true
} catch {
return false
}
}
export async function findAvailablePort(startPort: number = DEFAULT_PORT): Promise<number> {
for (let attempt = 0; attempt < MAX_PORT_ATTEMPTS; attempt++) {
const port = startPort + attempt
if (await isPortAvailable(port)) {
return port
}
}
throw new Error(`No available port found in range ${startPort}-${startPort + MAX_PORT_ATTEMPTS - 1}`)
}
export async function startCallbackServer(startPort: number = DEFAULT_PORT): Promise<CallbackServer> {
const port = await findAvailablePort(startPort)
let resolveCallback: ((result: OAuthCallbackResult) => void) | null = null
let rejectCallback: ((error: Error) => void) | null = null
const callbackPromise = new Promise<OAuthCallbackResult>((resolve, reject) => {
resolveCallback = resolve
rejectCallback = reject
})
const timeoutId = setTimeout(() => {
rejectCallback?.(new Error("OAuth callback timed out after 5 minutes"))
server.stop(true)
}, TIMEOUT_MS)
const server = Bun.serve({
port,
hostname: "127.0.0.1",
fetch(request: Request): Response {
const url = new URL(request.url)
if (url.pathname !== "/oauth/callback") {
return new Response("Not Found", { status: 404 })
}
const oauthError = url.searchParams.get("error")
if (oauthError) {
const description = url.searchParams.get("error_description") ?? oauthError
clearTimeout(timeoutId)
rejectCallback?.(new Error(`OAuth authorization failed: ${description}`))
setTimeout(() => server.stop(true), 100)
return new Response(`Authorization failed: ${description}`, { status: 400 })
}
const code = url.searchParams.get("code")
const state = url.searchParams.get("state")
if (!code || !state) {
clearTimeout(timeoutId)
rejectCallback?.(new Error("OAuth callback missing code or state parameter"))
setTimeout(() => server.stop(true), 100)
return new Response("Missing code or state parameter", { status: 400 })
}
resolveCallback?.({ code, state })
clearTimeout(timeoutId)
setTimeout(() => server.stop(true), 100)
return new Response(SUCCESS_HTML, {
headers: { "content-type": "text/html; charset=utf-8" },
})
},
})
return {
port,
waitForCallback: () => callbackPromise,
close: () => {
clearTimeout(timeoutId)
server.stop(true)
},
}
}

View File

@@ -0,0 +1,164 @@
import { describe, expect, it } from "bun:test"
import {
getOrRegisterClient,
type ClientCredentials,
type ClientRegistrationStorage,
type DcrFetch,
} from "./dcr"
function createStorage(initial: ClientCredentials | null):
& ClientRegistrationStorage
& { getLastKey: () => string | null; getLastSet: () => ClientCredentials | null } {
let stored = initial
let lastKey: string | null = null
let lastSet: ClientCredentials | null = null
return {
getClientRegistration: () => stored,
setClientRegistration: (serverIdentifier: string, credentials: ClientCredentials) => {
lastKey = serverIdentifier
lastSet = credentials
stored = credentials
},
getLastKey: () => lastKey,
getLastSet: () => lastSet,
}
}
describe("getOrRegisterClient", () => {
it("returns cached registration when available", async () => {
// #given
const storage = createStorage({
clientId: "cached-client",
clientSecret: "cached-secret",
})
const fetchMock: DcrFetch = async () => {
throw new Error("fetch should not be called")
}
// #when
const result = await getOrRegisterClient({
registrationEndpoint: "https://server.example.com/register",
serverIdentifier: "server-1",
clientName: "Test Client",
redirectUris: ["https://app.example.com/callback"],
tokenEndpointAuthMethod: "client_secret_post",
storage,
fetch: fetchMock,
})
// #then
expect(result).toEqual({
clientId: "cached-client",
clientSecret: "cached-secret",
})
})
it("registers client and stores credentials when endpoint available", async () => {
// #given
const storage = createStorage(null)
let fetchCalled = false
const fetchMock: DcrFetch = async (
input: string,
init?: { method?: string; headers?: Record<string, string>; body?: string }
) => {
fetchCalled = true
expect(input).toBe("https://server.example.com/register")
if (typeof init?.body !== "string") {
throw new Error("Expected request body string")
}
const payload = JSON.parse(init.body)
expect(payload).toEqual({
redirect_uris: ["https://app.example.com/callback"],
client_name: "Test Client",
grant_types: ["authorization_code", "refresh_token"],
response_types: ["code"],
token_endpoint_auth_method: "client_secret_post",
})
return {
ok: true,
json: async () => ({
client_id: "registered-client",
client_secret: "registered-secret",
}),
}
}
// #when
const result = await getOrRegisterClient({
registrationEndpoint: "https://server.example.com/register",
serverIdentifier: "server-2",
clientName: "Test Client",
redirectUris: ["https://app.example.com/callback"],
tokenEndpointAuthMethod: "client_secret_post",
storage,
fetch: fetchMock,
})
// #then
expect(fetchCalled).toBe(true)
expect(result).toEqual({
clientId: "registered-client",
clientSecret: "registered-secret",
})
expect(storage.getLastKey()).toBe("server-2")
expect(storage.getLastSet()).toEqual({
clientId: "registered-client",
clientSecret: "registered-secret",
})
})
it("uses config client id when registration endpoint missing", async () => {
// #given
const storage = createStorage(null)
let fetchCalled = false
const fetchMock: DcrFetch = async () => {
fetchCalled = true
return {
ok: false,
json: async () => ({}),
}
}
// #when
const result = await getOrRegisterClient({
registrationEndpoint: undefined,
serverIdentifier: "server-3",
clientName: "Test Client",
redirectUris: ["https://app.example.com/callback"],
tokenEndpointAuthMethod: "client_secret_post",
clientId: "config-client",
storage,
fetch: fetchMock,
})
// #then
expect(fetchCalled).toBe(false)
expect(result).toEqual({ clientId: "config-client" })
})
it("falls back to config client id when registration fails", async () => {
// #given
const storage = createStorage(null)
const fetchMock: DcrFetch = async () => {
throw new Error("network error")
}
// #when
const result = await getOrRegisterClient({
registrationEndpoint: "https://server.example.com/register",
serverIdentifier: "server-4",
clientName: "Test Client",
redirectUris: ["https://app.example.com/callback"],
tokenEndpointAuthMethod: "client_secret_post",
clientId: "fallback-client",
storage,
fetch: fetchMock,
})
// #then
expect(result).toEqual({ clientId: "fallback-client" })
expect(storage.getLastSet()).toBeNull()
})
})

View File

@@ -0,0 +1,98 @@
export type ClientRegistrationRequest = {
redirect_uris: string[]
client_name: string
grant_types: ["authorization_code", "refresh_token"]
response_types: ["code"]
token_endpoint_auth_method: "none" | "client_secret_post"
}
export type ClientCredentials = {
clientId: string
clientSecret?: string
}
export type ClientRegistrationStorage = {
getClientRegistration: (serverIdentifier: string) => ClientCredentials | null
setClientRegistration: (
serverIdentifier: string,
credentials: ClientCredentials
) => void
}
export type DynamicClientRegistrationOptions = {
registrationEndpoint?: string | null
serverIdentifier?: string
clientName: string
redirectUris: string[]
tokenEndpointAuthMethod: "none" | "client_secret_post"
clientId?: string | null
storage: ClientRegistrationStorage
fetch?: DcrFetch
}
export type DcrFetch = (
input: string,
init?: { method?: string; headers?: Record<string, string>; body?: string }
) => Promise<{ ok: boolean; json: () => Promise<unknown> }>
export async function getOrRegisterClient(
options: DynamicClientRegistrationOptions
): Promise<ClientCredentials | null> {
const serverIdentifier =
options.serverIdentifier ?? options.registrationEndpoint ?? "default"
const existing = options.storage.getClientRegistration(serverIdentifier)
if (existing) return existing
if (!options.registrationEndpoint) {
return options.clientId ? { clientId: options.clientId } : null
}
const fetchImpl = options.fetch ?? globalThis.fetch
const request: ClientRegistrationRequest = {
redirect_uris: options.redirectUris,
client_name: options.clientName,
grant_types: ["authorization_code", "refresh_token"],
response_types: ["code"],
token_endpoint_auth_method: options.tokenEndpointAuthMethod,
}
try {
const response = await fetchImpl(options.registrationEndpoint, {
method: "POST",
headers: { "content-type": "application/json" },
body: JSON.stringify(request),
})
if (!response.ok) {
return options.clientId ? { clientId: options.clientId } : null
}
const data: unknown = await response.json()
const parsed = parseRegistrationResponse(data)
if (!parsed) {
return options.clientId ? { clientId: options.clientId } : null
}
options.storage.setClientRegistration(serverIdentifier, parsed)
return parsed
} catch {
return options.clientId ? { clientId: options.clientId } : null
}
}
function parseRegistrationResponse(data: unknown): ClientCredentials | null {
if (!isRecord(data)) return null
const clientId = data.client_id
if (typeof clientId !== "string" || clientId.length === 0) return null
const clientSecret = data.client_secret
if (typeof clientSecret === "string" && clientSecret.length > 0) {
return { clientId, clientSecret }
}
return { clientId }
}
function isRecord(value: unknown): value is Record<string, unknown> {
return typeof value === "object" && value !== null
}

View File

@@ -0,0 +1,175 @@
import { describe, test, expect, beforeEach, afterEach } from "bun:test"
import { discoverOAuthServerMetadata, resetDiscoveryCache } from "./discovery"
describe("discoverOAuthServerMetadata", () => {
const originalFetch = globalThis.fetch
beforeEach(() => {
resetDiscoveryCache()
})
afterEach(() => {
Object.defineProperty(globalThis, "fetch", { value: originalFetch, configurable: true })
})
test("returns endpoints from PRM + AS discovery", () => {
// #given
const resource = "https://mcp.example.com"
const prmUrl = new URL("/.well-known/oauth-protected-resource", resource).toString()
const authServer = "https://auth.example.com"
const asUrl = new URL("/.well-known/oauth-authorization-server", authServer).toString()
const calls: string[] = []
const fetchMock = async (input: string | URL) => {
const url = typeof input === "string" ? input : input.toString()
calls.push(url)
if (url === prmUrl) {
return new Response(JSON.stringify({ authorization_servers: [authServer] }), { status: 200 })
}
if (url === asUrl) {
return new Response(
JSON.stringify({
authorization_endpoint: "https://auth.example.com/authorize",
token_endpoint: "https://auth.example.com/token",
registration_endpoint: "https://auth.example.com/register",
}),
{ status: 200 }
)
}
return new Response("not found", { status: 404 })
}
Object.defineProperty(globalThis, "fetch", { value: fetchMock, configurable: true })
// #when
return discoverOAuthServerMetadata(resource).then((result) => {
// #then
expect(result).toEqual({
authorizationEndpoint: "https://auth.example.com/authorize",
tokenEndpoint: "https://auth.example.com/token",
registrationEndpoint: "https://auth.example.com/register",
resource,
})
expect(calls).toEqual([prmUrl, asUrl])
})
})
test("falls back to RFC 8414 when PRM returns 404", () => {
// #given
const resource = "https://mcp.example.com"
const prmUrl = new URL("/.well-known/oauth-protected-resource", resource).toString()
const asUrl = new URL("/.well-known/oauth-authorization-server", resource).toString()
const calls: string[] = []
const fetchMock = async (input: string | URL) => {
const url = typeof input === "string" ? input : input.toString()
calls.push(url)
if (url === prmUrl) {
return new Response("not found", { status: 404 })
}
if (url === asUrl) {
return new Response(
JSON.stringify({
authorization_endpoint: "https://mcp.example.com/authorize",
token_endpoint: "https://mcp.example.com/token",
}),
{ status: 200 }
)
}
return new Response("not found", { status: 404 })
}
Object.defineProperty(globalThis, "fetch", { value: fetchMock, configurable: true })
// #when
return discoverOAuthServerMetadata(resource).then((result) => {
// #then
expect(result).toEqual({
authorizationEndpoint: "https://mcp.example.com/authorize",
tokenEndpoint: "https://mcp.example.com/token",
registrationEndpoint: undefined,
resource,
})
expect(calls).toEqual([prmUrl, asUrl])
})
})
test("throws when both PRM and AS discovery return 404", () => {
// #given
const resource = "https://mcp.example.com"
const prmUrl = new URL("/.well-known/oauth-protected-resource", resource).toString()
const asUrl = new URL("/.well-known/oauth-authorization-server", resource).toString()
const fetchMock = async (input: string | URL) => {
const url = typeof input === "string" ? input : input.toString()
if (url === prmUrl || url === asUrl) {
return new Response("not found", { status: 404 })
}
return new Response("not found", { status: 404 })
}
Object.defineProperty(globalThis, "fetch", { value: fetchMock, configurable: true })
// #when
const result = discoverOAuthServerMetadata(resource)
// #then
return expect(result).rejects.toThrow("OAuth authorization server metadata not found")
})
test("throws when AS metadata is malformed", () => {
// #given
const resource = "https://mcp.example.com"
const prmUrl = new URL("/.well-known/oauth-protected-resource", resource).toString()
const authServer = "https://auth.example.com"
const asUrl = new URL("/.well-known/oauth-authorization-server", authServer).toString()
const fetchMock = async (input: string | URL) => {
const url = typeof input === "string" ? input : input.toString()
if (url === prmUrl) {
return new Response(JSON.stringify({ authorization_servers: [authServer] }), { status: 200 })
}
if (url === asUrl) {
return new Response(JSON.stringify({ authorization_endpoint: "https://auth.example.com/authorize" }), {
status: 200,
})
}
return new Response("not found", { status: 404 })
}
Object.defineProperty(globalThis, "fetch", { value: fetchMock, configurable: true })
// #when
const result = discoverOAuthServerMetadata(resource)
// #then
return expect(result).rejects.toThrow("token_endpoint")
})
test("caches discovery results per resource URL", () => {
// #given
const resource = "https://mcp.example.com"
const prmUrl = new URL("/.well-known/oauth-protected-resource", resource).toString()
const authServer = "https://auth.example.com"
const asUrl = new URL("/.well-known/oauth-authorization-server", authServer).toString()
const calls: string[] = []
const fetchMock = async (input: string | URL) => {
const url = typeof input === "string" ? input : input.toString()
calls.push(url)
if (url === prmUrl) {
return new Response(JSON.stringify({ authorization_servers: [authServer] }), { status: 200 })
}
if (url === asUrl) {
return new Response(
JSON.stringify({
authorization_endpoint: "https://auth.example.com/authorize",
token_endpoint: "https://auth.example.com/token",
}),
{ status: 200 }
)
}
return new Response("not found", { status: 404 })
}
Object.defineProperty(globalThis, "fetch", { value: fetchMock, configurable: true })
// #when
return discoverOAuthServerMetadata(resource)
.then(() => discoverOAuthServerMetadata(resource))
.then(() => {
// #then
expect(calls).toEqual([prmUrl, asUrl])
})
})
})

View File

@@ -0,0 +1,123 @@
export interface OAuthServerMetadata {
authorizationEndpoint: string
tokenEndpoint: string
registrationEndpoint?: string
resource: string
}
const discoveryCache = new Map<string, OAuthServerMetadata>()
const pendingDiscovery = new Map<string, Promise<OAuthServerMetadata>>()
function parseHttpsUrl(value: string, label: string): URL {
const parsed = new URL(value)
if (parsed.protocol !== "https:") {
throw new Error(`${label} must use https`)
}
return parsed
}
function readStringField(source: Record<string, unknown>, field: string): string {
const value = source[field]
if (typeof value !== "string" || value.length === 0) {
throw new Error(`OAuth metadata missing ${field}`)
}
return value
}
async function fetchMetadata(url: string): Promise<{ ok: true; json: Record<string, unknown> } | { ok: false; status: number }> {
const response = await fetch(url, { headers: { accept: "application/json" } })
if (!response.ok) {
return { ok: false, status: response.status }
}
const json = (await response.json().catch(() => null)) as Record<string, unknown> | null
if (!json || typeof json !== "object") {
throw new Error("OAuth metadata response is not valid JSON")
}
return { ok: true, json }
}
async function fetchAuthorizationServerMetadata(issuer: string, resource: string): Promise<OAuthServerMetadata> {
const issuerUrl = parseHttpsUrl(issuer, "Authorization server URL")
const issuerPath = issuerUrl.pathname.replace(/\/+$/, "")
const metadataUrl = new URL(`/.well-known/oauth-authorization-server${issuerPath}`, issuerUrl).toString()
const metadata = await fetchMetadata(metadataUrl)
if (!metadata.ok) {
if (metadata.status === 404) {
throw new Error("OAuth authorization server metadata not found")
}
throw new Error(`OAuth authorization server metadata fetch failed (${metadata.status})`)
}
const authorizationEndpoint = parseHttpsUrl(
readStringField(metadata.json, "authorization_endpoint"),
"authorization_endpoint"
).toString()
const tokenEndpoint = parseHttpsUrl(
readStringField(metadata.json, "token_endpoint"),
"token_endpoint"
).toString()
const registrationEndpointValue = metadata.json.registration_endpoint
const registrationEndpoint =
typeof registrationEndpointValue === "string" && registrationEndpointValue.length > 0
? parseHttpsUrl(registrationEndpointValue, "registration_endpoint").toString()
: undefined
return {
authorizationEndpoint,
tokenEndpoint,
registrationEndpoint,
resource,
}
}
function parseAuthorizationServers(metadata: Record<string, unknown>): string[] {
const servers = metadata.authorization_servers
if (!Array.isArray(servers)) return []
return servers.filter((server): server is string => typeof server === "string" && server.length > 0)
}
export async function discoverOAuthServerMetadata(resource: string): Promise<OAuthServerMetadata> {
const resourceUrl = parseHttpsUrl(resource, "Resource server URL")
const resourceKey = resourceUrl.toString()
const cached = discoveryCache.get(resourceKey)
if (cached) return cached
const pending = pendingDiscovery.get(resourceKey)
if (pending) return pending
const discoveryPromise = (async () => {
const prmUrl = new URL("/.well-known/oauth-protected-resource", resourceUrl).toString()
const prmResponse = await fetchMetadata(prmUrl)
if (prmResponse.ok) {
const authServers = parseAuthorizationServers(prmResponse.json)
if (authServers.length === 0) {
throw new Error("OAuth protected resource metadata missing authorization_servers")
}
return fetchAuthorizationServerMetadata(authServers[0], resource)
}
if (prmResponse.status !== 404) {
throw new Error(`OAuth protected resource metadata fetch failed (${prmResponse.status})`)
}
return fetchAuthorizationServerMetadata(resourceKey, resource)
})()
pendingDiscovery.set(resourceKey, discoveryPromise)
try {
const result = await discoveryPromise
discoveryCache.set(resourceKey, result)
return result
} finally {
pendingDiscovery.delete(resourceKey)
}
}
export function resetDiscoveryCache(): void {
discoveryCache.clear()
pendingDiscovery.clear()
}

View File

@@ -0,0 +1 @@
export * from "./schema"

View File

@@ -0,0 +1,223 @@
import { describe, expect, it, beforeEach, afterEach, mock } from "bun:test"
import { createHash, randomBytes } from "node:crypto"
import { McpOAuthProvider, generateCodeVerifier, generateCodeChallenge, buildAuthorizationUrl } from "./provider"
import type { OAuthTokenData } from "./storage"
describe("McpOAuthProvider", () => {
describe("generateCodeVerifier", () => {
it("returns a base64url-encoded 32-byte random string", () => {
//#given
const verifier = generateCodeVerifier()
//#when
const decoded = Buffer.from(verifier, "base64url")
//#then
expect(decoded.length).toBe(32)
expect(verifier).toMatch(/^[A-Za-z0-9_-]+$/)
})
it("produces unique values on each call", () => {
//#given
const first = generateCodeVerifier()
//#when
const second = generateCodeVerifier()
//#then
expect(first).not.toBe(second)
})
})
describe("generateCodeChallenge", () => {
it("returns SHA256 base64url digest of the verifier", () => {
//#given
const verifier = "test-verifier-value"
const expected = createHash("sha256").update(verifier).digest("base64url")
//#when
const challenge = generateCodeChallenge(verifier)
//#then
expect(challenge).toBe(expected)
})
})
describe("buildAuthorizationUrl", () => {
it("builds URL with all required PKCE parameters", () => {
//#given
const endpoint = "https://auth.example.com/authorize"
//#when
const url = buildAuthorizationUrl(endpoint, {
clientId: "my-client",
redirectUri: "http://127.0.0.1:8912/callback",
codeChallenge: "challenge-value",
state: "state-value",
scopes: ["openid", "profile"],
resource: "https://mcp.example.com",
})
//#then
const parsed = new URL(url)
expect(parsed.origin + parsed.pathname).toBe("https://auth.example.com/authorize")
expect(parsed.searchParams.get("response_type")).toBe("code")
expect(parsed.searchParams.get("client_id")).toBe("my-client")
expect(parsed.searchParams.get("redirect_uri")).toBe("http://127.0.0.1:8912/callback")
expect(parsed.searchParams.get("code_challenge")).toBe("challenge-value")
expect(parsed.searchParams.get("code_challenge_method")).toBe("S256")
expect(parsed.searchParams.get("state")).toBe("state-value")
expect(parsed.searchParams.get("scope")).toBe("openid profile")
expect(parsed.searchParams.get("resource")).toBe("https://mcp.example.com")
})
it("omits scope when empty", () => {
//#given
const endpoint = "https://auth.example.com/authorize"
//#when
const url = buildAuthorizationUrl(endpoint, {
clientId: "my-client",
redirectUri: "http://127.0.0.1:8912/callback",
codeChallenge: "challenge-value",
state: "state-value",
scopes: [],
})
//#then
const parsed = new URL(url)
expect(parsed.searchParams.has("scope")).toBe(false)
})
it("omits resource when undefined", () => {
//#given
const endpoint = "https://auth.example.com/authorize"
//#when
const url = buildAuthorizationUrl(endpoint, {
clientId: "my-client",
redirectUri: "http://127.0.0.1:8912/callback",
codeChallenge: "challenge-value",
state: "state-value",
})
//#then
const parsed = new URL(url)
expect(parsed.searchParams.has("resource")).toBe(false)
})
})
describe("constructor and basic methods", () => {
it("stores serverUrl and optional clientId and scopes", () => {
//#given
const options = {
serverUrl: "https://mcp.example.com",
clientId: "my-client",
scopes: ["openid"],
}
//#when
const provider = new McpOAuthProvider(options)
//#then
expect(provider.tokens()).toBeNull()
expect(provider.clientInformation()).toBeNull()
expect(provider.codeVerifier()).toBeNull()
})
it("defaults scopes to empty array", () => {
//#given
const options = { serverUrl: "https://mcp.example.com" }
//#when
const provider = new McpOAuthProvider(options)
//#then
expect(provider.redirectUrl()).toBe("http://127.0.0.1:19877/callback")
})
})
describe("saveCodeVerifier / codeVerifier", () => {
it("stores and retrieves code verifier", () => {
//#given
const provider = new McpOAuthProvider({ serverUrl: "https://mcp.example.com" })
//#when
provider.saveCodeVerifier("my-verifier")
//#then
expect(provider.codeVerifier()).toBe("my-verifier")
})
})
describe("saveTokens / tokens", () => {
let originalEnv: string | undefined
beforeEach(() => {
originalEnv = process.env.OPENCODE_CONFIG_DIR
const { mkdirSync } = require("node:fs")
const { tmpdir } = require("node:os")
const { join } = require("node:path")
const testDir = join(tmpdir(), "mcp-oauth-provider-test-" + Date.now())
mkdirSync(testDir, { recursive: true })
process.env.OPENCODE_CONFIG_DIR = testDir
})
afterEach(() => {
if (originalEnv === undefined) {
delete process.env.OPENCODE_CONFIG_DIR
} else {
process.env.OPENCODE_CONFIG_DIR = originalEnv
}
})
it("persists and loads token data via storage", () => {
//#given
const provider = new McpOAuthProvider({ serverUrl: "https://mcp.example.com" })
const tokenData: OAuthTokenData = {
accessToken: "access-token-123",
refreshToken: "refresh-token-456",
expiresAt: 1710000000,
}
//#when
const saved = provider.saveTokens(tokenData)
const loaded = provider.tokens()
//#then
expect(saved).toBe(true)
expect(loaded).toEqual(tokenData)
})
})
describe("redirectToAuthorization", () => {
it("throws when no client information is set", async () => {
//#given
const provider = new McpOAuthProvider({ serverUrl: "https://mcp.example.com" })
const metadata = {
authorizationEndpoint: "https://auth.example.com/authorize",
tokenEndpoint: "https://auth.example.com/token",
resource: "https://mcp.example.com",
}
//#when
const result = provider.redirectToAuthorization(metadata)
//#then
await expect(result).rejects.toThrow("No client information available")
})
})
describe("redirectUrl", () => {
it("returns localhost callback URL with default port", () => {
//#given
const provider = new McpOAuthProvider({ serverUrl: "https://mcp.example.com" })
//#when
const url = provider.redirectUrl()
//#then
expect(url).toBe("http://127.0.0.1:19877/callback")
})
})
})

View File

@@ -0,0 +1,295 @@
import { createHash, randomBytes } from "node:crypto"
import { createServer } from "node:http"
import { spawn } from "node:child_process"
import type { OAuthTokenData } from "./storage"
import { loadToken, saveToken } from "./storage"
import { discoverOAuthServerMetadata } from "./discovery"
import type { OAuthServerMetadata } from "./discovery"
import { getOrRegisterClient } from "./dcr"
import type { ClientCredentials, ClientRegistrationStorage } from "./dcr"
import { findAvailablePort } from "./callback-server"
export type McpOAuthProviderOptions = {
serverUrl: string
clientId?: string
scopes?: string[]
}
type CallbackResult = {
code: string
state: string
}
function generateCodeVerifier(): string {
return randomBytes(32).toString("base64url")
}
function generateCodeChallenge(verifier: string): string {
return createHash("sha256").update(verifier).digest("base64url")
}
function buildAuthorizationUrl(
authorizationEndpoint: string,
options: {
clientId: string
redirectUri: string
codeChallenge: string
state: string
scopes?: string[]
resource?: string
}
): string {
const url = new URL(authorizationEndpoint)
url.searchParams.set("response_type", "code")
url.searchParams.set("client_id", options.clientId)
url.searchParams.set("redirect_uri", options.redirectUri)
url.searchParams.set("code_challenge", options.codeChallenge)
url.searchParams.set("code_challenge_method", "S256")
url.searchParams.set("state", options.state)
if (options.scopes && options.scopes.length > 0) {
url.searchParams.set("scope", options.scopes.join(" "))
}
if (options.resource) {
url.searchParams.set("resource", options.resource)
}
return url.toString()
}
const CALLBACK_TIMEOUT_MS = 5 * 60 * 1000
function startCallbackServer(port: number): Promise<CallbackResult> {
return new Promise((resolve, reject) => {
let timeoutId: ReturnType<typeof setTimeout>
const server = createServer((request, response) => {
clearTimeout(timeoutId)
const requestUrl = new URL(request.url ?? "/", `http://localhost:${port}`)
const code = requestUrl.searchParams.get("code")
const state = requestUrl.searchParams.get("state")
const error = requestUrl.searchParams.get("error")
if (error) {
const errorDescription = requestUrl.searchParams.get("error_description") ?? error
response.writeHead(400, { "content-type": "text/html" })
response.end("<html><body><h1>Authorization failed</h1></body></html>")
server.close()
reject(new Error(`OAuth authorization error: ${errorDescription}`))
return
}
if (!code || !state) {
response.writeHead(400, { "content-type": "text/html" })
response.end("<html><body><h1>Missing code or state</h1></body></html>")
server.close()
reject(new Error("OAuth callback missing code or state parameter"))
return
}
response.writeHead(200, { "content-type": "text/html" })
response.end("<html><body><h1>Authorization successful. You can close this tab.</h1></body></html>")
server.close()
resolve({ code, state })
})
timeoutId = setTimeout(() => {
server.close()
reject(new Error("OAuth callback timed out after 5 minutes"))
}, CALLBACK_TIMEOUT_MS)
server.listen(port, "127.0.0.1")
server.on("error", (err) => {
clearTimeout(timeoutId)
reject(err)
})
})
}
function openBrowser(url: string): void {
const platform = process.platform
let cmd: string
let args: string[]
if (platform === "darwin") {
cmd = "open"
args = [url]
} else if (platform === "win32") {
cmd = "explorer"
args = [url]
} else {
cmd = "xdg-open"
args = [url]
}
try {
const child = spawn(cmd, args, { stdio: "ignore", detached: true })
child.on("error", () => {})
child.unref()
} catch {
// Browser open failed — user must navigate manually
}
}
export class McpOAuthProvider {
private readonly serverUrl: string
private readonly configClientId: string | undefined
private readonly scopes: string[]
private storedCodeVerifier: string | null = null
private storedClientInfo: ClientCredentials | null = null
private callbackPort: number | null = null
constructor(options: McpOAuthProviderOptions) {
this.serverUrl = options.serverUrl
this.configClientId = options.clientId
this.scopes = options.scopes ?? []
}
tokens(): OAuthTokenData | null {
return loadToken(this.serverUrl, this.serverUrl)
}
saveTokens(tokenData: OAuthTokenData): boolean {
return saveToken(this.serverUrl, this.serverUrl, tokenData)
}
clientInformation(): ClientCredentials | null {
if (this.storedClientInfo) return this.storedClientInfo
const tokenData = this.tokens()
if (tokenData?.clientInfo) {
this.storedClientInfo = tokenData.clientInfo
return this.storedClientInfo
}
return null
}
redirectUrl(): string {
return `http://127.0.0.1:${this.callbackPort ?? 19877}/callback`
}
saveCodeVerifier(verifier: string): void {
this.storedCodeVerifier = verifier
}
codeVerifier(): string | null {
return this.storedCodeVerifier
}
async redirectToAuthorization(metadata: OAuthServerMetadata): Promise<CallbackResult> {
const verifier = generateCodeVerifier()
this.saveCodeVerifier(verifier)
const challenge = generateCodeChallenge(verifier)
const state = randomBytes(16).toString("hex")
const clientInfo = this.clientInformation()
if (!clientInfo) {
throw new Error("No client information available. Run login() or register a client first.")
}
if (this.callbackPort === null) {
this.callbackPort = await findAvailablePort()
}
const authUrl = buildAuthorizationUrl(metadata.authorizationEndpoint, {
clientId: clientInfo.clientId,
redirectUri: this.redirectUrl(),
codeChallenge: challenge,
state,
scopes: this.scopes,
resource: metadata.resource,
})
const callbackPromise = startCallbackServer(this.callbackPort)
openBrowser(authUrl)
const result = await callbackPromise
if (result.state !== state) {
throw new Error("OAuth state mismatch")
}
return result
}
async login(): Promise<OAuthTokenData> {
const metadata = await discoverOAuthServerMetadata(this.serverUrl)
const clientRegistrationStorage: ClientRegistrationStorage = {
getClientRegistration: () => this.storedClientInfo,
setClientRegistration: (_serverIdentifier: string, credentials: ClientCredentials) => {
this.storedClientInfo = credentials
},
}
const clientInfo = await getOrRegisterClient({
registrationEndpoint: metadata.registrationEndpoint,
serverIdentifier: this.serverUrl,
clientName: "oh-my-opencode",
redirectUris: [this.redirectUrl()],
tokenEndpointAuthMethod: "none",
clientId: this.configClientId,
storage: clientRegistrationStorage,
})
if (!clientInfo) {
throw new Error("Failed to obtain client credentials. Provide a clientId or ensure the server supports DCR.")
}
this.storedClientInfo = clientInfo
const { code } = await this.redirectToAuthorization(metadata)
const verifier = this.codeVerifier()
if (!verifier) {
throw new Error("Code verifier not found")
}
const tokenResponse = await fetch(metadata.tokenEndpoint, {
method: "POST",
headers: { "content-type": "application/x-www-form-urlencoded" },
body: new URLSearchParams({
grant_type: "authorization_code",
code,
redirect_uri: this.redirectUrl(),
client_id: clientInfo.clientId,
code_verifier: verifier,
...(metadata.resource ? { resource: metadata.resource } : {}),
}).toString(),
})
if (!tokenResponse.ok) {
let errorDetail = `${tokenResponse.status}`
try {
const body = (await tokenResponse.json()) as Record<string, unknown>
if (body.error) {
errorDetail = `${tokenResponse.status} ${body.error}`
if (body.error_description) {
errorDetail += `: ${body.error_description}`
}
}
} catch {
// Response body not JSON
}
throw new Error(`Token exchange failed: ${errorDetail}`)
}
const tokenData = (await tokenResponse.json()) as Record<string, unknown>
const accessToken = tokenData.access_token
if (typeof accessToken !== "string") {
throw new Error("Token response missing access_token")
}
const oauthTokenData: OAuthTokenData = {
accessToken,
refreshToken: typeof tokenData.refresh_token === "string" ? tokenData.refresh_token : undefined,
expiresAt:
typeof tokenData.expires_in === "number" ? Math.floor(Date.now() / 1000) + tokenData.expires_in : undefined,
clientInfo: {
clientId: clientInfo.clientId,
clientSecret: clientInfo.clientSecret,
},
}
this.saveTokens(oauthTokenData)
return oauthTokenData
}
}
export { generateCodeVerifier, generateCodeChallenge, buildAuthorizationUrl, startCallbackServer }

View File

@@ -0,0 +1,121 @@
import { describe, expect, it } from "bun:test"
import { addResourceToParams, getResourceIndicator } from "./resource-indicator"
describe("getResourceIndicator", () => {
it("returns URL unchanged when already normalized", () => {
// #given
const url = "https://mcp.example.com"
// #when
const result = getResourceIndicator(url)
// #then
expect(result).toBe("https://mcp.example.com")
})
it("strips trailing slash", () => {
// #given
const url = "https://mcp.example.com/"
// #when
const result = getResourceIndicator(url)
// #then
expect(result).toBe("https://mcp.example.com")
})
it("strips query parameters", () => {
// #given
const url = "https://mcp.example.com/v1?token=abc&debug=true"
// #when
const result = getResourceIndicator(url)
// #then
expect(result).toBe("https://mcp.example.com/v1")
})
it("strips fragment", () => {
// #given
const url = "https://mcp.example.com/v1#section"
// #when
const result = getResourceIndicator(url)
// #then
expect(result).toBe("https://mcp.example.com/v1")
})
it("strips query and trailing slash together", () => {
// #given
const url = "https://mcp.example.com/api/?key=val"
// #when
const result = getResourceIndicator(url)
// #then
expect(result).toBe("https://mcp.example.com/api")
})
it("preserves path segments", () => {
// #given
const url = "https://mcp.example.com/org/project/v2"
// #when
const result = getResourceIndicator(url)
// #then
expect(result).toBe("https://mcp.example.com/org/project/v2")
})
it("preserves port number", () => {
// #given
const url = "https://mcp.example.com:8443/api/"
// #when
const result = getResourceIndicator(url)
// #then
expect(result).toBe("https://mcp.example.com:8443/api")
})
})
describe("addResourceToParams", () => {
it("sets resource parameter on empty params", () => {
// #given
const params = new URLSearchParams()
const resource = "https://mcp.example.com"
// #when
addResourceToParams(params, resource)
// #then
expect(params.get("resource")).toBe("https://mcp.example.com")
})
it("adds resource alongside existing parameters", () => {
// #given
const params = new URLSearchParams({ grant_type: "authorization_code" })
const resource = "https://mcp.example.com/v1"
// #when
addResourceToParams(params, resource)
// #then
expect(params.get("grant_type")).toBe("authorization_code")
expect(params.get("resource")).toBe("https://mcp.example.com/v1")
})
it("overwrites existing resource parameter", () => {
// #given
const params = new URLSearchParams({ resource: "https://old.example.com" })
const resource = "https://new.example.com"
// #when
addResourceToParams(params, resource)
// #then
expect(params.get("resource")).toBe("https://new.example.com")
expect(params.getAll("resource")).toHaveLength(1)
})
})

View File

@@ -0,0 +1,16 @@
export function getResourceIndicator(url: string): string {
const parsed = new URL(url)
parsed.search = ""
parsed.hash = ""
let normalized = parsed.toString()
if (normalized.endsWith("/")) {
normalized = normalized.slice(0, -1)
}
return normalized
}
export function addResourceToParams(params: URLSearchParams, resource: string): void {
params.set("resource", resource)
}

View File

@@ -0,0 +1,60 @@
/// <reference types="bun-types" />
import { describe, expect, test } from "bun:test"
import { McpOauthSchema } from "./schema"
describe("McpOauthSchema", () => {
test("parses empty oauth config", () => {
//#given
const input = {}
//#when
const result = McpOauthSchema.parse(input)
//#then
expect(result).toEqual({})
})
test("parses oauth config with clientId", () => {
//#given
const input = { clientId: "client-123" }
//#when
const result = McpOauthSchema.parse(input)
//#then
expect(result).toEqual({ clientId: "client-123" })
})
test("parses oauth config with scopes", () => {
//#given
const input = { scopes: ["openid", "profile"] }
//#when
const result = McpOauthSchema.parse(input)
//#then
expect(result).toEqual({ scopes: ["openid", "profile"] })
})
test("rejects non-string clientId", () => {
//#given
const input = { clientId: 123 }
//#when
const result = McpOauthSchema.safeParse(input)
//#then
expect(result.success).toBe(false)
})
test("rejects non-string scopes", () => {
//#given
const input = { scopes: ["openid", 42] }
//#when
const result = McpOauthSchema.safeParse(input)
//#then
expect(result.success).toBe(false)
})
})

View File

@@ -0,0 +1,8 @@
import { z } from "zod"
export const McpOauthSchema = z.object({
clientId: z.string().optional(),
scopes: z.array(z.string()).optional(),
})
export type McpOauth = z.infer<typeof McpOauthSchema>

View File

@@ -0,0 +1,223 @@
import { describe, expect, it } from "bun:test"
import { isStepUpRequired, mergeScopes, parseWwwAuthenticate } from "./step-up"
describe("parseWwwAuthenticate", () => {
it("parses scope from simple Bearer header", () => {
// #given
const header = 'Bearer scope="read write"'
// #when
const result = parseWwwAuthenticate(header)
// #then
expect(result).toEqual({ requiredScopes: ["read", "write"] })
})
it("parses scope with error fields", () => {
// #given
const header = 'Bearer error="insufficient_scope", scope="admin"'
// #when
const result = parseWwwAuthenticate(header)
// #then
expect(result).toEqual({
requiredScopes: ["admin"],
error: "insufficient_scope",
})
})
it("parses all fields including error_description", () => {
// #given
const header =
'Bearer realm="example", error="insufficient_scope", error_description="Need admin access", scope="admin write"'
// #when
const result = parseWwwAuthenticate(header)
// #then
expect(result).toEqual({
requiredScopes: ["admin", "write"],
error: "insufficient_scope",
errorDescription: "Need admin access",
})
})
it("returns null for non-Bearer scheme", () => {
// #given
const header = 'Basic realm="example"'
// #when
const result = parseWwwAuthenticate(header)
// #then
expect(result).toBeNull()
})
it("returns null when no scope parameter present", () => {
// #given
const header = 'Bearer error="invalid_token"'
// #when
const result = parseWwwAuthenticate(header)
// #then
expect(result).toBeNull()
})
it("returns null for empty scope value", () => {
// #given
const header = 'Bearer scope=""'
// #when
const result = parseWwwAuthenticate(header)
// #then
expect(result).toBeNull()
})
it("returns null for bare Bearer with no params", () => {
// #given
const header = "Bearer"
// #when
const result = parseWwwAuthenticate(header)
// #then
expect(result).toBeNull()
})
it("handles case-insensitive Bearer prefix", () => {
// #given
const header = 'bearer scope="read"'
// #when
const result = parseWwwAuthenticate(header)
// #then
expect(result).toEqual({ requiredScopes: ["read"] })
})
it("parses single scope value", () => {
// #given
const header = 'Bearer scope="admin"'
// #when
const result = parseWwwAuthenticate(header)
// #then
expect(result).toEqual({ requiredScopes: ["admin"] })
})
})
describe("mergeScopes", () => {
it("merges new scopes into existing", () => {
// #given
const existing = ["read", "write"]
const required = ["admin", "write"]
// #when
const result = mergeScopes(existing, required)
// #then
expect(result).toEqual(["read", "write", "admin"])
})
it("returns required when existing is empty", () => {
// #given
const existing: string[] = []
const required = ["read", "write"]
// #when
const result = mergeScopes(existing, required)
// #then
expect(result).toEqual(["read", "write"])
})
it("returns existing when required is empty", () => {
// #given
const existing = ["read"]
const required: string[] = []
// #when
const result = mergeScopes(existing, required)
// #then
expect(result).toEqual(["read"])
})
it("deduplicates identical scopes", () => {
// #given
const existing = ["read", "write"]
const required = ["read", "write"]
// #when
const result = mergeScopes(existing, required)
// #then
expect(result).toEqual(["read", "write"])
})
})
describe("isStepUpRequired", () => {
it("returns step-up info for 403 with WWW-Authenticate", () => {
// #given
const statusCode = 403
const headers = { "www-authenticate": 'Bearer scope="admin"' }
// #when
const result = isStepUpRequired(statusCode, headers)
// #then
expect(result).toEqual({ requiredScopes: ["admin"] })
})
it("returns null for non-403 status", () => {
// #given
const statusCode = 401
const headers = { "www-authenticate": 'Bearer scope="admin"' }
// #when
const result = isStepUpRequired(statusCode, headers)
// #then
expect(result).toBeNull()
})
it("returns null when no WWW-Authenticate header", () => {
// #given
const statusCode = 403
const headers = { "content-type": "application/json" }
// #when
const result = isStepUpRequired(statusCode, headers)
// #then
expect(result).toBeNull()
})
it("handles capitalized WWW-Authenticate header", () => {
// #given
const statusCode = 403
const headers = { "WWW-Authenticate": 'Bearer scope="read write"' }
// #when
const result = isStepUpRequired(statusCode, headers)
// #then
expect(result).toEqual({ requiredScopes: ["read", "write"] })
})
it("returns null for 403 with unparseable WWW-Authenticate", () => {
// #given
const statusCode = 403
const headers = { "www-authenticate": 'Basic realm="example"' }
// #when
const result = isStepUpRequired(statusCode, headers)
// #then
expect(result).toBeNull()
})
})

View File

@@ -0,0 +1,79 @@
export interface StepUpInfo {
requiredScopes: string[]
error?: string
errorDescription?: string
}
export function parseWwwAuthenticate(header: string): StepUpInfo | null {
const trimmed = header.trim()
const lowerHeader = trimmed.toLowerCase()
const bearerIndex = lowerHeader.indexOf("bearer")
if (bearerIndex === -1) {
return null
}
const params = trimmed.slice(bearerIndex + "bearer".length).trim()
if (params.length === 0) {
return null
}
const scope = extractParam(params, "scope")
if (scope === null) {
return null
}
const requiredScopes = scope
.split(/\s+/)
.filter((s) => s.length > 0)
if (requiredScopes.length === 0) {
return null
}
const info: StepUpInfo = { requiredScopes }
const error = extractParam(params, "error")
if (error !== null) {
info.error = error
}
const errorDescription = extractParam(params, "error_description")
if (errorDescription !== null) {
info.errorDescription = errorDescription
}
return info
}
function extractParam(params: string, name: string): string | null {
const quotedPattern = new RegExp(`${name}="([^"]*)"`)
const quotedMatch = quotedPattern.exec(params)
if (quotedMatch) {
return quotedMatch[1]
}
const unquotedPattern = new RegExp(`${name}=([^\\s,]+)`)
const unquotedMatch = unquotedPattern.exec(params)
return unquotedMatch?.[1] ?? null
}
export function mergeScopes(existing: string[], required: string[]): string[] {
const set = new Set(existing)
for (const scope of required) {
set.add(scope)
}
return [...set]
}
export function isStepUpRequired(statusCode: number, headers: Record<string, string>): StepUpInfo | null {
if (statusCode !== 403) {
return null
}
const wwwAuth = headers["www-authenticate"] ?? headers["WWW-Authenticate"]
if (!wwwAuth) {
return null
}
return parseWwwAuthenticate(wwwAuth)
}

View File

@@ -0,0 +1,136 @@
import { describe, expect, test, beforeEach, afterEach } from "bun:test"
import { existsSync, mkdirSync, rmSync, readFileSync, statSync, writeFileSync } from "node:fs"
import { join } from "node:path"
import { tmpdir } from "node:os"
import {
deleteToken,
getMcpOauthStoragePath,
listAllTokens,
listTokensByHost,
loadToken,
saveToken,
} from "./storage"
import type { OAuthTokenData } from "./storage"
describe("mcp-oauth storage", () => {
const TEST_CONFIG_DIR = join(tmpdir(), "mcp-oauth-test-" + Date.now())
let originalConfigDir: string | undefined
beforeEach(() => {
originalConfigDir = process.env.OPENCODE_CONFIG_DIR
process.env.OPENCODE_CONFIG_DIR = TEST_CONFIG_DIR
if (!existsSync(TEST_CONFIG_DIR)) {
mkdirSync(TEST_CONFIG_DIR, { recursive: true })
}
})
afterEach(() => {
if (originalConfigDir === undefined) {
delete process.env.OPENCODE_CONFIG_DIR
} else {
process.env.OPENCODE_CONFIG_DIR = originalConfigDir
}
if (existsSync(TEST_CONFIG_DIR)) {
rmSync(TEST_CONFIG_DIR, { recursive: true, force: true })
}
})
test("should save tokens with {host}/{resource} key and set 0600 permissions", () => {
// #given
const token: OAuthTokenData = {
accessToken: "access-1",
refreshToken: "refresh-1",
expiresAt: 1710000000,
clientInfo: { clientId: "client-1", clientSecret: "secret-1" },
}
// #when
const success = saveToken("https://example.com:443", "mcp/v1", token)
const storagePath = getMcpOauthStoragePath()
const parsed = JSON.parse(readFileSync(storagePath, "utf-8")) as Record<string, OAuthTokenData>
const mode = statSync(storagePath).mode & 0o777
// #then
expect(success).toBe(true)
expect(Object.keys(parsed)).toEqual(["example.com/mcp/v1"])
expect(parsed["example.com/mcp/v1"].accessToken).toBe("access-1")
expect(mode).toBe(0o600)
})
test("should load a saved token", () => {
// #given
const token: OAuthTokenData = { accessToken: "access-2", refreshToken: "refresh-2" }
saveToken("api.example.com", "resource-a", token)
// #when
const loaded = loadToken("api.example.com:8443", "resource-a")
// #then
expect(loaded).toEqual(token)
})
test("should delete a token", () => {
// #given
const token: OAuthTokenData = { accessToken: "access-3" }
saveToken("api.example.com", "resource-b", token)
// #when
const success = deleteToken("api.example.com", "resource-b")
const loaded = loadToken("api.example.com", "resource-b")
// #then
expect(success).toBe(true)
expect(loaded).toBeNull()
})
test("should list tokens by host", () => {
// #given
saveToken("api.example.com", "resource-a", { accessToken: "access-a" })
saveToken("api.example.com", "resource-b", { accessToken: "access-b" })
saveToken("other.example.com", "resource-c", { accessToken: "access-c" })
// #when
const entries = listTokensByHost("api.example.com:5555")
// #then
expect(Object.keys(entries).sort()).toEqual([
"api.example.com/resource-a",
"api.example.com/resource-b",
])
expect(entries["api.example.com/resource-a"].accessToken).toBe("access-a")
})
test("should handle missing storage file", () => {
// #given
const storagePath = getMcpOauthStoragePath()
if (existsSync(storagePath)) {
rmSync(storagePath, { force: true })
}
// #when
const loaded = loadToken("api.example.com", "resource-a")
const entries = listTokensByHost("api.example.com")
// #then
expect(loaded).toBeNull()
expect(entries).toEqual({})
})
test("should handle invalid JSON", () => {
// #given
const storagePath = getMcpOauthStoragePath()
const dir = join(storagePath, "..")
if (!existsSync(dir)) {
mkdirSync(dir, { recursive: true })
}
writeFileSync(storagePath, "{not-valid-json", "utf-8")
// #when
const loaded = loadToken("api.example.com", "resource-a")
const entries = listTokensByHost("api.example.com")
// #then
expect(loaded).toBeNull()
expect(entries).toEqual({})
})
})

Some files were not shown because too many files have changed in this diff Show More