The agent parameter was using raw config key "atlas" but the SDK
expects the display name "Atlas (Plan Executor)". This caused
"Agent not found: 'atlas'" errors when auto-compact tried to
continue boulder execution.
Root cause: injectBoulderContinuation passed raw agent key to
session.promptAsync, but SDK's agent matching logic compares
against display names registered in the system.
Fix: Use getAgentDisplayName() to convert the config key to
the expected display name before passing to the SDK.
The server-health module used module-level state for inProcessServerRunning,
which doesn't survive when Bun loads separate module instances in the same
process. Fix: use globalThis with Symbol.for key so the flag is truly
process-global.
- server-health.ts: replace module-level boolean with globalThis[Symbol.for()]
- export markServerRunningInProcess from tmux-utils barrel
- test: verify flag skips HTTP fetch, verify globalThis persistence
When .sisyphus/ is gitignored, task state written during worktree execution
is lost when the worktree is removed. Fix:
- add worktree-sync.ts: syncSisyphusStateFromWorktree() copies .sisyphus/
contents from worktree to main repo directory
- update start-work.ts template: documents the sync step as CRITICAL when
worktree_path is set in boulder.json
- update work-with-pr/SKILL.md: adds explicit sync step before worktree removal
- export from boulder-state index
- test: 5 scenarios covering no-.sisyphus, nested dirs, overwrite stale state
- fix(switcher): use lastIndexOf for multi-slash model IDs (e.g. aws/anthropic/claude-sonnet-4)
- fix(model-resolution): same lastIndexOf fix in doctor parseProviderModel
- fix(call-omo-agent): resolve model from agent config and forward to both
background and sync executors via DelegatedModelConfig
- fix(subagent-resolver): inherit category model/variant when agent uses
category reference without explicit model override
- test: add model override forwarding tests for call-omo-agent
- test: add multi-slash model ID test for switcher
- fix(context-limit): check modelContextLimitsCache for all Anthropic
models, not just GA-model set; user config/cache wins over 200K default
(fixes#2836)
- fix(agent-key-remapper): preserve config key aliases alongside display
names so `opencode run --agent sisyphus` resolves correctly
(fixes#2858)
- fix(tool-config): respect host permission.skill=deny by disabling
skill/skill_mcp tools when host denies them (fixes#2873)
- test: update context-limit and agent-key-remapper tests to match new
behavior
Models frequently hallucinate a 'directory' parameter alongside filePath,
causing hard failures. Instead of rejecting, accept directory as an alias
for filePath and gracefully handle when both are provided (prefer filePath).
This prevents the 'filePath and directory are mutually exclusive' error
that users hit when models pass both parameters.
Fixes model confusion with lsp_diagnostics tool parameters.
Apply the same mock.module() isolation fixes to publish.yml:
- Move shared and session-recovery mock-heavy tests to isolated section
- Use dynamic find + exclusion for remaining src/shared tests
- Include session-recovery tests in remaining batch
Ensures publish workflow has the same test config as main CI run.
Move 4 src/shared tests that use mock.module() to the isolated test section:
- model-capabilities.test.ts (mocks ./connected-providers-cache)
- log-legacy-plugin-startup-warning.test.ts (mocks ./legacy-plugin-warning)
- model-error-classifier.test.ts
- opencode-message-dir.test.ts
Also isolate recover-tool-result-missing.test.ts (mocks ./storage).
Use find + exclusion pattern in remaining tests to dynamically build the
src/shared file list without the isolated mock-heavy files.
Fixes 6 Linux CI failures caused by bun's mock.module() cache pollution
when running in parallel.
Prevent src/shared batch runs from leaking module mocks into later files, which was breaking Linux CI cache metadata and legacy plugin warning assertions.
When recovering missing tool results, the session recovery hook was using
raw part.id (prt_* format) as tool_use_id when callID was absent, causing
ZodError validation failures from the API.
Added isValidToolUseID() guard that only accepts toolu_* and call_* prefixed
IDs, and normalizeMessagePart() that returns null for parts without valid
callIDs. Both the SQLite fallback and stored-parts paths now filter out
invalid entries before constructing tool_result payloads.
Includes 4 regression tests covering both valid/invalid callID paths for
both SQLite and stored-parts backends.
Pass explicit config dir to checkForLegacyPluginEntry instead of relying
on XDG_CONFIG_HOME env var, which gets contaminated by parallel tests on
Linux CI. Also adds missing 'join' import.
The opencode-project-command-discovery test used execFileSync for git init,
which collided with image-converter.test.ts's global execFileSync mock when
running in parallel on Linux CI. Switching to Bun.spawnSync avoids the mock
entirely since spyOn(childProcess, 'execFileSync') doesn't affect Bun APIs.
Fixes CI flake that only reproduced on Linux.
- Add legacy plugin startup warning when oh-my-opencode config detected
- Update CLI installer and TUI installer for new package name
- Split monolithic config-manager.test.ts into focused test modules
- Add plugin config detection tests for legacy name fallback
- Update processed-command-store to use plugin-identity constants
- Add claude-code-plugin-loader discovery test for both config names
- Update chat-params and ultrawork-db tests for plugin identity
Part of #2823
Updated across README (all locales), docs/guide/, docs/reference/,
docs/examples/, AGENTS.md files, and test expectations/snapshots.
The deep category and multimodal-looker still use gpt-5.3-codex as
those are separate from the hephaestus agent.
Hephaestus now uses gpt-5.4 as its default model across all providers
(openai, github-copilot, venice, opencode), matching Sisyphus's GPT 5.4
support. The separate gpt-5.3-codex → github-copilot fallback entry is
removed since gpt-5.4 is available on all required providers.
Mock connected-providers-cache in model-capabilities.test.ts to prevent
findProviderModelMetadata from reading disk-cached model metadata.
Without this mock, the 'prefers runtime models.dev cache' test gets
polluted by real cached data from opencode serve runs, causing the
test to receive different maxOutputTokens/supportsTemperature values
than the mock runtime snapshot provides.
This was the last CI-only failure — passes locally with cache, fails
on CI without cache, now passes everywhere via mock isolation.
Full suite: 4484 pass, 0 fail.
Sisyphus-authored fixes across 15 files:
- plugin-identity: align CONFIG_BASENAME with actual config file name
- add-plugin-to-opencode-config: handle legacy→canonical name migration
- plugin-detection tests: update expectations for new identity constants
- doctor/system: fix legacy name warning test assertions
- install tests: align with new plugin name
- chat-params tests: fix mock isolation
- model-capabilities tests: fix snapshot expectations
- image-converter: fix platform-dependent test assertions (Linux CI)
- example configs: expanded with more detailed comments
Full suite: 4484 pass, 0 fail, typecheck clean.
- Doctor now detects when opencode.json references 'oh-my-opencode'
(legacy name) and warns users to switch to 'oh-my-openagent' with
the exact replacement string.
- Added 3 example config files in docs/examples/:
- default.jsonc: balanced setup with all agents documented
- coding-focused.jsonc: Sisyphus + Hephaestus heavy
- planning-focused.jsonc: Prometheus + Atlas heavy
All examples include every agent (sisyphus, hephaestus, atlas,
prometheus, explore, librarian) with model recommendations.
Helps with #2823
When provider-models cache is cold (first run / cache miss),
resolveModelForDelegateTask returns {skipped: true}. Previously this
caused the subagent resolver to:
1. Ignore the user's explicit model override (e.g. explore.model)
2. Fall through to the hardcoded fallback chain which may contain
model IDs that don't exist in the provider catalog
Now:
- subagent-resolver: if resolution is skipped but user explicitly
configured a model, use it directly
- subagent-resolver: don't assign hardcoded fallback chain on skip
- category-resolver: same — don't leak hardcoded chain on skip
- general-agents: if user model fails resolution, use it as-is
instead of falling back to hardcoded chain first entry
Closes#2820