subagent-resolver: when falling back to matchedAgent.model for custom
subagents, verify the model is actually available via fuzzyMatchModel
before setting it as categoryModel. Prevents delegate_task from using
an unavailable model when the user's custom agent config references
a model they don't have access to.
Test updated to include the target model in available models mock.
- completedTaskSummaries now includes status and error info
- notifyParentSession: noReply=false for failed tasks so parent reacts
- Batch notification distinguishes successful vs failed/cancelled tasks
- notification-template updated to show task errors
- task-poller: session-gone tests (85 new lines)
- CI: add Bun shim to PATH for legacy plugin migration tests
When a subagent session disappears from the status registry (process
crashed), the main agent was waiting the full stale timeout before
acting. Fix:
- Add sessionGoneTimeoutMs config option (default 60s, vs 30min normal)
- task-poller: use shorter timeout when session is gone from status
- manager: verify session existence when gone, fail crashed tasks
immediately with descriptive error
- Add legacy-plugin-toast hook for #2823 migration warnings
- Update schema with new config option
- logLegacyPluginStartupWarning now emits console.warn (visible to user,
not just log file) when oh-my-opencode is detected in opencode.json
- Auto-migrates opencode.json plugin entry from oh-my-opencode to
oh-my-openagent (with backup)
- plugin-config.ts: add console.warn when loading legacy config filename
- test: 10 tests covering migration, console output, edge cases
Two issues fixed:
1. process-cleanup.ts used fire-and-forget void Promise for shutdown
handlers — now properly collects and awaits all cleanup promises
via Promise.allSettled, with dedup guard to prevent double cleanup
2. TmuxSessionManager was never registered for process cleanup —
now registered in create-managers.ts via registerManagerForCleanup
Also fixed setTimeout().unref() which could let the process exit
before cleanup completes.
Background subagents (explore/librarian) failed with auth errors because
resolveModelForDelegateTask() always picked the first fallback entry when
availableModels was empty — often an unauthenticated provider like xai
or opencode-go.
Fix: when connectedProvidersCache is populated, iterate fallback chain
and pick the first entry whose provider is in the connected set.
Legacy behavior preserved when cache is null (not yet populated).
- model-selection.ts: use readConnectedProvidersCache to filter fallback chain
- test: 4 new tests for connected-provider-aware resolution
lastCompactionTime was only set on successful compaction. When compaction
failed (rate limit, timeout, etc.), no cooldown was recorded, causing
immediate retries in an infinite loop.
Fix: set lastCompactionTime before the try block so both success and
failure respect the cooldown window.
- test: add failed-compaction cooldown enforcement test
- test: fix timeout retry test to advance past cooldown
The server-health module used module-level state for inProcessServerRunning,
which doesn't survive when Bun loads separate module instances in the same
process. Fix: use globalThis with Symbol.for key so the flag is truly
process-global.
- server-health.ts: replace module-level boolean with globalThis[Symbol.for()]
- export markServerRunningInProcess from tmux-utils barrel
- test: verify flag skips HTTP fetch, verify globalThis persistence
When .sisyphus/ is gitignored, task state written during worktree execution
is lost when the worktree is removed. Fix:
- add worktree-sync.ts: syncSisyphusStateFromWorktree() copies .sisyphus/
contents from worktree to main repo directory
- update start-work.ts template: documents the sync step as CRITICAL when
worktree_path is set in boulder.json
- update work-with-pr/SKILL.md: adds explicit sync step before worktree removal
- export from boulder-state index
- test: 5 scenarios covering no-.sisyphus, nested dirs, overwrite stale state
- fix(switcher): use lastIndexOf for multi-slash model IDs (e.g. aws/anthropic/claude-sonnet-4)
- fix(model-resolution): same lastIndexOf fix in doctor parseProviderModel
- fix(call-omo-agent): resolve model from agent config and forward to both
background and sync executors via DelegatedModelConfig
- fix(subagent-resolver): inherit category model/variant when agent uses
category reference without explicit model override
- test: add model override forwarding tests for call-omo-agent
- test: add multi-slash model ID test for switcher
- fix(context-limit): check modelContextLimitsCache for all Anthropic
models, not just GA-model set; user config/cache wins over 200K default
(fixes#2836)
- fix(agent-key-remapper): preserve config key aliases alongside display
names so `opencode run --agent sisyphus` resolves correctly
(fixes#2858)
- fix(tool-config): respect host permission.skill=deny by disabling
skill/skill_mcp tools when host denies them (fixes#2873)
- test: update context-limit and agent-key-remapper tests to match new
behavior
Models frequently hallucinate a 'directory' parameter alongside filePath,
causing hard failures. Instead of rejecting, accept directory as an alias
for filePath and gracefully handle when both are provided (prefer filePath).
This prevents the 'filePath and directory are mutually exclusive' error
that users hit when models pass both parameters.
Fixes model confusion with lsp_diagnostics tool parameters.
Apply the same mock.module() isolation fixes to publish.yml:
- Move shared and session-recovery mock-heavy tests to isolated section
- Use dynamic find + exclusion for remaining src/shared tests
- Include session-recovery tests in remaining batch
Ensures publish workflow has the same test config as main CI run.
Move 4 src/shared tests that use mock.module() to the isolated test section:
- model-capabilities.test.ts (mocks ./connected-providers-cache)
- log-legacy-plugin-startup-warning.test.ts (mocks ./legacy-plugin-warning)
- model-error-classifier.test.ts
- opencode-message-dir.test.ts
Also isolate recover-tool-result-missing.test.ts (mocks ./storage).
Use find + exclusion pattern in remaining tests to dynamically build the
src/shared file list without the isolated mock-heavy files.
Fixes 6 Linux CI failures caused by bun's mock.module() cache pollution
when running in parallel.
Prevent src/shared batch runs from leaking module mocks into later files, which was breaking Linux CI cache metadata and legacy plugin warning assertions.
When recovering missing tool results, the session recovery hook was using
raw part.id (prt_* format) as tool_use_id when callID was absent, causing
ZodError validation failures from the API.
Added isValidToolUseID() guard that only accepts toolu_* and call_* prefixed
IDs, and normalizeMessagePart() that returns null for parts without valid
callIDs. Both the SQLite fallback and stored-parts paths now filter out
invalid entries before constructing tool_result payloads.
Includes 4 regression tests covering both valid/invalid callID paths for
both SQLite and stored-parts backends.
Pass explicit config dir to checkForLegacyPluginEntry instead of relying
on XDG_CONFIG_HOME env var, which gets contaminated by parallel tests on
Linux CI. Also adds missing 'join' import.
The opencode-project-command-discovery test used execFileSync for git init,
which collided with image-converter.test.ts's global execFileSync mock when
running in parallel on Linux CI. Switching to Bun.spawnSync avoids the mock
entirely since spyOn(childProcess, 'execFileSync') doesn't affect Bun APIs.
Fixes CI flake that only reproduced on Linux.
- Add legacy plugin startup warning when oh-my-opencode config detected
- Update CLI installer and TUI installer for new package name
- Split monolithic config-manager.test.ts into focused test modules
- Add plugin config detection tests for legacy name fallback
- Update processed-command-store to use plugin-identity constants
- Add claude-code-plugin-loader discovery test for both config names
- Update chat-params and ultrawork-db tests for plugin identity
Part of #2823
Updated across README (all locales), docs/guide/, docs/reference/,
docs/examples/, AGENTS.md files, and test expectations/snapshots.
The deep category and multimodal-looker still use gpt-5.3-codex as
those are separate from the hephaestus agent.
Hephaestus now uses gpt-5.4 as its default model across all providers
(openai, github-copilot, venice, opencode), matching Sisyphus's GPT 5.4
support. The separate gpt-5.3-codex → github-copilot fallback entry is
removed since gpt-5.4 is available on all required providers.
Mock connected-providers-cache in model-capabilities.test.ts to prevent
findProviderModelMetadata from reading disk-cached model metadata.
Without this mock, the 'prefers runtime models.dev cache' test gets
polluted by real cached data from opencode serve runs, causing the
test to receive different maxOutputTokens/supportsTemperature values
than the mock runtime snapshot provides.
This was the last CI-only failure — passes locally with cache, fails
on CI without cache, now passes everywhere via mock isolation.
Full suite: 4484 pass, 0 fail.
Sisyphus-authored fixes across 15 files:
- plugin-identity: align CONFIG_BASENAME with actual config file name
- add-plugin-to-opencode-config: handle legacy→canonical name migration
- plugin-detection tests: update expectations for new identity constants
- doctor/system: fix legacy name warning test assertions
- install tests: align with new plugin name
- chat-params tests: fix mock isolation
- model-capabilities tests: fix snapshot expectations
- image-converter: fix platform-dependent test assertions (Linux CI)
- example configs: expanded with more detailed comments
Full suite: 4484 pass, 0 fail, typecheck clean.
- Doctor now detects when opencode.json references 'oh-my-opencode'
(legacy name) and warns users to switch to 'oh-my-openagent' with
the exact replacement string.
- Added 3 example config files in docs/examples/:
- default.jsonc: balanced setup with all agents documented
- coding-focused.jsonc: Sisyphus + Hephaestus heavy
- planning-focused.jsonc: Prometheus + Atlas heavy
All examples include every agent (sisyphus, hephaestus, atlas,
prometheus, explore, librarian) with model recommendations.
Helps with #2823
When provider-models cache is cold (first run / cache miss),
resolveModelForDelegateTask returns {skipped: true}. Previously this
caused the subagent resolver to:
1. Ignore the user's explicit model override (e.g. explore.model)
2. Fall through to the hardcoded fallback chain which may contain
model IDs that don't exist in the provider catalog
Now:
- subagent-resolver: if resolution is skipped but user explicitly
configured a model, use it directly
- subagent-resolver: don't assign hardcoded fallback chain on skip
- category-resolver: same — don't leak hardcoded chain on skip
- general-agents: if user model fails resolution, use it as-is
instead of falling back to hardcoded chain first entry
Closes#2820