Compare commits

...

63 Commits

Author SHA1 Message Date
YeonGyu-Kim
09cfd0b408 diag(todo-continuation): add comprehensive debug logging for session idle handling
Add [TODO-DIAG] console.error statements throughout the todo continuation
enforcer to help diagnose why continuation prompts aren't being injected.

Changes:
- Add session.idle event handler diagnostic in handler.ts
- Add detailed blocking reason logging in idle-event.ts for all gate checks
- Update JSON schema to reflect circuit breaker config changes

🤖 Generated with assistance of [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)
2026-03-18 14:45:14 +09:00
YeonGyu-Kim
d48ea025f0 refactor(circuit-breaker): replace sliding window with consecutive call detection
Switch background task loop detection from percentage-based sliding window
(80% of 20-call window) to consecutive same-tool counting. Triggers when
same tool signature is called 20+ times in a row; a different tool resets
the counter.
2026-03-18 14:32:27 +09:00
YeonGyu-Kim
c5c7ba4eed perf: pre-compile regex patterns and optimize hot-path string operations
- error-classifier: pre-compile default retry pattern regex
- think-mode/detector: combine multilingual patterns into single regex
- parser: skip redundant toLowerCase on pre-lowered keywords
- edit-operations: use fast arraysEqual instead of JSON comparison
- hash-computation: optimize streaming line extraction with index tracking
2026-03-18 14:19:23 +09:00
YeonGyu-Kim
90aa3a306c perf(hooks,tools): optimize string operations and reduce redundant iterations
- output-renderer, hashline-edit-diff: replace str += with array join (H2)
- auto-slash-command: single-pass Map grouping instead of 6x filter (M1)
- comment-checker: hoist Zod schema to module scope (M2)
- session-last-agent: reverse iterate sorted array instead of sort+reverse (L2)
2026-03-18 14:19:12 +09:00
YeonGyu-Kim
c2f7d059d2 perf(shared): optimize hot-path utilities across plugin
- task-list: replace O(n³) blocker resolution with Map lookup (C4)
- logger: buffer log entries and flush periodically to reduce sync I/O (C5)
- plugin-interface: create chatParamsHandler once at init (H3)
- pattern-matcher: cache compiled RegExp for wildcard matchers (H6)
- file-reference-resolver: use replaceAll instead of split/join (M9)
- connected-providers-cache: add in-memory cache for read operations (L4)
2026-03-18 14:19:00 +09:00
YeonGyu-Kim
7a96a167e6 perf(claude-code-hooks): defer config loading until after disabled check
Move loadClaudeHooksConfig and loadPluginExtendedConfig after isHookDisabled check
in both tool-execute-before and tool-execute-after handlers to skip 5 file reads
per tool call when hooks are disabled (C1)
2026-03-18 14:18:49 +09:00
YeonGyu-Kim
2da19fe608 perf(background-agent): use Set for countedToolPartIDs, cache circuit breaker settings, optimize loop detector
- Replace countedToolPartIDs string[] with Set<string> for O(1) has/add vs O(n) includes/spread (C2)
- Cache resolveCircuitBreakerSettings at manager level to avoid repeated object creation (C3)
- Optimize recordToolCall to avoid full array copy with slice (L1)
2026-03-18 14:18:38 +09:00
YeonGyu-Kim
952bd5338d fix(background-agent): treat non-active session statuses as terminal to prevent parent session hang
Previously, pollRunningTasks() and checkAndInterruptStaleTasks() treated
any non-"idle" session status as "still running", which caused tasks with
terminal statuses like "interrupted" to be skipped indefinitely — both
for completion detection AND stale timeout. This made the parent session
hang forever waiting for an ALL COMPLETE notification that never came.

Extract isActiveSessionStatus() and isTerminalSessionStatus() that
classify session statuses explicitly. Only known active statuses
("busy", "retry", "running") protect tasks from completion/stale checks.
Known terminal statuses ("interrupted") trigger immediate completion.
Unknown statuses fall through to the standard idle/gone path with output
validation as a conservative default.

Introduced by: a0c93816 (2026-02-14), dc370f7f (2026-03-08)
2026-03-18 14:06:23 +09:00
YeonGyu-Kim
57757a345d refactor: improve test isolation and DI for cache/port-utils/resolve-file-uri
- connected-providers-cache: extract factory pattern (createConnectedProvidersCacheStore) for testable cache dir injection
- port-utils.test: environment-independent tests with real socket probing and contiguous port detection
- resolve-file-uri.test: mock homedir instead of touching real home directory
- github-triage: update SKILL.md
2026-03-18 13:17:01 +09:00
YeonGyu-Kim
3caae14192 fix(ralph-loop): abort stale Oracle sessions before ulw verification restart
When Oracle verification fails in ulw-loop mode, the previous Oracle
session was never aborted before restarting. Each retry created a new
descendant session, causing unbounded session accumulation and 500
errors from server overload.

Now abort the old verification session before:
- restarting the loop after failed verification
- re-entering verification phase on subsequent DONE detection
2026-03-18 12:49:27 +09:00
YeonGyu-Kim
55ac653eaa feat(hooks): add todo-description-override hook to enforce atomic todo format
Override TodoWrite description via tool.definition hook to require
WHERE/WHY/HOW/RESULT in each todo title and enforce 1-3 tool call
granularity.
2026-03-18 11:49:13 +09:00
YeonGyu-Kim
1d5652dfa9 Merge pull request #2655 from tad-hq/infinite-circuit-target-fix
fix(circuit-breaker): make repetitive detection target-aware and add enabled escape hatch
2026-03-18 11:46:06 +09:00
YeonGyu-Kim
76c460536d docs(start-work): update worktree and task breakdown guidance
- Change worktree behavior: default to current directory, worktree only with --worktree flag
- Add mandatory TASK BREAKDOWN section with granular sub-task requirements
- Add WORKTREE COMPLETION section for merging worktree branches back

🤖 Generated with assistance of OhMyOpenCode
2026-03-18 11:16:43 +09:00
github-actions[bot]
b067d4a284 @ogormans-deptstack has signed the CLA in code-yeongyu/oh-my-openagent#2656 2026-03-17 20:42:53 +00:00
github-actions[bot]
94838ec039 @tad-hq has signed the CLA in code-yeongyu/oh-my-openagent#2655 2026-03-17 20:07:20 +00:00
tad-hq
224ecea8c7 chore: regenerate JSON schema with circuitBreaker.enabled field 2026-03-17 13:43:56 -06:00
tad-hq
5d5755f29d fix(circuit-breaker): wire target-aware detection into background manager 2026-03-17 13:40:46 -06:00
tad-hq
1fdce01fd2 fix(circuit-breaker): target-aware loop detection via tool signatures 2026-03-17 13:36:09 -06:00
tad-hq
c8213c970e fix(circuit-breaker): add enabled config flag as escape hatch 2026-03-17 13:29:06 -06:00
YeonGyu-Kim
576ff453e5 Merge pull request #2651 from code-yeongyu/fix/openagent-version-in-publish
fix(release): set version when publishing oh-my-openagent
2026-03-18 02:15:36 +09:00
YeonGyu-Kim
9b8aca45f9 fix(release): set version when publishing oh-my-openagent
The publish step was updating name and optionalDependencies but not
version, causing npm to try publishing the base package.json version
(3.11.0) instead of the release version (3.12.0).

Error was: 'You cannot publish over the previously published versions: 3.11.0'
2026-03-18 02:15:15 +09:00
YeonGyu-Kim
f1f20f5a79 Merge pull request #2650 from code-yeongyu/fix/openagent-platform-publish
fix(release): add oh-my-openagent dual-publish to platform and main workflows
2026-03-18 01:55:31 +09:00
YeonGyu-Kim
de40caf76d fix(release): add oh-my-openagent dual-publish to platform and main workflows
- publish-platform.yml: Build job now checks BOTH oh-my-opencode and
  oh-my-openagent before skipping. Build only skips when both are published.
  Added 'Publish oh-my-openagent-{platform}' step that renames package.json
  and publishes under the openagent name.

- publish.yml: Added 'Publish oh-my-openagent' step after opencode publish.
  Rewrites package name and optionalDependencies to oh-my-openagent variants,
  then publishes. Restores package.json after.

Previously, oh-my-openagent platform packages were never published because
the build skip check only looked at oh-my-opencode (which was already published),
causing the entire build to be skipped.
2026-03-18 01:45:02 +09:00
github-actions[bot]
d80833896c @HaD0Yun has signed the CLA in code-yeongyu/oh-my-openagent#2640 2026-03-17 08:27:56 +00:00
YeonGyu-Kim
d50c38f037 refactor(tests): rename benchmarks/ to tests/hashline/, remove FriendliAI dependency
- Move benchmarks/ → tests/hashline/
- Replace @friendliai/ai-provider with @ai-sdk/openai-compatible
- Remove all 'benchmark' naming (package name, scripts, env vars, session IDs)
- Fix import paths for new directory depth (../src → ../../src)
- Fix pre-existing syntax error in headless.ts (unclosed case block)
- Inject HASHLINE_EDIT_DESCRIPTION into test system prompt
- Scripts renamed: bench:* → test:*
2026-03-17 16:47:13 +09:00
YeonGyu-Kim
f2d5f4ca92 improve(hashline-edit): rewrite tool description with examples and fix lines schema
- Add XML-structured description (<must>, <operations>, <examples>, <auto>)
- Add 5 concrete examples including BAD pattern showing duplication
- Add explicit anti-duplication warning for range replace
- Move snapshot rule to top-level <must> section
- Clarify batch semantics (multiple ops, not one big replace)
- Fix lines schema: add string[] to union (was string|null, now string[]|string|null)
- Matches runtime RawHashlineEdit type and description text
2026-03-17 16:47:13 +09:00
YeonGyu-Kim
b788586caf relax task timeouts: stale timeout 3min→20min, session wait 30s→1min 2026-03-17 16:47:13 +09:00
YeonGyu-Kim
90351e442e update look_at tool description to discourage visual precision use cases 2026-03-17 16:47:13 +09:00
YeonGyu-Kim
4ad88b2576 feat(task-toast): show model name before category in toast notification
Display resolved model ID (e.g., gpt-5.3-codex: deep) instead of
agent/category format when modelInfo is available. Falls back to
old format when no model info exists.
2026-03-17 16:47:13 +09:00
YeonGyu-Kim
2ce69710e3 docs: sync agent-model-matching guide with actual fallback chains
- Metis: add missing GPT-5.4 high as 2nd fallback
- Hephaestus: add GPT-5.4 (Copilot) fallback, was incorrectly listed as Codex-only
- Oracle: add opencode-go/glm-5 as last fallback
- Momus: add opencode-go/glm-5 fallback, note xhigh variant
- Atlas: add GPT-5.4 medium as 3rd fallback
- Sisyphus: add Kimi K2.5 (moonshot providers) in chain
- Sisyphus-Junior: add missing agent to Utility Runners section
- GPT Family table: merge duplicate GPT-5.4 rows
- Categories: add missing opencode-go intermediate fallbacks for
  visual-engineering, ultrabrain, quick, unspecified-low/high, writing
2026-03-17 16:47:13 +09:00
YeonGyu-Kim
0b4d092cf6 Merge pull request #2639 from code-yeongyu/feature/2635-smart-circuit-breaker
feat(background-agent): add smart circuit breaker for repeated tool calls
2026-03-17 16:43:08 +09:00
YeonGyu-Kim
53285617d3 Merge pull request #2636 from code-yeongyu/fix/pre-publish-blockers
fix: resolve 12 pre-publish blockers (security, correctness, migration)
2026-03-17 16:36:04 +09:00
YeonGyu-Kim
ae3befbfbe fix(background-agent): apply smart circuit breaker to manager events
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-03-17 16:31:55 +09:00
YeonGyu-Kim
dc1a05ac3e feat(background-agent): add loop detector helpers
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-03-17 16:31:55 +09:00
YeonGyu-Kim
e271b4a1b0 feat(config): add background task circuit breaker settings
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-03-17 16:31:55 +09:00
YeonGyu-Kim
fee938d63a fix(cli): cherry-pick glm-4.7-free → gpt-5-nano fallback fix from dev 2026-03-17 16:30:12 +09:00
YeonGyu-Kim
4d74d888e4 Merge pull request #2637 from code-yeongyu/fix/ulw-verification-session-tracking
fix(ulw-loop): add fallback for Oracle verification session tracking
2026-03-17 16:25:28 +09:00
YeonGyu-Kim
4bc7b1d27c fix(ulw-loop): add fallback for Oracle verification session tracking
The verification_session_id was never reliably set because the
prompt-based attempt_id matching in tool-execute-after depends on
metadata.prompt surviving the delegate-task execution chain. When
this fails silently, the loop never detects Oracle's VERIFIED
emission.

Add a fallback: when exact attempt_id matching fails but oracle
agent + verification_pending state match, still set the session ID.
Add diagnostic logging to trace verification flow failures.
Add integration test covering the full verification chain.
2026-03-17 16:21:40 +09:00
YeonGyu-Kim
78dac0642e Merge pull request #2590 from MoerAI/fix/subagent-circuit-breaker
fix(background-agent): add circuit breaker to prevent subagent infinite loops (fixes #2571)
2026-03-17 16:09:29 +09:00
YeonGyu-Kim
92bc72a90b fix(bun-install): use workspaceDir option instead of hardcoded cache-dir 2026-03-17 16:05:51 +09:00
YeonGyu-Kim
a7301ba8a9 fix(delegate-task): guard skipped sentinel in subagent-resolver 2026-03-17 15:57:23 +09:00
YeonGyu-Kim
e9887dd82f fix(doctor): align auto-update and doctor config paths 2026-03-17 15:56:02 +09:00
YeonGyu-Kim
c0082d8a09 Merge pull request #2634 from code-yeongyu/fix/run-in-background-required
fix(delegate-task): remove auto-default for run_in_background, require explicit parameter
2026-03-17 15:55:17 +09:00
YeonGyu-Kim
fbc3b4e230 Merge pull request #2612 from MoerAI/fix/dead-fallback-model
fix(cli): replace dead glm-4.7-free with gpt-5-nano as ultimate fallback (fixes #2101)
2026-03-17 15:53:29 +09:00
YeonGyu-Kim
1f7fdb43ba Merge pull request #2539 from cpkt9762/fix/category-variant-no-requirement
fix(delegate-task): build categoryModel with variant for categories without fallback chain
2026-03-17 15:53:11 +09:00
YeonGyu-Kim
566031f4fa fix(delegate-task): remove auto-default for run_in_background, require explicit parameter
Remove the auto-defaulting logic from PR #2420 that silently set
run_in_background=false when category/subagent_type/session_id was present.

The tool description falsely claimed 'Default: false' which misled agents
into omitting the parameter. Now the description says REQUIRED and the
validation always throws when the parameter is missing, with a clear
error message guiding the agent to retry with the correct value.

Reverts the behavioral change from #2420 while keeping the issue's
root cause (misleading description) fixed.
2026-03-17 15:49:47 +09:00
YeonGyu-Kim
0cf386ec52 fix(skill-tool): invalidate cached skill description on execute 2026-03-17 15:49:26 +09:00
YeonGyu-Kim
d493f9ec3a fix(cli-run): move resolveRunModel inside try block 2026-03-17 15:49:26 +09:00
YeonGyu-Kim
2c7ded2433 fix(background-agent): defer task cleanup while siblings running 2026-03-17 15:17:34 +09:00
YeonGyu-Kim
82c7807a4f fix(event): clear retry dedupe key on non-retry status 2026-03-17 15:17:34 +09:00
YeonGyu-Kim
df7e1ae16d fix(todo-continuation): remove activity-based stagnation bypass 2026-03-17 15:17:34 +09:00
YeonGyu-Kim
0471078006 fix(tmux): escape serverUrl in pane shell commands 2026-03-17 15:16:54 +09:00
YeonGyu-Kim
1070b9170f docs: remove temporary injury notice from README 2026-03-17 10:41:56 +09:00
acamq
bb312711cf Merge pull request #2618 from RaviTharuma/fix/extract-status-code-nested-errors
fix(runtime-fallback): extract status code from nested AI SDK errors
2026-03-16 16:28:31 -06:00
github-actions[bot]
c31facf41e @gxlife has signed the CLA in code-yeongyu/oh-my-openagent#2625 2026-03-16 15:17:21 +00:00
Ravi Tharuma
de66f1f397 fix(runtime-fallback): prefer numeric status codes over non-numeric in extraction chain
The nullish-coalescing chain could stop at a non-numeric value (e.g.
status: "error"), preventing deeper nested numeric statusCode values
from being reached. Switch to Array.find() with a type guard to always
select the first numeric value.

Adds 11 tests for extractStatusCode covering: top-level, nested
(data/error/cause), non-numeric skip, fallback to regex, and
precedence.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 13:51:23 +01:00
YeonGyu-Kim
427fa6d7a2 Merge pull request #2619 from code-yeongyu/revert/openclaw-one-way
revert: remove one-way OpenClaw integration
2026-03-16 21:09:30 +09:00
YeonGyu-Kim
239da8b02a Revert "Merge pull request #2607 from code-yeongyu/feat/openclaw-integration"
This reverts commit 8213534e87, reversing
changes made to 84fb1113f1.
2026-03-16 21:09:08 +09:00
YeonGyu-Kim
17244e2c84 Revert "Merge pull request #2609 from code-yeongyu/fix/rename-omx-to-omo-env"
This reverts commit 4759dfb654, reversing
changes made to 8213534e87.
2026-03-16 21:09:08 +09:00
Ravi Tharuma
24a0f7b032 fix(runtime-fallback): extract status code from nested AI SDK errors
AI SDK wraps HTTP status codes inside error.error.statusCode (e.g., AI_APICallError). The current extractStatusCode only checks the top level, missing these nested codes.

This caused runtime-fallback to skip retryable errors like 400, 500, 504 because it couldn't find the status code.

Fixes #2617
2026-03-16 13:04:14 +01:00
MoerAI
fc48df1d53 fix(cli): replace dead glm-4.7-free with gpt-5-nano as ultimate fallback
The opencode/glm-4.7-free model was removed from the OpenCode platform,
causing the ULTIMATE_FALLBACK in the CLI installer to point to a dead
model. Users installing OMO without any major provider configured would
get a non-functional model assignment.

Replaced with opencode/gpt-5-nano which is confirmed available per
user reports and existing fallback chains in model-requirements.ts.

Fixes #2101
2026-03-16 19:21:10 +09:00
MoerAI
3055454ecc fix(background-agent): add circuit breaker to prevent subagent infinite loops
Adds a configurable maxToolCalls limit (default: 200) that automatically
cancels background tasks when they exceed the threshold. This prevents
runaway subagent loops from burning unlimited tokens, as reported in #2571
where a Gemini subagent ran 809 consecutive tool calls over 3.5 hours
costing ~$350.

The circuit breaker triggers in the existing tool call tracking path
(message.part.updated/delta events) and cancels the task with a clear
error message explaining what happened. The limit is configurable via
background_task.maxToolCalls in oh-my-opencode.jsonc.

Fixes #2571
2026-03-16 11:07:33 +09:00
cpkt9762
11e9276498 fix(delegate-task): build categoryModel with variant for categories without fallback chain
When a category has no CATEGORY_MODEL_REQUIREMENTS entry (e.g.
user-defined categories like solana-re), the !requirement branch
set actualModel but never built categoryModel with variant from
the user config. The bottom fallback then created categoryModel
via parseModelString alone, silently dropping the variant.

Mirror the requirement branch logic: read variant from
userCategories and resolved.config, and build categoryModel
with it.

Fixes #2538
2026-03-13 04:15:17 +08:00
123 changed files with 4234 additions and 1783 deletions

View File

@@ -59,20 +59,39 @@ jobs:
- name: Check if already published
id: check
run: |
PKG_NAME="oh-my-opencode-${{ matrix.platform }}"
VERSION="${{ inputs.version }}"
STATUS=$(curl -s -o /dev/null -w "%{http_code}" "https://registry.npmjs.org/${PKG_NAME}/${VERSION}")
# Convert platform name for output (replace - with _)
PLATFORM_KEY="${{ matrix.platform }}"
PLATFORM_KEY="${PLATFORM_KEY//-/_}"
if [ "$STATUS" = "200" ]; then
# Check oh-my-opencode
OC_STATUS=$(curl -s -o /dev/null -w "%{http_code}" "https://registry.npmjs.org/oh-my-opencode-${{ matrix.platform }}/${VERSION}")
# Check oh-my-openagent
OA_STATUS=$(curl -s -o /dev/null -w "%{http_code}" "https://registry.npmjs.org/oh-my-openagent-${{ matrix.platform }}/${VERSION}")
echo "oh-my-opencode-${{ matrix.platform }}@${VERSION}: ${OC_STATUS}"
echo "oh-my-openagent-${{ matrix.platform }}@${VERSION}: ${OA_STATUS}"
if [ "$OC_STATUS" = "200" ]; then
echo "skip_opencode=true" >> $GITHUB_OUTPUT
echo "✓ oh-my-opencode-${{ matrix.platform }}@${VERSION} already published"
else
echo "skip_opencode=false" >> $GITHUB_OUTPUT
echo "→ oh-my-opencode-${{ matrix.platform }}@${VERSION} needs publishing"
fi
if [ "$OA_STATUS" = "200" ]; then
echo "skip_openagent=true" >> $GITHUB_OUTPUT
echo "✓ oh-my-openagent-${{ matrix.platform }}@${VERSION} already published"
else
echo "skip_openagent=false" >> $GITHUB_OUTPUT
echo "→ oh-my-openagent-${{ matrix.platform }}@${VERSION} needs publishing"
fi
# Skip build only if BOTH are already published
if [ "$OC_STATUS" = "200" ] && [ "$OA_STATUS" = "200" ]; then
echo "skip=true" >> $GITHUB_OUTPUT
echo "skip_${PLATFORM_KEY}=true" >> $GITHUB_OUTPUT
echo "✓ ${PKG_NAME}@${VERSION} already published"
else
echo "skip=false" >> $GITHUB_OUTPUT
echo "skip_${PLATFORM_KEY}=false" >> $GITHUB_OUTPUT
echo "→ ${PKG_NAME}@${VERSION} needs publishing"
fi
- name: Update version in package.json
@@ -207,23 +226,38 @@ jobs:
matrix:
platform: [darwin-arm64, darwin-x64, darwin-x64-baseline, linux-x64, linux-x64-baseline, linux-arm64, linux-x64-musl, linux-x64-musl-baseline, linux-arm64-musl, windows-x64, windows-x64-baseline]
steps:
- name: Check if oh-my-opencode already published
- name: Check if already published
id: check
run: |
PKG_NAME="oh-my-opencode-${{ matrix.platform }}"
VERSION="${{ inputs.version }}"
STATUS=$(curl -s -o /dev/null -w "%{http_code}" "https://registry.npmjs.org/${PKG_NAME}/${VERSION}")
if [ "$STATUS" = "200" ]; then
echo "skip=true" >> $GITHUB_OUTPUT
echo "✓ ${PKG_NAME}@${VERSION} already published, skipping"
OC_STATUS=$(curl -s -o /dev/null -w "%{http_code}" "https://registry.npmjs.org/oh-my-opencode-${{ matrix.platform }}/${VERSION}")
OA_STATUS=$(curl -s -o /dev/null -w "%{http_code}" "https://registry.npmjs.org/oh-my-openagent-${{ matrix.platform }}/${VERSION}")
if [ "$OC_STATUS" = "200" ]; then
echo "skip_opencode=true" >> $GITHUB_OUTPUT
echo "✓ oh-my-opencode-${{ matrix.platform }}@${VERSION} already published"
else
echo "skip=false" >> $GITHUB_OUTPUT
echo "→ ${PKG_NAME}@${VERSION} will be published"
echo "skip_opencode=false" >> $GITHUB_OUTPUT
fi
if [ "$OA_STATUS" = "200" ]; then
echo "skip_openagent=true" >> $GITHUB_OUTPUT
echo "✓ oh-my-openagent-${{ matrix.platform }}@${VERSION} already published"
else
echo "skip_openagent=false" >> $GITHUB_OUTPUT
fi
# Need artifact if either package needs publishing
if [ "$OC_STATUS" = "200" ] && [ "$OA_STATUS" = "200" ]; then
echo "skip_all=true" >> $GITHUB_OUTPUT
else
echo "skip_all=false" >> $GITHUB_OUTPUT
fi
- name: Download artifact
id: download
if: steps.check.outputs.skip != 'true'
if: steps.check.outputs.skip_all != 'true'
continue-on-error: true
uses: actions/download-artifact@v4
with:
@@ -231,7 +265,7 @@ jobs:
path: .
- name: Extract artifact
if: steps.check.outputs.skip != 'true' && steps.download.outcome == 'success'
if: steps.check.outputs.skip_all != 'true' && steps.download.outcome == 'success'
run: |
PLATFORM="${{ matrix.platform }}"
mkdir -p packages/${PLATFORM}
@@ -247,13 +281,13 @@ jobs:
ls -la packages/${PLATFORM}/bin/
- uses: actions/setup-node@v4
if: steps.check.outputs.skip != 'true' && steps.download.outcome == 'success'
if: steps.check.outputs.skip_all != 'true' && steps.download.outcome == 'success'
with:
node-version: "24"
registry-url: "https://registry.npmjs.org"
- name: Publish ${{ matrix.platform }}
if: steps.check.outputs.skip != 'true' && steps.download.outcome == 'success'
- name: Publish oh-my-opencode-${{ matrix.platform }}
if: steps.check.outputs.skip_opencode != 'true' && steps.download.outcome == 'success'
run: |
cd packages/${{ matrix.platform }}
@@ -267,3 +301,25 @@ jobs:
NODE_AUTH_TOKEN: ${{ secrets.NODE_AUTH_TOKEN }}
NPM_CONFIG_PROVENANCE: true
timeout-minutes: 15
- name: Publish oh-my-openagent-${{ matrix.platform }}
if: steps.check.outputs.skip_openagent != 'true' && steps.download.outcome == 'success'
run: |
cd packages/${{ matrix.platform }}
# Rename package for oh-my-openagent
jq --arg name "oh-my-openagent-${{ matrix.platform }}" \
--arg desc "Platform-specific binary for oh-my-openagent (${{ matrix.platform }})" \
'.name = $name | .description = $desc | .bin = {"oh-my-openagent": (.bin | to_entries | .[0].value)}' \
package.json > tmp.json && mv tmp.json package.json
TAG_ARG=""
if [ -n "${{ inputs.dist_tag }}" ]; then
TAG_ARG="--tag ${{ inputs.dist_tag }}"
fi
npm publish --access public --provenance $TAG_ARG
env:
NODE_AUTH_TOKEN: ${{ secrets.NODE_AUTH_TOKEN }}
NPM_CONFIG_PROVENANCE: true
timeout-minutes: 15

View File

@@ -216,6 +216,48 @@ jobs:
NODE_AUTH_TOKEN: ${{ secrets.NODE_AUTH_TOKEN }}
NPM_CONFIG_PROVENANCE: true
- name: Check if oh-my-openagent already published
id: check-openagent
run: |
VERSION="${{ steps.version.outputs.version }}"
STATUS=$(curl -s -o /dev/null -w "%{http_code}" "https://registry.npmjs.org/oh-my-openagent/${VERSION}")
if [ "$STATUS" = "200" ]; then
echo "skip=true" >> $GITHUB_OUTPUT
echo "✓ oh-my-openagent@${VERSION} already published"
else
echo "skip=false" >> $GITHUB_OUTPUT
fi
- name: Publish oh-my-openagent
if: steps.check-openagent.outputs.skip != 'true'
run: |
VERSION="${{ steps.version.outputs.version }}"
# Update package name, version, and optionalDependencies for oh-my-openagent
jq --arg v "$VERSION" '
.name = "oh-my-openagent" |
.version = $v |
.optionalDependencies = (
.optionalDependencies | to_entries |
map(.key = (.key | sub("^oh-my-opencode-"; "oh-my-openagent-")) | .value = $v) |
from_entries
)
' package.json > tmp.json && mv tmp.json package.json
TAG_ARG=""
if [ -n "${{ steps.version.outputs.dist_tag }}" ]; then
TAG_ARG="--tag ${{ steps.version.outputs.dist_tag }}"
fi
npm publish --access public --provenance $TAG_ARG || echo "::warning::oh-my-openagent publish failed"
env:
NODE_AUTH_TOKEN: ${{ secrets.NODE_AUTH_TOKEN }}
NPM_CONFIG_PROVENANCE: true
- name: Restore package.json
if: steps.check-openagent.outputs.skip != 'true'
run: |
git checkout -- package.json
trigger-platform:
runs-on: ubuntu-latest
needs: publish-main

View File

@@ -136,7 +136,36 @@ fi
---
## Phase 3: Spawn Subagents
## Phase 3: Spawn Subagents (Individual Tool Calls)
**CRITICAL: Create tasks ONE BY ONE using individual `task_create` tool calls. NEVER batch or script.**
For each item, execute these steps sequentially:
### Step 3.1: Create Task Record
```typescript
task_create(
subject="Triage: #{number} {title}",
description="GitHub {issue|PR} triage analysis - {type}",
metadata={"type": "{ISSUE_QUESTION|ISSUE_BUG|ISSUE_FEATURE|ISSUE_OTHER|PR_BUGFIX|PR_OTHER}", "number": {number}}
)
```
### Step 3.2: Spawn Analysis Subagent (Background)
```typescript
task(
category="quick",
run_in_background=true,
load_skills=[],
prompt=SUBAGENT_PROMPT
)
```
**ABSOLUTE RULES for Subagents:**
- **ONLY ANALYZE** - Never take action on GitHub (no comments, merges, closes)
- **READ-ONLY** - Use tools only for reading code/GitHub data
- **WRITE REPORT ONLY** - Output goes to `{REPORT_DIR}/{issue|pr}-{number}.md` via Write tool
- **EVIDENCE REQUIRED** - Every claim must have GitHub permalink as proof
```
For each item:
@@ -170,6 +199,7 @@ ABSOLUTE RULES (violating ANY = critical failure):
- Your ONLY writable output: {REPORT_DIR}/{issue|pr}-{number}.md via the Write tool
```
---
### ISSUE_QUESTION

View File

@@ -1,9 +1,3 @@
> [!WARNING]
> **TEMP NOTICE (This Week): Reduced Maintainer Availability**
>
> Core maintainer Q got injured, so issue/PR responses and releases may be delayed this week.
> Thank you for your patience and support.
> [!NOTE]
>
> [![Sisyphus Labs - Sisyphus is the agent that codes like your team.](./.github/assets/sisyphuslabs.png?v=2)](https://sisyphuslabs.ai)

View File

@@ -3699,6 +3699,30 @@
"syncPollTimeoutMs": {
"type": "number",
"minimum": 60000
},
"maxToolCalls": {
"type": "integer",
"minimum": 10,
"maximum": 9007199254740991
},
"circuitBreaker": {
"type": "object",
"properties": {
"enabled": {
"type": "boolean"
},
"maxToolCalls": {
"type": "integer",
"minimum": 10,
"maximum": 9007199254740991
},
"consecutiveThreshold": {
"type": "integer",
"minimum": 5,
"maximum": 9007199254740991
}
},
"additionalProperties": false
}
},
"additionalProperties": false

View File

@@ -1,18 +0,0 @@
{
"name": "hashline-edit-benchmark",
"version": "0.1.0",
"private": true,
"type": "module",
"description": "Hashline edit tool benchmark using Vercel AI SDK with FriendliAI provider",
"scripts": {
"bench:basic": "bun run test-edit-ops.ts",
"bench:edge": "bun run test-edge-cases.ts",
"bench:multi": "bun run test-multi-model.ts",
"bench:all": "bun run bench:basic && bun run bench:edge"
},
"dependencies": {
"@friendliai/ai-provider": "^1.0.9",
"ai": "^6.0.94",
"zod": "^4.1.0"
}
}

View File

@@ -64,8 +64,8 @@ These agents have Claude-optimized prompts — long, detailed, mechanics-driven.
| Agent | Role | Fallback Chain | Notes |
| ------------ | ----------------- | -------------------------------------- | ------------------------------------------------------------------------------------------------- |
| **Sisyphus** | Main orchestrator | Claude Opus → opencode-go/kimi-k2.5 → K2P5 → GPT-5.4 → GLM-5 → Big Pickle | Claude-family first. GPT-5.4 has dedicated prompt support. Kimi/GLM as intermediate fallbacks. |
| **Metis** | Plan gap analyzer | Claude Opus → opencode-go/glm-5 → K2P5 | Claude preferred. Uses opencode-go for reliable GLM-5 access. |
| **Sisyphus** | Main orchestrator | Claude Opus → opencode-go/kimi-k2.5 → K2P5 → Kimi K2.5 → GPT-5.4 → GLM-5 → Big Pickle | Claude-family first. GPT-5.4 has dedicated prompt support. Kimi available through multiple providers. |
| **Metis** | Plan gap analyzer | Claude Opus → GPT-5.4 → opencode-go/glm-5 → K2P5 | Claude preferred. GPT-5.4 as secondary before GLM-5 fallback. |
### Dual-Prompt Agents → Claude preferred, GPT supported
@@ -74,7 +74,7 @@ These agents ship separate prompts for Claude and GPT families. They auto-detect
| Agent | Role | Fallback Chain | Notes |
| -------------- | ----------------- | -------------------------------------- | -------------------------------------------------------------------- |
| **Prometheus** | Strategic planner | Claude Opus → GPT-5.4 → opencode-go/glm-5 → Gemini 3.1 Pro | Interview-mode planning. GPT prompt is compact and principle-driven. |
| **Atlas** | Todo orchestrator | Claude Sonnet → opencode-go/kimi-k2.5 | Claude first, opencode-go as the current fallback path. |
| **Atlas** | Todo orchestrator | Claude Sonnet → opencode-go/kimi-k2.5 → GPT-5.4 | Claude first, opencode-go as intermediate, GPT-5.4 as last resort. |
### Deep Specialists → GPT
@@ -82,9 +82,9 @@ These agents are built for GPT's principle-driven style. Their prompts assume au
| Agent | Role | Fallback Chain | Notes |
| -------------- | ----------------------- | -------------------------------------- | ------------------------------------------------ |
| **Hephaestus** | Autonomous deep worker | GPT-5.3 Codex only | No fallback. Requires GPT access. The craftsman. |
| **Oracle** | Architecture consultant | GPT-5.4 → Gemini 3.1 Pro → Claude Opus | Read-only high-IQ consultation. |
| **Momus** | Ruthless reviewer | GPT-5.4 → Claude Opus → Gemini 3.1 Pro | Verification and plan review. |
| **Hephaestus** | Autonomous deep worker | GPT-5.3 Codex → GPT-5.4 (Copilot) | Requires GPT access. GPT-5.4 via Copilot as fallback. The craftsman. |
| **Oracle** | Architecture consultant | GPT-5.4 → Gemini 3.1 Pro → Claude Opus → opencode-go/glm-5 | Read-only high-IQ consultation. |
| **Momus** | Ruthless reviewer | GPT-5.4 → Claude Opus → Gemini 3.1 Pro → opencode-go/glm-5 | Verification and plan review. GPT-5.4 uses xhigh variant. |
### Utility Runners → Speed over Intelligence
@@ -95,6 +95,7 @@ These agents do grep, search, and retrieval. They intentionally use the fastest,
| **Explore** | Fast codebase grep | Grok Code Fast → opencode-go/minimax-m2.5 → MiniMax Free → Haiku → GPT-5-Nano | Speed is everything. Fire 10 in parallel. |
| **Librarian** | Docs/code search | opencode-go/minimax-m2.5 → MiniMax Free → Haiku → GPT-5-Nano | Doc retrieval doesn't need deep reasoning. |
| **Multimodal Looker** | Vision/screenshots | GPT-5.4 → opencode-go/kimi-k2.5 → GLM-4.6v → GPT-5-Nano | Uses the first available multimodal-capable fallback. |
| **Sisyphus-Junior** | Category executor | Claude Sonnet → opencode-go/kimi-k2.5 → GPT-5.4 → Big Pickle | Handles delegated category tasks. Sonnet-tier default. |
---
@@ -119,8 +120,7 @@ Principle-driven, explicit reasoning, deep technical capability. Best for agents
| Model | Strengths |
| ----------------- | ----------------------------------------------------------------------------------------------- |
| **GPT-5.3 Codex** | Deep coding powerhouse. Autonomous exploration. Required for Hephaestus. |
| **GPT-5.4** | High intelligence, strategic reasoning. Default for Oracle. |
| **GPT-5.4** | Strong principle-driven reasoning. Default for Momus and a key fallback for Prometheus / Atlas. |
| **GPT-5.4** | High intelligence, strategic reasoning. Default for Oracle, Momus, and a key fallback for Prometheus / Atlas. Uses xhigh variant for Momus. |
| **GPT-5-Nano** | Ultra-cheap, fast. Good for simple utility tasks. |
### Other Models
@@ -166,14 +166,14 @@ When agents delegate work, they don't pick a model name — they pick a **catego
| Category | When Used | Fallback Chain |
| -------------------- | -------------------------- | -------------------------------------------- |
| `visual-engineering` | Frontend, UI, CSS, design | Gemini 3.1 Pro → GLM 5 → Claude Opus |
| `ultrabrain` | Maximum reasoning needed | GPT-5.4 → Gemini 3.1 Pro → Claude Opus |
| `visual-engineering` | Frontend, UI, CSS, design | Gemini 3.1 Pro → GLM 5 → Claude Opus → opencode-go/glm-5 → K2P5 |
| `ultrabrain` | Maximum reasoning needed | GPT-5.4 → Gemini 3.1 Pro → Claude Opus → opencode-go/glm-5 |
| `deep` | Deep coding, complex logic | GPT-5.3 Codex → Claude Opus → Gemini 3.1 Pro |
| `artistry` | Creative, novel approaches | Gemini 3.1 Pro → Claude Opus → GPT-5.4 |
| `quick` | Simple, fast tasks | Claude Haiku → Gemini Flash → GPT-5-Nano |
| `unspecified-high` | General complex work | Claude Opus → GPT-5.4 (high) → GLM 5 → K2P5 |
| `unspecified-low` | General standard work | Claude Sonnet → GPT-5.3 Codex → Gemini Flash |
| `writing` | Text, docs, prose | Gemini Flash → Claude Sonnet |
| `quick` | Simple, fast tasks | Claude Haiku → Gemini Flash → opencode-go/minimax-m2.5 → GPT-5-Nano |
| `unspecified-high` | General complex work | Claude Opus → GPT-5.4 → GLM 5 → K2P5 → opencode-go/glm-5 → Kimi K2.5 |
| `unspecified-low` | General standard work | Claude Sonnet → GPT-5.3 Codex → opencode-go/kimi-k2.5 → Gemini Flash |
| `writing` | Text, docs, prose | Gemini Flash → opencode-go/kimi-k2.5 → Claude Sonnet |
See the [Orchestration System Guide](./orchestration.md) for how agents dispatch tasks to categories.

View File

@@ -2207,6 +2207,38 @@
"created_at": "2026-03-16T04:55:10Z",
"repoId": 1108837393,
"pullRequestNo": 2604
},
{
"name": "gxlife",
"id": 110413359,
"comment_id": 4068427047,
"created_at": "2026-03-16T15:17:01Z",
"repoId": 1108837393,
"pullRequestNo": 2625
},
{
"name": "HaD0Yun",
"id": 102889891,
"comment_id": 4073195308,
"created_at": "2026-03-17T08:27:45Z",
"repoId": 1108837393,
"pullRequestNo": 2640
},
{
"name": "tad-hq",
"id": 213478119,
"comment_id": 4077697128,
"created_at": "2026-03-17T20:07:09Z",
"repoId": 1108837393,
"pullRequestNo": 2655
},
{
"name": "ogormans-deptstack",
"id": 208788555,
"comment_id": 4077893096,
"created_at": "2026-03-17T20:42:42Z",
"repoId": 1108837393,
"pullRequestNo": 2656
}
]
}

View File

@@ -1,20 +1,32 @@
import { afterAll, beforeAll, describe, expect, test } from "bun:test"
import { afterAll, beforeAll, describe, expect, mock, test } from "bun:test"
import { mkdirSync, rmSync, writeFileSync } from "node:fs"
import { homedir, tmpdir } from "node:os"
import * as os from "node:os"
import { tmpdir } from "node:os"
import { join } from "node:path"
import { resolvePromptAppend } from "./resolve-file-uri"
const originalHomedir = os.homedir.bind(os)
let mockedHomeDir = ""
let moduleImportCounter = 0
let resolvePromptAppend: typeof import("./resolve-file-uri").resolvePromptAppend
mock.module("node:os", () => ({
...os,
homedir: () => mockedHomeDir || originalHomedir(),
}))
describe("resolvePromptAppend", () => {
const fixtureRoot = join(tmpdir(), `resolve-file-uri-${Date.now()}`)
const configDir = join(fixtureRoot, "config")
const homeFixtureDir = join(homedir(), `.resolve-file-uri-home-${Date.now()}`)
const homeFixtureRoot = join(fixtureRoot, "home")
const homeFixtureDir = join(homeFixtureRoot, "fixture-home")
const absoluteFilePath = join(fixtureRoot, "absolute.txt")
const relativeFilePath = join(configDir, "relative.txt")
const spacedFilePath = join(fixtureRoot, "with space.txt")
const homeFilePath = join(homeFixtureDir, "home.txt")
beforeAll(() => {
beforeAll(async () => {
mockedHomeDir = homeFixtureRoot
mkdirSync(fixtureRoot, { recursive: true })
mkdirSync(configDir, { recursive: true })
mkdirSync(homeFixtureDir, { recursive: true })
@@ -23,11 +35,14 @@ describe("resolvePromptAppend", () => {
writeFileSync(relativeFilePath, "relative-content", "utf8")
writeFileSync(spacedFilePath, "encoded-content", "utf8")
writeFileSync(homeFilePath, "home-content", "utf8")
moduleImportCounter += 1
;({ resolvePromptAppend } = await import(`./resolve-file-uri?test=${moduleImportCounter}`))
})
afterAll(() => {
rmSync(fixtureRoot, { recursive: true, force: true })
rmSync(homeFixtureDir, { recursive: true, force: true })
mock.restore()
})
test("returns non-file URI strings unchanged", () => {
@@ -65,7 +80,7 @@ describe("resolvePromptAppend", () => {
test("resolves home directory URI path", () => {
//#given
const input = `file://~/${homeFixtureDir.split("/").pop()}/home.txt`
const input = "file://~/fixture-home/home.txt"
//#when
const resolved = resolvePromptAppend(input)

View File

@@ -5,60 +5,60 @@ exports[`generateModelConfig no providers available returns ULTIMATE_FALLBACK fo
"$schema": "https://raw.githubusercontent.com/code-yeongyu/oh-my-openagent/dev/assets/oh-my-opencode.schema.json",
"agents": {
"atlas": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"explore": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"hephaestus": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"librarian": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"metis": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"momus": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"multimodal-looker": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"oracle": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"prometheus": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"sisyphus-junior": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
},
"categories": {
"artistry": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"deep": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"quick": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"ultrabrain": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"unspecified-high": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"unspecified-low": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"visual-engineering": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"writing": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
},
}
@@ -83,7 +83,7 @@ exports[`generateModelConfig single native provider uses Claude models when only
"variant": "max",
},
"multimodal-looker": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"oracle": {
"model": "anthropic/claude-opus-4-6",
@@ -145,7 +145,7 @@ exports[`generateModelConfig single native provider uses Claude models with isMa
"variant": "max",
},
"multimodal-looker": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"oracle": {
"model": "anthropic/claude-opus-4-6",
@@ -366,20 +366,20 @@ exports[`generateModelConfig single native provider uses Gemini models when only
"$schema": "https://raw.githubusercontent.com/code-yeongyu/oh-my-openagent/dev/assets/oh-my-opencode.schema.json",
"agents": {
"atlas": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"explore": {
"model": "opencode/gpt-5-nano",
},
"metis": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"momus": {
"model": "google/gemini-3.1-pro-preview",
"variant": "high",
},
"multimodal-looker": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"oracle": {
"model": "google/gemini-3.1-pro-preview",
@@ -389,7 +389,7 @@ exports[`generateModelConfig single native provider uses Gemini models when only
"model": "google/gemini-3.1-pro-preview",
},
"sisyphus-junior": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
},
"categories": {
@@ -426,20 +426,20 @@ exports[`generateModelConfig single native provider uses Gemini models with isMa
"$schema": "https://raw.githubusercontent.com/code-yeongyu/oh-my-openagent/dev/assets/oh-my-opencode.schema.json",
"agents": {
"atlas": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"explore": {
"model": "opencode/gpt-5-nano",
},
"metis": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"momus": {
"model": "google/gemini-3.1-pro-preview",
"variant": "high",
},
"multimodal-looker": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"oracle": {
"model": "google/gemini-3.1-pro-preview",
@@ -449,7 +449,7 @@ exports[`generateModelConfig single native provider uses Gemini models with isMa
"model": "google/gemini-3.1-pro-preview",
},
"sisyphus-junior": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
},
"categories": {
@@ -465,7 +465,7 @@ exports[`generateModelConfig single native provider uses Gemini models with isMa
"variant": "high",
},
"unspecified-high": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"unspecified-low": {
"model": "google/gemini-3-flash-preview",
@@ -929,7 +929,7 @@ exports[`generateModelConfig fallback providers uses ZAI model for librarian whe
"$schema": "https://raw.githubusercontent.com/code-yeongyu/oh-my-openagent/dev/assets/oh-my-opencode.schema.json",
"agents": {
"atlas": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"explore": {
"model": "opencode/gpt-5-nano",
@@ -938,45 +938,45 @@ exports[`generateModelConfig fallback providers uses ZAI model for librarian whe
"model": "zai-coding-plan/glm-4.7",
},
"metis": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"momus": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"multimodal-looker": {
"model": "zai-coding-plan/glm-4.6v",
},
"oracle": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"prometheus": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"sisyphus": {
"model": "zai-coding-plan/glm-5",
},
"sisyphus-junior": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
},
"categories": {
"quick": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"ultrabrain": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"unspecified-high": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"unspecified-low": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"visual-engineering": {
"model": "zai-coding-plan/glm-5",
},
"writing": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
},
}
@@ -987,7 +987,7 @@ exports[`generateModelConfig fallback providers uses ZAI model for librarian wit
"$schema": "https://raw.githubusercontent.com/code-yeongyu/oh-my-openagent/dev/assets/oh-my-opencode.schema.json",
"agents": {
"atlas": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"explore": {
"model": "opencode/gpt-5-nano",
@@ -996,45 +996,45 @@ exports[`generateModelConfig fallback providers uses ZAI model for librarian wit
"model": "zai-coding-plan/glm-4.7",
},
"metis": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"momus": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"multimodal-looker": {
"model": "zai-coding-plan/glm-4.6v",
},
"oracle": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"prometheus": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"sisyphus": {
"model": "zai-coding-plan/glm-5",
},
"sisyphus-junior": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
},
"categories": {
"quick": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"ultrabrain": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"unspecified-high": {
"model": "zai-coding-plan/glm-5",
},
"unspecified-low": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"visual-engineering": {
"model": "zai-coding-plan/glm-5",
},
"writing": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
},
}
@@ -1273,7 +1273,7 @@ exports[`generateModelConfig mixed provider scenarios uses Gemini + Claude combi
"variant": "max",
},
"multimodal-looker": {
"model": "opencode/glm-4.7-free",
"model": "opencode/gpt-5-nano",
},
"oracle": {
"model": "google/gemini-3.1-pro-preview",

View File

@@ -1,5 +1,6 @@
import { readFileSync, writeFileSync } from "node:fs"
import type { ConfigMergeResult } from "../types"
import { PLUGIN_NAME, LEGACY_PLUGIN_NAME } from "../../shared"
import { getConfigDir } from "./config-context"
import { ensureConfigDirectoryExists } from "./ensure-config-directory-exists"
import { formatErrorWithSuggestion } from "./format-error-with-suggestion"
@@ -7,8 +8,6 @@ import { detectConfigFormat } from "./opencode-config-format"
import { parseOpenCodeConfigFileWithError, type OpenCodeConfig } from "./parse-opencode-config-file"
import { getPluginNameWithVersion } from "./plugin-name-with-version"
const PACKAGE_NAME = "oh-my-opencode"
export async function addPluginToOpenCodeConfig(currentVersion: string): Promise<ConfigMergeResult> {
try {
ensureConfigDirectoryExists()
@@ -21,7 +20,7 @@ export async function addPluginToOpenCodeConfig(currentVersion: string): Promise
}
const { format, path } = detectConfigFormat()
const pluginEntry = await getPluginNameWithVersion(currentVersion, PACKAGE_NAME)
const pluginEntry = await getPluginNameWithVersion(currentVersion, PLUGIN_NAME)
try {
if (format === "none") {
@@ -41,13 +40,24 @@ export async function addPluginToOpenCodeConfig(currentVersion: string): Promise
const config = parseResult.config
const plugins = config.plugin ?? []
const existingIndex = plugins.findIndex((plugin) => plugin === PACKAGE_NAME || plugin.startsWith(`${PACKAGE_NAME}@`))
if (existingIndex !== -1) {
if (plugins[existingIndex] === pluginEntry) {
// Check for existing plugin (either current or legacy name)
const currentNameIndex = plugins.findIndex(
(plugin) => plugin === PLUGIN_NAME || plugin.startsWith(`${PLUGIN_NAME}@`)
)
const legacyNameIndex = plugins.findIndex(
(plugin) => plugin === LEGACY_PLUGIN_NAME || plugin.startsWith(`${LEGACY_PLUGIN_NAME}@`)
)
// If either name exists, update to new name
if (currentNameIndex !== -1) {
if (plugins[currentNameIndex] === pluginEntry) {
return { success: true, configPath: path }
}
plugins[existingIndex] = pluginEntry
plugins[currentNameIndex] = pluginEntry
} else if (legacyNameIndex !== -1) {
// Upgrade legacy name to new name
plugins[legacyNameIndex] = pluginEntry
} else {
plugins.push(pluginEntry)
}

View File

@@ -11,6 +11,8 @@ type BunInstallOutputMode = "inherit" | "pipe"
interface RunBunInstallOptions {
outputMode?: BunInstallOutputMode
/** Workspace directory to install to. Defaults to cache dir if not provided. */
workspaceDir?: string
}
interface BunInstallOutput {
@@ -65,7 +67,7 @@ function logCapturedOutputOnFailure(outputMode: BunInstallOutputMode, output: Bu
export async function runBunInstallWithDetails(options?: RunBunInstallOptions): Promise<BunInstallResult> {
const outputMode = options?.outputMode ?? "pipe"
const cacheDir = getOpenCodeCacheDir()
const cacheDir = options?.workspaceDir ?? getOpenCodeCacheDir()
const packageJsonPath = `${cacheDir}/package.json`
if (!existsSync(packageJsonPath)) {

View File

@@ -1,5 +1,5 @@
import { existsSync, readFileSync } from "node:fs"
import { parseJsonc } from "../../shared"
import { parseJsonc, LEGACY_PLUGIN_NAME, PLUGIN_NAME } from "../../shared"
import type { DetectedConfig } from "../types"
import { getOmoConfigPath } from "./config-context"
import { detectConfigFormat } from "./opencode-config-format"
@@ -55,8 +55,12 @@ function detectProvidersFromOmoConfig(): {
}
}
function isOurPlugin(plugin: string): boolean {
return plugin === PLUGIN_NAME || plugin.startsWith(`${PLUGIN_NAME}@`) ||
plugin === LEGACY_PLUGIN_NAME || plugin.startsWith(`${LEGACY_PLUGIN_NAME}@`)
}
export function detectCurrentConfig(): DetectedConfig {
const PACKAGE_NAME = "oh-my-opencode"
const result: DetectedConfig = {
isInstalled: false,
hasClaude: true,
@@ -82,7 +86,7 @@ export function detectCurrentConfig(): DetectedConfig {
const openCodeConfig = parseResult.config
const plugins = openCodeConfig.plugin ?? []
result.isInstalled = plugins.some((plugin) => plugin.startsWith(PACKAGE_NAME))
result.isInstalled = plugins.some(isOurPlugin)
if (!result.isInstalled) {
return result

View File

@@ -52,6 +52,30 @@ describe("detectCurrentConfig - single package detection", () => {
expect(result.isInstalled).toBe(true)
})
it("detects oh-my-openagent as installed (legacy name)", () => {
// given
const config = { plugin: ["oh-my-openagent"] }
writeFileSync(testConfigPath, JSON.stringify(config, null, 2) + "\n", "utf-8")
// when
const result = detectCurrentConfig()
// then
expect(result.isInstalled).toBe(true)
})
it("detects oh-my-openagent with version pin as installed (legacy name)", () => {
// given
const config = { plugin: ["oh-my-openagent@3.11.0"] }
writeFileSync(testConfigPath, JSON.stringify(config, null, 2) + "\n", "utf-8")
// when
const result = detectCurrentConfig()
// then
expect(result.isInstalled).toBe(true)
})
it("returns false when plugin not present", () => {
// given
const config = { plugin: ["some-other-plugin"] }
@@ -64,6 +88,18 @@ describe("detectCurrentConfig - single package detection", () => {
expect(result.isInstalled).toBe(false)
})
it("returns false when plugin not present (even with similar name)", () => {
// given - not exactly oh-my-openagent
const config = { plugin: ["oh-my-openagent-extra"] }
writeFileSync(testConfigPath, JSON.stringify(config, null, 2) + "\n", "utf-8")
// when
const result = detectCurrentConfig()
// then
expect(result.isInstalled).toBe(false)
})
it("detects OpenCode Go from the existing omo config", () => {
// given
writeFileSync(testConfigPath, JSON.stringify({ plugin: ["oh-my-opencode"] }, null, 2) + "\n", "utf-8")
@@ -130,6 +166,38 @@ describe("addPluginToOpenCodeConfig - single package writes", () => {
expect(savedConfig.plugin).not.toContain("oh-my-opencode@3.10.0")
})
it("recognizes oh-my-openagent as already installed (legacy name)", async () => {
// given
const config = { plugin: ["oh-my-openagent"] }
writeFileSync(testConfigPath, JSON.stringify(config, null, 2) + "\n", "utf-8")
// when
const result = await addPluginToOpenCodeConfig("3.11.0")
// then
expect(result.success).toBe(true)
const savedConfig = JSON.parse(readFileSync(testConfigPath, "utf-8"))
// Should upgrade to new name
expect(savedConfig.plugin).toContain("oh-my-opencode")
expect(savedConfig.plugin).not.toContain("oh-my-openagent")
})
it("replaces version-pinned oh-my-openagent@X.Y.Z with new name", async () => {
// given
const config = { plugin: ["oh-my-openagent@3.10.0"] }
writeFileSync(testConfigPath, JSON.stringify(config, null, 2) + "\n", "utf-8")
// when
const result = await addPluginToOpenCodeConfig("3.11.0")
// then
expect(result.success).toBe(true)
const savedConfig = JSON.parse(readFileSync(testConfigPath, "utf-8"))
// Legacy should be replaced with new name
expect(savedConfig.plugin).toContain("oh-my-opencode")
expect(savedConfig.plugin).not.toContain("oh-my-openagent")
})
it("adds new plugin when none exists", async () => {
// given
const config = {}

View File

@@ -1,7 +1,6 @@
import { existsSync, readFileSync } from "node:fs"
import { PACKAGE_NAME } from "../constants"
import { getOpenCodeConfigPaths, parseJsonc } from "../../../shared"
import { LEGACY_PLUGIN_NAME, PLUGIN_NAME, getOpenCodeConfigPaths, parseJsonc } from "../../../shared"
export interface PluginInfo {
registered: boolean
@@ -24,18 +23,33 @@ function detectConfigPath(): string | null {
}
function parsePluginVersion(entry: string): string | null {
if (!entry.startsWith(`${PACKAGE_NAME}@`)) return null
const value = entry.slice(PACKAGE_NAME.length + 1)
if (!value || value === "latest") return null
return value
// Check for current package name
if (entry.startsWith(`${PLUGIN_NAME}@`)) {
const value = entry.slice(PLUGIN_NAME.length + 1)
if (!value || value === "latest") return null
return value
}
// Check for legacy package name
if (entry.startsWith(`${LEGACY_PLUGIN_NAME}@`)) {
const value = entry.slice(LEGACY_PLUGIN_NAME.length + 1)
if (!value || value === "latest") return null
return value
}
return null
}
function findPluginEntry(entries: string[]): { entry: string; isLocalDev: boolean } | null {
for (const entry of entries) {
if (entry === PACKAGE_NAME || entry.startsWith(`${PACKAGE_NAME}@`)) {
// Check for current package name
if (entry === PLUGIN_NAME || entry.startsWith(`${PLUGIN_NAME}@`)) {
return { entry, isLocalDev: false }
}
if (entry.startsWith("file://") && entry.includes(PACKAGE_NAME)) {
// Check for legacy package name
if (entry === LEGACY_PLUGIN_NAME || entry.startsWith(`${LEGACY_PLUGIN_NAME}@`)) {
return { entry, isLocalDev: false }
}
// Check for file:// paths that include either name
if (entry.startsWith("file://") && (entry.includes(PLUGIN_NAME) || entry.includes(LEGACY_PLUGIN_NAME))) {
return { entry, isLocalDev: true }
}
}
@@ -76,7 +90,7 @@ export function getPluginInfo(): PluginInfo {
registered: true,
configPath,
entry: pluginEntry.entry,
isPinned: pinnedVersion !== null && /^\d+\.\d+\.\d+/.test(pinnedVersion),
isPinned: pinnedVersion !== null && /^\d+\.\d+\.\d+/.test(pinnedVersion ?? ""),
pinnedVersion,
isLocalDev: pluginEntry.isLocalDev,
}

View File

@@ -19,7 +19,7 @@ export type { GeneratedOmoConfig } from "./model-fallback-types"
const ZAI_MODEL = "zai-coding-plan/glm-4.7"
const ULTIMATE_FALLBACK = "opencode/glm-4.7-free"
const ULTIMATE_FALLBACK = "opencode/gpt-5-nano"
const SCHEMA_URL = "https://raw.githubusercontent.com/code-yeongyu/oh-my-openagent/dev/assets/oh-my-opencode.schema.json"

View File

@@ -45,26 +45,26 @@ export function writePaddedText(
return { output: text, atLineStart: text.endsWith("\n") }
}
let output = ""
const parts: string[] = []
let lineStart = atLineStart
for (let i = 0; i < text.length; i++) {
const ch = text[i]
if (lineStart) {
output += " "
parts.push(" ")
lineStart = false
}
if (ch === "\n") {
output += " \n"
parts.push(" \n")
lineStart = true
continue
}
output += ch
parts.push(ch)
}
return { output, atLineStart: lineStart }
return { output: parts.join(""), atLineStart: lineStart }
}
function colorizeWithProfileColor(text: string, hexColor?: string): string {

View File

@@ -1,6 +1,6 @@
/// <reference types="bun-types" />
import { describe, it, expect } from "bun:test"
import { describe, it, expect, beforeEach, afterEach, vi } from "bun:test"
import type { OhMyOpenCodeConfig } from "../../config"
import { resolveRunAgent, waitForEventProcessorShutdown } from "./runner"
@@ -83,7 +83,6 @@ describe("resolveRunAgent", () => {
})
describe("waitForEventProcessorShutdown", () => {
it("returns quickly when event processor completes", async () => {
//#given
const eventProcessor = new Promise<void>((resolve) => {
@@ -115,3 +114,44 @@ describe("waitForEventProcessorShutdown", () => {
expect(elapsed).toBeGreaterThanOrEqual(timeoutMs - 10)
})
})
describe("run with invalid model", () => {
it("given invalid --model value, when run, then returns exit code 1 with error message", async () => {
// given
const originalExit = process.exit
const originalError = console.error
const errorMessages: string[] = []
const exitCodes: number[] = []
console.error = (...args: unknown[]) => {
errorMessages.push(args.map(String).join(" "))
}
process.exit = ((code?: number) => {
exitCodes.push(code ?? 0)
throw new Error("exit")
}) as typeof process.exit
try {
// when
// Note: This will actually try to run - but the issue is that resolveRunModel
// is called BEFORE the try block, so it throws an unhandled exception
// We're testing the runner's error handling
const { run } = await import("./runner")
// This will throw because model "invalid" is invalid format
try {
await run({
message: "test",
model: "invalid",
})
} catch {
// Expected to potentially throw due to unhandled model resolution error
}
} finally {
// then - verify error handling
// Currently this will fail because the error is not caught properly
console.error = originalError
process.exit = originalExit
}
})
})

View File

@@ -47,10 +47,11 @@ export async function run(options: RunOptions): Promise<number> {
const pluginConfig = loadPluginConfig(directory, { command: "run" })
const resolvedAgent = resolveRunAgent(options, pluginConfig)
const resolvedModel = resolveRunModel(options.model)
const abortController = new AbortController()
try {
const resolvedModel = resolveRunModel(options.model)
const { client, cleanup: serverCleanup } = await createServerConnection({
port: options.port,
attach: options.attach,

View File

@@ -0,0 +1,56 @@
import { describe, expect, test } from "bun:test"
import { ZodError } from "zod/v4"
import { BackgroundTaskConfigSchema } from "./background-task"
describe("BackgroundTaskConfigSchema.circuitBreaker", () => {
describe("#given valid circuit breaker settings", () => {
test("#when parsed #then returns nested config", () => {
const result = BackgroundTaskConfigSchema.parse({
circuitBreaker: {
maxToolCalls: 150,
consecutiveThreshold: 10,
},
})
expect(result.circuitBreaker).toEqual({
maxToolCalls: 150,
consecutiveThreshold: 10,
})
})
})
describe("#given consecutiveThreshold below minimum", () => {
test("#when parsed #then throws ZodError", () => {
let thrownError: unknown
try {
BackgroundTaskConfigSchema.parse({
circuitBreaker: {
consecutiveThreshold: 4,
},
})
} catch (error) {
thrownError = error
}
expect(thrownError).toBeInstanceOf(ZodError)
})
})
describe("#given consecutiveThreshold is zero", () => {
test("#when parsed #then throws ZodError", () => {
let thrownError: unknown
try {
BackgroundTaskConfigSchema.parse({
circuitBreaker: {
consecutiveThreshold: 0,
},
})
} catch (error) {
thrownError = error
}
expect(thrownError).toBeInstanceOf(ZodError)
})
})
})

View File

@@ -1,5 +1,11 @@
import { z } from "zod"
const CircuitBreakerConfigSchema = z.object({
enabled: z.boolean().optional(),
maxToolCalls: z.number().int().min(10).optional(),
consecutiveThreshold: z.number().int().min(5).optional(),
})
export const BackgroundTaskConfigSchema = z.object({
defaultConcurrency: z.number().min(1).optional(),
providerConcurrency: z.record(z.string(), z.number().min(0)).optional(),
@@ -11,6 +17,9 @@ export const BackgroundTaskConfigSchema = z.object({
/** Timeout for tasks that never received any progress update, falling back to startedAt (default: 1800000 = 30 minutes, minimum: 60000 = 1 minute) */
messageStalenessTimeoutMs: z.number().min(60000).optional(),
syncPollTimeoutMs: z.number().min(60000).optional(),
/** Maximum tool calls per subagent task before circuit breaker triggers (default: 200, minimum: 10). Prevents runaway loops from burning unlimited tokens. */
maxToolCalls: z.number().int().min(10).optional(),
circuitBreaker: CircuitBreakerConfigSchema.optional(),
})
export type BackgroundTaskConfig = z.infer<typeof BackgroundTaskConfigSchema>

View File

@@ -51,7 +51,7 @@ export const HookNameSchema = z.enum([
"anthropic-effort",
"hashline-read-enhancer",
"read-image-resizer",
"openclaw-sender",
"todo-description-override",
])
export type HookName = z.infer<typeof HookNameSchema>

View File

@@ -12,7 +12,6 @@ import { BuiltinCommandNameSchema } from "./commands"
import { ExperimentalConfigSchema } from "./experimental"
import { GitMasterConfigSchema } from "./git-master"
import { NotificationConfigSchema } from "./notification"
import { OpenClawConfigSchema } from "./openclaw"
import { RalphLoopConfigSchema } from "./ralph-loop"
import { RuntimeFallbackConfigSchema } from "./runtime-fallback"
import { SkillsConfigSchema } from "./skills"
@@ -56,7 +55,6 @@ export const OhMyOpenCodeConfigSchema = z.object({
runtime_fallback: z.union([z.boolean(), RuntimeFallbackConfigSchema]).optional(),
background_task: BackgroundTaskConfigSchema.optional(),
notification: NotificationConfigSchema.optional(),
openclaw: OpenClawConfigSchema.optional(),
babysitting: BabysittingConfigSchema.optional(),
git_master: GitMasterConfigSchema.optional(),
browser_automation_engine: BrowserAutomationConfigSchema.optional(),

View File

@@ -1,51 +0,0 @@
import { z } from "zod";
export const OpenClawHookEventSchema = z.enum([
"session-start",
"session-end",
"session-idle",
"ask-user-question",
"stop",
]);
export const OpenClawHttpGatewayConfigSchema = z.object({
type: z.literal("http").optional(),
url: z.string(), // Allow looser URL validation as it might contain placeholders
headers: z.record(z.string(), z.string()).optional(),
method: z.enum(["POST", "PUT"]).optional(),
timeout: z.number().optional(),
});
export const OpenClawCommandGatewayConfigSchema = z.object({
type: z.literal("command"),
command: z.string(),
timeout: z.number().optional(),
});
export const OpenClawGatewayConfigSchema = z.union([
OpenClawHttpGatewayConfigSchema,
OpenClawCommandGatewayConfigSchema,
]);
export const OpenClawHookMappingSchema = z.object({
gateway: z.string(),
instruction: z.string(),
enabled: z.boolean(),
});
export const OpenClawConfigSchema = z.object({
enabled: z.boolean(),
gateways: z.record(z.string(), OpenClawGatewayConfigSchema),
hooks: z
.object({
"session-start": OpenClawHookMappingSchema.optional(),
"session-end": OpenClawHookMappingSchema.optional(),
"session-idle": OpenClawHookMappingSchema.optional(),
"ask-user-question": OpenClawHookMappingSchema.optional(),
stop: OpenClawHookMappingSchema.optional(),
})
.strict()
.optional(),
});
export type OpenClawConfig = z.infer<typeof OpenClawConfigSchema>;

View File

@@ -2,9 +2,13 @@ import type { PluginInput } from "@opencode-ai/plugin"
import type { BackgroundTask, LaunchInput } from "./types"
export const TASK_TTL_MS = 30 * 60 * 1000
export const TERMINAL_TASK_TTL_MS = 30 * 60 * 1000
export const MIN_STABILITY_TIME_MS = 10 * 1000
export const DEFAULT_STALE_TIMEOUT_MS = 180_000
export const DEFAULT_STALE_TIMEOUT_MS = 1_200_000
export const DEFAULT_MESSAGE_STALENESS_TIMEOUT_MS = 1_800_000
export const DEFAULT_MAX_TOOL_CALLS = 200
export const DEFAULT_CIRCUIT_BREAKER_CONSECUTIVE_THRESHOLD = 20
export const DEFAULT_CIRCUIT_BREAKER_ENABLED = true
export const MIN_RUNTIME_BEFORE_STALE_MS = 30_000
export const MIN_IDLE_TIME_MS = 5000
export const POLLING_INTERVAL_MS = 3000

View File

@@ -0,0 +1,17 @@
declare const require: (name: string) => any
const { describe, expect, test } = require("bun:test")
import { DEFAULT_STALE_TIMEOUT_MS } from "./constants"
describe("DEFAULT_STALE_TIMEOUT_MS", () => {
test("uses a 20 minute default", () => {
// #given
const expectedTimeout = 20 * 60 * 1000
// #when
const timeout = DEFAULT_STALE_TIMEOUT_MS
// #then
expect(timeout).toBe(expectedTimeout)
})
})

View File

@@ -0,0 +1,240 @@
import { describe, expect, test } from "bun:test"
import {
createToolCallSignature,
detectRepetitiveToolUse,
recordToolCall,
resolveCircuitBreakerSettings,
} from "./loop-detector"
function buildWindow(
toolNames: string[],
override?: Parameters<typeof resolveCircuitBreakerSettings>[0]
) {
const settings = resolveCircuitBreakerSettings(override)
return toolNames.reduce(
(window, toolName) => recordToolCall(window, toolName, settings),
undefined as ReturnType<typeof recordToolCall> | undefined
)
}
function buildWindowWithInputs(
calls: Array<{ tool: string; input?: Record<string, unknown> }>,
override?: Parameters<typeof resolveCircuitBreakerSettings>[0]
) {
const settings = resolveCircuitBreakerSettings(override)
return calls.reduce(
(window, { tool, input }) => recordToolCall(window, tool, settings, input),
undefined as ReturnType<typeof recordToolCall> | undefined
)
}
describe("loop-detector", () => {
describe("resolveCircuitBreakerSettings", () => {
describe("#given nested circuit breaker config", () => {
test("#when resolved #then nested values override defaults", () => {
const result = resolveCircuitBreakerSettings({
maxToolCalls: 200,
circuitBreaker: {
maxToolCalls: 120,
consecutiveThreshold: 7,
},
})
expect(result).toEqual({
enabled: true,
maxToolCalls: 120,
consecutiveThreshold: 7,
})
})
})
describe("#given no enabled config", () => {
test("#when resolved #then enabled defaults to true", () => {
const result = resolveCircuitBreakerSettings({
circuitBreaker: {
maxToolCalls: 100,
consecutiveThreshold: 5,
},
})
expect(result.enabled).toBe(true)
})
})
describe("#given enabled is false in config", () => {
test("#when resolved #then enabled is false", () => {
const result = resolveCircuitBreakerSettings({
circuitBreaker: {
enabled: false,
maxToolCalls: 100,
consecutiveThreshold: 5,
},
})
expect(result.enabled).toBe(false)
})
})
describe("#given enabled is true in config", () => {
test("#when resolved #then enabled is true", () => {
const result = resolveCircuitBreakerSettings({
circuitBreaker: {
enabled: true,
maxToolCalls: 100,
consecutiveThreshold: 5,
},
})
expect(result.enabled).toBe(true)
})
})
})
describe("createToolCallSignature", () => {
test("#given tool with input #when signature created #then includes tool and sorted input", () => {
const result = createToolCallSignature("read", { filePath: "/a.ts" })
expect(result).toBe('read::{"filePath":"/a.ts"}')
})
test("#given tool with undefined input #when signature created #then returns bare tool name", () => {
const result = createToolCallSignature("read", undefined)
expect(result).toBe("read")
})
test("#given tool with null input #when signature created #then returns bare tool name", () => {
const result = createToolCallSignature("read", null)
expect(result).toBe("read")
})
test("#given tool with empty object input #when signature created #then returns bare tool name", () => {
const result = createToolCallSignature("read", {})
expect(result).toBe("read")
})
test("#given same input different key order #when signatures compared #then they are equal", () => {
const first = createToolCallSignature("read", { filePath: "/a.ts", offset: 0 })
const second = createToolCallSignature("read", { offset: 0, filePath: "/a.ts" })
expect(first).toBe(second)
})
})
describe("detectRepetitiveToolUse", () => {
describe("#given recent tools are diverse", () => {
test("#when evaluated #then it does not trigger", () => {
const window = buildWindow([
"read",
"grep",
"edit",
"bash",
"read",
"glob",
"lsp_diagnostics",
"read",
"grep",
"edit",
])
const result = detectRepetitiveToolUse(window)
expect(result.triggered).toBe(false)
})
})
describe("#given the same tool is called consecutively", () => {
test("#when evaluated #then it triggers", () => {
const window = buildWindow(Array.from({ length: 20 }, () => "read"))
const result = detectRepetitiveToolUse(window)
expect(result).toEqual({
triggered: true,
toolName: "read",
repeatedCount: 20,
})
})
})
describe("#given consecutive calls are interrupted by different tool", () => {
test("#when evaluated #then it does not trigger", () => {
const window = buildWindow([
...Array.from({ length: 19 }, () => "read"),
"edit",
"read",
])
const result = detectRepetitiveToolUse(window)
expect(result).toEqual({ triggered: false })
})
})
describe("#given threshold boundary", () => {
test("#when below threshold #then it does not trigger", () => {
const belowThresholdWindow = buildWindow(Array.from({ length: 19 }, () => "read"))
const result = detectRepetitiveToolUse(belowThresholdWindow)
expect(result).toEqual({ triggered: false })
})
test("#when equal to threshold #then it triggers", () => {
const atThresholdWindow = buildWindow(Array.from({ length: 20 }, () => "read"))
const result = detectRepetitiveToolUse(atThresholdWindow)
expect(result).toEqual({
triggered: true,
toolName: "read",
repeatedCount: 20,
})
})
})
describe("#given same tool with different file inputs", () => {
test("#when evaluated #then it does not trigger", () => {
const calls = Array.from({ length: 20 }, (_, i) => ({
tool: "read",
input: { filePath: `/src/file-${i}.ts` },
}))
const window = buildWindowWithInputs(calls)
const result = detectRepetitiveToolUse(window)
expect(result.triggered).toBe(false)
})
})
describe("#given same tool with identical file inputs", () => {
test("#when evaluated #then it triggers with bare tool name", () => {
const calls = Array.from({ length: 20 }, () => ({
tool: "read",
input: { filePath: "/src/same.ts" },
}))
const window = buildWindowWithInputs(calls)
const result = detectRepetitiveToolUse(window)
expect(result).toEqual({
triggered: true,
toolName: "read",
repeatedCount: 20,
})
})
})
describe("#given tool calls with no input", () => {
test("#when evaluated #then it triggers", () => {
const calls = Array.from({ length: 20 }, () => ({ tool: "read" }))
const window = buildWindowWithInputs(calls)
const result = detectRepetitiveToolUse(window)
expect(result).toEqual({
triggered: true,
toolName: "read",
repeatedCount: 20,
})
})
})
})
})

View File

@@ -0,0 +1,94 @@
import type { BackgroundTaskConfig } from "../../config/schema"
import {
DEFAULT_CIRCUIT_BREAKER_ENABLED,
DEFAULT_CIRCUIT_BREAKER_CONSECUTIVE_THRESHOLD,
DEFAULT_MAX_TOOL_CALLS,
} from "./constants"
import type { ToolCallWindow } from "./types"
export interface CircuitBreakerSettings {
enabled: boolean
maxToolCalls: number
consecutiveThreshold: number
}
export interface ToolLoopDetectionResult {
triggered: boolean
toolName?: string
repeatedCount?: number
}
export function resolveCircuitBreakerSettings(
config?: BackgroundTaskConfig
): CircuitBreakerSettings {
return {
enabled: config?.circuitBreaker?.enabled ?? DEFAULT_CIRCUIT_BREAKER_ENABLED,
maxToolCalls:
config?.circuitBreaker?.maxToolCalls ?? config?.maxToolCalls ?? DEFAULT_MAX_TOOL_CALLS,
consecutiveThreshold:
config?.circuitBreaker?.consecutiveThreshold ?? DEFAULT_CIRCUIT_BREAKER_CONSECUTIVE_THRESHOLD,
}
}
export function recordToolCall(
window: ToolCallWindow | undefined,
toolName: string,
settings: CircuitBreakerSettings,
toolInput?: Record<string, unknown> | null
): ToolCallWindow {
const signature = createToolCallSignature(toolName, toolInput)
if (window && window.lastSignature === signature) {
return {
lastSignature: signature,
consecutiveCount: window.consecutiveCount + 1,
threshold: settings.consecutiveThreshold,
}
}
return {
lastSignature: signature,
consecutiveCount: 1,
threshold: settings.consecutiveThreshold,
}
}
function sortObject(obj: unknown): unknown {
if (obj === null || obj === undefined) return obj
if (typeof obj !== "object") return obj
if (Array.isArray(obj)) return obj.map(sortObject)
const sorted: Record<string, unknown> = {}
const keys = Object.keys(obj as Record<string, unknown>).sort()
for (const key of keys) {
sorted[key] = sortObject((obj as Record<string, unknown>)[key])
}
return sorted
}
export function createToolCallSignature(
toolName: string,
toolInput?: Record<string, unknown> | null
): string {
if (toolInput === undefined || toolInput === null) {
return toolName
}
if (Object.keys(toolInput).length === 0) {
return toolName
}
return `${toolName}::${JSON.stringify(sortObject(toolInput))}`
}
export function detectRepetitiveToolUse(
window: ToolCallWindow | undefined
): ToolLoopDetectionResult {
if (!window || window.consecutiveCount < window.threshold) {
return { triggered: false }
}
return {
triggered: true,
toolName: window.lastSignature.split("::")[0],
repeatedCount: window.consecutiveCount,
}
}

View File

@@ -0,0 +1,387 @@
import { describe, expect, test } from "bun:test"
import type { PluginInput } from "@opencode-ai/plugin"
import { tmpdir } from "node:os"
import type { BackgroundTaskConfig } from "../../config/schema"
import { BackgroundManager } from "./manager"
import type { BackgroundTask } from "./types"
function createManager(config?: BackgroundTaskConfig): BackgroundManager {
const client = {
session: {
prompt: async () => ({}),
promptAsync: async () => ({}),
abort: async () => ({}),
},
}
const manager = new BackgroundManager({ client, directory: tmpdir() } as unknown as PluginInput, config)
const testManager = manager as unknown as {
enqueueNotificationForParent: (sessionID: string, fn: () => Promise<void>) => Promise<void>
notifyParentSession: (task: BackgroundTask) => Promise<void>
tasks: Map<string, BackgroundTask>
}
testManager.enqueueNotificationForParent = async (_sessionID, fn) => {
await fn()
}
testManager.notifyParentSession = async () => {}
return manager
}
function getTaskMap(manager: BackgroundManager): Map<string, BackgroundTask> {
return (manager as unknown as { tasks: Map<string, BackgroundTask> }).tasks
}
async function flushAsyncWork() {
await new Promise(resolve => setTimeout(resolve, 0))
}
describe("BackgroundManager circuit breaker", () => {
describe("#given the same tool is called consecutively", () => {
test("#when consecutive tool events arrive #then the task is cancelled", async () => {
const manager = createManager({
circuitBreaker: {
consecutiveThreshold: 20,
},
})
const task: BackgroundTask = {
id: "task-loop-1",
sessionID: "session-loop-1",
parentSessionID: "parent-1",
parentMessageID: "msg-1",
description: "Looping task",
prompt: "loop",
agent: "explore",
status: "running",
startedAt: new Date(Date.now() - 60_000),
progress: {
toolCalls: 0,
lastUpdate: new Date(Date.now() - 60_000),
},
}
getTaskMap(manager).set(task.id, task)
for (let i = 0; i < 20; i++) {
manager.handleEvent({
type: "message.part.updated",
properties: { sessionID: task.sessionID, type: "tool", tool: "read" },
})
}
await flushAsyncWork()
expect(task.status).toBe("cancelled")
expect(task.error).toContain("read 20 consecutive times")
})
})
describe("#given recent tool calls are diverse", () => {
test("#when the window fills #then the task keeps running", async () => {
const manager = createManager({
circuitBreaker: {
consecutiveThreshold: 10,
},
})
const task: BackgroundTask = {
id: "task-diverse-1",
sessionID: "session-diverse-1",
parentSessionID: "parent-1",
parentMessageID: "msg-1",
description: "Healthy task",
prompt: "work",
agent: "explore",
status: "running",
startedAt: new Date(Date.now() - 60_000),
progress: {
toolCalls: 0,
lastUpdate: new Date(Date.now() - 60_000),
},
}
getTaskMap(manager).set(task.id, task)
for (const toolName of [
"read",
"grep",
"edit",
"bash",
"glob",
"read",
"lsp_diagnostics",
"grep",
"edit",
"read",
]) {
manager.handleEvent({
type: "message.part.updated",
properties: { sessionID: task.sessionID, type: "tool", tool: toolName },
})
}
await flushAsyncWork()
expect(task.status).toBe("running")
expect(task.progress?.toolCalls).toBe(10)
})
})
describe("#given the absolute cap is configured lower than the repetition detector needs", () => {
test("#when the raw tool-call cap is reached #then the backstop still cancels the task", async () => {
const manager = createManager({
maxToolCalls: 3,
circuitBreaker: {
consecutiveThreshold: 95,
},
})
const task: BackgroundTask = {
id: "task-cap-1",
sessionID: "session-cap-1",
parentSessionID: "parent-1",
parentMessageID: "msg-1",
description: "Backstop task",
prompt: "work",
agent: "explore",
status: "running",
startedAt: new Date(Date.now() - 60_000),
progress: {
toolCalls: 0,
lastUpdate: new Date(Date.now() - 60_000),
},
}
getTaskMap(manager).set(task.id, task)
for (const toolName of ["read", "grep", "edit"]) {
manager.handleEvent({
type: "message.part.updated",
properties: { sessionID: task.sessionID, type: "tool", tool: toolName },
})
}
await flushAsyncWork()
expect(task.status).toBe("cancelled")
expect(task.error).toContain("maximum tool call limit (3)")
})
})
describe("#given the same running tool part emits multiple updates", () => {
test("#when duplicate running updates arrive #then it only counts the tool once", async () => {
const manager = createManager({
maxToolCalls: 2,
circuitBreaker: {
consecutiveThreshold: 5,
},
})
const task: BackgroundTask = {
id: "task-dedupe-1",
sessionID: "session-dedupe-1",
parentSessionID: "parent-1",
parentMessageID: "msg-1",
description: "Dedupe task",
prompt: "work",
agent: "explore",
status: "running",
startedAt: new Date(Date.now() - 60_000),
progress: {
toolCalls: 0,
lastUpdate: new Date(Date.now() - 60_000),
},
}
getTaskMap(manager).set(task.id, task)
for (let index = 0; index < 3; index += 1) {
manager.handleEvent({
type: "message.part.updated",
properties: {
part: {
id: "tool-1",
sessionID: task.sessionID,
type: "tool",
tool: "bash",
state: { status: "running" },
},
},
})
}
await flushAsyncWork()
expect(task.status).toBe("running")
expect(task.progress?.toolCalls).toBe(1)
expect(task.progress?.countedToolPartIDs).toEqual(new Set(["tool-1"]))
})
})
describe("#given same tool reading different files", () => {
test("#when tool events arrive with state.input #then task keeps running", async () => {
const manager = createManager({
circuitBreaker: {
consecutiveThreshold: 20,
},
})
const task: BackgroundTask = {
id: "task-diff-files-1",
sessionID: "session-diff-files-1",
parentSessionID: "parent-1",
parentMessageID: "msg-1",
description: "Reading different files",
prompt: "work",
agent: "explore",
status: "running",
startedAt: new Date(Date.now() - 60_000),
progress: {
toolCalls: 0,
lastUpdate: new Date(Date.now() - 60_000),
},
}
getTaskMap(manager).set(task.id, task)
for (let i = 0; i < 20; i++) {
manager.handleEvent({
type: "message.part.updated",
properties: {
part: {
sessionID: task.sessionID,
type: "tool",
tool: "read",
state: { status: "running", input: { filePath: `/src/file-${i}.ts` } },
},
},
})
}
await flushAsyncWork()
expect(task.status).toBe("running")
expect(task.progress?.toolCalls).toBe(20)
})
})
describe("#given same tool reading same file repeatedly", () => {
test("#when tool events arrive with state.input #then task is cancelled with bare tool name in error", async () => {
const manager = createManager({
circuitBreaker: {
consecutiveThreshold: 20,
},
})
const task: BackgroundTask = {
id: "task-same-file-1",
sessionID: "session-same-file-1",
parentSessionID: "parent-1",
parentMessageID: "msg-1",
description: "Reading same file repeatedly",
prompt: "work",
agent: "explore",
status: "running",
startedAt: new Date(Date.now() - 60_000),
progress: {
toolCalls: 0,
lastUpdate: new Date(Date.now() - 60_000),
},
}
getTaskMap(manager).set(task.id, task)
for (let i = 0; i < 20; i++) {
manager.handleEvent({
type: "message.part.updated",
properties: {
part: {
sessionID: task.sessionID,
type: "tool",
tool: "read",
state: { status: "running", input: { filePath: "/src/same.ts" } },
},
},
})
}
await flushAsyncWork()
expect(task.status).toBe("cancelled")
expect(task.error).toContain("read 20 consecutive times")
expect(task.error).not.toContain("::")
})
})
describe("#given circuit breaker enabled is false", () => {
test("#when repetitive tools arrive #then task keeps running", async () => {
const manager = createManager({
circuitBreaker: {
enabled: false,
consecutiveThreshold: 20,
},
})
const task: BackgroundTask = {
id: "task-disabled-1",
sessionID: "session-disabled-1",
parentSessionID: "parent-1",
parentMessageID: "msg-1",
description: "Disabled circuit breaker task",
prompt: "work",
agent: "explore",
status: "running",
startedAt: new Date(Date.now() - 60_000),
progress: {
toolCalls: 0,
lastUpdate: new Date(Date.now() - 60_000),
},
}
getTaskMap(manager).set(task.id, task)
for (let i = 0; i < 20; i++) {
manager.handleEvent({
type: "message.part.updated",
properties: {
sessionID: task.sessionID,
type: "tool",
tool: "read",
},
})
}
await flushAsyncWork()
expect(task.status).toBe("running")
})
})
describe("#given circuit breaker enabled is false but absolute cap is low", () => {
test("#when max tool calls exceeded #then task is still cancelled by absolute cap", async () => {
const manager = createManager({
maxToolCalls: 3,
circuitBreaker: {
enabled: false,
consecutiveThreshold: 95,
},
})
const task: BackgroundTask = {
id: "task-cap-disabled-1",
sessionID: "session-cap-disabled-1",
parentSessionID: "parent-1",
parentMessageID: "msg-1",
description: "Backstop task with disabled circuit breaker",
prompt: "work",
agent: "explore",
status: "running",
startedAt: new Date(Date.now() - 60_000),
progress: {
toolCalls: 0,
lastUpdate: new Date(Date.now() - 60_000),
},
}
getTaskMap(manager).set(task.id, task)
for (const toolName of ["read", "grep", "edit"]) {
manager.handleEvent({
type: "message.part.updated",
properties: { sessionID: task.sessionID, type: "tool", tool: toolName },
})
}
await flushAsyncWork()
expect(task.status).toBe("cancelled")
expect(task.error).toContain("maximum tool call limit (3)")
})
})
})

View File

@@ -153,4 +153,42 @@ describe("BackgroundManager pollRunningTasks", () => {
expect(task.status).toBe("running")
})
})
describe("#given a running task whose session has terminal non-idle status", () => {
test('#when session status is "interrupted" #then completes the task', async () => {
//#given
const manager = createManagerWithClient({
status: async () => ({ data: { "ses-interrupted": { type: "interrupted" } } }),
})
const task = createRunningTask("ses-interrupted")
injectTask(manager, task)
//#when
const poll = (manager as unknown as { pollRunningTasks: () => Promise<void> }).pollRunningTasks
await poll.call(manager)
manager.shutdown()
//#then
expect(task.status).toBe("completed")
expect(task.completedAt).toBeDefined()
})
test('#when session status is an unknown type #then completes the task', async () => {
//#given
const manager = createManagerWithClient({
status: async () => ({ data: { "ses-unknown": { type: "some-weird-status" } } }),
})
const task = createRunningTask("ses-unknown")
injectTask(manager, task)
//#when
const poll = (manager as unknown as { pollRunningTasks: () => Promise<void> }).pollRunningTasks
await poll.call(manager)
manager.shutdown()
//#then
expect(task.status).toBe("completed")
expect(task.completedAt).toBeDefined()
})
})
})

View File

@@ -3027,10 +3027,10 @@ describe("BackgroundManager.checkAndInterruptStaleTasks", () => {
prompt: "Test",
agent: "test-agent",
status: "running",
startedAt: new Date(Date.now() - 300_000),
startedAt: new Date(Date.now() - 25 * 60 * 1000),
progress: {
toolCalls: 1,
lastUpdate: new Date(Date.now() - 200_000),
lastUpdate: new Date(Date.now() - 21 * 60 * 1000),
},
}

View File

@@ -27,6 +27,7 @@ import {
import {
POLLING_INTERVAL_MS,
TASK_CLEANUP_DELAY_MS,
TASK_TTL_MS,
} from "./constants"
import { subagentSessions } from "../claude-code-session-state"
@@ -51,6 +52,13 @@ import { join } from "node:path"
import { pruneStaleTasksAndNotifications } from "./task-poller"
import { checkAndInterruptStaleTasks } from "./task-poller"
import { removeTaskToastTracking } from "./remove-task-toast-tracking"
import { isActiveSessionStatus, isTerminalSessionStatus } from "./session-status-classifier"
import {
detectRepetitiveToolUse,
recordToolCall,
resolveCircuitBreakerSettings,
type CircuitBreakerSettings,
} from "./loop-detector"
import {
createSubagentDepthLimitError,
createSubagentDescendantLimitError,
@@ -64,9 +72,11 @@ type OpencodeClient = PluginInput["client"]
interface MessagePartInfo {
id?: string
sessionID?: string
type?: string
tool?: string
state?: { status?: string; input?: Record<string, unknown> }
}
interface EventProperties {
@@ -80,6 +90,19 @@ interface Event {
properties?: EventProperties
}
function resolveMessagePartInfo(properties: EventProperties | undefined): MessagePartInfo | undefined {
if (!properties || typeof properties !== "object") {
return undefined
}
const nestedPart = properties.part
if (nestedPart && typeof nestedPart === "object") {
return nestedPart as MessagePartInfo
}
return properties as MessagePartInfo
}
interface Todo {
content: string
status: string
@@ -100,6 +123,8 @@ export interface SubagentSessionCreatedEvent {
export type OnSubagentSessionCreated = (event: SubagentSessionCreatedEvent) => Promise<void>
const MAX_TASK_REMOVAL_RESCHEDULES = 6
export class BackgroundManager {
@@ -128,6 +153,7 @@ export class BackgroundManager {
private preStartDescendantReservations: Set<string>
private enableParentSessionNotifications: boolean
readonly taskHistory = new TaskHistory()
private cachedCircuitBreakerSettings?: CircuitBreakerSettings
constructor(
ctx: PluginInput,
@@ -720,6 +746,8 @@ export class BackgroundManager {
existingTask.progress = {
toolCalls: existingTask.progress?.toolCalls ?? 0,
toolCallWindow: existingTask.progress?.toolCallWindow,
countedToolPartIDs: existingTask.progress?.countedToolPartIDs,
lastUpdate: new Date(),
}
@@ -852,8 +880,7 @@ export class BackgroundManager {
}
if (event.type === "message.part.updated" || event.type === "message.part.delta") {
if (!props || typeof props !== "object" || !("sessionID" in props)) return
const partInfo = props as unknown as MessagePartInfo
const partInfo = resolveMessagePartInfo(props)
const sessionID = partInfo?.sessionID
if (!sessionID) return
@@ -876,8 +903,65 @@ export class BackgroundManager {
task.progress.lastUpdate = new Date()
if (partInfo?.type === "tool" || partInfo?.tool) {
const countedToolPartIDs = task.progress.countedToolPartIDs ?? new Set<string>()
const shouldCountToolCall =
!partInfo.id ||
partInfo.state?.status !== "running" ||
!countedToolPartIDs.has(partInfo.id)
if (!shouldCountToolCall) {
return
}
if (partInfo.id && partInfo.state?.status === "running") {
countedToolPartIDs.add(partInfo.id)
task.progress.countedToolPartIDs = countedToolPartIDs
}
task.progress.toolCalls += 1
task.progress.lastTool = partInfo.tool
const circuitBreaker = this.cachedCircuitBreakerSettings ?? (this.cachedCircuitBreakerSettings = resolveCircuitBreakerSettings(this.config))
if (partInfo.tool) {
task.progress.toolCallWindow = recordToolCall(
task.progress.toolCallWindow,
partInfo.tool,
circuitBreaker,
partInfo.state?.input
)
if (circuitBreaker.enabled) {
const loopDetection = detectRepetitiveToolUse(task.progress.toolCallWindow)
if (loopDetection.triggered) {
log("[background-agent] Circuit breaker: consecutive tool usage detected", {
taskId: task.id,
agent: task.agent,
sessionID,
toolName: loopDetection.toolName,
repeatedCount: loopDetection.repeatedCount,
})
void this.cancelTask(task.id, {
source: "circuit-breaker",
reason: `Subagent called ${loopDetection.toolName} ${loopDetection.repeatedCount} consecutive times (threshold: ${circuitBreaker.consecutiveThreshold}). This usually indicates an infinite loop. The task was automatically cancelled to prevent excessive token usage.`,
})
return
}
}
}
const maxToolCalls = circuitBreaker.maxToolCalls
if (task.progress.toolCalls >= maxToolCalls) {
log("[background-agent] Circuit breaker: tool call limit reached", {
taskId: task.id,
toolCalls: task.progress.toolCalls,
maxToolCalls,
agent: task.agent,
sessionID,
})
void this.cancelTask(task.id, {
source: "circuit-breaker",
reason: `Subagent exceeded maximum tool call limit (${maxToolCalls}). This usually indicates an infinite loop. The task was automatically cancelled to prevent excessive token usage.`,
})
}
}
}
@@ -1188,7 +1272,7 @@ export class BackgroundManager {
this.completedTaskSummaries.delete(parentSessionID)
}
private scheduleTaskRemoval(taskId: string): void {
private scheduleTaskRemoval(taskId: string, rescheduleCount = 0): void {
const existingTimer = this.completionTimers.get(taskId)
if (existingTimer) {
clearTimeout(existingTimer)
@@ -1198,17 +1282,29 @@ export class BackgroundManager {
const timer = setTimeout(() => {
this.completionTimers.delete(taskId)
const task = this.tasks.get(taskId)
if (task) {
this.clearNotificationsForTask(taskId)
this.tasks.delete(taskId)
this.clearTaskHistoryWhenParentTasksGone(task.parentSessionID)
if (task.sessionID) {
subagentSessions.delete(task.sessionID)
SessionCategoryRegistry.remove(task.sessionID)
if (!task) return
if (task.parentSessionID) {
const siblings = this.getTasksByParentSession(task.parentSessionID)
const runningOrPendingSiblings = siblings.filter(
sibling => sibling.id !== taskId && (sibling.status === "running" || sibling.status === "pending"),
)
const completedAtTimestamp = task.completedAt?.getTime()
const reachedTaskTtl = completedAtTimestamp !== undefined && (Date.now() - completedAtTimestamp) >= TASK_TTL_MS
if (runningOrPendingSiblings.length > 0 && rescheduleCount < MAX_TASK_REMOVAL_RESCHEDULES && !reachedTaskTtl) {
this.scheduleTaskRemoval(taskId, rescheduleCount + 1)
return
}
log("[background-agent] Removed completed task from memory:", taskId)
this.clearTaskHistoryWhenParentTasksGone(task?.parentSessionID)
}
this.clearNotificationsForTask(taskId)
this.tasks.delete(taskId)
this.clearTaskHistoryWhenParentTasksGone(task.parentSessionID)
if (task.sessionID) {
subagentSessions.delete(task.sessionID)
SessionCategoryRegistry.remove(task.sessionID)
}
log("[background-agent] Removed completed task from memory:", taskId)
}, TASK_CLEANUP_DELAY_MS)
this.completionTimers.set(taskId, timer)
@@ -1688,11 +1784,9 @@ Use \`background_output(task_id="${task.id}")\` to retrieve this result when rea
}
}
// Match sync-session-poller pattern: only skip completion check when
// status EXISTS and is not idle (i.e., session is actively running).
// When sessionStatus is undefined, the session has completed and dropped
// from the status response — fall through to completion detection.
if (sessionStatus && sessionStatus.type !== "idle") {
// Only skip completion when session status is actively running.
// Unknown or terminal statuses (like "interrupted") fall through to completion.
if (sessionStatus && isActiveSessionStatus(sessionStatus.type)) {
log("[background-agent] Session still running, relying on event-based progress:", {
taskId: task.id,
sessionID,
@@ -1702,6 +1796,24 @@ Use \`background_output(task_id="${task.id}")\` to retrieve this result when rea
continue
}
// Explicit terminal non-idle status (e.g., "interrupted") — complete immediately,
// skipping output validation (session will never produce more output).
// Unknown statuses fall through to the idle/gone path with output validation.
if (sessionStatus && isTerminalSessionStatus(sessionStatus.type)) {
await this.tryCompleteTask(task, `polling (terminal session status: ${sessionStatus.type})`)
continue
}
// Unknown non-idle status — not active, not terminal, not idle.
// Fall through to idle/gone completion path with output validation.
if (sessionStatus && sessionStatus.type !== "idle") {
log("[background-agent] Unknown session status, treating as potentially idle:", {
taskId: task.id,
sessionID,
sessionStatus: sessionStatus.type,
})
}
// Session is idle or no longer in status response (completed/disappeared)
const completionSource = sessionStatus?.type === "idle"
? "polling (idle status)"

View File

@@ -0,0 +1,66 @@
import { describe, test, expect, mock } from "bun:test"
import { isActiveSessionStatus, isTerminalSessionStatus } from "./session-status-classifier"
const mockLog = mock()
mock.module("../../shared", () => ({ log: mockLog }))
describe("isActiveSessionStatus", () => {
describe("#given a known active session status", () => {
test('#when type is "busy" #then returns true', () => {
expect(isActiveSessionStatus("busy")).toBe(true)
})
test('#when type is "retry" #then returns true', () => {
expect(isActiveSessionStatus("retry")).toBe(true)
})
test('#when type is "running" #then returns true', () => {
expect(isActiveSessionStatus("running")).toBe(true)
})
})
describe("#given a known terminal session status", () => {
test('#when type is "idle" #then returns false', () => {
expect(isActiveSessionStatus("idle")).toBe(false)
})
test('#when type is "interrupted" #then returns false and does not log', () => {
mockLog.mockClear()
expect(isActiveSessionStatus("interrupted")).toBe(false)
expect(mockLog).not.toHaveBeenCalled()
})
})
describe("#given an unknown session status", () => {
test('#when type is an arbitrary unknown string #then returns false and logs warning', () => {
mockLog.mockClear()
expect(isActiveSessionStatus("some-unknown-status")).toBe(false)
expect(mockLog).toHaveBeenCalledWith(
"[background-agent] Unknown session status type encountered:",
"some-unknown-status",
)
})
test('#when type is empty string #then returns false', () => {
expect(isActiveSessionStatus("")).toBe(false)
})
})
})
describe("isTerminalSessionStatus", () => {
test('#when type is "interrupted" #then returns true', () => {
expect(isTerminalSessionStatus("interrupted")).toBe(true)
})
test('#when type is "idle" #then returns false (idle is handled separately)', () => {
expect(isTerminalSessionStatus("idle")).toBe(false)
})
test('#when type is "busy" #then returns false', () => {
expect(isTerminalSessionStatus("busy")).toBe(false)
})
test('#when type is an unknown string #then returns false', () => {
expect(isTerminalSessionStatus("some-unknown")).toBe(false)
})
})

View File

@@ -0,0 +1,20 @@
import { log } from "../../shared"
const ACTIVE_SESSION_STATUSES = new Set(["busy", "retry", "running"])
const KNOWN_TERMINAL_STATUSES = new Set(["idle", "interrupted"])
export function isActiveSessionStatus(type: string): boolean {
if (ACTIVE_SESSION_STATUSES.has(type)) {
return true
}
if (!KNOWN_TERMINAL_STATUSES.has(type)) {
log("[background-agent] Unknown session status type encountered:", type)
}
return false
}
export function isTerminalSessionStatus(type: string): boolean {
return KNOWN_TERMINAL_STATUSES.has(type) && type !== "idle"
}

View File

@@ -1,6 +1,5 @@
declare const require: (name: string) => any
const { describe, test, expect, afterEach } = require("bun:test")
import { tmpdir } from "node:os"
import { afterEach, describe, expect, test } from "bun:test"
import type { PluginInput } from "@opencode-ai/plugin"
import { TASK_CLEANUP_DELAY_MS } from "./constants"
import { BackgroundManager } from "./manager"
@@ -157,17 +156,19 @@ function getRequiredTimer(manager: BackgroundManager, taskID: string): ReturnTyp
}
describe("BackgroundManager.notifyParentSession cleanup scheduling", () => {
describe("#given 2 tasks for same parent and task A completed", () => {
test("#when task B is still running #then task A is cleaned up from this.tasks after delay even though task B is not done", async () => {
describe("#given 3 tasks for same parent and task A completed first", () => {
test("#when siblings are still running or pending #then task A remains until siblings also complete", async () => {
// given
const { manager } = createManager(false)
managerUnderTest = manager
fakeTimers = installFakeTimers()
const taskA = createTask({ id: "task-a", parentSessionID: "parent-1", description: "task A", status: "completed", completedAt: new Date("2026-03-11T00:01:00.000Z") })
const taskA = createTask({ id: "task-a", parentSessionID: "parent-1", description: "task A", status: "completed", completedAt: new Date() })
const taskB = createTask({ id: "task-b", parentSessionID: "parent-1", description: "task B", status: "running" })
const taskC = createTask({ id: "task-c", parentSessionID: "parent-1", description: "task C", status: "pending" })
getTasks(manager).set(taskA.id, taskA)
getTasks(manager).set(taskB.id, taskB)
getPendingByParent(manager).set(taskA.parentSessionID, new Set([taskA.id, taskB.id]))
getTasks(manager).set(taskC.id, taskC)
getPendingByParent(manager).set(taskA.parentSessionID, new Set([taskA.id, taskB.id, taskC.id]))
// when
await notifyParentSessionForTest(manager, taskA)
@@ -177,8 +178,23 @@ describe("BackgroundManager.notifyParentSession cleanup scheduling", () => {
// then
expect(fakeTimers.getDelay(taskATimer)).toBeUndefined()
expect(getTasks(manager).has(taskA.id)).toBe(false)
expect(getTasks(manager).has(taskA.id)).toBe(true)
expect(getTasks(manager).get(taskB.id)).toBe(taskB)
expect(getTasks(manager).get(taskC.id)).toBe(taskC)
// when
taskB.status = "completed"
taskB.completedAt = new Date()
taskC.status = "completed"
taskC.completedAt = new Date()
await notifyParentSessionForTest(manager, taskB)
await notifyParentSessionForTest(manager, taskC)
const rescheduledTaskATimer = getRequiredTimer(manager, taskA.id)
expect(fakeTimers.getDelay(rescheduledTaskATimer)).toBe(TASK_CLEANUP_DELAY_MS)
fakeTimers.run(rescheduledTaskATimer)
// then
expect(getTasks(manager).has(taskA.id)).toBe(false)
})
})

View File

@@ -417,6 +417,56 @@ describe("checkAndInterruptStaleTasks", () => {
expect(task.status).toBe("cancelled")
expect(onTaskInterrupted).toHaveBeenCalledWith(task)
})
it('should NOT protect task when session has terminal non-idle status like "interrupted"', async () => {
//#given — lastUpdate is 5min old, session is "interrupted" (terminal, not active)
const task = createRunningTask({
startedAt: new Date(Date.now() - 300_000),
progress: {
toolCalls: 2,
lastUpdate: new Date(Date.now() - 300_000),
},
})
//#when — session status is "interrupted" (terminal)
await checkAndInterruptStaleTasks({
tasks: [task],
client: mockClient as never,
config: { staleTimeoutMs: 180_000 },
concurrencyManager: mockConcurrencyManager as never,
notifyParentSession: mockNotify,
sessionStatuses: { "ses-1": { type: "interrupted" } },
})
//#then — terminal statuses should not protect from stale timeout
expect(task.status).toBe("cancelled")
expect(task.error).toContain("Stale timeout")
})
it('should NOT protect task when session has unknown status type', async () => {
//#given — lastUpdate is 5min old, session has an unknown status
const task = createRunningTask({
startedAt: new Date(Date.now() - 300_000),
progress: {
toolCalls: 2,
lastUpdate: new Date(Date.now() - 300_000),
},
})
//#when — session has unknown status type
await checkAndInterruptStaleTasks({
tasks: [task],
client: mockClient as never,
config: { staleTimeoutMs: 180_000 },
concurrencyManager: mockConcurrencyManager as never,
notifyParentSession: mockNotify,
sessionStatuses: { "ses-1": { type: "some-weird-status" } },
})
//#then — unknown statuses should not protect from stale timeout
expect(task.status).toBe("cancelled")
expect(task.error).toContain("Stale timeout")
})
})
describe("pruneStaleTasksAndNotifications", () => {

View File

@@ -9,12 +9,12 @@ import {
DEFAULT_MESSAGE_STALENESS_TIMEOUT_MS,
DEFAULT_STALE_TIMEOUT_MS,
MIN_RUNTIME_BEFORE_STALE_MS,
TERMINAL_TASK_TTL_MS,
TASK_TTL_MS,
} from "./constants"
import { removeTaskToastTracking } from "./remove-task-toast-tracking"
const TERMINAL_TASK_TTL_MS = 30 * 60 * 1000
import { isActiveSessionStatus } from "./session-status-classifier"
const TERMINAL_TASK_STATUSES = new Set<BackgroundTask["status"]>([
"completed",
"error",
@@ -121,7 +121,7 @@ export async function checkAndInterruptStaleTasks(args: {
if (!startedAt || !sessionID) continue
const sessionStatus = sessionStatuses?.[sessionID]?.type
const sessionIsRunning = sessionStatus !== undefined && sessionStatus !== "idle"
const sessionIsRunning = sessionStatus !== undefined && isActiveSessionStatus(sessionStatus)
const runtime = now - startedAt.getTime()
if (!task.progress?.lastUpdate) {

View File

@@ -9,9 +9,17 @@ export type BackgroundTaskStatus =
| "cancelled"
| "interrupt"
export interface ToolCallWindow {
lastSignature: string
consecutiveCount: number
threshold: number
}
export interface TaskProgress {
toolCalls: number
lastTool?: string
toolCallWindow?: ToolCallWindow
countedToolPartIDs?: Set<string>
lastUpdate: Date
lastMessage?: string
lastMessageAt?: Date

View File

@@ -59,10 +59,13 @@ export function appendSessionId(directory: string, sessionId: string): BoulderSt
if (!Array.isArray(state.session_ids)) {
state.session_ids = []
}
const originalSessionIds = [...state.session_ids]
state.session_ids.push(sessionId)
if (writeBoulderState(directory, state)) {
return state
}
state.session_ids = originalSessionIds
return null
}
return state

View File

@@ -7,7 +7,7 @@ export const START_WORK_TEMPLATE = `You are starting a Sisyphus work session.
- \`--worktree <path>\` (optional): absolute path to an existing git worktree to work in
- If specified and valid: hook pre-sets worktree_path in boulder.json
- If specified but invalid: you must run \`git worktree add <path> <branch>\` first
- If omitted: you MUST choose or create a worktree (see Worktree Setup below)
- If omitted: work directly in the current project directory (no worktree)
## WHAT TO DO
@@ -24,7 +24,7 @@ export const START_WORK_TEMPLATE = `You are starting a Sisyphus work session.
- If ONE plan: auto-select it
- If MULTIPLE plans: show list with timestamps, ask user to select
4. **Worktree Setup** (when \`worktree_path\` not already set in boulder.json):
4. **Worktree Setup** (ONLY when \`--worktree\` was explicitly specified and \`worktree_path\` not already set in boulder.json):
1. \`git worktree list --porcelain\` — see available worktrees
2. Create: \`git worktree add <absolute-path> <branch-or-HEAD>\`
3. Update boulder.json to add \`"worktree_path": "<absolute-path>"\`
@@ -86,6 +86,38 @@ Reading plan and beginning execution...
- The session_id is injected by the hook - use it directly
- Always update boulder.json BEFORE starting work
- Always set worktree_path in boulder.json before executing any tasks
- If worktree_path is set in boulder.json, all work happens inside that worktree directory
- Read the FULL plan file before delegating any tasks
- Follow atlas delegation protocols (7-section format)`
- Follow atlas delegation protocols (7-section format)
## TASK BREAKDOWN (MANDATORY)
After reading the plan file, you MUST decompose every plan task into granular, implementation-level sub-steps and register ALL of them as task/todo items BEFORE starting any work.
**How to break down**:
- Each plan checkbox item (e.g., \`- [ ] Add user authentication\`) must be split into concrete, actionable sub-tasks
- Sub-tasks should be specific enough that each one touches a clear set of files/functions
- Include: file to modify, what to change, expected behavior, and how to verify
- Do NOT leave any task vague — "implement feature X" is NOT acceptable; "add validateToken() to src/auth/middleware.ts that checks JWT expiry and returns 401" IS acceptable
**Example breakdown**:
Plan task: \`- [ ] Add rate limiting to API\`
→ Todo items:
1. Create \`src/middleware/rate-limiter.ts\` with sliding window algorithm (max 100 req/min per IP)
2. Add RateLimiter middleware to \`src/app.ts\` router chain, before auth middleware
3. Add rate limit headers (X-RateLimit-Limit, X-RateLimit-Remaining) to response in \`rate-limiter.ts\`
4. Add test: verify 429 response after exceeding limit in \`src/middleware/rate-limiter.test.ts\`
5. Add test: verify headers are present on normal responses
Register these as task/todo items so progress is tracked and visible throughout the session.
## WORKTREE COMPLETION
When working in a worktree (\`worktree_path\` is set in boulder.json) and ALL plan tasks are complete:
1. Commit all remaining changes in the worktree
2. Switch to the main working directory (the original repo, NOT the worktree)
3. Merge the worktree branch into the current branch: \`git merge <worktree-branch>\`
4. If merge succeeds, clean up: \`git worktree remove <worktree-path>\`
5. Remove the boulder.json state
This is the DEFAULT behavior when \`--worktree\` was used. Skip merge only if the user explicitly instructs otherwise (e.g., asks to create a PR instead).`

View File

@@ -153,3 +153,25 @@ describe("#given git_env_prefix with commit footer", () => {
})
})
})
describe("#given idempotency of prefixGitCommandsInBashCodeBlocks", () => {
describe("#when git_env_prefix is provided and template already has prefixed commands in env prefix section", () => {
it("#then does NOT double-prefix the already-prefixed commands", () => {
const result = injectGitMasterConfig(SAMPLE_TEMPLATE, {
commit_footer: false,
include_co_authored_by: false,
git_env_prefix: "GIT_MASTER=1",
})
expect(result).not.toContain("GIT_MASTER=1 GIT_MASTER=1 git status")
expect(result).not.toContain("GIT_MASTER=1 GIT_MASTER=1 git add")
expect(result).not.toContain("GIT_MASTER=1 GIT_MASTER=1 git commit")
expect(result).not.toContain("GIT_MASTER=1 GIT_MASTER=1 git push")
expect(result).toContain("GIT_MASTER=1 git status")
expect(result).toContain("GIT_MASTER=1 git add")
expect(result).toContain("GIT_MASTER=1 git commit")
expect(result).toContain("GIT_MASTER=1 git push")
})
})
})

View File

@@ -72,8 +72,16 @@ function prefixGitCommandsInBashCodeBlocks(template: string, prefix: string): st
function prefixGitCommandsInCodeBlock(codeBlock: string, prefix: string): string {
return codeBlock
.replace(LEADING_GIT_COMMAND_PATTERN, `$1${prefix} git`)
.replace(INLINE_GIT_COMMAND_PATTERN, `$1${prefix} git`)
.split("\n")
.map((line) => {
if (line.includes(prefix)) {
return line
}
return line
.replace(LEADING_GIT_COMMAND_PATTERN, `$1${prefix} git`)
.replace(INLINE_GIT_COMMAND_PATTERN, `$1${prefix} git`)
})
.join("\n")
}
function buildCommitFooterInjection(

View File

@@ -199,3 +199,236 @@ describe("EXCLUDED_ENV_PATTERNS", () => {
}
})
})
describe("secret env var filtering", () => {
it("filters out ANTHROPIC_API_KEY", () => {
// given
process.env.ANTHROPIC_API_KEY = "sk-ant-api03-secret"
process.env.PATH = "/usr/bin"
// when
const cleanEnv = createCleanMcpEnvironment()
// then
expect(cleanEnv.ANTHROPIC_API_KEY).toBeUndefined()
expect(cleanEnv.PATH).toBe("/usr/bin")
})
it("filters out AWS_SECRET_ACCESS_KEY", () => {
// given
process.env.AWS_SECRET_ACCESS_KEY = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
process.env.AWS_ACCESS_KEY_ID = "AKIAIOSFODNN7EXAMPLE"
process.env.HOME = "/home/user"
// when
const cleanEnv = createCleanMcpEnvironment()
// then
expect(cleanEnv.AWS_SECRET_ACCESS_KEY).toBeUndefined()
expect(cleanEnv.AWS_ACCESS_KEY_ID).toBeUndefined()
expect(cleanEnv.HOME).toBe("/home/user")
})
it("filters out GITHUB_TOKEN", () => {
// given
process.env.GITHUB_TOKEN = "ghp_secrettoken123456789"
process.env.GITHUB_API_TOKEN = "another_secret_token"
process.env.SHELL = "/bin/bash"
// when
const cleanEnv = createCleanMcpEnvironment()
// then
expect(cleanEnv.GITHUB_TOKEN).toBeUndefined()
expect(cleanEnv.GITHUB_API_TOKEN).toBeUndefined()
expect(cleanEnv.SHELL).toBe("/bin/bash")
})
it("filters out OPENAI_API_KEY", () => {
// given
process.env.OPENAI_API_KEY = "sk-secret123456789"
process.env.LANG = "en_US.UTF-8"
// when
const cleanEnv = createCleanMcpEnvironment()
// then
expect(cleanEnv.OPENAI_API_KEY).toBeUndefined()
expect(cleanEnv.LANG).toBe("en_US.UTF-8")
})
it("filters out DATABASE_URL with credentials", () => {
// given
process.env.DATABASE_URL = "postgresql://user:password@localhost:5432/db"
process.env.DB_PASSWORD = "supersecretpassword"
process.env.TERM = "xterm-256color"
// when
const cleanEnv = createCleanMcpEnvironment()
// then
expect(cleanEnv.DATABASE_URL).toBeUndefined()
expect(cleanEnv.DB_PASSWORD).toBeUndefined()
expect(cleanEnv.TERM).toBe("xterm-256color")
})
})
describe("suffix-based secret filtering", () => {
it("filters variables ending with _KEY", () => {
// given
process.env.MY_API_KEY = "secret-value"
process.env.SOME_KEY = "another-secret"
process.env.TMPDIR = "/tmp"
// when
const cleanEnv = createCleanMcpEnvironment()
// then
expect(cleanEnv.MY_API_KEY).toBeUndefined()
expect(cleanEnv.SOME_KEY).toBeUndefined()
expect(cleanEnv.TMPDIR).toBe("/tmp")
})
it("filters variables ending with _SECRET", () => {
// given
process.env.AWS_SECRET = "secret-value"
process.env.JWT_SECRET = "jwt-secret-token"
process.env.USER = "testuser"
// when
const cleanEnv = createCleanMcpEnvironment()
// then
expect(cleanEnv.AWS_SECRET).toBeUndefined()
expect(cleanEnv.JWT_SECRET).toBeUndefined()
expect(cleanEnv.USER).toBe("testuser")
})
it("filters variables ending with _TOKEN", () => {
// given
process.env.ACCESS_TOKEN = "token-value"
process.env.BEARER_TOKEN = "bearer-token"
process.env.HOME = "/home/user"
// when
const cleanEnv = createCleanMcpEnvironment()
// then
expect(cleanEnv.ACCESS_TOKEN).toBeUndefined()
expect(cleanEnv.BEARER_TOKEN).toBeUndefined()
expect(cleanEnv.HOME).toBe("/home/user")
})
it("filters variables ending with _PASSWORD", () => {
// given
process.env.DB_PASSWORD = "db-password"
process.env.APP_PASSWORD = "app-secret"
process.env.NODE_ENV = "production"
// when
const cleanEnv = createCleanMcpEnvironment()
// then
expect(cleanEnv.DB_PASSWORD).toBeUndefined()
expect(cleanEnv.APP_PASSWORD).toBeUndefined()
expect(cleanEnv.NODE_ENV).toBe("production")
})
it("filters variables ending with _CREDENTIAL", () => {
// given
process.env.GCP_CREDENTIAL = "json-credential"
process.env.AZURE_CREDENTIAL = "azure-creds"
process.env.PWD = "/current/dir"
// when
const cleanEnv = createCleanMcpEnvironment()
// then
expect(cleanEnv.GCP_CREDENTIAL).toBeUndefined()
expect(cleanEnv.AZURE_CREDENTIAL).toBeUndefined()
expect(cleanEnv.PWD).toBe("/current/dir")
})
it("filters variables ending with _API_KEY", () => {
// given
// given
process.env.STRIPE_API_KEY = "sk_live_secret"
process.env.SENDGRID_API_KEY = "SG.secret"
process.env.SHELL = "/bin/zsh"
// when
const cleanEnv = createCleanMcpEnvironment()
// then
expect(cleanEnv.STRIPE_API_KEY).toBeUndefined()
expect(cleanEnv.SENDGRID_API_KEY).toBeUndefined()
expect(cleanEnv.SHELL).toBe("/bin/zsh")
})
})
describe("safe environment variables preserved", () => {
it("preserves PATH", () => {
// given
process.env.PATH = "/usr/bin:/usr/local/bin"
// when
const cleanEnv = createCleanMcpEnvironment()
// then
expect(cleanEnv.PATH).toBe("/usr/bin:/usr/local/bin")
})
it("preserves HOME", () => {
// given
process.env.HOME = "/home/testuser"
// when
const cleanEnv = createCleanMcpEnvironment()
// then
expect(cleanEnv.HOME).toBe("/home/testuser")
})
it("preserves SHELL", () => {
// given
process.env.SHELL = "/bin/bash"
// when
const cleanEnv = createCleanMcpEnvironment()
// then
expect(cleanEnv.SHELL).toBe("/bin/bash")
})
it("preserves LANG", () => {
// given
process.env.LANG = "en_US.UTF-8"
// when
const cleanEnv = createCleanMcpEnvironment()
// then
expect(cleanEnv.LANG).toBe("en_US.UTF-8")
})
it("preserves TERM", () => {
// given
process.env.TERM = "xterm-256color"
// when
const cleanEnv = createCleanMcpEnvironment()
// then
expect(cleanEnv.TERM).toBe("xterm-256color")
})
it("preserves TMPDIR", () => {
// given
process.env.TMPDIR = "/tmp"
// when
const cleanEnv = createCleanMcpEnvironment()
// then
expect(cleanEnv.TMPDIR).toBe("/tmp")
})
})

View File

@@ -1,10 +1,28 @@
// Filters npm/pnpm/yarn config env vars that break MCP servers in pnpm projects (#456)
// Also filters secret-containing env vars to prevent exposure to malicious stdio MCP servers (#B-02)
export const EXCLUDED_ENV_PATTERNS: RegExp[] = [
// npm/pnpm/yarn config patterns (original)
/^NPM_CONFIG_/i,
/^npm_config_/,
/^YARN_/,
/^PNPM_/,
/^NO_UPDATE_NOTIFIER$/,
// Specific high-risk secret env vars (explicit blocks)
/^ANTHROPIC_API_KEY$/i,
/^AWS_ACCESS_KEY_ID$/i,
/^AWS_SECRET_ACCESS_KEY$/i,
/^GITHUB_TOKEN$/i,
/^DATABASE_URL$/i,
/^OPENAI_API_KEY$/i,
// Suffix-based patterns for common secret naming conventions
/_KEY$/i,
/_SECRET$/i,
/_TOKEN$/i,
/_PASSWORD$/i,
/_CREDENTIAL$/i,
/_API_KEY$/i,
]
export function createCleanMcpEnvironment(

View File

@@ -279,6 +279,116 @@ describe("TaskToastManager", () => {
})
})
describe("model name display in task line", () => {
test("should show model name before category when modelInfo exists", () => {
// given - a task with category and modelInfo
const task = {
id: "task_model_display",
description: "Build UI component",
agent: "sisyphus-junior",
isBackground: true,
category: "deep",
modelInfo: { model: "openai/gpt-5.3-codex", type: "category-default" as const },
}
// when - addTask is called
toastManager.addTask(task)
// then - toast should show model name before category like "gpt-5.3-codex: deep"
const call = mockClient.tui.showToast.mock.calls[0][0]
expect(call.body.message).toContain("gpt-5.3-codex: deep")
expect(call.body.message).not.toContain("sisyphus-junior/deep")
})
test("should strip provider prefix from model name", () => {
// given - a task with provider-prefixed model
const task = {
id: "task_strip_provider",
description: "Fix styles",
agent: "sisyphus-junior",
isBackground: false,
category: "visual-engineering",
modelInfo: { model: "google/gemini-3.1-pro", type: "category-default" as const },
}
// when - addTask is called
toastManager.addTask(task)
// then - should show model ID without provider prefix
const call = mockClient.tui.showToast.mock.calls[0][0]
expect(call.body.message).toContain("gemini-3.1-pro: visual-engineering")
})
test("should fall back to agent/category format when no modelInfo", () => {
// given - a task without modelInfo
const task = {
id: "task_no_model",
description: "Quick fix",
agent: "sisyphus-junior",
isBackground: true,
category: "quick",
}
// when - addTask is called
toastManager.addTask(task)
// then - should use old format with agent name
const call = mockClient.tui.showToast.mock.calls[0][0]
expect(call.body.message).toContain("sisyphus-junior/quick")
})
test("should show model name without category when category is absent", () => {
// given - a task with modelInfo but no category
const task = {
id: "task_model_no_cat",
description: "Explore codebase",
agent: "explore",
isBackground: true,
modelInfo: { model: "anthropic/claude-sonnet-4-6", type: "category-default" as const },
}
// when - addTask is called
toastManager.addTask(task)
// then - should show just the model name in parens
const call = mockClient.tui.showToast.mock.calls[0][0]
expect(call.body.message).toContain("(claude-sonnet-4-6)")
})
test("should show model name in queued tasks too", () => {
// given - a concurrency manager that limits to 1
const limitedConcurrency = {
getConcurrencyLimit: mock(() => 1),
} as unknown as ConcurrencyManager
// eslint-disable-next-line @typescript-eslint/no-explicit-any
const limitedManager = new TaskToastManager(mockClient as any, limitedConcurrency)
limitedManager.addTask({
id: "task_running",
description: "Running task",
agent: "sisyphus-junior",
isBackground: true,
category: "deep",
modelInfo: { model: "openai/gpt-5.3-codex", type: "category-default" as const },
})
limitedManager.addTask({
id: "task_queued",
description: "Queued task",
agent: "sisyphus-junior",
isBackground: true,
category: "quick",
status: "queued",
modelInfo: { model: "anthropic/claude-haiku-4-5", type: "category-default" as const },
})
// when - the queued task toast fires
const lastCall = mockClient.tui.showToast.mock.calls[1][0]
// then - queued task should also show model name
expect(lastCall.body.message).toContain("claude-haiku-4-5: quick")
})
})
describe("updateTaskModelBySession", () => {
test("updates task model info and shows fallback toast", () => {
// given - task without model info

View File

@@ -127,6 +127,13 @@ export class TaskToastManager {
const queued = this.getQueuedTasks()
const concurrencyInfo = this.getConcurrencyInfo()
const formatTaskIdentifier = (task: TrackedTask): string => {
const modelName = task.modelInfo?.model?.split("/").pop()
if (modelName && task.category) return `${modelName}: ${task.category}`
if (modelName) return modelName
if (task.category) return `${task.agent}/${task.category}`
return task.agent
}
const lines: string[] = []
const isFallback = newTask.modelInfo && (
@@ -151,9 +158,9 @@ export class TaskToastManager {
const duration = this.formatDuration(task.startedAt)
const bgIcon = task.isBackground ? "[BG]" : "[RUN]"
const isNew = task.id === newTask.id ? " ← NEW" : ""
const categoryInfo = task.category ? `/${task.category}` : ""
const taskId = formatTaskIdentifier(task)
const skillsInfo = task.skills?.length ? ` [${task.skills.join(", ")}]` : ""
lines.push(`${bgIcon} ${task.description} (${task.agent}${categoryInfo})${skillsInfo} - ${duration}${isNew}`)
lines.push(`${bgIcon} ${task.description} (${taskId})${skillsInfo} - ${duration}${isNew}`)
}
}
@@ -162,10 +169,10 @@ export class TaskToastManager {
lines.push(`Queued (${queued.length}):`)
for (const task of queued) {
const bgIcon = task.isBackground ? "[Q]" : "[W]"
const categoryInfo = task.category ? `/${task.category}` : ""
const taskId = formatTaskIdentifier(task)
const skillsInfo = task.skills?.length ? ` [${task.skills.join(", ")}]` : ""
const isNew = task.id === newTask.id ? " ← NEW" : ""
lines.push(`${bgIcon} ${task.description} (${task.agent}${categoryInfo})${skillsInfo} - Queued${isNew}`)
lines.push(`${bgIcon} ${task.description} (${taskId})${skillsInfo} - Queued${isNew}`)
}
}

View File

@@ -70,7 +70,7 @@ function isTokenLimitError(text: string): boolean {
return false
}
const lower = text.toLowerCase()
return TOKEN_LIMIT_KEYWORDS.some((kw) => lower.includes(kw.toLowerCase()))
return TOKEN_LIMIT_KEYWORDS.some((kw) => lower.includes(kw))
}
export function parseAnthropicTokenLimitError(err: unknown): ParsedTokenLimitError | null {

View File

@@ -18,9 +18,9 @@ function getLastAgentFromMessageDir(messageDir: string): string | null {
const files = readdirSync(messageDir)
.filter((fileName) => fileName.endsWith(".json"))
.sort()
.reverse()
for (const fileName of files) {
for (let i = files.length - 1; i >= 0; i--) {
const fileName = files[i]
try {
const content = readFileSync(join(messageDir, fileName), "utf-8")
const parsed = JSON.parse(content) as { agent?: unknown }

View File

@@ -44,12 +44,6 @@ export interface ExecutorOptions {
agent?: string
}
function filterDiscoveredCommandsByScope(
commands: DiscoveredCommandInfo[],
scope: DiscoveredCommandInfo["scope"],
): DiscoveredCommandInfo[] {
return commands.filter(command => command.scope === scope)
}
async function discoverAllCommands(options?: ExecutorOptions): Promise<CommandInfo[]> {
const discoveredCommands = discoverCommandsSync(process.cwd(), {
@@ -60,14 +54,18 @@ async function discoverAllCommands(options?: ExecutorOptions): Promise<CommandIn
const skills = options?.skills ?? await discoverAllSkills()
const skillCommands = skills.map(skillToCommandInfo)
const scopeOrder: DiscoveredCommandInfo["scope"][] = ["project", "user", "opencode-project", "opencode", "builtin", "plugin"]
const grouped = new Map<string, DiscoveredCommandInfo[]>()
for (const cmd of discoveredCommands) {
const list = grouped.get(cmd.scope) ?? []
list.push(cmd)
grouped.set(cmd.scope, list)
}
const orderedCommands = scopeOrder.flatMap((scope) => grouped.get(scope) ?? [])
return [
...skillCommands,
...filterDiscoveredCommandsByScope(discoveredCommands, "project"),
...filterDiscoveredCommandsByScope(discoveredCommands, "user"),
...filterDiscoveredCommandsByScope(discoveredCommands, "opencode-project"),
...filterDiscoveredCommandsByScope(discoveredCommands, "opencode"),
...filterDiscoveredCommandsByScope(discoveredCommands, "builtin"),
...filterDiscoveredCommandsByScope(discoveredCommands, "plugin"),
...orderedCommands,
]
}

View File

@@ -1,6 +1,9 @@
import type { PluginInput } from "@opencode-ai/plugin"
import { existsSync } from "node:fs"
import { join } from "node:path"
import { runBunInstallWithDetails } from "../../../cli/config-manager"
import { log } from "../../../shared/logger"
import { getOpenCodeCacheDir, getOpenCodeConfigPaths } from "../../../shared"
import { invalidatePackage } from "../cache"
import { PACKAGE_NAME } from "../constants"
import { extractChannel } from "../version-channel"
@@ -11,9 +14,36 @@ function getPinnedVersionToastMessage(latestVersion: string): string {
return `Update available: ${latestVersion} (version pinned, update manually)`
}
async function runBunInstallSafe(): Promise<boolean> {
/**
* Resolves the active install workspace.
* Same logic as doctor check: prefer config-dir if installed, fall back to cache-dir.
*/
function resolveActiveInstallWorkspace(): string {
const configPaths = getOpenCodeConfigPaths({ binary: "opencode" })
const cacheDir = getOpenCodeCacheDir()
const configInstallPath = join(configPaths.configDir, "node_modules", PACKAGE_NAME, "package.json")
const cacheInstallPath = join(cacheDir, "node_modules", PACKAGE_NAME, "package.json")
// Prefer config-dir if installed there, otherwise fall back to cache-dir
if (existsSync(configInstallPath)) {
log(`[auto-update-checker] Active workspace: config-dir (${configPaths.configDir})`)
return configPaths.configDir
}
if (existsSync(cacheInstallPath)) {
log(`[auto-update-checker] Active workspace: cache-dir (${cacheDir})`)
return cacheDir
}
// Default to config-dir if neither exists (matches doctor behavior)
log(`[auto-update-checker] Active workspace: config-dir (default, no install detected)`)
return configPaths.configDir
}
async function runBunInstallSafe(workspaceDir: string): Promise<boolean> {
try {
const result = await runBunInstallWithDetails({ outputMode: "pipe" })
const result = await runBunInstallWithDetails({ outputMode: "pipe", workspaceDir })
if (!result.success && result.error) {
log("[auto-update-checker] bun install error:", result.error)
}
@@ -82,7 +112,8 @@ export async function runBackgroundUpdateCheck(
invalidatePackage(PACKAGE_NAME)
const installSuccess = await runBunInstallSafe()
const activeWorkspace = resolveActiveInstallWorkspace()
const installSuccess = await runBunInstallSafe(activeWorkspace)
if (installSuccess) {
await showAutoUpdatedToast(ctx, currentVersion, latestVersion)

View File

@@ -0,0 +1,223 @@
import type { PluginInput } from "@opencode-ai/plugin"
import { afterEach, beforeEach, describe, expect, it, mock } from "bun:test"
import { existsSync, mkdirSync, rmSync, writeFileSync } from "node:fs"
import { join } from "node:path"
type PluginEntry = {
entry: string
isPinned: boolean
pinnedVersion: string | null
configPath: string
}
type ToastMessageGetter = (isUpdate: boolean, version?: string) => string
function createPluginEntry(overrides?: Partial<PluginEntry>): PluginEntry {
return {
entry: "oh-my-opencode@3.4.0",
isPinned: false,
pinnedVersion: null,
configPath: "/test/opencode.json",
...overrides,
}
}
const TEST_DIR = join(import.meta.dir, "__test-workspace-resolution__")
const TEST_CACHE_DIR = join(TEST_DIR, "cache")
const TEST_CONFIG_DIR = join(TEST_DIR, "config")
const mockFindPluginEntry = mock((_directory: string): PluginEntry | null => createPluginEntry())
const mockGetCachedVersion = mock((): string | null => "3.4.0")
const mockGetLatestVersion = mock(async (): Promise<string | null> => "3.5.0")
const mockExtractChannel = mock(() => "latest")
const mockInvalidatePackage = mock(() => {})
const mockShowUpdateAvailableToast = mock(
async (_ctx: PluginInput, _latestVersion: string, _getToastMessage: ToastMessageGetter): Promise<void> => {}
)
const mockShowAutoUpdatedToast = mock(
async (_ctx: PluginInput, _fromVersion: string, _toVersion: string): Promise<void> => {}
)
const mockSyncCachePackageJsonToIntent = mock(() => ({ synced: true, error: null }))
const mockRunBunInstallWithDetails = mock(
async (opts?: { outputMode?: string; workspaceDir?: string }) => {
return { success: true }
}
)
mock.module("../checker", () => ({
findPluginEntry: mockFindPluginEntry,
getCachedVersion: mockGetCachedVersion,
getLatestVersion: mockGetLatestVersion,
revertPinnedVersion: mock(() => false),
syncCachePackageJsonToIntent: mockSyncCachePackageJsonToIntent,
}))
mock.module("../version-channel", () => ({ extractChannel: mockExtractChannel }))
mock.module("../cache", () => ({ invalidatePackage: mockInvalidatePackage }))
mock.module("../../../cli/config-manager", () => ({
runBunInstallWithDetails: mockRunBunInstallWithDetails,
}))
mock.module("./update-toasts", () => ({
showUpdateAvailableToast: mockShowUpdateAvailableToast,
showAutoUpdatedToast: mockShowAutoUpdatedToast,
}))
mock.module("../../../shared/logger", () => ({ log: () => {} }))
mock.module("../../../shared", () => ({
getOpenCodeCacheDir: () => TEST_CACHE_DIR,
getOpenCodeConfigPaths: () => ({
configDir: TEST_CONFIG_DIR,
configJson: join(TEST_CONFIG_DIR, "opencode.json"),
configJsonc: join(TEST_CONFIG_DIR, "opencode.jsonc"),
packageJson: join(TEST_CONFIG_DIR, "package.json"),
omoConfig: join(TEST_CONFIG_DIR, "oh-my-opencode.json"),
}),
getOpenCodeConfigDir: () => TEST_CONFIG_DIR,
}))
// Mock constants BEFORE importing the module
const ORIGINAL_PACKAGE_NAME = "oh-my-opencode"
mock.module("../constants", () => ({
PACKAGE_NAME: ORIGINAL_PACKAGE_NAME,
CACHE_DIR: TEST_CACHE_DIR,
USER_CONFIG_DIR: TEST_CONFIG_DIR,
}))
// Need to mock getOpenCodeCacheDir and getOpenCodeConfigPaths before importing the module
mock.module("../../../shared/data-path", () => ({
getDataDir: () => join(TEST_DIR, "data"),
getOpenCodeStorageDir: () => join(TEST_DIR, "data", "opencode", "storage"),
getCacheDir: () => TEST_DIR,
getOmoOpenCodeCacheDir: () => join(TEST_DIR, "oh-my-opencode"),
getOpenCodeCacheDir: () => TEST_CACHE_DIR,
}))
mock.module("../../../shared/opencode-config-dir", () => ({
getOpenCodeConfigDir: () => TEST_CONFIG_DIR,
getOpenCodeConfigPaths: () => ({
configDir: TEST_CONFIG_DIR,
configJson: join(TEST_CONFIG_DIR, "opencode.json"),
configJsonc: join(TEST_CONFIG_DIR, "opencode.jsonc"),
packageJson: join(TEST_CONFIG_DIR, "package.json"),
omoConfig: join(TEST_CONFIG_DIR, "oh-my-opencode.json"),
}),
}))
const modulePath = "./background-update-check?test"
const { runBackgroundUpdateCheck } = await import(modulePath)
describe("workspace resolution", () => {
const mockCtx = { directory: "/test" } as PluginInput
const getToastMessage: ToastMessageGetter = (isUpdate, version) =>
isUpdate ? `Update to ${version}` : "Up to date"
beforeEach(() => {
// Setup test directories
if (existsSync(TEST_DIR)) {
rmSync(TEST_DIR, { recursive: true, force: true })
}
mkdirSync(TEST_DIR, { recursive: true })
mockFindPluginEntry.mockReset()
mockGetCachedVersion.mockReset()
mockGetLatestVersion.mockReset()
mockExtractChannel.mockReset()
mockInvalidatePackage.mockReset()
mockRunBunInstallWithDetails.mockReset()
mockShowUpdateAvailableToast.mockReset()
mockShowAutoUpdatedToast.mockReset()
mockFindPluginEntry.mockReturnValue(createPluginEntry())
mockGetCachedVersion.mockReturnValue("3.4.0")
mockGetLatestVersion.mockResolvedValue("3.5.0")
mockExtractChannel.mockReturnValue("latest")
// Note: Don't use mockResolvedValue here - it overrides the function that captures args
mockSyncCachePackageJsonToIntent.mockReturnValue({ synced: true, error: null })
})
afterEach(() => {
if (existsSync(TEST_DIR)) {
rmSync(TEST_DIR, { recursive: true, force: true })
}
})
describe("#given config-dir install exists but cache-dir does not", () => {
it("installs to config-dir, not cache-dir", async () => {
//#given - config-dir has installation, cache-dir does not
mkdirSync(join(TEST_CONFIG_DIR, "node_modules", "oh-my-opencode"), { recursive: true })
writeFileSync(
join(TEST_CONFIG_DIR, "package.json"),
JSON.stringify({ dependencies: { "oh-my-opencode": "3.4.0" } }, null, 2)
)
writeFileSync(
join(TEST_CONFIG_DIR, "node_modules", "oh-my-opencode", "package.json"),
JSON.stringify({ name: "oh-my-opencode", version: "3.4.0" }, null, 2)
)
// cache-dir should NOT exist
expect(existsSync(TEST_CACHE_DIR)).toBe(false)
//#when
await runBackgroundUpdateCheck(mockCtx, true, getToastMessage)
//#then - install should be called with config-dir
const mockCalls = mockRunBunInstallWithDetails.mock.calls
expect(mockCalls[0][0]?.workspaceDir).toBe(TEST_CONFIG_DIR)
})
})
describe("#given both config-dir and cache-dir exist", () => {
it("prefers config-dir over cache-dir", async () => {
//#given - both directories have installations
mkdirSync(join(TEST_CONFIG_DIR, "node_modules", "oh-my-opencode"), { recursive: true })
writeFileSync(
join(TEST_CONFIG_DIR, "package.json"),
JSON.stringify({ dependencies: { "oh-my-opencode": "3.4.0" } }, null, 2)
)
writeFileSync(
join(TEST_CONFIG_DIR, "node_modules", "oh-my-opencode", "package.json"),
JSON.stringify({ name: "oh-my-opencode", version: "3.4.0" }, null, 2)
)
mkdirSync(join(TEST_CACHE_DIR, "node_modules", "oh-my-opencode"), { recursive: true })
writeFileSync(
join(TEST_CACHE_DIR, "package.json"),
JSON.stringify({ dependencies: { "oh-my-opencode": "3.4.0" } }, null, 2)
)
writeFileSync(
join(TEST_CACHE_DIR, "node_modules", "oh-my-opencode", "package.json"),
JSON.stringify({ name: "oh-my-opencode", version: "3.4.0" }, null, 2)
)
//#when
await runBackgroundUpdateCheck(mockCtx, true, getToastMessage)
//#then - install should prefer config-dir
const mockCalls2 = mockRunBunInstallWithDetails.mock.calls
expect(mockCalls2[0][0]?.workspaceDir).toBe(TEST_CONFIG_DIR)
})
})
describe("#given only cache-dir install exists", () => {
it("falls back to cache-dir", async () => {
//#given - only cache-dir has installation
mkdirSync(join(TEST_CACHE_DIR, "node_modules", "oh-my-opencode"), { recursive: true })
writeFileSync(
join(TEST_CACHE_DIR, "package.json"),
JSON.stringify({ dependencies: { "oh-my-opencode": "3.4.0" } }, null, 2)
)
writeFileSync(
join(TEST_CACHE_DIR, "node_modules", "oh-my-opencode", "package.json"),
JSON.stringify({ name: "oh-my-opencode", version: "3.4.0" }, null, 2)
)
// config-dir should NOT exist
expect(existsSync(TEST_CONFIG_DIR)).toBe(false)
//#when
await runBackgroundUpdateCheck(mockCtx, true, getToastMessage)
//#then - install should fall back to cache-dir
const mockCalls3 = mockRunBunInstallWithDetails.mock.calls
expect(mockCalls3[0][0]?.workspaceDir).toBe(TEST_CACHE_DIR)
})
})
})

View File

@@ -79,8 +79,6 @@ export function createToolExecuteAfterHandler(ctx: PluginInput, config: PluginCo
return
}
const claudeConfig = await loadClaudeHooksConfig()
const extendedConfig = await loadPluginExtendedConfig()
const cachedInput = getToolInput(input.sessionID, input.tool, input.callID) || {}
@@ -96,6 +94,9 @@ export function createToolExecuteAfterHandler(ctx: PluginInput, config: PluginCo
return
}
const claudeConfig = await loadClaudeHooksConfig()
const extendedConfig = await loadPluginExtendedConfig()
const postClient: PostToolUseClient = {
session: {
messages: (opts) => ctx.client.session.messages(opts),

View File

@@ -43,8 +43,6 @@ export function createToolExecuteBeforeHandler(ctx: PluginInput, config: PluginC
log("todowrite: parsed todos string to array", { sessionID: input.sessionID })
}
const claudeConfig = await loadClaudeHooksConfig()
const extendedConfig = await loadPluginExtendedConfig()
appendTranscriptEntry(input.sessionID, {
type: "tool_use",
@@ -59,6 +57,9 @@ export function createToolExecuteBeforeHandler(ctx: PluginInput, config: PluginC
return
}
const claudeConfig = await loadClaudeHooksConfig()
const extendedConfig = await loadPluginExtendedConfig()
const preCtx: PreToolUseContext = {
sessionId: input.sessionID,
toolName: input.tool,

View File

@@ -3,6 +3,18 @@ import type { CommentCheckerConfig } from "../../config/schema"
import z from "zod"
const ApplyPatchMetadataSchema = z.object({
files: z.array(
z.object({
filePath: z.string(),
movePath: z.string().optional(),
before: z.string(),
after: z.string(),
type: z.string().optional(),
}),
),
})
import {
initializeCommentCheckerCli,
getCommentCheckerCliPathPromise,
@@ -104,17 +116,6 @@ export function createCommentCheckerHooks(config?: CommentCheckerConfig) {
return
}
const ApplyPatchMetadataSchema = z.object({
files: z.array(
z.object({
filePath: z.string(),
movePath: z.string().optional(),
before: z.string(),
after: z.string(),
type: z.string().optional(),
}),
),
})
if (toolLower === "apply_patch") {
const parsed = ApplyPatchMetadataSchema.safeParse(output.metadata)

View File

@@ -52,3 +52,4 @@ export { createWriteExistingFileGuardHook } from "./write-existing-file-guard";
export { createHashlineReadEnhancerHook } from "./hashline-read-enhancer";
export { createJsonErrorRecoveryHook, JSON_ERROR_TOOL_EXCLUDE_LIST, JSON_ERROR_PATTERNS, JSON_ERROR_REMINDER } from "./json-error-recovery";
export { createReadImageResizerHook } from "./read-image-resizer"
export { createTodoDescriptionOverrideHook } from "./todo-description-override"

View File

@@ -1,70 +0,0 @@
import { wakeOpenClaw } from "../../openclaw/client";
import type { OpenClawConfig, OpenClawContext } from "../../openclaw/types";
import { getMainSessionID } from "../../features/claude-code-session-state";
import type { PluginContext } from "../../plugin/types";
export function createOpenClawSenderHook(
ctx: PluginContext,
config: OpenClawConfig
) {
return {
event: async (input: {
event: { type: string; properties?: Record<string, unknown> };
}) => {
const { type, properties } = input.event;
const info = properties?.info as Record<string, unknown> | undefined;
const context: OpenClawContext = {
sessionId:
(properties?.sessionID as string) ||
(info?.id as string) ||
getMainSessionID(),
projectPath: ctx.directory,
};
if (type === "session.created") {
await wakeOpenClaw("session-start", context, config);
} else if (type === "session.idle") {
await wakeOpenClaw("session-idle", context, config);
} else if (type === "session.deleted") {
await wakeOpenClaw("session-end", context, config);
}
},
"tool.execute.before": async (
input: { tool: string; sessionID: string },
output: { args: Record<string, unknown> }
) => {
const toolName = input.tool.toLowerCase();
const context: OpenClawContext = {
sessionId: input.sessionID,
projectPath: ctx.directory,
};
if (
toolName === "ask_user_question" ||
toolName === "askuserquestion" ||
toolName === "question"
) {
const question =
typeof output.args.question === "string"
? output.args.question
: undefined;
await wakeOpenClaw(
"ask-user-question",
{
...context,
question,
},
config
);
} else if (toolName === "skill") {
const rawName =
typeof output.args.name === "string" ? output.args.name : undefined;
const command = rawName?.replace(/^\//, "").toLowerCase();
if (command === "stop-continuation") {
await wakeOpenClaw("stop", context, config);
}
}
},
};
}

View File

@@ -23,6 +23,10 @@ export async function handleDetectedCompletion(
const { sessionID, state, loopState, directory, apiTimeoutMs } = input
if (state.ultrawork && !state.verification_pending) {
if (state.verification_session_id) {
ctx.client.session.abort({ path: { id: state.verification_session_id } }).catch(() => {})
}
const verificationState = loopState.markVerificationPending(sessionID)
if (!verificationState) {
log(`[${HOOK_NAME}] Failed to transition ultrawork loop to verification`, {

View File

@@ -1,11 +1,96 @@
import type { PluginInput } from "@opencode-ai/plugin"
import { log } from "../../shared/logger"
import { HOOK_NAME } from "./constants"
import { ULTRAWORK_VERIFICATION_PROMISE } from "./constants"
import type { RalphLoopState } from "./types"
import { handleFailedVerification } from "./verification-failure-handler"
import { withTimeout } from "./with-timeout"
type OpenCodeSessionMessage = {
info?: { role?: string }
parts?: Array<{ type?: string; text?: string }>
}
const ORACLE_AGENT_PATTERN = /Agent:\s*oracle/i
const TASK_METADATA_SESSION_PATTERN = /<task_metadata>[\s\S]*?session_id:\s*([^\s<]+)[\s\S]*?<\/task_metadata>/i
const VERIFIED_PROMISE_PATTERN = new RegExp(
`<promise>\\s*${ULTRAWORK_VERIFICATION_PROMISE}\\s*<\\/promise>`,
"i",
)
function collectAssistantText(message: OpenCodeSessionMessage): string {
if (!Array.isArray(message.parts)) {
return ""
}
let text = ""
for (const part of message.parts) {
if (part.type !== "text") {
continue
}
text += `${text ? "\n" : ""}${part.text ?? ""}`
}
return text
}
async function detectOracleVerificationFromParentSession(
ctx: PluginInput,
parentSessionID: string,
directory: string,
apiTimeoutMs: number,
): Promise<string | undefined> {
try {
const response = await withTimeout(
ctx.client.session.messages({
path: { id: parentSessionID },
query: { directory },
}),
apiTimeoutMs,
)
const messagesResponse: unknown = response
const responseData =
typeof messagesResponse === "object" && messagesResponse !== null && "data" in messagesResponse
? (messagesResponse as { data?: unknown }).data
: undefined
const messageArray: unknown[] = Array.isArray(messagesResponse)
? messagesResponse
: Array.isArray(responseData)
? responseData
: []
for (let index = messageArray.length - 1; index >= 0; index -= 1) {
const message = messageArray[index] as OpenCodeSessionMessage
if (message.info?.role !== "assistant") {
continue
}
const assistantText = collectAssistantText(message)
if (!VERIFIED_PROMISE_PATTERN.test(assistantText) || !ORACLE_AGENT_PATTERN.test(assistantText)) {
continue
}
const sessionMatch = assistantText.match(TASK_METADATA_SESSION_PATTERN)
const detectedOracleSessionID = sessionMatch?.[1]?.trim()
if (detectedOracleSessionID) {
return detectedOracleSessionID
}
}
return undefined
} catch (error) {
log(`[${HOOK_NAME}] Failed to scan parent session for oracle verification evidence`, {
parentSessionID,
error: String(error),
})
return undefined
}
}
type LoopStateController = {
restartAfterFailedVerification: (sessionID: string, messageCountAtStart?: number) => RalphLoopState | null
setVerificationSessionID: (sessionID: string, verificationSessionID: string) => RalphLoopState | null
}
export async function handlePendingVerification(
@@ -33,6 +118,29 @@ export async function handlePendingVerification(
} = input
if (matchesParentSession || (verificationSessionID && matchesVerificationSession)) {
if (!verificationSessionID && state.session_id) {
const recoveredVerificationSessionID = await detectOracleVerificationFromParentSession(
ctx,
state.session_id,
directory,
apiTimeoutMs,
)
if (recoveredVerificationSessionID) {
const updatedState = loopState.setVerificationSessionID(
state.session_id,
recoveredVerificationSessionID,
)
if (updatedState) {
log(`[${HOOK_NAME}] Recovered missing verification session from parent evidence`, {
parentSessionID: state.session_id,
recoveredVerificationSessionID,
})
return
}
}
}
const restarted = await handleFailedVerification(ctx, {
state,
loopState,

View File

@@ -136,6 +136,13 @@ export function createRalphLoopEventHandler(
}
if (state.verification_pending) {
if (!verificationSessionID && matchesParentSession) {
log(`[${HOOK_NAME}] Verification pending without tracked oracle session, running recovery check`, {
sessionID,
iteration: state.iteration,
})
}
await handlePendingVerification(ctx, {
sessionID,
state,

View File

@@ -10,6 +10,7 @@ describe("ulw-loop verification", () => {
const testDir = join(tmpdir(), `ulw-loop-verification-${Date.now()}`)
let promptCalls: Array<{ sessionID: string; text: string }>
let toastCalls: Array<{ title: string; message: string; variant: string }>
let abortCalls: Array<{ id: string }>
let parentTranscriptPath: string
let oracleTranscriptPath: string
@@ -25,6 +26,10 @@ describe("ulw-loop verification", () => {
return {}
},
messages: async () => ({ data: [] }),
abort: async (opts: { path: { id: string } }) => {
abortCalls.push({ id: opts.path.id })
return {}
},
},
tui: {
showToast: async (opts: { body: { title: string; message: string; variant: string } }) => {
@@ -40,6 +45,7 @@ describe("ulw-loop verification", () => {
beforeEach(() => {
promptCalls = []
toastCalls = []
abortCalls = []
parentTranscriptPath = join(testDir, "transcript-parent.jsonl")
oracleTranscriptPath = join(testDir, "transcript-oracle.jsonl")
@@ -385,4 +391,96 @@ describe("ulw-loop verification", () => {
expect(promptCalls).toHaveLength(2)
expect(promptCalls[1]?.text).toContain("Verification failed")
})
test("#given oracle verification fails #when loop restarts #then old oracle session is aborted", async () => {
const sessionMessages: Record<string, unknown[]> = {
"session-123": [{}, {}, {}],
}
const hook = createRalphLoopHook({
...createMockPluginInput(),
client: {
...createMockPluginInput().client,
session: {
...createMockPluginInput().client.session,
messages: async (opts: { path: { id: string } }) => ({
data: sessionMessages[opts.path.id] ?? [],
}),
},
},
} as Parameters<typeof createRalphLoopHook>[0], {
getTranscriptPath: (sessionID) => sessionID === "ses-oracle" ? oracleTranscriptPath : parentTranscriptPath,
})
hook.startLoop("session-123", "Build API", { ultrawork: true })
writeFileSync(
parentTranscriptPath,
`${JSON.stringify({ type: "tool_result", timestamp: new Date().toISOString(), tool_output: { output: "done <promise>DONE</promise>" } })}\n`,
)
await hook.event({ event: { type: "session.idle", properties: { sessionID: "session-123" } } })
writeState(testDir, {
...hook.getState()!,
verification_session_id: "ses-oracle",
})
writeFileSync(
oracleTranscriptPath,
`${JSON.stringify({ type: "tool_result", timestamp: new Date().toISOString(), tool_output: { output: "verification failed: missing tests" } })}\n`,
)
await hook.event({ event: { type: "session.idle", properties: { sessionID: "ses-oracle" } } })
expect(abortCalls).toHaveLength(1)
expect(abortCalls[0].id).toBe("ses-oracle")
})
test("#given ulw loop re-enters verification #when DONE detected again after failed verification #then previous verification session is aborted", async () => {
const sessionMessages: Record<string, unknown[]> = {
"session-123": [{}, {}, {}],
}
const hook = createRalphLoopHook({
...createMockPluginInput(),
client: {
...createMockPluginInput().client,
session: {
...createMockPluginInput().client.session,
messages: async (opts: { path: { id: string } }) => ({
data: sessionMessages[opts.path.id] ?? [],
}),
},
},
} as Parameters<typeof createRalphLoopHook>[0], {
getTranscriptPath: (sessionID) => sessionID === "ses-oracle" ? oracleTranscriptPath : parentTranscriptPath,
})
hook.startLoop("session-123", "Build API", { ultrawork: true })
writeFileSync(
parentTranscriptPath,
`${JSON.stringify({ type: "tool_result", timestamp: new Date().toISOString(), tool_output: { output: "done <promise>DONE</promise>" } })}\n`,
)
await hook.event({ event: { type: "session.idle", properties: { sessionID: "session-123" } } })
writeState(testDir, {
...hook.getState()!,
verification_session_id: "ses-oracle",
})
writeFileSync(
oracleTranscriptPath,
`${JSON.stringify({ type: "tool_result", timestamp: new Date().toISOString(), tool_output: { output: "failed" } })}\n`,
)
await hook.event({ event: { type: "session.idle", properties: { sessionID: "ses-oracle" } } })
abortCalls.length = 0
writeFileSync(
parentTranscriptPath,
`${JSON.stringify({ type: "tool_result", timestamp: new Date().toISOString(), tool_output: { output: "fixed it <promise>DONE</promise>" } })}\n`,
)
writeState(testDir, {
...hook.getState()!,
verification_session_id: "ses-oracle-old",
})
await hook.event({ event: { type: "session.idle", properties: { sessionID: "session-123" } } })
expect(abortCalls).toHaveLength(1)
expect(abortCalls[0].id).toBe("ses-oracle-old")
})
})

View File

@@ -68,6 +68,10 @@ export async function handleFailedVerification(
return false
}
if (state.verification_session_id) {
ctx.client.session.abort({ path: { id: state.verification_session_id } }).catch(() => {})
}
const resumedState = loopState.restartAfterFailedVerification(
parentSessionID,
messageCountAtStart,

View File

@@ -1,6 +1,6 @@
import { describe, expect, test } from "bun:test"
import { classifyErrorType, extractAutoRetrySignal, isRetryableError } from "./error-classifier"
import { classifyErrorType, extractAutoRetrySignal, extractStatusCode, isRetryableError } from "./error-classifier"
describe("runtime-fallback error classifier", () => {
test("detects cooling-down auto-retry status signals", () => {
@@ -97,3 +97,72 @@ describe("runtime-fallback error classifier", () => {
expect(signal).toBeUndefined()
})
})
describe("extractStatusCode", () => {
test("extracts numeric statusCode from top-level", () => {
expect(extractStatusCode({ statusCode: 429 })).toBe(429)
})
test("extracts numeric status from top-level", () => {
expect(extractStatusCode({ status: 503 })).toBe(503)
})
test("extracts statusCode from nested data", () => {
expect(extractStatusCode({ data: { statusCode: 500 } })).toBe(500)
})
test("extracts statusCode from nested error", () => {
expect(extractStatusCode({ error: { statusCode: 502 } })).toBe(502)
})
test("extracts statusCode from nested cause", () => {
expect(extractStatusCode({ cause: { statusCode: 504 } })).toBe(504)
})
test("skips non-numeric status and finds deeper numeric statusCode", () => {
//#given — status is a string, but error.statusCode is numeric
const error = {
status: "error",
error: { statusCode: 429 },
}
//#when
const code = extractStatusCode(error)
//#then
expect(code).toBe(429)
})
test("skips non-numeric statusCode string and finds numeric in cause", () => {
const error = {
statusCode: "UNKNOWN",
status: "failed",
cause: { statusCode: 503 },
}
expect(extractStatusCode(error)).toBe(503)
})
test("returns undefined when no numeric status exists", () => {
expect(extractStatusCode({ status: "error", message: "something broke" })).toBeUndefined()
})
test("returns undefined for null/undefined error", () => {
expect(extractStatusCode(null)).toBeUndefined()
expect(extractStatusCode(undefined)).toBeUndefined()
})
test("falls back to regex match in error message", () => {
const error = { message: "Request failed with status code 429" }
expect(extractStatusCode(error, [429, 503])).toBe(429)
})
test("prefers top-level numeric over nested numeric", () => {
const error = {
statusCode: 400,
error: { statusCode: 429 },
cause: { statusCode: 503 },
}
expect(extractStatusCode(error)).toBe(400)
})
})

View File

@@ -28,18 +28,28 @@ export function getErrorMessage(error: unknown): string {
}
}
const DEFAULT_RETRY_PATTERN = new RegExp(`\\b(${DEFAULT_CONFIG.retry_on_errors.join("|")})\\b`)
export function extractStatusCode(error: unknown, retryOnErrors?: number[]): number | undefined {
if (!error) return undefined
const errorObj = error as Record<string, unknown>
const statusCode = errorObj.statusCode ?? errorObj.status ?? (errorObj.data as Record<string, unknown>)?.statusCode
if (typeof statusCode === "number") {
const statusCode = [
errorObj.statusCode,
errorObj.status,
(errorObj.data as Record<string, unknown>)?.statusCode,
(errorObj.error as Record<string, unknown>)?.statusCode,
(errorObj.cause as Record<string, unknown>)?.statusCode,
].find((code): code is number => typeof code === "number")
if (statusCode !== undefined) {
return statusCode
}
const codes = retryOnErrors ?? DEFAULT_CONFIG.retry_on_errors
const pattern = new RegExp(`\\b(${codes.join("|")})\\b`)
const pattern = retryOnErrors
? new RegExp(`\\b(${retryOnErrors.join("|")})\\b`)
: DEFAULT_RETRY_PATTERN
const message = getErrorMessage(error)
const statusMatch = message.match(pattern)
if (statusMatch) {

View File

@@ -32,8 +32,10 @@ const MULTILINGUAL_KEYWORDS = [
"fikir", "berfikir",
]
const MULTILINGUAL_PATTERNS = MULTILINGUAL_KEYWORDS.map((kw) => new RegExp(kw, "i"))
const THINK_PATTERNS = [...ENGLISH_PATTERNS, ...MULTILINGUAL_PATTERNS]
const COMBINED_THINK_PATTERN = new RegExp(
`\\b(?:ultrathink|think)\\b|${MULTILINGUAL_KEYWORDS.join("|")}`,
"i"
)
const CODE_BLOCK_PATTERN = /```[\s\S]*?```/g
const INLINE_CODE_PATTERN = /`[^`]+`/g
@@ -44,7 +46,7 @@ function removeCodeBlocks(text: string): string {
export function detectThinkKeyword(text: string): boolean {
const textWithoutCode = removeCodeBlocks(text)
return THINK_PATTERNS.some((pattern) => pattern.test(textWithoutCode))
return COMBINED_THINK_PATTERN.test(textWithoutCode)
}
export function extractPromptText(

View File

@@ -31,6 +31,10 @@ export function createTodoContinuationHandler(args: {
return async ({ event }: { event: { type: string; properties?: unknown } }): Promise<void> => {
const props = event.properties as Record<string, unknown> | undefined
if (event.type === "session.idle") {
console.error(`[TODO-DIAG] handler received session.idle event`, { sessionID: (props?.sessionID as string) })
}
if (event.type === "session.error") {
const sessionID = props?.sessionID as string | undefined
if (!sessionID) return

View File

@@ -43,10 +43,12 @@ export async function handleSessionIdle(args: {
} = args
log(`[${HOOK_NAME}] session.idle`, { sessionID })
console.error(`[TODO-DIAG] session.idle fired for ${sessionID}`)
const state = sessionStateStore.getState(sessionID)
if (state.isRecovering) {
log(`[${HOOK_NAME}] Skipped: in recovery`, { sessionID })
console.error(`[TODO-DIAG] BLOCKED: isRecovering=true`)
return
}
@@ -54,6 +56,7 @@ export async function handleSessionIdle(args: {
const timeSinceAbort = Date.now() - state.abortDetectedAt
if (timeSinceAbort < ABORT_WINDOW_MS) {
log(`[${HOOK_NAME}] Skipped: abort detected via event ${timeSinceAbort}ms ago`, { sessionID })
console.error(`[TODO-DIAG] BLOCKED: abort detected ${timeSinceAbort}ms ago`)
state.abortDetectedAt = undefined
return
}
@@ -66,6 +69,7 @@ export async function handleSessionIdle(args: {
if (hasRunningBgTasks) {
log(`[${HOOK_NAME}] Skipped: background tasks running`, { sessionID })
console.error(`[TODO-DIAG] BLOCKED: background tasks running`, backgroundManager?.getTasksByParentSession(sessionID).filter((t: {status:string}) => t.status === 'running').map((t: {id:string, status:string}) => t.id))
return
}
@@ -77,10 +81,12 @@ export async function handleSessionIdle(args: {
const messages = normalizeSDKResponse(messagesResp, [] as Array<{ info?: MessageInfo }>)
if (isLastAssistantMessageAborted(messages)) {
log(`[${HOOK_NAME}] Skipped: last assistant message was aborted (API fallback)`, { sessionID })
console.error(`[TODO-DIAG] BLOCKED: last assistant message aborted`)
return
}
if (hasUnansweredQuestion(messages)) {
log(`[${HOOK_NAME}] Skipped: pending question awaiting user response`, { sessionID })
console.error(`[TODO-DIAG] BLOCKED: hasUnansweredQuestion=true`)
return
}
} catch (error) {
@@ -93,24 +99,30 @@ export async function handleSessionIdle(args: {
todos = normalizeSDKResponse(response, [] as Todo[], { preferResponseOnMissingData: true })
} catch (error) {
log(`[${HOOK_NAME}] Todo fetch failed`, { sessionID, error: String(error) })
console.error(`[TODO-DIAG] BLOCKED: todo fetch failed`, String(error))
return
}
if (!todos || todos.length === 0) {
sessionStateStore.resetContinuationProgress(sessionID)
sessionStateStore.resetContinuationProgress(sessionID)
log(`[${HOOK_NAME}] No todos`, { sessionID })
console.error(`[TODO-DIAG] BLOCKED: no todos`)
return
}
const incompleteCount = getIncompleteCount(todos)
if (incompleteCount === 0) {
sessionStateStore.resetContinuationProgress(sessionID)
sessionStateStore.resetContinuationProgress(sessionID)
log(`[${HOOK_NAME}] All todos complete`, { sessionID, total: todos.length })
console.error(`[TODO-DIAG] BLOCKED: all todos complete (${todos.length})`)
return
}
if (state.inFlight) {
log(`[${HOOK_NAME}] Skipped: injection in flight`, { sessionID })
console.error(`[TODO-DIAG] BLOCKED: inFlight=true`)
return
}
@@ -124,22 +136,16 @@ export async function handleSessionIdle(args: {
}
if (state.consecutiveFailures >= MAX_CONSECUTIVE_FAILURES) {
log(`[${HOOK_NAME}] Skipped: max consecutive failures reached`, {
sessionID,
consecutiveFailures: state.consecutiveFailures,
maxConsecutiveFailures: MAX_CONSECUTIVE_FAILURES,
})
log(`[${HOOK_NAME}] Skipped: max consecutive failures reached`, { sessionID, consecutiveFailures: state.consecutiveFailures })
console.error(`[TODO-DIAG] BLOCKED: consecutiveFailures=${state.consecutiveFailures} >= ${MAX_CONSECUTIVE_FAILURES}`)
return
}
const effectiveCooldown =
CONTINUATION_COOLDOWN_MS * Math.pow(2, Math.min(state.consecutiveFailures, 5))
if (state.lastInjectedAt && Date.now() - state.lastInjectedAt < effectiveCooldown) {
log(`[${HOOK_NAME}] Skipped: cooldown active`, {
sessionID,
effectiveCooldown,
consecutiveFailures: state.consecutiveFailures,
})
log(`[${HOOK_NAME}] Skipped: cooldown active`, { sessionID, effectiveCooldown, consecutiveFailures: state.consecutiveFailures })
console.error(`[TODO-DIAG] BLOCKED: cooldown active (${effectiveCooldown}ms, failures=${state.consecutiveFailures})`)
return
}
@@ -165,10 +171,12 @@ export async function handleSessionIdle(args: {
const resolvedAgentName = resolvedInfo?.agent
if (resolvedAgentName && skipAgents.some(s => getAgentConfigKey(s) === getAgentConfigKey(resolvedAgentName))) {
log(`[${HOOK_NAME}] Skipped: agent in skipAgents list`, { sessionID, agent: resolvedAgentName })
console.error(`[TODO-DIAG] BLOCKED: agent '${resolvedAgentName}' in skipAgents`)
return
}
if ((compactionGuardActive || encounteredCompaction) && !resolvedInfo?.agent) {
log(`[${HOOK_NAME}] Skipped: compaction occurred but no agent info resolved`, { sessionID })
console.error(`[TODO-DIAG] BLOCKED: compaction guard + no agent`)
return
}
if (state.recentCompactionAt && resolvedInfo?.agent) {
@@ -177,18 +185,22 @@ export async function handleSessionIdle(args: {
if (isContinuationStopped?.(sessionID)) {
log(`[${HOOK_NAME}] Skipped: continuation stopped for session`, { sessionID })
console.error(`[TODO-DIAG] BLOCKED: isContinuationStopped=true`)
return
}
if (shouldSkipContinuation?.(sessionID)) {
log(`[${HOOK_NAME}] Skipped: another continuation hook already injected`, { sessionID })
console.error(`[TODO-DIAG] BLOCKED: shouldSkipContinuation=true (gptPermissionContinuation recently injected)`)
return
}
const progressUpdate = sessionStateStore.trackContinuationProgress(sessionID, incompleteCount, todos)
if (shouldStopForStagnation({ sessionID, incompleteCount, progressUpdate })) {
console.error(`[TODO-DIAG] BLOCKED: stagnation detected (count=${progressUpdate.stagnationCount})`)
return
}
console.error(`[TODO-DIAG] PASSED all gates! Starting countdown (${incompleteCount}/${todos.length} incomplete)`)
startCountdown({
ctx,
sessionID,

View File

@@ -18,7 +18,7 @@ describe("createSessionStateStore regressions", () => {
describe("#given external activity happens after a successful continuation", () => {
describe("#when todos stay unchanged", () => {
test("#then it treats the activity as progress instead of stagnation", () => {
test("#then it keeps counting stagnation", () => {
const sessionID = "ses-activity-progress"
const todos = [
{ id: "1", content: "Task 1", status: "pending", priority: "high" },
@@ -37,9 +37,9 @@ describe("createSessionStateStore regressions", () => {
trackedState.abortDetectedAt = undefined
const progressUpdate = sessionStateStore.trackContinuationProgress(sessionID, 2, todos)
expect(progressUpdate.hasProgressed).toBe(true)
expect(progressUpdate.progressSource).toBe("activity")
expect(progressUpdate.stagnationCount).toBe(0)
expect(progressUpdate.hasProgressed).toBe(false)
expect(progressUpdate.progressSource).toBe("none")
expect(progressUpdate.stagnationCount).toBe(1)
})
})
})
@@ -72,7 +72,7 @@ describe("createSessionStateStore regressions", () => {
describe("#given stagnation already halted a session", () => {
describe("#when new activity appears before the next idle check", () => {
test("#then it resets the stop condition on the next progress check", () => {
test("#then it does not reset the stop condition", () => {
const sessionID = "ses-stagnation-recovery"
const todos = [
{ id: "1", content: "Task 1", status: "pending", priority: "high" },
@@ -96,9 +96,9 @@ describe("createSessionStateStore regressions", () => {
const progressUpdate = sessionStateStore.trackContinuationProgress(sessionID, 2, todos)
expect(progressUpdate.previousStagnationCount).toBe(MAX_STAGNATION_COUNT)
expect(progressUpdate.hasProgressed).toBe(true)
expect(progressUpdate.progressSource).toBe("activity")
expect(progressUpdate.stagnationCount).toBe(0)
expect(progressUpdate.hasProgressed).toBe(false)
expect(progressUpdate.progressSource).toBe("none")
expect(progressUpdate.stagnationCount).toBe(MAX_STAGNATION_COUNT)
})
})
})

View File

@@ -16,8 +16,6 @@ interface TrackedSessionState {
lastAccessedAt: number
lastCompletedCount?: number
lastTodoSnapshot?: string
activitySignalCount: number
lastObservedActivitySignalCount?: number
}
export interface ContinuationProgressUpdate {
@@ -25,7 +23,7 @@ export interface ContinuationProgressUpdate {
previousStagnationCount: number
stagnationCount: number
hasProgressed: boolean
progressSource: "none" | "todo" | "activity"
progressSource: "none" | "todo"
}
export interface SessionStateStore {
@@ -98,17 +96,7 @@ export function createSessionStateStore(): SessionStateStore {
const trackedSession: TrackedSessionState = {
state: rawState,
lastAccessedAt: Date.now(),
activitySignalCount: 0,
}
trackedSession.state = new Proxy(rawState, {
set(target, property, value, receiver) {
if (property === "abortDetectedAt" && value === undefined) {
trackedSession.activitySignalCount += 1
}
return Reflect.set(target, property, value, receiver)
},
})
sessions.set(sessionID, trackedSession)
return trackedSession
}
@@ -137,7 +125,6 @@ export function createSessionStateStore(): SessionStateStore {
const previousStagnationCount = state.stagnationCount
const currentCompletedCount = todos?.filter((todo) => todo.status === "completed").length
const currentTodoSnapshot = todos ? getTodoSnapshot(todos) : undefined
const currentActivitySignalCount = trackedSession.activitySignalCount
const hasCompletedMoreTodos =
currentCompletedCount !== undefined
&& trackedSession.lastCompletedCount !== undefined
@@ -146,9 +133,6 @@ export function createSessionStateStore(): SessionStateStore {
currentTodoSnapshot !== undefined
&& trackedSession.lastTodoSnapshot !== undefined
&& currentTodoSnapshot !== trackedSession.lastTodoSnapshot
const hasObservedExternalActivity =
trackedSession.lastObservedActivitySignalCount !== undefined
&& currentActivitySignalCount > trackedSession.lastObservedActivitySignalCount
const hadSuccessfulInjectionAwaitingProgressCheck = state.awaitingPostInjectionProgressCheck === true
state.lastIncompleteCount = incompleteCount
@@ -158,7 +142,6 @@ export function createSessionStateStore(): SessionStateStore {
if (currentTodoSnapshot !== undefined) {
trackedSession.lastTodoSnapshot = currentTodoSnapshot
}
trackedSession.lastObservedActivitySignalCount = currentActivitySignalCount
if (previousIncompleteCount === undefined) {
state.stagnationCount = 0
@@ -173,9 +156,7 @@ export function createSessionStateStore(): SessionStateStore {
const progressSource = incompleteCount < previousIncompleteCount || hasCompletedMoreTodos || hasTodoSnapshotChanged
? "todo"
: hasObservedExternalActivity
? "activity"
: "none"
: "none"
if (progressSource !== "none") {
state.stagnationCount = 0
@@ -223,8 +204,6 @@ export function createSessionStateStore(): SessionStateStore {
state.awaitingPostInjectionProgressCheck = false
trackedSession.lastCompletedCount = undefined
trackedSession.lastTodoSnapshot = undefined
trackedSession.activitySignalCount = 0
trackedSession.lastObservedActivitySignalCount = undefined
}
function cancelCountdown(sessionID: string): void {

View File

@@ -3,6 +3,8 @@
import { describe, expect, it as test } from "bun:test"
import { MAX_STAGNATION_COUNT } from "./constants"
import { handleNonIdleEvent } from "./non-idle-events"
import { createSessionStateStore } from "./session-state"
import { shouldStopForStagnation } from "./stagnation-detection"
describe("shouldStopForStagnation", () => {
@@ -25,7 +27,7 @@ describe("shouldStopForStagnation", () => {
})
})
describe("#when activity progress is detected after the halt", () => {
describe("#when todo progress is detected after the halt", () => {
test("#then it clears the stop condition", () => {
const shouldStop = shouldStopForStagnation({
sessionID: "ses-recovered",
@@ -35,7 +37,7 @@ describe("shouldStopForStagnation", () => {
previousStagnationCount: MAX_STAGNATION_COUNT,
stagnationCount: 0,
hasProgressed: true,
progressSource: "activity",
progressSource: "todo",
},
})
@@ -43,4 +45,60 @@ describe("shouldStopForStagnation", () => {
})
})
})
describe("#given only non-idle tool and message events happen between idle checks", () => {
describe("#when todo state does not change across three idle cycles", () => {
test("#then stagnation count reaches three", () => {
// given
const sessionStateStore = createSessionStateStore()
const sessionID = "ses-non-idle-activity-without-progress"
const state = sessionStateStore.getState(sessionID)
const todos = [
{ id: "1", content: "Task 1", status: "pending", priority: "high" },
{ id: "2", content: "Task 2", status: "pending", priority: "medium" },
]
sessionStateStore.trackContinuationProgress(sessionID, 2, todos)
// when
state.awaitingPostInjectionProgressCheck = true
const firstCycle = sessionStateStore.trackContinuationProgress(sessionID, 2, todos)
handleNonIdleEvent({
eventType: "tool.execute.before",
properties: { sessionID },
sessionStateStore,
})
handleNonIdleEvent({
eventType: "message.updated",
properties: { info: { sessionID, role: "assistant" } },
sessionStateStore,
})
state.awaitingPostInjectionProgressCheck = true
const secondCycle = sessionStateStore.trackContinuationProgress(sessionID, 2, todos)
handleNonIdleEvent({
eventType: "tool.execute.after",
properties: { sessionID },
sessionStateStore,
})
handleNonIdleEvent({
eventType: "message.part.updated",
properties: { info: { sessionID, role: "assistant" } },
sessionStateStore,
})
state.awaitingPostInjectionProgressCheck = true
const thirdCycle = sessionStateStore.trackContinuationProgress(sessionID, 2, todos)
// then
expect(firstCycle.stagnationCount).toBe(1)
expect(secondCycle.stagnationCount).toBe(2)
expect(thirdCycle.stagnationCount).toBe(3)
sessionStateStore.shutdown()
})
})
})
})

View File

@@ -0,0 +1,28 @@
export const TODOWRITE_DESCRIPTION = `Use this tool to create and manage a structured task list for tracking progress on multi-step work.
## Todo Format (MANDATORY)
Each todo title MUST encode four elements: WHERE, WHY, HOW, and EXPECTED RESULT.
Format: "[WHERE] [HOW] to [WHY] — expect [RESULT]"
GOOD:
- "src/utils/validation.ts: Add validateEmail() for input sanitization — returns boolean"
- "UserService.create(): Call validateEmail() before DB insert — rejects invalid emails with 400"
- "validation.test.ts: Add test for missing @ sign — expect validateEmail('foo') to return false"
BAD:
- "Implement email validation" (where? how? what result?)
- "Add dark mode" (this is a feature, not a todo)
- "Fix auth" (what file? what changes? what's expected?)
## Granularity Rules
Each todo MUST be a single atomic action completable in 1-3 tool calls. If it needs more, split it.
**Size test**: Can you complete this todo by editing one file or running one command? If not, it's too big.
## Task Management
- One in_progress at a time. Complete it before starting the next.
- Mark completed immediately after finishing each item.
- Skip this tool for single trivial tasks (one-step, obvious action).`

View File

@@ -0,0 +1,14 @@
import { TODOWRITE_DESCRIPTION } from "./description"
export function createTodoDescriptionOverrideHook() {
return {
"tool.definition": async (
input: { toolID: string },
output: { description: string; parameters: unknown },
) => {
if (input.toolID === "todowrite") {
output.description = TODOWRITE_DESCRIPTION
}
},
}
}

View File

@@ -0,0 +1,40 @@
import { describe, it, expect } from "bun:test"
import { createTodoDescriptionOverrideHook } from "./hook"
import { TODOWRITE_DESCRIPTION } from "./description"
describe("createTodoDescriptionOverrideHook", () => {
describe("#given hook is created", () => {
describe("#when tool.definition is called with todowrite", () => {
it("#then should override the description", async () => {
const hook = createTodoDescriptionOverrideHook()
const output = { description: "original description", parameters: {} }
await hook["tool.definition"]({ toolID: "todowrite" }, output)
expect(output.description).toBe(TODOWRITE_DESCRIPTION)
})
})
describe("#when tool.definition is called with non-todowrite tool", () => {
it("#then should not modify the description", async () => {
const hook = createTodoDescriptionOverrideHook()
const output = { description: "original description", parameters: {} }
await hook["tool.definition"]({ toolID: "bash" }, output)
expect(output.description).toBe("original description")
})
})
describe("#when tool.definition is called with TodoWrite (case-insensitive)", () => {
it("#then should not override for different casing since OpenCode sends lowercase", async () => {
const hook = createTodoDescriptionOverrideHook()
const output = { description: "original description", parameters: {} }
await hook["tool.definition"]({ toolID: "TodoWrite" }, output)
expect(output.description).toBe("original description")
})
})
})
})

View File

@@ -0,0 +1 @@
export { createTodoDescriptionOverrideHook } from "./hook"

View File

@@ -1,98 +0,0 @@
import { describe, it, expect, beforeEach, afterEach } from "bun:test";
import { resolveGateway, wakeOpenClaw } from "../client";
import { type OpenClawConfig } from "../types";
describe("OpenClaw Client", () => {
describe("resolveGateway", () => {
const config: OpenClawConfig = {
enabled: true,
gateways: {
foo: { type: "command", command: "echo foo" },
bar: { type: "http", url: "https://example.com" },
},
hooks: {
"session-start": {
gateway: "foo",
instruction: "start",
enabled: true,
},
"session-end": { gateway: "bar", instruction: "end", enabled: true },
stop: { gateway: "foo", instruction: "stop", enabled: false },
},
};
it("resolves valid mapping", () => {
const result = resolveGateway(config, "session-start");
expect(result).not.toBeNull();
expect(result?.gatewayName).toBe("foo");
expect(result?.instruction).toBe("start");
});
it("returns null for disabled hook", () => {
const result = resolveGateway(config, "stop");
expect(result).toBeNull();
});
it("returns null for unmapped event", () => {
const result = resolveGateway(config, "ask-user-question");
expect(result).toBeNull();
});
});
describe("wakeOpenClaw env gate", () => {
let oldEnv: string | undefined;
beforeEach(() => {
oldEnv = process.env.OMO_OPENCLAW;
});
afterEach(() => {
if (oldEnv === undefined) {
delete process.env.OMO_OPENCLAW;
} else {
process.env.OMO_OPENCLAW = oldEnv;
}
});
it("returns null when OMO_OPENCLAW is not set", async () => {
delete process.env.OMO_OPENCLAW;
const config: OpenClawConfig = {
enabled: true,
gateways: { gw: { type: "command", command: "echo test" } },
hooks: {
"session-start": { gateway: "gw", instruction: "hi", enabled: true },
},
};
const result = await wakeOpenClaw("session-start", { projectPath: "/tmp" }, config);
expect(result).toBeNull();
});
it("returns null when OMO_OPENCLAW is not '1'", async () => {
process.env.OMO_OPENCLAW = "0";
const config: OpenClawConfig = {
enabled: true,
gateways: { gw: { type: "command", command: "echo test" } },
hooks: {
"session-start": { gateway: "gw", instruction: "hi", enabled: true },
},
};
const result = await wakeOpenClaw("session-start", { projectPath: "/tmp" }, config);
expect(result).toBeNull();
});
it("does not use OMX_OPENCLAW (old env var)", async () => {
delete process.env.OMO_OPENCLAW;
process.env.OMX_OPENCLAW = "1";
const config: OpenClawConfig = {
enabled: true,
gateways: { gw: { type: "command", command: "echo test" } },
hooks: {
"session-start": { gateway: "gw", instruction: "hi", enabled: true },
},
};
const result = await wakeOpenClaw("session-start", { projectPath: "/tmp" }, config);
expect(result).toBeNull();
delete process.env.OMX_OPENCLAW;
});
});
});

View File

@@ -1,40 +0,0 @@
import { describe, it, expect } from "bun:test";
import { OpenClawConfigSchema } from "../../config/schema/openclaw";
describe("OpenClaw Config Schema", () => {
it("validates correct config", () => {
const raw = {
enabled: true,
gateways: {
foo: { type: "command", command: "echo foo" },
bar: { type: "http", url: "https://example.com" },
},
hooks: {
"session-start": {
gateway: "foo",
instruction: "start",
enabled: true,
},
},
};
const parsed = OpenClawConfigSchema.safeParse(raw);
if (!parsed.success) console.log(parsed.error);
expect(parsed.success).toBe(true);
});
it("fails on invalid event", () => {
const raw = {
enabled: true,
gateways: {},
hooks: {
"invalid-event": {
gateway: "foo",
instruction: "start",
enabled: true,
},
},
};
const parsed = OpenClawConfigSchema.safeParse(raw);
expect(parsed.success).toBe(false);
});
});

View File

@@ -1,78 +0,0 @@
import { describe, it, expect } from "bun:test";
import {
interpolateInstruction,
resolveCommandTimeoutMs,
shellEscapeArg,
validateGatewayUrl,
wakeCommandGateway,
} from "../dispatcher";
import { type OpenClawCommandGatewayConfig } from "../types";
describe("OpenClaw Dispatcher", () => {
describe("validateGatewayUrl", () => {
it("accepts valid https URLs", () => {
expect(validateGatewayUrl("https://example.com")).toBe(true);
});
it("rejects http URLs (remote)", () => {
expect(validateGatewayUrl("http://example.com")).toBe(false);
});
it("accepts http URLs for localhost", () => {
expect(validateGatewayUrl("http://localhost:3000")).toBe(true);
expect(validateGatewayUrl("http://127.0.0.1:8080")).toBe(true);
});
});
describe("interpolateInstruction", () => {
it("interpolates variables correctly", () => {
const result = interpolateInstruction("Hello {{name}}!", { name: "World" });
expect(result).toBe("Hello World!");
});
it("handles missing variables", () => {
const result = interpolateInstruction("Hello {{name}}!", {});
expect(result).toBe("Hello !");
});
});
describe("shellEscapeArg", () => {
it("escapes simple string", () => {
expect(shellEscapeArg("foo")).toBe("'foo'");
});
it("escapes string with single quotes", () => {
expect(shellEscapeArg("it's")).toBe("'it'\\''s'");
});
});
describe("resolveCommandTimeoutMs", () => {
it("uses default timeout", () => {
expect(resolveCommandTimeoutMs(undefined, undefined)).toBe(5000);
});
it("uses provided timeout", () => {
expect(resolveCommandTimeoutMs(1000, undefined)).toBe(1000);
});
it("clamps timeout", () => {
expect(resolveCommandTimeoutMs(10, undefined)).toBe(100);
expect(resolveCommandTimeoutMs(1000000, undefined)).toBe(300000);
});
});
describe("wakeCommandGateway", () => {
it("rejects if disabled via env", async () => {
const oldEnv = process.env.OMO_OPENCLAW_COMMAND;
process.env.OMO_OPENCLAW_COMMAND = "0";
const config: OpenClawCommandGatewayConfig = {
type: "command",
command: "echo hi",
};
const result = await wakeCommandGateway("test", config, {});
expect(result.success).toBe(false);
expect(result.error).toContain("disabled");
process.env.OMO_OPENCLAW_COMMAND = oldEnv;
});
});
});

View File

@@ -1,256 +0,0 @@
/**
* OpenClaw Integration - Client
*
* Wakes OpenClaw gateways on hook events. Non-blocking, fire-and-forget.
*
* Usage:
* wakeOpenClaw("session-start", { sessionId, projectPath: directory }, config);
*
* Activation requires OMO_OPENCLAW=1 env var and config in pluginConfig.openclaw.
*/
import {
type OpenClawConfig,
type OpenClawContext,
type OpenClawHookEvent,
type OpenClawResult,
type OpenClawGatewayConfig,
type OpenClawHttpGatewayConfig,
type OpenClawCommandGatewayConfig,
type OpenClawPayload,
} from "./types";
import {
interpolateInstruction,
isCommandGateway,
wakeCommandGateway,
wakeGateway,
} from "./dispatcher";
import { execSync } from "child_process";
import { basename } from "path";
/** Whether debug logging is enabled */
const DEBUG = process.env.OMO_OPENCLAW_DEBUG === "1";
// Helper for tmux session
function getCurrentTmuxSession(): string | undefined {
if (!process.env.TMUX) return undefined;
try {
// tmux display-message -p '#S'
const session = execSync("tmux display-message -p '#S'", {
encoding: "utf-8",
}).trim();
return session || undefined;
} catch {
return undefined;
}
}
// Helper for tmux capture
function captureTmuxPane(paneId: string, lines: number): string | undefined {
try {
// tmux capture-pane -p -t {paneId} -S -{lines}
const output = execSync(
`tmux capture-pane -p -t "${paneId}" -S -${lines}`,
{ encoding: "utf-8" }
);
return output || undefined;
} catch {
return undefined;
}
}
/**
* Build a whitelisted context object from the input context.
* Only known fields are included to prevent accidental data leakage.
*/
function buildWhitelistedContext(context: OpenClawContext): OpenClawContext {
const result: OpenClawContext = {};
if (context.sessionId !== undefined) result.sessionId = context.sessionId;
if (context.projectPath !== undefined)
result.projectPath = context.projectPath;
if (context.tmuxSession !== undefined)
result.tmuxSession = context.tmuxSession;
if (context.prompt !== undefined) result.prompt = context.prompt;
if (context.contextSummary !== undefined)
result.contextSummary = context.contextSummary;
if (context.reason !== undefined) result.reason = context.reason;
if (context.question !== undefined) result.question = context.question;
if (context.tmuxTail !== undefined) result.tmuxTail = context.tmuxTail;
if (context.replyChannel !== undefined)
result.replyChannel = context.replyChannel;
if (context.replyTarget !== undefined)
result.replyTarget = context.replyTarget;
if (context.replyThread !== undefined)
result.replyThread = context.replyThread;
return result;
}
/**
* Resolve gateway config for a specific hook event.
* Returns null if the event is not mapped or disabled.
* Returns the gateway name alongside config to avoid O(n) reverse lookup.
*/
export function resolveGateway(
config: OpenClawConfig,
event: OpenClawHookEvent
): {
gatewayName: string;
gateway: OpenClawGatewayConfig;
instruction: string;
} | null {
const mapping = config.hooks?.[event];
if (!mapping || !mapping.enabled) {
return null;
}
const gateway = config.gateways?.[mapping.gateway];
if (!gateway) {
return null;
}
// Validate based on gateway type
if (gateway.type === "command") {
if (!gateway.command) return null;
} else {
// HTTP gateway (default when type is absent or "http")
if (!("url" in gateway) || !gateway.url) return null;
}
return {
gatewayName: mapping.gateway,
gateway,
instruction: mapping.instruction,
};
}
/**
* Wake the OpenClaw gateway mapped to a hook event.
*
* This is the main entry point called from the notify hook.
* Non-blocking, swallows all errors. Returns null if OpenClaw
* is not configured or the event is not mapped.
*
* @param event - The hook event type
* @param context - Context data for template variable interpolation
* @param config - OpenClaw configuration
* @returns OpenClawResult or null if not configured/mapped
*/
export async function wakeOpenClaw(
event: OpenClawHookEvent,
context: OpenClawContext,
config?: OpenClawConfig
): Promise<OpenClawResult | null> {
try {
// Activation gate: only active when OMO_OPENCLAW=1
if (process.env.OMO_OPENCLAW !== "1") {
return null;
}
if (!config || !config.enabled) return null;
const resolved = resolveGateway(config, event);
if (!resolved) return null;
const { gatewayName, gateway, instruction } = resolved;
const now = new Date().toISOString();
// Read originating channel context from env vars
const replyChannel =
context.replyChannel ?? process.env.OPENCLAW_REPLY_CHANNEL ?? undefined;
const replyTarget =
context.replyTarget ?? process.env.OPENCLAW_REPLY_TARGET ?? undefined;
const replyThread =
context.replyThread ?? process.env.OPENCLAW_REPLY_THREAD ?? undefined;
// Merge reply context
const enrichedContext: OpenClawContext = {
...context,
...(replyChannel !== undefined && { replyChannel }),
...(replyTarget !== undefined && { replyTarget }),
...(replyThread !== undefined && { replyThread }),
};
// Auto-detect tmux session
const tmuxSession =
enrichedContext.tmuxSession ?? getCurrentTmuxSession() ?? undefined;
// Auto-capture tmux pane content
let tmuxTail = enrichedContext.tmuxTail;
if (
!tmuxTail &&
(event === "stop" || event === "session-end") &&
process.env.TMUX
) {
const paneId = process.env.TMUX_PANE;
if (paneId) {
tmuxTail = captureTmuxPane(paneId, 15) ?? undefined;
}
}
// Build template variables
const variables: Record<string, string | undefined> = {
sessionId: enrichedContext.sessionId,
projectPath: enrichedContext.projectPath,
projectName: enrichedContext.projectPath
? basename(enrichedContext.projectPath)
: undefined,
tmuxSession,
prompt: enrichedContext.prompt,
contextSummary: enrichedContext.contextSummary,
reason: enrichedContext.reason,
question: enrichedContext.question,
tmuxTail,
event,
timestamp: now,
replyChannel,
replyTarget,
replyThread,
};
// Interpolate instruction
const interpolatedInstruction = interpolateInstruction(
instruction,
variables
);
variables.instruction = interpolatedInstruction;
let result: OpenClawResult;
if (isCommandGateway(gateway)) {
result = await wakeCommandGateway(gatewayName, gateway, variables);
} else {
const payload: OpenClawPayload = {
event,
instruction: interpolatedInstruction,
text: interpolatedInstruction,
timestamp: now,
sessionId: enrichedContext.sessionId,
projectPath: enrichedContext.projectPath,
projectName: enrichedContext.projectPath
? basename(enrichedContext.projectPath)
: undefined,
tmuxSession,
tmuxTail,
...(replyChannel !== undefined && { channel: replyChannel }),
...(replyTarget !== undefined && { to: replyTarget }),
...(replyThread !== undefined && { threadId: replyThread }),
context: buildWhitelistedContext(enrichedContext),
};
result = await wakeGateway(gatewayName, gateway, payload);
}
if (DEBUG) {
console.error(
`[openclaw] wake ${event} -> ${gatewayName}: ${
result.success ? "ok" : result.error
}`
);
}
return result;
} catch (error) {
if (DEBUG) {
console.error(
`[openclaw] wakeOpenClaw error:`,
error instanceof Error ? error.message : error
);
}
return null;
}
}

View File

@@ -1,317 +0,0 @@
/**
* OpenClaw Gateway Dispatcher
*
* Sends instruction payloads to OpenClaw gateways via HTTP or CLI command.
* All calls are non-blocking with timeouts. Failures are swallowed
* to avoid blocking hooks.
*
* SECURITY: Command gateway requires OMO_OPENCLAW_COMMAND=1 opt-in.
* Command timeout is configurable with safe bounds.
* Prefers execFile for simple commands; falls back to sh -c only for shell metacharacters.
*/
import {
type OpenClawCommandGatewayConfig,
type OpenClawGatewayConfig,
type OpenClawHttpGatewayConfig,
type OpenClawPayload,
type OpenClawResult,
} from "./types";
import { exec, execFile } from "child_process";
/** Default per-request timeout for HTTP gateways */
const DEFAULT_HTTP_TIMEOUT_MS = 10_000;
/** Default command gateway timeout (backward-compatible default) */
const DEFAULT_COMMAND_TIMEOUT_MS = 5_000;
/**
* Command timeout safety bounds.
* - Minimum 100ms: avoids immediate/near-zero timeout misconfiguration.
* - Maximum 300000ms (5 minutes): prevents runaway long-lived command processes.
*/
const MIN_COMMAND_TIMEOUT_MS = 100;
const MAX_COMMAND_TIMEOUT_MS = 300_000;
/** Shell metacharacters that require sh -c instead of execFile */
const SHELL_METACHAR_RE = /[|&;><`$()]/;
/**
* Validate gateway URL. Must be HTTPS, except localhost/127.0.0.1/::1
* which allows HTTP for local development.
*/
export function validateGatewayUrl(url: string): boolean {
try {
const parsed = new URL(url);
if (parsed.protocol === "https:") return true;
if (
parsed.protocol === "http:" &&
(parsed.hostname === "localhost" ||
parsed.hostname === "127.0.0.1" ||
parsed.hostname === "::1" ||
parsed.hostname === "[::1]")
) {
return true;
}
return false;
} catch (err) {
process.stderr.write(`[openclaw-dispatcher] operation failed: ${err}\n`);
return false;
}
}
/**
* Interpolate template variables in an instruction string.
*
* Supported variables (from hook context):
* - {{projectName}} - basename of project directory
* - {{projectPath}} - full project directory path
* - {{sessionId}} - session identifier
* - {{prompt}} - prompt text
* - {{contextSummary}} - context summary (session-end event)
* - {{question}} - question text (ask-user-question event)
* - {{timestamp}} - ISO timestamp
* - {{event}} - hook event name
* - {{instruction}} - interpolated instruction (for command gateway)
* - {{replyChannel}} - originating channel (from OPENCLAW_REPLY_CHANNEL env var)
* - {{replyTarget}} - reply target user/bot (from OPENCLAW_REPLY_TARGET env var)
* - {{replyThread}} - reply thread ID (from OPENCLAW_REPLY_THREAD env var)
*
* Unresolved variables are replaced with empty string.
*/
export function interpolateInstruction(
template: string,
variables: Record<string, string | undefined>
): string {
return template.replace(/\{\{(\w+)\}\}/g, (_match, key) => {
return variables[key] ?? "";
});
}
/**
* Type guard: is this gateway config a command gateway?
*/
export function isCommandGateway(
config: OpenClawGatewayConfig
): config is OpenClawCommandGatewayConfig {
return config.type === "command";
}
/**
* Shell-escape a string for safe embedding in a shell command.
* Uses single-quote wrapping with internal quote escaping.
*/
export function shellEscapeArg(value: string): string {
return "'" + value.replace(/'/g, "'\\''") + "'";
}
/**
* Resolve command gateway timeout with precedence:
* gateway timeout > OMO_OPENCLAW_COMMAND_TIMEOUT_MS > default.
*/
export function resolveCommandTimeoutMs(
gatewayTimeout?: number,
envTimeoutRaw = process.env.OMO_OPENCLAW_COMMAND_TIMEOUT_MS
): number {
const parseFinite = (value: unknown): number | undefined => {
if (typeof value !== "number" || !Number.isFinite(value)) return undefined;
return value;
};
const parseEnv = (value: string | undefined): number | undefined => {
if (!value) return undefined;
const parsed = Number(value);
return Number.isFinite(parsed) ? parsed : undefined;
};
const rawTimeout =
parseFinite(gatewayTimeout) ??
parseEnv(envTimeoutRaw) ??
DEFAULT_COMMAND_TIMEOUT_MS;
return Math.min(
MAX_COMMAND_TIMEOUT_MS,
Math.max(MIN_COMMAND_TIMEOUT_MS, Math.trunc(rawTimeout))
);
}
/**
* Wake an HTTP-type OpenClaw gateway with the given payload.
*/
export async function wakeGateway(
gatewayName: string,
gatewayConfig: OpenClawHttpGatewayConfig,
payload: OpenClawPayload
): Promise<OpenClawResult> {
if (!validateGatewayUrl(gatewayConfig.url)) {
return {
gateway: gatewayName,
success: false,
error: "Invalid URL (HTTPS required)",
};
}
try {
const headers = {
"Content-Type": "application/json",
...gatewayConfig.headers,
};
const timeout = gatewayConfig.timeout ?? DEFAULT_HTTP_TIMEOUT_MS;
const controller = new AbortController();
const timeoutId = setTimeout(() => controller.abort(), timeout);
const response = await fetch(gatewayConfig.url, {
method: gatewayConfig.method || "POST",
headers,
body: JSON.stringify(payload),
signal: controller.signal,
});
clearTimeout(timeoutId);
if (!response.ok) {
return {
gateway: gatewayName,
success: false,
error: `HTTP ${response.status}`,
statusCode: response.status,
};
}
return { gateway: gatewayName, success: true, statusCode: response.status };
} catch (error) {
return {
gateway: gatewayName,
success: false,
error: error instanceof Error ? error.message : "Unknown error",
};
}
}
/**
* Wake a command-type OpenClaw gateway by executing a shell command.
*
* SECURITY REQUIREMENTS:
* - Requires OMO_OPENCLAW_COMMAND=1 opt-in (separate gate from OMO_OPENCLAW)
* - Timeout is configurable via gateway.timeout or OMO_OPENCLAW_COMMAND_TIMEOUT_MS
* with safe clamping bounds and backward-compatible default 5000ms
* - Prefers execFile for simple commands (no metacharacters)
* - Falls back to sh -c only when metacharacters detected
* - detached: false to prevent orphan processes
* - SIGTERM cleanup handler kills child on parent SIGTERM, 1s grace then SIGKILL
*
* The command template supports {{variable}} placeholders. All variable
* values are shell-escaped before interpolation to prevent injection.
*/
export async function wakeCommandGateway(
gatewayName: string,
gatewayConfig: OpenClawCommandGatewayConfig,
variables: Record<string, string | undefined>
): Promise<OpenClawResult> {
// Separate command gateway opt-in gate
if (process.env.OMO_OPENCLAW_COMMAND !== "1") {
return {
gateway: gatewayName,
success: false,
error: "Command gateway disabled (set OMO_OPENCLAW_COMMAND=1 to enable)",
};
}
let child: any = null;
let sigtermHandler: (() => void) | null = null;
try {
const timeout = resolveCommandTimeoutMs(gatewayConfig.timeout);
// Interpolate variables with shell escaping
const interpolated = gatewayConfig.command.replace(
/\{\{(\w+)\}\}/g,
(match, key) => {
const value = variables[key];
if (value === undefined) return match;
return shellEscapeArg(value);
}
);
// Detect whether the interpolated command contains shell metacharacters
const hasMetachars = SHELL_METACHAR_RE.test(interpolated);
await new Promise<void>((resolve, reject) => {
const cleanup = (signal: NodeJS.Signals) => {
if (child) {
child.kill(signal);
// 1s grace period then SIGKILL
setTimeout(() => {
try {
child?.kill("SIGKILL");
} catch (err) {
process.stderr.write(
`[openclaw-dispatcher] operation failed: ${err}\n`
);
}
}, 1000);
}
};
sigtermHandler = () => cleanup("SIGTERM");
process.once("SIGTERM", sigtermHandler);
const onExit = (code: number | null, signal: NodeJS.Signals | null) => {
if (sigtermHandler) {
process.removeListener("SIGTERM", sigtermHandler);
sigtermHandler = null;
}
if (signal) {
reject(new Error(`Command killed by signal ${signal}`));
} else if (code !== 0) {
reject(new Error(`Command exited with code ${code}`));
} else {
resolve();
}
};
const onError = (err: Error) => {
if (sigtermHandler) {
process.removeListener("SIGTERM", sigtermHandler);
sigtermHandler = null;
}
reject(err);
};
if (hasMetachars) {
// Fall back to sh -c for complex commands with metacharacters
child = exec(interpolated, {
timeout,
env: { ...process.env },
});
} else {
// Parse simple command: split on whitespace, use execFile
const parts = interpolated.split(/\s+/).filter(Boolean);
const cmd = parts[0];
const args = parts.slice(1);
child = execFile(cmd, args, {
timeout,
env: { ...process.env },
});
}
// Ensure detached is false (default, but explicit via options above)
if (child) {
child.on("exit", onExit);
child.on("error", onError);
} else {
reject(new Error("Failed to spawn process"));
}
});
return { gateway: gatewayName, success: true };
} catch (error) {
// Ensure SIGTERM handler is cleaned up on error
if (sigtermHandler) {
process.removeListener("SIGTERM", sigtermHandler as () => void);
}
return {
gateway: gatewayName,
success: false,
error: error instanceof Error ? error.message : "Unknown error",
};
}
}

View File

@@ -1,10 +0,0 @@
export { resolveGateway, wakeOpenClaw } from "./client";
export {
interpolateInstruction,
isCommandGateway,
shellEscapeArg,
validateGatewayUrl,
wakeCommandGateway,
wakeGateway,
} from "./dispatcher";
export * from "./types";

View File

@@ -1,134 +0,0 @@
/**
* OpenClaw Gateway Integration Types
*
* Defines types for the OpenClaw gateway waker system.
* Each hook event can be mapped to a gateway with a pre-defined instruction.
*/
/** Hook events that can trigger OpenClaw gateway calls */
export type OpenClawHookEvent =
| "session-start"
| "session-end"
| "session-idle"
| "ask-user-question"
| "stop";
/** HTTP gateway configuration (default when type is absent or "http") */
export interface OpenClawHttpGatewayConfig {
/** Gateway type discriminator (optional for backward compat) */
type?: "http";
/** Gateway endpoint URL (HTTPS required, HTTP allowed for localhost) */
url: string;
/** Optional custom headers (e.g., Authorization) */
headers?: Record<string, string>;
/** HTTP method (default: POST) */
method?: "POST" | "PUT";
/** Per-request timeout in ms (default: 10000) */
timeout?: number;
}
/** CLI command gateway configuration */
export interface OpenClawCommandGatewayConfig {
/** Gateway type discriminator */
type: "command";
/** Command template with {{variable}} placeholders.
* Variables are shell-escaped automatically before interpolation. */
command: string;
/**
* Per-command timeout in ms.
* Precedence: gateway timeout > OMO_OPENCLAW_COMMAND_TIMEOUT_MS > default (5000ms).
* Runtime clamps to safe bounds.
*/
timeout?: number;
}
/** Gateway configuration — HTTP or CLI command */
export type OpenClawGatewayConfig =
| OpenClawHttpGatewayConfig
| OpenClawCommandGatewayConfig;
/** Per-hook-event mapping to a gateway + instruction */
export interface OpenClawHookMapping {
/** Name of the gateway (key in gateways object) */
gateway: string;
/** Instruction template with {{variable}} placeholders */
instruction: string;
/** Whether this hook-event mapping is active */
enabled: boolean;
}
/** Top-level config schema for notifications.openclaw key in .omx-config.json */
export interface OpenClawConfig {
/** Global enable/disable */
enabled: boolean;
/** Named gateway endpoints */
gateways: Record<string, OpenClawGatewayConfig>;
/** Hook-event to gateway+instruction mappings */
hooks?: Partial<Record<OpenClawHookEvent, OpenClawHookMapping>>;
}
/** Payload sent to an OpenClaw gateway */
export interface OpenClawPayload {
/** The hook event that triggered this call */
event: OpenClawHookEvent;
/** Interpolated instruction text */
instruction: string;
/** Alias of instruction — allows OpenClaw /hooks/wake to consume the payload directly */
text: string;
/** ISO timestamp */
timestamp: string;
/** Session identifier (if available) */
sessionId?: string;
/** Project directory path */
projectPath?: string;
/** Project basename */
projectName?: string;
/** Tmux session name (if running inside tmux) */
tmuxSession?: string;
/** Recent tmux pane output (for stop/session-end events) */
tmuxTail?: string;
/** Originating channel for reply routing (if OPENCLAW_REPLY_CHANNEL is set) */
channel?: string;
/** Reply target user/bot (if OPENCLAW_REPLY_TARGET is set) */
to?: string;
/** Reply thread ID (if OPENCLAW_REPLY_THREAD is set) */
threadId?: string;
/** Context data from the hook (whitelisted fields only) */
context: OpenClawContext;
}
/**
* Context data passed from the hook to OpenClaw for template interpolation.
*
* All fields are explicitly enumerated (no index signature) to prevent
* accidental leakage of sensitive data into gateway payloads.
*/
export interface OpenClawContext {
sessionId?: string;
projectPath?: string;
tmuxSession?: string;
prompt?: string;
contextSummary?: string;
reason?: string;
question?: string;
/** Recent tmux pane output (captured automatically for stop/session-end events) */
tmuxTail?: string;
/** Originating channel for reply routing (from OPENCLAW_REPLY_CHANNEL env var) */
replyChannel?: string;
/** Reply target user/bot (from OPENCLAW_REPLY_TARGET env var) */
replyTarget?: string;
/** Reply thread ID for threaded conversations (from OPENCLAW_REPLY_THREAD env var) */
replyThread?: string;
}
/** Result of a gateway wake attempt */
export interface OpenClawResult {
/** Gateway name */
gateway: string;
/** Whether the call succeeded */
success: boolean;
/** Error message if failed */
error?: string;
/** HTTP status code if available */
statusCode?: number;
}

View File

@@ -32,10 +32,7 @@ export function createPluginInterface(args: {
return {
tool: tools,
"chat.params": async (input: unknown, output: unknown) => {
const handler = createChatParamsHandler({ anthropicEffort: hooks.anthropicEffort })
await handler(input, output)
},
"chat.params": createChatParamsHandler({ anthropicEffort: hooks.anthropicEffort }),
"chat.headers": createChatHeadersHandler({ ctx }),
@@ -71,5 +68,9 @@ export function createPluginInterface(args: {
ctx,
hooks,
}),
"tool.definition": async (input, output) => {
await hooks.todoDescriptionOverride?.["tool.definition"]?.(input, output)
},
}
}

View File

@@ -1,8 +1,15 @@
import { describe, it, expect } from "bun:test"
import { describe, it, expect, afterEach } from "bun:test"
import { createEventHandler } from "./event"
import { createChatMessageHandler } from "./chat-message"
import { _resetForTesting, setMainSession } from "../features/claude-code-session-state"
import { clearPendingModelFallback, createModelFallbackHook } from "../hooks/model-fallback/hook"
type EventInput = { event: { type: string; properties?: Record<string, unknown> } }
type EventInput = { event: { type: string; properties?: unknown } }
afterEach(() => {
_resetForTesting()
})
describe("createEventHandler - idle deduplication", () => {
it("Order A (status→idle): synthetic idle deduped - real idle not dispatched again", async () => {
@@ -66,7 +73,7 @@ type EventInput = { event: { type: string; properties?: Record<string, unknown>
//#then - synthetic idle dispatched once
expect(dispatchCalls.length).toBe(1)
expect(dispatchCalls[0].event.type).toBe("session.idle")
expect(dispatchCalls[0].event.properties?.sessionID).toBe(sessionId)
expect((dispatchCalls[0].event.properties as { sessionID?: string } | undefined)?.sessionID).toBe(sessionId)
//#when - real session.idle arrives
await eventHandler({
@@ -142,7 +149,7 @@ type EventInput = { event: { type: string; properties?: Record<string, unknown>
//#then - real idle dispatched once
expect(dispatchCalls.length).toBe(1)
expect(dispatchCalls[0].event.type).toBe("session.idle")
expect(dispatchCalls[0].event.properties?.sessionID).toBe(sessionId)
expect((dispatchCalls[0].event.properties as { sessionID?: string } | undefined)?.sessionID).toBe(sessionId)
//#when - session.status with idle (generates synthetic idle)
await eventHandler({
@@ -245,7 +252,7 @@ type EventInput = { event: { type: string; properties?: Record<string, unknown>
event: {
type: "message.updated",
},
})
} as any)
//#then - both maps should be pruned (no dedup should occur for new events)
// We verify by checking that a new idle event for same session is dispatched
@@ -287,7 +294,7 @@ type EventInput = { event: { type: string; properties?: Record<string, unknown>
stopContinuationGuard: { event: async () => {} },
compactionTodoPreserver: { event: async () => {} },
atlasHook: { handler: async () => {} },
},
} as any,
})
await eventHandlerWithMock({
@@ -426,7 +433,7 @@ describe("createEventHandler - event forwarding", () => {
type: "session.deleted",
properties: { info: { id: sessionID } },
},
})
} as any)
//#then
expect(forwardedEvents.length).toBe(1)
@@ -435,3 +442,146 @@ describe("createEventHandler - event forwarding", () => {
expect(deletedSessions).toEqual([sessionID])
})
})
describe("createEventHandler - retry dedupe lifecycle", () => {
it("re-handles same retry key after session recovers to idle status", async () => {
//#given
const sessionID = "ses_retry_recovery_rearm"
setMainSession(sessionID)
clearPendingModelFallback(sessionID)
const abortCalls: string[] = []
const promptCalls: string[] = []
const modelFallback = createModelFallbackHook()
const eventHandler = createEventHandler({
ctx: {
directory: "/tmp",
client: {
session: {
abort: async ({ path }: { path: { id: string } }) => {
abortCalls.push(path.id)
return {}
},
prompt: async ({ path }: { path: { id: string } }) => {
promptCalls.push(path.id)
return {}
},
},
},
} as any,
pluginConfig: {} as any,
firstMessageVariantGate: {
markSessionCreated: () => {},
clear: () => {},
},
managers: {
tmuxSessionManager: {
onSessionCreated: async () => {},
onSessionDeleted: async () => {},
},
skillMcpManager: {
disconnectSession: async () => {},
},
} as any,
hooks: {
modelFallback,
stopContinuationGuard: { isStopped: () => false },
} as any,
})
const chatMessageHandler = createChatMessageHandler({
ctx: {
client: {
tui: {
showToast: async () => ({}),
},
},
} as any,
pluginConfig: {} as any,
firstMessageVariantGate: {
shouldOverride: () => false,
markApplied: () => {},
},
hooks: {
modelFallback,
stopContinuationGuard: null,
keywordDetector: null,
claudeCodeHooks: null,
autoSlashCommand: null,
startWork: null,
ralphLoop: null,
} as any,
})
const retryStatus = {
type: "retry",
attempt: 1,
message: "All credentials for model claude-opus-4-6-thinking are cooling down [retrying in 7m 56s attempt #1]",
next: 476,
} as const
await eventHandler({
event: {
type: "message.updated",
properties: {
info: {
id: "msg_user_retry_rearm",
sessionID,
role: "user",
modelID: "claude-opus-4-6-thinking",
providerID: "anthropic",
agent: "Sisyphus (Ultraworker)",
},
},
},
} as any)
//#when - first retry key is handled
await eventHandler({
event: {
type: "session.status",
properties: {
sessionID,
status: retryStatus,
},
},
} as any)
const firstOutput = { message: {}, parts: [] as Array<{ type: string; text?: string }> }
await chatMessageHandler(
{
sessionID,
agent: "sisyphus",
model: { providerID: "anthropic", modelID: "claude-opus-4-6-thinking" },
},
firstOutput,
)
//#when - session recovers to non-retry idle state
await eventHandler({
event: {
type: "session.status",
properties: {
sessionID,
status: { type: "idle" },
},
},
} as any)
//#when - same retry key appears again after recovery
await eventHandler({
event: {
type: "session.status",
properties: {
sessionID,
status: retryStatus,
},
},
} as any)
//#then
expect(abortCalls).toEqual([sessionID, sessionID])
expect(promptCalls).toEqual([sessionID, sessionID])
})
})

View File

@@ -215,7 +215,6 @@ export function createEventHandler(args: {
await Promise.resolve(hooks.compactionTodoPreserver?.event?.(input));
await Promise.resolve(hooks.writeExistingFileGuard?.event?.(input));
await Promise.resolve(hooks.atlasHook?.handler?.(input));
await Promise.resolve(hooks.openclawSender?.event?.(input));
await Promise.resolve(hooks.autoSlashCommand?.event?.(input));
};
@@ -422,6 +421,12 @@ export function createEventHandler(args: {
const sessionID = props?.sessionID as string | undefined;
const status = props?.status as { type?: string; attempt?: number; message?: string; next?: number } | undefined;
// Retry dedupe lifecycle: set key when a retry status is handled, clear it after recovery
// (non-retry idle) so future failures with the same key can trigger fallback again.
if (sessionID && status?.type === "idle") {
lastHandledRetryStatusKey.delete(sessionID);
}
if (sessionID && status?.type === "retry" && isModelFallbackEnabled && !isRuntimeFallbackEnabled) {
try {
const retryMessage = typeof status.message === "string" ? status.message : "";

View File

@@ -26,7 +26,6 @@ import {
createPreemptiveCompactionHook,
createRuntimeFallbackHook,
} from "../../hooks"
import { createOpenClawSenderHook } from "../../hooks/openclaw-sender"
import { createAnthropicEffortHook } from "../../hooks/anthropic-effort"
import {
detectExternalNotificationPlugin,
@@ -61,7 +60,6 @@ export type SessionHooks = {
taskResumeInfo: ReturnType<typeof createTaskResumeInfoHook> | null
anthropicEffort: ReturnType<typeof createAnthropicEffortHook> | null
runtimeFallback: ReturnType<typeof createRuntimeFallbackHook> | null
openclawSender: ReturnType<typeof createOpenClawSenderHook> | null
}
export function createSessionHooks(args: {
@@ -263,11 +261,6 @@ export function createSessionHooks(args: {
pluginConfig,
}))
: null
const openclawSender = isHookEnabled("openclaw-sender") && pluginConfig.openclaw?.enabled
? safeHook("openclaw-sender", () => createOpenClawSenderHook(ctx, pluginConfig.openclaw!))
: null
return {
contextWindowMonitor,
preemptiveCompaction,
@@ -292,6 +285,5 @@ export function createSessionHooks(args: {
taskResumeInfo,
anthropicEffort,
runtimeFallback,
openclawSender,
}
}

View File

@@ -14,6 +14,7 @@ import {
createHashlineReadEnhancerHook,
createReadImageResizerHook,
createJsonErrorRecoveryHook,
createTodoDescriptionOverrideHook,
} from "../../hooks"
import {
getOpenCodeVersion,
@@ -35,6 +36,7 @@ export type ToolGuardHooks = {
hashlineReadEnhancer: ReturnType<typeof createHashlineReadEnhancerHook> | null
jsonErrorRecovery: ReturnType<typeof createJsonErrorRecoveryHook> | null
readImageResizer: ReturnType<typeof createReadImageResizerHook> | null
todoDescriptionOverride: ReturnType<typeof createTodoDescriptionOverrideHook> | null
}
export function createToolGuardHooks(args: {
@@ -111,6 +113,10 @@ export function createToolGuardHooks(args: {
? safeHook("read-image-resizer", () => createReadImageResizerHook(ctx))
: null
const todoDescriptionOverride = isHookEnabled("todo-description-override")
? safeHook("todo-description-override", () => createTodoDescriptionOverrideHook())
: null
return {
commentChecker,
toolOutputTruncator,
@@ -123,5 +129,6 @@ export function createToolGuardHooks(args: {
hashlineReadEnhancer,
jsonErrorRecovery,
readImageResizer,
todoDescriptionOverride,
}
}

View File

@@ -48,22 +48,50 @@ export function createToolExecuteAfterHandler(args: {
const prompt = typeof output.metadata?.prompt === "string" ? output.metadata.prompt : undefined
const verificationAttemptId = prompt?.match(VERIFICATION_ATTEMPT_PATTERN)?.[1]?.trim()
const loopState = directory ? readState(directory) : null
if (
const isVerificationContext =
agent === "oracle"
&& sessionId
&& verificationAttemptId
&& directory
&& !!sessionId
&& !!directory
&& loopState?.active === true
&& loopState.ultrawork === true
&& loopState.verification_pending === true
&& loopState.session_id === input.sessionID
log("[tool-execute-after] ULW verification tracking check", {
tool: input.tool,
agent,
parentSessionID: input.sessionID,
oracleSessionID: sessionId,
hasPromptInMetadata: typeof prompt === "string",
extractedVerificationAttemptId: verificationAttemptId,
})
if (
isVerificationContext
&& verificationAttemptId
&& loopState.verification_attempt_id === verificationAttemptId
) {
writeState(directory, {
...loopState,
verification_session_id: sessionId,
})
log("[tool-execute-after] Stored oracle verification session via attempt match", {
parentSessionID: input.sessionID,
oracleSessionID: sessionId,
verificationAttemptId,
})
} else if (isVerificationContext && !verificationAttemptId) {
writeState(directory, {
...loopState,
verification_session_id: sessionId,
})
log("[tool-execute-after] Fallback: stored oracle verification session without attempt match", {
parentSessionID: input.sessionID,
oracleSessionID: sessionId,
hasPromptInMetadata: typeof prompt === "string",
expectedAttemptId: loopState.verification_attempt_id,
extractedAttemptId: verificationAttemptId,
})
}
}

View File

@@ -33,7 +33,6 @@ export function createToolExecuteBeforeHandler(args: {
await hooks.prometheusMdOnly?.["tool.execute.before"]?.(input, output)
await hooks.sisyphusJuniorNotepad?.["tool.execute.before"]?.(input, output)
await hooks.atlasHook?.["tool.execute.before"]?.(input, output)
await hooks.openclawSender?.["tool.execute.before"]?.(input, output)
const normalizedToolName = input.tool.toLowerCase()
if (
@@ -80,6 +79,12 @@ export function createToolExecuteBeforeHandler(args: {
if (shouldInjectOracleVerification) {
const verificationAttemptId = randomUUID()
log("[tool-execute-before] Injecting ULW oracle verification attempt", {
sessionID: input.sessionID,
callID: input.callID,
verificationAttemptId,
loopSessionID: loopState.session_id,
})
writeState(ctx.directory, {
...loopState,
verification_attempt_id: verificationAttemptId,

View File

@@ -19,6 +19,27 @@ describe("tool.execute.before ultrawork oracle verification", () => {
}
}
function createOracleTaskArgs(prompt: string): Record<string, unknown> {
return {
subagent_type: "oracle",
run_in_background: true,
prompt,
}
}
function createSyncTaskMetadata(
args: Record<string, unknown>,
sessionId: string,
): Record<string, unknown> {
return {
prompt: args.prompt,
agent: "oracle",
run_in_background: args.run_in_background,
sessionId,
sync: true,
}
}
test("#given ulw loop is awaiting verification #when oracle task runs #then oracle prompt is enforced and sync", async () => {
const directory = join(tmpdir(), `tool-before-ulw-${Date.now()}`)
mkdirSync(directory, { recursive: true })
@@ -38,13 +59,7 @@ describe("tool.execute.before ultrawork oracle verification", () => {
ctx: createCtx(directory) as unknown as Parameters<typeof createToolExecuteBeforeHandler>[0]["ctx"],
hooks: {} as Parameters<typeof createToolExecuteBeforeHandler>[0]["hooks"],
})
const output = {
args: {
subagent_type: "oracle",
run_in_background: true,
prompt: "Check it",
} as Record<string, unknown>,
}
const output = { args: createOracleTaskArgs("Check it") }
await handler({ tool: "task", sessionID: "ses-main", callID: "call-1" }, output)
@@ -64,13 +79,7 @@ describe("tool.execute.before ultrawork oracle verification", () => {
ctx: createCtx(directory) as unknown as Parameters<typeof createToolExecuteBeforeHandler>[0]["ctx"],
hooks: {} as Parameters<typeof createToolExecuteBeforeHandler>[0]["hooks"],
})
const output = {
args: {
subagent_type: "oracle",
run_in_background: true,
prompt: "Check it",
} as Record<string, unknown>,
}
const output = { args: createOracleTaskArgs("Check it") }
await handler({ tool: "task", sessionID: "ses-main", callID: "call-1" }, output)
@@ -80,7 +89,7 @@ describe("tool.execute.before ultrawork oracle verification", () => {
rmSync(directory, { recursive: true, force: true })
})
test("#given ulw loop is awaiting verification #when oracle task finishes #then oracle session id is stored", async () => {
test("#given ulw loop is awaiting verification #when oracle sync task metadata is persisted #then oracle session id is stored", async () => {
const directory = join(tmpdir(), `tool-after-ulw-${Date.now()}`)
mkdirSync(directory, { recursive: true })
writeState(directory, {
@@ -99,14 +108,44 @@ describe("tool.execute.before ultrawork oracle verification", () => {
ctx: createCtx(directory) as unknown as Parameters<typeof createToolExecuteBeforeHandler>[0]["ctx"],
hooks: {} as Parameters<typeof createToolExecuteBeforeHandler>[0]["hooks"],
})
const beforeOutput = {
args: {
subagent_type: "oracle",
run_in_background: true,
prompt: "Check it",
} as Record<string, unknown>,
}
const beforeOutput = { args: createOracleTaskArgs("Check it") }
await beforeHandler({ tool: "task", sessionID: "ses-main", callID: "call-1" }, beforeOutput)
const metadataFromSyncTask = createSyncTaskMetadata(beforeOutput.args, "ses-oracle")
const handler = createToolExecuteAfterHandler({
ctx: createCtx(directory) as unknown as Parameters<typeof createToolExecuteAfterHandler>[0]["ctx"],
hooks: {} as Parameters<typeof createToolExecuteAfterHandler>[0]["hooks"],
})
await handler(
{ tool: "task", sessionID: "ses-main", callID: "call-1" },
{
title: "oracle task",
output: "done",
metadata: metadataFromSyncTask,
},
)
expect(readState(directory)?.verification_session_id).toBe("ses-oracle")
clearState(directory)
rmSync(directory, { recursive: true, force: true })
})
test("#given ulw loop is awaiting verification #when oracle metadata prompt is missing #then oracle session fallback is stored", async () => {
const directory = join(tmpdir(), `tool-after-ulw-fallback-${Date.now()}`)
mkdirSync(directory, { recursive: true })
writeState(directory, {
active: true,
iteration: 3,
completion_promise: ULTRAWORK_VERIFICATION_PROMISE,
initial_completion_promise: "DONE",
started_at: new Date().toISOString(),
prompt: "Ship feature",
session_id: "ses-main",
ultrawork: true,
verification_pending: true,
})
const handler = createToolExecuteAfterHandler({
ctx: createCtx(directory) as unknown as Parameters<typeof createToolExecuteAfterHandler>[0]["ctx"],
@@ -120,13 +159,13 @@ describe("tool.execute.before ultrawork oracle verification", () => {
output: "done",
metadata: {
agent: "oracle",
prompt: String(beforeOutput.args.prompt),
sessionId: "ses-oracle",
sessionId: "ses-oracle-fallback",
sync: true,
},
},
)
expect(readState(directory)?.verification_session_id).toBe("ses-oracle")
expect(readState(directory)?.verification_session_id).toBe("ses-oracle-fallback")
clearState(directory)
rmSync(directory, { recursive: true, force: true })
@@ -156,23 +195,11 @@ describe("tool.execute.before ultrawork oracle verification", () => {
hooks: {} as Parameters<typeof createToolExecuteAfterHandler>[0]["hooks"],
})
const firstOutput = {
args: {
subagent_type: "oracle",
run_in_background: true,
prompt: "Check it",
} as Record<string, unknown>,
}
const firstOutput = { args: createOracleTaskArgs("Check it") }
await beforeHandler({ tool: "task", sessionID: "ses-main", callID: "call-1" }, firstOutput)
const firstAttemptId = readState(directory)?.verification_attempt_id
const secondOutput = {
args: {
subagent_type: "oracle",
run_in_background: true,
prompt: "Check it again",
} as Record<string, unknown>,
}
const secondOutput = { args: createOracleTaskArgs("Check it again") }
await beforeHandler({ tool: "task", sessionID: "ses-main", callID: "call-2" }, secondOutput)
const secondAttemptId = readState(directory)?.verification_attempt_id

View File

@@ -1,45 +1,30 @@
/// <reference types="bun-types" />
import { beforeAll, beforeEach, afterEach, describe, expect, mock, test } from "bun:test"
import { beforeEach, afterEach, describe, expect, test } from "bun:test"
import { existsSync, mkdirSync, mkdtempSync, readFileSync, rmSync, writeFileSync } from "node:fs"
import { tmpdir } from "node:os"
import { join } from "node:path"
import * as dataPath from "./data-path"
import {
createConnectedProvidersCacheStore,
} from "./connected-providers-cache"
let fakeUserCacheRoot = ""
let testCacheDir = ""
let moduleImportCounter = 0
const getOmoOpenCodeCacheDirMock = mock(() => testCacheDir)
let updateConnectedProvidersCache: typeof import("./connected-providers-cache").updateConnectedProvidersCache
let readProviderModelsCache: typeof import("./connected-providers-cache").readProviderModelsCache
async function prepareConnectedProvidersCacheTestModule(): Promise<void> {
testCacheDir = mkdtempSync(join(tmpdir(), "connected-providers-cache-test-"))
getOmoOpenCodeCacheDirMock.mockClear()
mock.module("./data-path", () => ({
getOmoOpenCodeCacheDir: getOmoOpenCodeCacheDirMock,
}))
moduleImportCounter += 1
;({ updateConnectedProvidersCache, readProviderModelsCache } = await import(`./connected-providers-cache?test=${moduleImportCounter}`))
}
let testCacheStore: ReturnType<typeof createConnectedProvidersCacheStore>
describe("updateConnectedProvidersCache", () => {
beforeAll(() => {
mock.restore()
})
beforeEach(async () => {
mock.restore()
await prepareConnectedProvidersCacheTestModule()
beforeEach(() => {
fakeUserCacheRoot = mkdtempSync(join(tmpdir(), "connected-providers-user-cache-"))
testCacheDir = join(fakeUserCacheRoot, "oh-my-opencode")
testCacheStore = createConnectedProvidersCacheStore(() => testCacheDir)
})
afterEach(() => {
mock.restore()
if (existsSync(testCacheDir)) {
rmSync(testCacheDir, { recursive: true, force: true })
if (existsSync(fakeUserCacheRoot)) {
rmSync(fakeUserCacheRoot, { recursive: true, force: true })
}
fakeUserCacheRoot = ""
testCacheDir = ""
})
@@ -76,10 +61,10 @@ describe("updateConnectedProvidersCache", () => {
}
//#when
await updateConnectedProvidersCache(mockClient)
await testCacheStore.updateConnectedProvidersCache(mockClient)
//#then
const cache = readProviderModelsCache()
const cache = testCacheStore.readProviderModelsCache()
expect(cache).not.toBeNull()
expect(cache!.connected).toEqual(["openai", "anthropic"])
expect(cache!.models).toEqual({
@@ -109,10 +94,10 @@ describe("updateConnectedProvidersCache", () => {
}
//#when
await updateConnectedProvidersCache(mockClient)
await testCacheStore.updateConnectedProvidersCache(mockClient)
//#then
const cache = readProviderModelsCache()
const cache = testCacheStore.readProviderModelsCache()
expect(cache).not.toBeNull()
expect(cache!.models).toEqual({})
})
@@ -130,10 +115,10 @@ describe("updateConnectedProvidersCache", () => {
}
//#when
await updateConnectedProvidersCache(mockClient)
await testCacheStore.updateConnectedProvidersCache(mockClient)
//#then
const cache = readProviderModelsCache()
const cache = testCacheStore.readProviderModelsCache()
expect(cache).not.toBeNull()
expect(cache!.models).toEqual({})
})
@@ -143,25 +128,44 @@ describe("updateConnectedProvidersCache", () => {
const mockClient = {}
//#when
await updateConnectedProvidersCache(mockClient)
await testCacheStore.updateConnectedProvidersCache(mockClient)
//#then
const cache = readProviderModelsCache()
const cache = testCacheStore.readProviderModelsCache()
expect(cache).toBeNull()
})
test("does not remove the user's real cache directory during test setup", async () => {
test("does not remove unrelated files in the cache directory", async () => {
//#given
const realCacheDir = join(dataPath.getCacheDir(), "oh-my-opencode")
const realCacheDir = join(fakeUserCacheRoot, "oh-my-opencode")
const sentinelPath = join(realCacheDir, "connected-providers-cache.test-sentinel.json")
mkdirSync(realCacheDir, { recursive: true })
writeFileSync(sentinelPath, JSON.stringify({ keep: true }))
const mockClient = {
provider: {
list: async () => ({
data: {
connected: ["openai"],
all: [
{
id: "openai",
models: {
"gpt-5.4": { id: "gpt-5.4" },
},
},
],
},
}),
},
}
try {
//#when
await prepareConnectedProvidersCacheTestModule()
await testCacheStore.updateConnectedProvidersCache(mockClient)
//#then
expect(testCacheStore.readConnectedProvidersCache()).toEqual(["openai"])
expect(existsSync(sentinelPath)).toBe(true)
expect(readFileSync(sentinelPath, "utf-8")).toBe(JSON.stringify({ keep: true }))
} finally {

View File

@@ -25,172 +25,190 @@ interface ProviderModelsCache {
updatedAt: string
}
function getCacheFilePath(filename: string): string {
return join(dataPath.getOmoOpenCodeCacheDir(), filename)
}
function ensureCacheDir(): void {
const cacheDir = dataPath.getOmoOpenCodeCacheDir()
if (!existsSync(cacheDir)) {
mkdirSync(cacheDir, { recursive: true })
}
}
/**
* Read the connected providers cache.
* Returns the list of connected provider IDs, or null if cache doesn't exist.
*/
export function readConnectedProvidersCache(): string[] | null {
const cacheFile = getCacheFilePath(CONNECTED_PROVIDERS_CACHE_FILE)
if (!existsSync(cacheFile)) {
log("[connected-providers-cache] Cache file not found", { cacheFile })
return null
export function createConnectedProvidersCacheStore(
getCacheDir: () => string = dataPath.getOmoOpenCodeCacheDir
) {
function getCacheFilePath(filename: string): string {
return join(getCacheDir(), filename)
}
try {
const content = readFileSync(cacheFile, "utf-8")
const data = JSON.parse(content) as ConnectedProvidersCache
log("[connected-providers-cache] Read cache", { count: data.connected.length, updatedAt: data.updatedAt })
return data.connected
} catch (err) {
log("[connected-providers-cache] Error reading cache", { error: String(err) })
return null
}
}
let memConnected: string[] | null | undefined
let memProviderModels: ProviderModelsCache | null | undefined
/**
* Check if connected providers cache exists.
*/
export function hasConnectedProvidersCache(): boolean {
const cacheFile = getCacheFilePath(CONNECTED_PROVIDERS_CACHE_FILE)
return existsSync(cacheFile)
}
/**
* Write the connected providers cache.
*/
function writeConnectedProvidersCache(connected: string[]): void {
ensureCacheDir()
const cacheFile = getCacheFilePath(CONNECTED_PROVIDERS_CACHE_FILE)
const data: ConnectedProvidersCache = {
connected,
updatedAt: new Date().toISOString(),
function ensureCacheDir(): void {
const cacheDir = getCacheDir()
if (!existsSync(cacheDir)) {
mkdirSync(cacheDir, { recursive: true })
}
}
try {
writeFileSync(cacheFile, JSON.stringify(data, null, 2))
log("[connected-providers-cache] Cache written", { count: connected.length })
} catch (err) {
log("[connected-providers-cache] Error writing cache", { error: String(err) })
}
}
function readConnectedProvidersCache(): string[] | null {
if (memConnected !== undefined) return memConnected
const cacheFile = getCacheFilePath(CONNECTED_PROVIDERS_CACHE_FILE)
/**
* Read the provider-models cache.
* Returns the cache data, or null if cache doesn't exist.
*/
export function readProviderModelsCache(): ProviderModelsCache | null {
const cacheFile = getCacheFilePath(PROVIDER_MODELS_CACHE_FILE)
if (!existsSync(cacheFile)) {
log("[connected-providers-cache] Provider-models cache file not found", { cacheFile })
return null
}
try {
const content = readFileSync(cacheFile, "utf-8")
const data = JSON.parse(content) as ProviderModelsCache
log("[connected-providers-cache] Read provider-models cache", {
providerCount: Object.keys(data.models).length,
updatedAt: data.updatedAt
})
return data
} catch (err) {
log("[connected-providers-cache] Error reading provider-models cache", { error: String(err) })
return null
}
}
/**
* Check if provider-models cache exists.
*/
export function hasProviderModelsCache(): boolean {
const cacheFile = getCacheFilePath(PROVIDER_MODELS_CACHE_FILE)
return existsSync(cacheFile)
}
/**
* Write the provider-models cache.
*/
export function writeProviderModelsCache(data: { models: Record<string, string[]>; connected: string[] }): void {
ensureCacheDir()
const cacheFile = getCacheFilePath(PROVIDER_MODELS_CACHE_FILE)
const cacheData: ProviderModelsCache = {
...data,
updatedAt: new Date().toISOString(),
}
try {
writeFileSync(cacheFile, JSON.stringify(cacheData, null, 2))
log("[connected-providers-cache] Provider-models cache written", {
providerCount: Object.keys(data.models).length
})
} catch (err) {
log("[connected-providers-cache] Error writing provider-models cache", { error: String(err) })
}
}
/**
* Update the connected providers cache by fetching from the client.
* Also updates the provider-models cache with model lists per provider.
*/
export async function updateConnectedProvidersCache(client: {
provider?: {
list?: () => Promise<{
data?: {
connected?: string[]
all?: Array<{ id: string; models?: Record<string, unknown> }>
}
}>
}
}): Promise<void> {
if (!client?.provider?.list) {
log("[connected-providers-cache] client.provider.list not available")
return
}
try {
const result = await client.provider.list()
const connected = result.data?.connected ?? []
log("[connected-providers-cache] Fetched connected providers", { count: connected.length, providers: connected })
writeConnectedProvidersCache(connected)
const modelsByProvider: Record<string, string[]> = {}
const allProviders = result.data?.all ?? []
for (const provider of allProviders) {
if (provider.models) {
const modelIds = Object.keys(provider.models)
if (modelIds.length > 0) {
modelsByProvider[provider.id] = modelIds
}
}
if (!existsSync(cacheFile)) {
log("[connected-providers-cache] Cache file not found", { cacheFile })
memConnected = null
return null
}
log("[connected-providers-cache] Extracted models from provider list", {
providerCount: Object.keys(modelsByProvider).length,
totalModels: Object.values(modelsByProvider).reduce((sum, ids) => sum + ids.length, 0),
})
try {
const content = readFileSync(cacheFile, "utf-8")
const data = JSON.parse(content) as ConnectedProvidersCache
log("[connected-providers-cache] Read cache", { count: data.connected.length, updatedAt: data.updatedAt })
memConnected = data.connected
return data.connected
} catch (err) {
log("[connected-providers-cache] Error reading cache", { error: String(err) })
memConnected = null
return null
}
}
writeProviderModelsCache({
models: modelsByProvider,
function hasConnectedProvidersCache(): boolean {
const cacheFile = getCacheFilePath(CONNECTED_PROVIDERS_CACHE_FILE)
return existsSync(cacheFile)
}
function writeConnectedProvidersCache(connected: string[]): void {
ensureCacheDir()
const cacheFile = getCacheFilePath(CONNECTED_PROVIDERS_CACHE_FILE)
const data: ConnectedProvidersCache = {
connected,
})
} catch (err) {
log("[connected-providers-cache] Error updating cache", { error: String(err) })
updatedAt: new Date().toISOString(),
}
try {
writeFileSync(cacheFile, JSON.stringify(data, null, 2))
memConnected = connected
log("[connected-providers-cache] Cache written", { count: connected.length })
} catch (err) {
log("[connected-providers-cache] Error writing cache", { error: String(err) })
}
}
function readProviderModelsCache(): ProviderModelsCache | null {
if (memProviderModels !== undefined) return memProviderModels
const cacheFile = getCacheFilePath(PROVIDER_MODELS_CACHE_FILE)
if (!existsSync(cacheFile)) {
log("[connected-providers-cache] Provider-models cache file not found", { cacheFile })
memProviderModels = null
return null
}
try {
const content = readFileSync(cacheFile, "utf-8")
const data = JSON.parse(content) as ProviderModelsCache
log("[connected-providers-cache] Read provider-models cache", {
providerCount: Object.keys(data.models).length,
updatedAt: data.updatedAt,
})
memProviderModels = data
return data
} catch (err) {
log("[connected-providers-cache] Error reading provider-models cache", { error: String(err) })
memProviderModels = null
return null
}
}
function hasProviderModelsCache(): boolean {
const cacheFile = getCacheFilePath(PROVIDER_MODELS_CACHE_FILE)
return existsSync(cacheFile)
}
function writeProviderModelsCache(data: { models: Record<string, string[]>; connected: string[] }): void {
ensureCacheDir()
const cacheFile = getCacheFilePath(PROVIDER_MODELS_CACHE_FILE)
const cacheData: ProviderModelsCache = {
...data,
updatedAt: new Date().toISOString(),
}
try {
writeFileSync(cacheFile, JSON.stringify(cacheData, null, 2))
memProviderModels = cacheData
log("[connected-providers-cache] Provider-models cache written", {
providerCount: Object.keys(data.models).length,
})
} catch (err) {
log("[connected-providers-cache] Error writing provider-models cache", { error: String(err) })
}
}
async function updateConnectedProvidersCache(client: {
provider?: {
list?: () => Promise<{
data?: {
connected?: string[]
all?: Array<{ id: string; models?: Record<string, unknown> }>
}
}>
}
}): Promise<void> {
if (!client?.provider?.list) {
log("[connected-providers-cache] client.provider.list not available")
return
}
try {
const result = await client.provider.list()
const connected = result.data?.connected ?? []
log("[connected-providers-cache] Fetched connected providers", {
count: connected.length,
providers: connected,
})
writeConnectedProvidersCache(connected)
const modelsByProvider: Record<string, string[]> = {}
const allProviders = result.data?.all ?? []
for (const provider of allProviders) {
if (provider.models) {
const modelIds = Object.keys(provider.models)
if (modelIds.length > 0) {
modelsByProvider[provider.id] = modelIds
}
}
}
log("[connected-providers-cache] Extracted models from provider list", {
providerCount: Object.keys(modelsByProvider).length,
totalModels: Object.values(modelsByProvider).reduce((sum, ids) => sum + ids.length, 0),
})
writeProviderModelsCache({
models: modelsByProvider,
connected,
})
} catch (err) {
log("[connected-providers-cache] Error updating cache", { error: String(err) })
}
}
return {
readConnectedProvidersCache,
hasConnectedProvidersCache,
readProviderModelsCache,
hasProviderModelsCache,
writeProviderModelsCache,
updateConnectedProvidersCache,
}
}
const defaultConnectedProvidersCacheStore = createConnectedProvidersCacheStore(
() => dataPath.getOmoOpenCodeCacheDir()
)
export const {
readConnectedProvidersCache,
hasConnectedProvidersCache,
readProviderModelsCache,
hasProviderModelsCache,
writeProviderModelsCache,
updateConnectedProvidersCache,
} = defaultConnectedProvidersCacheStore

View File

@@ -74,7 +74,7 @@ export async function resolveFileReferencesInText(
let resolved = text
for (const [pattern, replacement] of replacements.entries()) {
resolved = resolved.split(pattern).join(replacement)
resolved = resolved.replaceAll(pattern, replacement)
}
if (findFileReferences(resolved).length > 0 && depth + 1 < maxDepth) {

View File

@@ -1,16 +1,42 @@
// Shared logging utility for the plugin
import * as fs from "fs"
import * as os from "os"
import * as path from "path"
const logFile = path.join(os.tmpdir(), "oh-my-opencode.log")
let buffer: string[] = []
let flushTimer: ReturnType<typeof setTimeout> | null = null
const FLUSH_INTERVAL_MS = 500
const BUFFER_SIZE_LIMIT = 50
function flush(): void {
if (buffer.length === 0) return
const data = buffer.join("")
buffer = []
try {
fs.appendFileSync(logFile, data)
} catch {
}
}
function scheduleFlush(): void {
if (flushTimer) return
flushTimer = setTimeout(() => {
flushTimer = null
flush()
}, FLUSH_INTERVAL_MS)
}
export function log(message: string, data?: unknown): void {
try {
const timestamp = new Date().toISOString()
const logEntry = `[${timestamp}] ${message} ${data ? JSON.stringify(data) : ""}\n`
fs.appendFileSync(logFile, logEntry)
buffer.push(logEntry)
if (buffer.length >= BUFFER_SIZE_LIMIT) {
flush()
} else {
scheduleFlush()
}
} catch {
}
}

View File

@@ -9,6 +9,8 @@ function escapeRegexExceptAsterisk(str: string): string {
return str.replace(/[.+?^${}()|[\]\\]/g, "\\$&")
}
const regexCache = new Map<string, RegExp>()
export function matchesToolMatcher(toolName: string, matcher: string): boolean {
if (!matcher) {
return true
@@ -17,8 +19,12 @@ export function matchesToolMatcher(toolName: string, matcher: string): boolean {
return patterns.some((p) => {
if (p.includes("*")) {
// First escape regex special chars (except *), then convert * to .*
const escaped = escapeRegexExceptAsterisk(p)
const regex = new RegExp(`^${escaped.replace(/\*/g, ".*")}$`, "i")
let regex = regexCache.get(p)
if (!regex) {
const escaped = escapeRegexExceptAsterisk(p)
regex = new RegExp(`^${escaped.replace(/\*/g, ".*")}$`, "i")
regexCache.set(p, regex)
}
return regex.test(toolName)
}
return p.toLowerCase() === toolName.toLowerCase()

View File

@@ -1,4 +1,5 @@
export const PLUGIN_NAME = "oh-my-opencode"
export const LEGACY_PLUGIN_NAME = "oh-my-openagent"
export const CONFIG_BASENAME = "oh-my-opencode"
export const LOG_FILENAME = "oh-my-opencode.log"
export const CACHE_DIR_NAME = "oh-my-opencode"

View File

@@ -1,4 +1,4 @@
import { describe, it, expect, beforeAll, afterAll } from "bun:test"
import { afterEach, beforeEach, describe, expect, it, spyOn } from "bun:test"
import {
isPortAvailable,
findAvailablePort,
@@ -6,96 +6,283 @@ import {
DEFAULT_SERVER_PORT,
} from "./port-utils"
const HOSTNAME = "127.0.0.1"
const REAL_PORT_SEARCH_WINDOW = 200
function supportsRealSocketBinding(): boolean {
try {
const server = Bun.serve({
port: 0,
hostname: HOSTNAME,
fetch: () => new Response("probe"),
})
server.stop(true)
return true
} catch {
return false
}
}
const canBindRealSockets = supportsRealSocketBinding()
describe("port-utils", () => {
describe("isPortAvailable", () => {
it("#given unused port #when checking availability #then returns true", async () => {
const port = 59999
const result = await isPortAvailable(port)
expect(result).toBe(true)
})
it("#given port in use #when checking availability #then returns false", async () => {
const port = 59998
const blocker = Bun.serve({
if (canBindRealSockets) {
function startRealBlocker(port: number = 0) {
return Bun.serve({
port,
hostname: "127.0.0.1",
hostname: HOSTNAME,
fetch: () => new Response("blocked"),
})
}
try {
const result = await isPortAvailable(port)
expect(result).toBe(false)
} finally {
blocker.stop(true)
async function findContiguousAvailableStart(length: number): Promise<number> {
const probe = startRealBlocker()
const seedPort = probe.port
probe.stop(true)
for (let candidate = seedPort; candidate < seedPort + REAL_PORT_SEARCH_WINDOW; candidate++) {
const checks = await Promise.all(
Array.from({ length }, async (_, offset) => isPortAvailable(candidate + offset, HOSTNAME))
)
if (checks.every(Boolean)) {
return candidate
}
}
})
})
describe("findAvailablePort", () => {
it("#given start port available #when finding port #then returns start port", async () => {
const startPort = 59997
const result = await findAvailablePort(startPort)
expect(result).toBe(startPort)
})
throw new Error(`Could not find ${length} contiguous available ports`)
}
it("#given start port blocked #when finding port #then returns next available", async () => {
const startPort = 59996
const blocker = Bun.serve({
port: startPort,
hostname: "127.0.0.1",
fetch: () => new Response("blocked"),
describe("with real sockets", () => {
describe("isPortAvailable", () => {
it("#given unused port #when checking availability #then returns true", async () => {
const blocker = startRealBlocker()
const port = blocker.port
blocker.stop(true)
const result = await isPortAvailable(port)
expect(result).toBe(true)
})
it("#given port in use #when checking availability #then returns false", async () => {
const blocker = startRealBlocker()
const port = blocker.port
try {
const result = await isPortAvailable(port)
expect(result).toBe(false)
} finally {
blocker.stop(true)
}
})
})
try {
const result = await findAvailablePort(startPort)
expect(result).toBe(startPort + 1)
} finally {
blocker.stop(true)
}
})
describe("findAvailablePort", () => {
it("#given start port available #when finding port #then returns start port", async () => {
const startPort = await findContiguousAvailableStart(1)
const result = await findAvailablePort(startPort)
expect(result).toBe(startPort)
})
it("#given multiple ports blocked #when finding port #then skips all blocked", async () => {
const startPort = 59993
const blockers = [
Bun.serve({ port: startPort, hostname: "127.0.0.1", fetch: () => new Response() }),
Bun.serve({ port: startPort + 1, hostname: "127.0.0.1", fetch: () => new Response() }),
Bun.serve({ port: startPort + 2, hostname: "127.0.0.1", fetch: () => new Response() }),
]
it("#given start port blocked #when finding port #then returns next available", async () => {
const startPort = await findContiguousAvailableStart(2)
const blocker = startRealBlocker(startPort)
try {
const result = await findAvailablePort(startPort)
expect(result).toBe(startPort + 3)
} finally {
blockers.forEach((b) => b.stop(true))
}
})
})
try {
const result = await findAvailablePort(startPort)
expect(result).toBe(startPort + 1)
} finally {
blocker.stop(true)
}
})
describe("getAvailableServerPort", () => {
it("#given preferred port available #when getting port #then returns preferred with wasAutoSelected=false", async () => {
const preferredPort = 59990
const result = await getAvailableServerPort(preferredPort)
expect(result.port).toBe(preferredPort)
expect(result.wasAutoSelected).toBe(false)
})
it("#given multiple ports blocked #when finding port #then skips all blocked", async () => {
const startPort = await findContiguousAvailableStart(4)
const blockers = [
startRealBlocker(startPort),
startRealBlocker(startPort + 1),
startRealBlocker(startPort + 2),
]
it("#given preferred port blocked #when getting port #then returns alternative with wasAutoSelected=true", async () => {
const preferredPort = 59989
const blocker = Bun.serve({
port: preferredPort,
hostname: "127.0.0.1",
fetch: () => new Response("blocked"),
try {
const result = await findAvailablePort(startPort)
expect(result).toBe(startPort + 3)
} finally {
blockers.forEach((blocker) => blocker.stop(true))
}
})
})
try {
const result = await getAvailableServerPort(preferredPort)
expect(result.port).toBeGreaterThan(preferredPort)
expect(result.wasAutoSelected).toBe(true)
} finally {
blocker.stop(true)
}
describe("getAvailableServerPort", () => {
it("#given preferred port available #when getting port #then returns preferred with wasAutoSelected=false", async () => {
const preferredPort = await findContiguousAvailableStart(1)
const result = await getAvailableServerPort(preferredPort)
expect(result.port).toBe(preferredPort)
expect(result.wasAutoSelected).toBe(false)
})
it("#given preferred port blocked #when getting port #then returns alternative with wasAutoSelected=true", async () => {
const preferredPort = await findContiguousAvailableStart(2)
const blocker = startRealBlocker(preferredPort)
try {
const result = await getAvailableServerPort(preferredPort)
expect(result.port).toBe(preferredPort + 1)
expect(result.wasAutoSelected).toBe(true)
} finally {
blocker.stop(true)
}
})
})
})
})
} else {
const blockedSockets = new Set<string>()
let serveSpy: ReturnType<typeof spyOn>
function getSocketKey(port: number, hostname: string): string {
return `${hostname}:${port}`
}
beforeEach(() => {
blockedSockets.clear()
serveSpy = spyOn(Bun, "serve").mockImplementation(({ port, hostname }) => {
if (typeof port !== "number") {
throw new Error("Test expected numeric port")
}
const resolvedHostname = typeof hostname === "string" ? hostname : HOSTNAME
const socketKey = getSocketKey(port, resolvedHostname)
if (blockedSockets.has(socketKey)) {
const error = new Error(`Failed to start server. Is port ${port} in use?`) as Error & {
code?: string
syscall?: string
errno?: number
address?: string
port?: number
}
error.code = "EADDRINUSE"
error.syscall = "listen"
error.errno = 0
error.address = resolvedHostname
error.port = port
throw error
}
blockedSockets.add(socketKey)
return {
stop: (_force?: boolean) => {
blockedSockets.delete(socketKey)
},
} as { stop: (force?: boolean) => void }
})
})
afterEach(() => {
expect(blockedSockets.size).toBe(0)
serveSpy.mockRestore()
blockedSockets.clear()
})
describe("with mocked sockets fallback", () => {
describe("isPortAvailable", () => {
it("#given unused port #when checking availability #then returns true", async () => {
const port = 59999
const result = await isPortAvailable(port)
expect(result).toBe(true)
expect(blockedSockets.size).toBe(0)
})
it("#given port in use #when checking availability #then returns false", async () => {
const port = 59998
const blocker = Bun.serve({
port,
hostname: HOSTNAME,
fetch: () => new Response("blocked"),
})
try {
const result = await isPortAvailable(port)
expect(result).toBe(false)
} finally {
blocker.stop(true)
}
})
it("#given custom hostname #when checking availability #then passes hostname through to Bun.serve", async () => {
const hostname = "192.0.2.10"
await isPortAvailable(59995, hostname)
expect(serveSpy.mock.calls[0]?.[0]?.hostname).toBe(hostname)
})
})
describe("findAvailablePort", () => {
it("#given start port available #when finding port #then returns start port", async () => {
const startPort = 59997
const result = await findAvailablePort(startPort)
expect(result).toBe(startPort)
})
it("#given start port blocked #when finding port #then returns next available", async () => {
const startPort = 59996
const blocker = Bun.serve({
port: startPort,
hostname: HOSTNAME,
fetch: () => new Response("blocked"),
})
try {
const result = await findAvailablePort(startPort)
expect(result).toBe(startPort + 1)
} finally {
blocker.stop(true)
}
})
it("#given multiple ports blocked #when finding port #then skips all blocked", async () => {
const startPort = 59993
const blockers = [
Bun.serve({ port: startPort, hostname: HOSTNAME, fetch: () => new Response() }),
Bun.serve({ port: startPort + 1, hostname: HOSTNAME, fetch: () => new Response() }),
Bun.serve({ port: startPort + 2, hostname: HOSTNAME, fetch: () => new Response() }),
]
try {
const result = await findAvailablePort(startPort)
expect(result).toBe(startPort + 3)
} finally {
blockers.forEach((blocker) => blocker.stop(true))
}
})
})
describe("getAvailableServerPort", () => {
it("#given preferred port available #when getting port #then returns preferred with wasAutoSelected=false", async () => {
const preferredPort = 59990
const result = await getAvailableServerPort(preferredPort)
expect(result.port).toBe(preferredPort)
expect(result.wasAutoSelected).toBe(false)
})
it("#given preferred port blocked #when getting port #then returns alternative with wasAutoSelected=true", async () => {
const preferredPort = 59989
const blocker = Bun.serve({
port: preferredPort,
hostname: HOSTNAME,
fetch: () => new Response("blocked"),
})
try {
const result = await getAvailableServerPort(preferredPort)
expect(result.port).toBe(preferredPort + 1)
expect(result.wasAutoSelected).toBe(true)
} finally {
blocker.stop(true)
}
})
})
})
}
describe("DEFAULT_SERVER_PORT", () => {
it("#given constant #when accessed #then returns 4096", () => {

View File

@@ -109,3 +109,44 @@ export function buildEnvPrefix(
return ""
}
}
/**
* Escape a value for use in a double-quoted shell -c command argument.
*
* In shell -c "..." strings, these characters have special meaning and must be escaped:
* - $ - variable expansion, command substitution $(...)
* - ` - command substitution `...`
* - \\ - escape character
* - " - end quote
* - ; | & - command separators
* - # - comment
* - () - grouping operators
*
* @param value - The value to escape
* @returns Escaped value safe for double-quoted shell -c argument
*
* @example
* ```ts
* // For malicious input
* const url = "http://localhost:3000'; cat /etc/passwd; echo '"
* const escaped = shellEscapeForDoubleQuotedCommand(url)
* // => "http://localhost:3000'\''; cat /etc/passwd; echo '"
*
* // Usage in command:
* const cmd = `/bin/sh -c "opencode attach ${escaped} --session ${sessionId}"`
* ```
*/
export function shellEscapeForDoubleQuotedCommand(value: string): string {
// Order matters: escape backslash FIRST, then other characters
return value
.replace(/\\/g, "\\\\") // escape backslash first
.replace(/\$/g, "\\$") // escape dollar sign
.replace(/`/g, "\\`") // escape backticks
.replace(/"/g, "\\\"") // escape double quotes
.replace(/;/g, "\\;") // escape semicolon (command separator)
.replace(/\|/g, "\\|") // escape pipe (command separator)
.replace(/&/g, "\\&") // escape ampersand (command separator)
.replace(/#/g, "\\#") // escape hash (comment)
.replace(/\(/g, "\\(") // escape parentheses
.replace(/\)/g, "\\)") // escape parentheses
}

View File

@@ -3,6 +3,7 @@ import type { TmuxConfig } from "../../../config/schema"
import { getTmuxPath } from "../../../tools/interactive-bash/tmux-path-resolver"
import type { SpawnPaneResult } from "../types"
import { isInsideTmux } from "./environment"
import { shellEscapeForDoubleQuotedCommand } from "../../shell-env"
export async function replaceTmuxPane(
paneId: string,
@@ -35,7 +36,8 @@ export async function replaceTmuxPane(
await ctrlCProc.exited
const shell = process.env.SHELL || "/bin/sh"
const opencodeCmd = `${shell} -c 'opencode attach ${serverUrl} --session ${sessionId}'`
const escapedUrl = shellEscapeForDoubleQuotedCommand(serverUrl)
const opencodeCmd = `${shell} -c "opencode attach ${escapedUrl} --session ${sessionId}"`
const proc = spawn([tmux, "respawn-pane", "-k", "-t", paneId, opencodeCmd], {
stdout: "pipe",
@@ -60,6 +62,7 @@ export async function replaceTmuxPane(
const titleStderr = await stderrPromise
log("[replaceTmuxPane] WARNING: failed to set pane title", {
paneId,
title,
exitCode: titleExitCode,
stderr: titleStderr.trim(),
})

View File

@@ -0,0 +1,96 @@
import { describe, expect, it } from "bun:test"
import { shellEscapeForDoubleQuotedCommand } from "../../shell-env"
describe("given a serverUrl with shell metacharacters", () => {
describe("when building tmux spawn command with double quotes", () => {
it("then serverUrl is escaped to prevent shell injection", () => {
const serverUrl = "http://localhost:3000'; cat /etc/passwd; echo '"
const sessionId = "test-session"
const shell = "/bin/sh"
// Use double quotes for outer shell -c command, escape dangerous chars in URL
const escapedUrl = shellEscapeForDoubleQuotedCommand(serverUrl)
const opencodeCmd = `${shell} -c "opencode attach ${escapedUrl} --session ${sessionId}"`
// The semicolon should be escaped so it's treated as literal, not separator
expect(opencodeCmd).toContain("\\;")
// The malicious content should be escaped - semicolons are now \\;
expect(opencodeCmd).not.toMatch(/[^\\];\s*cat/)
})
})
describe("when building tmux replace command", () => {
it("then serverUrl is escaped to prevent shell injection", () => {
const serverUrl = "http://localhost:3000'; rm -rf /; '"
const sessionId = "test-session"
const shell = "/bin/sh"
const escapedUrl = shellEscapeForDoubleQuotedCommand(serverUrl)
const opencodeCmd = `${shell} -c "opencode attach ${escapedUrl} --session ${sessionId}"`
expect(opencodeCmd).toContain("\\;")
expect(opencodeCmd).not.toMatch(/[^\\];\s*rm/)
})
})
})
describe("given a normal serverUrl without shell metacharacters", () => {
describe("when building tmux spawn command", () => {
it("then serverUrl works correctly", () => {
const serverUrl = "http://localhost:3000"
const sessionId = "test-session"
const shell = "/bin/sh"
const escapedUrl = shellEscapeForDoubleQuotedCommand(serverUrl)
const opencodeCmd = `${shell} -c "opencode attach ${escapedUrl} --session ${sessionId}"`
expect(opencodeCmd).toContain(serverUrl)
})
})
})
describe("given a serverUrl with dollar sign (command injection)", () => {
describe("when building tmux command", () => {
it("then dollar sign is escaped properly", () => {
const serverUrl = "http://localhost:3000$(whoami)"
const sessionId = "test-session"
const shell = "/bin/sh"
const escapedUrl = shellEscapeForDoubleQuotedCommand(serverUrl)
const opencodeCmd = `${shell} -c "opencode attach ${escapedUrl} --session ${sessionId}"`
// The $ should be escaped to literal $
expect(opencodeCmd).toContain("\\$")
})
})
})
describe("given a serverUrl with backticks (command injection)", () => {
describe("when building tmux command", () => {
it("then backticks are escaped properly", () => {
const serverUrl = "http://localhost:3000`whoami`"
const sessionId = "test-session"
const shell = "/bin/sh"
const escapedUrl = shellEscapeForDoubleQuotedCommand(serverUrl)
const opencodeCmd = `${shell} -c "opencode attach ${escapedUrl} --session ${sessionId}"`
expect(opencodeCmd).toContain("\\`")
})
})
})
describe("given a serverUrl with pipe operator", () => {
describe("when building tmux command", () => {
it("then pipe is escaped properly", () => {
const serverUrl = "http://localhost:3000 | ls"
const sessionId = "test-session"
const shell = "/bin/sh"
const escapedUrl = shellEscapeForDoubleQuotedCommand(serverUrl)
const opencodeCmd = `${shell} -c "opencode attach ${escapedUrl} --session ${sessionId}"`
expect(opencodeCmd).toContain("\\|")
})
})
})

View File

@@ -5,6 +5,7 @@ import type { SpawnPaneResult } from "../types"
import type { SplitDirection } from "./environment"
import { isInsideTmux } from "./environment"
import { isServerRunning } from "./server-health"
import { shellEscapeForDoubleQuotedCommand } from "../../shell-env"
export async function spawnTmuxPane(
sessionId: string,
@@ -49,7 +50,8 @@ export async function spawnTmuxPane(
log("[spawnTmuxPane] all checks passed, spawning...")
const shell = process.env.SHELL || "/bin/sh"
const opencodeCmd = `${shell} -c 'opencode attach ${serverUrl} --session ${sessionId}'`
const escapedUrl = shellEscapeForDoubleQuotedCommand(serverUrl)
const opencodeCmd = `${shell} -c "opencode attach ${escapedUrl} --session ${sessionId}"`
const args = [
"split-window",

View File

@@ -7,16 +7,22 @@ import * as connectedProvidersCache from "../../shared/connected-providers-cache
describe("resolveCategoryExecution", () => {
let connectedProvidersSpy: ReturnType<typeof spyOn> | undefined
let providerModelsSpy: ReturnType<typeof spyOn> | undefined
let hasConnectedProvidersSpy: ReturnType<typeof spyOn> | undefined
let hasProviderModelsSpy: ReturnType<typeof spyOn> | undefined
beforeEach(() => {
mock.restore()
connectedProvidersSpy = spyOn(connectedProvidersCache, "readConnectedProvidersCache").mockReturnValue(null)
providerModelsSpy = spyOn(connectedProvidersCache, "readProviderModelsCache").mockReturnValue(null)
hasConnectedProvidersSpy = spyOn(connectedProvidersCache, "hasConnectedProvidersCache").mockReturnValue(false)
hasProviderModelsSpy = spyOn(connectedProvidersCache, "hasProviderModelsCache").mockReturnValue(false)
})
afterEach(() => {
connectedProvidersSpy?.mockRestore()
providerModelsSpy?.mockRestore()
hasConnectedProvidersSpy?.mockRestore()
hasProviderModelsSpy?.mockRestore()
})
const createMockExecutorContext = (): ExecutorContext => ({
@@ -27,7 +33,7 @@ describe("resolveCategoryExecution", () => {
sisyphusJuniorModel: undefined,
})
test("returns clear error when category exists but required model is not available", async () => {
test("returns unpinned resolution when category cache is not ready on first run", async () => {
//#given
const args = {
category: "deep",
@@ -39,6 +45,9 @@ describe("resolveCategoryExecution", () => {
enableSkillTools: false,
}
const executorCtx = createMockExecutorContext()
executorCtx.userCategories = {
deep: {},
}
const inheritedModel = undefined
const systemDefaultModel = "anthropic/claude-sonnet-4-6"
@@ -46,10 +55,10 @@ describe("resolveCategoryExecution", () => {
const result = await resolveCategoryExecution(args, executorCtx, inheritedModel, systemDefaultModel)
//#then
expect(result.error).toBeDefined()
expect(result.error).toContain("deep")
expect(result.error).toMatch(/model.*not.*available|requires.*model/i)
expect(result.error).not.toContain("Unknown category")
expect(result.error).toBeUndefined()
expect(result.actualModel).toBeUndefined()
expect(result.categoryModel).toBeUndefined()
expect(result.agentToUse).toBeDefined()
})
test("returns 'unknown category' error for truly unknown categories", async () => {

Some files were not shown because too many files have changed in this diff Show More