Commit Graph

742 Commits

Author SHA1 Message Date
justsisyphus
e23ce11df9 feat: allow Sisyphus-Junior to call sisyphus_task 2026-01-16 17:14:31 +09:00
justsisyphus
f1cdb3bce1 feat: global sisyphus_task deny with orchestrator exceptions
- Add sisyphus_task: deny to global config.permission
- Add sisyphus_task: allow exception for orchestrator-sisyphus, Sisyphus, and Prometheus (Planner)
- Ensures only orchestrator agents can spawn sisyphus_task subagents
2026-01-16 17:13:08 +09:00
justsisyphus
83cbc56709 refactor: remove legacy tools format, use permission only
BREAKING: Requires OpenCode 1.1.1+

- Remove supportsNewPermissionSystem/usesLegacyToolsSystem checks
- Simplify permission-compat.ts to permission format only
- Unify explore/librarian deny lists: write, edit, task, sisyphus_task, call_omo_agent
- Add sisyphus_task to oracle deny list
- Update agent-tool-restrictions.ts with correct per-agent restrictions
- Clean config-handler.ts conditional version checks
- Update tests for simplified API
2026-01-16 17:11:34 +09:00
justsisyphus
ede9abceb3 feat(multimodal-looker): restrict to read-only tool access
Use createAgentToolAllowlist to allow only 'read' tool for multimodal-looker agent.
Previously denied write/edit/bash but allowed other tools.
Now uses wildcard deny pattern (*: deny) with explicit read allow.

- Add createAgentToolAllowlist function for allowlist-based restrictions
- Support legacy fallback for older OpenCode versions
- Add 4 test cases covering both permission systems
2026-01-16 15:02:55 +09:00
justsisyphus
27ef9fa8df feat(orchestrator): emphasize project-level lsp_diagnostics and QA verification
- Add mandatory PROJECT-LEVEL code checks (lsp_diagnostics at src/ or . level)
- Strengthen verification duties with explicit QA checklist
- Add 'SUBAGENTS LIE - VERIFY EVERYTHING' reminders throughout
- Emphasize that only orchestrator sees full picture of cross-file impacts
2026-01-16 14:11:56 +09:00
justsisyphus
333db56172 refactor(agents): remove lsp_diagnostics from Sisyphus and Sisyphus-Junior prompts
Orchestrator Sisyphus will handle project-level code validation instead of
having each subagent run file-level lsp_diagnostics.
2026-01-16 14:09:28 +09:00
justsisyphus
1ecb2bafdf fix(hooks): prevent start-work false trigger from command description
- Remove 'Start Sisyphus work session' text check, keep only <session-context> tag
- Update interactive_bash description with WARNING: TMUX ONLY emphasis
- Update tests to use <session-context> wrapper
2026-01-16 14:01:29 +09:00
justsisyphus
d00c2e7439 fix(hooks): extract model from assistant messages with flat modelID/providerID
OpenCode API returns different structures for user vs assistant messages:
- User: info.model = { providerID, modelID } (nested)
- Assistant: info.modelID, info.providerID (flat top-level)

Previous code only checked nested format, causing model info loss when
continuation hooks fired after assistant messages.

Files modified:
- todo-continuation-enforcer.ts
- ralph-loop/index.ts
- sisyphus-task/tools.ts
- background-agent/manager.ts

Added test for assistant message model extraction.
2026-01-16 13:54:22 +09:00
justsisyphus
8d545723dc refactor(orchestrator): restructure post-verification workflow as Step 4-6
- Unified verification (Step 1-3) and post-verification (Step 4-6) into continuous workflow
- Step 4: Immediate plan file marking after verification passes
- Step 5: Commit atomic unit
- Step 6: Proceed to next task
- Emphasized immediacy: 'RIGHT NOW - Do not delay'
- Applied to both boulder state and standalone reminder contexts
2026-01-16 13:48:18 +09:00
justsisyphus
e737477fbe feat(prometheus): strengthen plan-mode constraints with constraint-first architecture
- Move Turn Termination Rules inside <system-reminder> block (from line 488 to ~186)
- Add Final Constraint Reminder at end of prompt (constraint sandwich pattern)
- Preserve all existing interview mode detail and strategies

Applies OpenCode's effective constraint patterns to prevent plan-mode agents
from offering to implement work instead of staying in consultation mode.
2026-01-16 13:36:46 +09:00
justsisyphus
aa859f8cdd feat(sisyphus-task): require explicit skills parameter - reject empty array []
- Change skills type from string[] to string[] | null
- Empty array [] now returns error with available skills list
- null is allowed for tasks that genuinely need no skills
- Updated tests to use skills: null instead of skills: []
- Forces explicit decision: either specify skills or justify with null
2026-01-16 13:12:48 +09:00
justsisyphus
c282244439 fix: store session agent in chat.message for prometheus-md-only hook
The prometheus-md-only hook was not enforcing file restrictions because
getSessionAgent() returned undefined - setSessionAgent was only called
in message.updated event which doesn't always provide agent info.

- Add setSessionAgent call in chat.message hook when input.agent exists
- Add session state tests for setSessionAgent/getSessionAgent
- Add clearSessionAgent cleanup to prometheus-md-only tests

This ensures prometheus-md-only hook can reliably identify Prometheus
sessions and enforce .sisyphus/*.md write restrictions.
2026-01-16 11:35:37 +09:00
justsisyphus
75925d5433 fix: clear session agent on /start-work to allow mode transition from Prometheus
When transitioning from Prometheus (Planner) to Sisyphus via /start-work,
the session agent was not being cleared. This caused prometheus-md-only
hook to continue injecting READ-ONLY constraints into sisyphus_task calls.

- Add clearSessionAgent() call when start-work command is detected
- Add TDD test verifying clearSessionAgent is called with sessionID
2026-01-16 11:35:37 +09:00
justsisyphus
c7ca608b38 refactor: unify system directive prefix for keyword-detector filtering
- Add shared/system-directive.ts with SYSTEM_DIRECTIVE_PREFIX constant
- Unify all system message prefixes to [SYSTEM DIRECTIVE: OH-MY-OPENCODE - ...]
- Add isSystemDirective() filter to keyword-detector to skip system messages
- Update prometheus-md-only tests to use new prefix constants
2026-01-16 11:35:37 +09:00
justsisyphus
b933992e36 refactor: remove dcp_for_compaction and preemptive_compaction features
- Delete src/hooks/preemptive-compaction/ entirely
- Remove dcp_for_compaction from schema and executor
- Clean up related imports, options, and test code
- Update READMEs to remove experimental options docs
2026-01-16 11:35:37 +09:00
justsisyphus
bf28b3e711 fix: ensure Sisyphus agent has call_omo_agent disabled
The tools restriction was defined in sisyphus.ts but not enforced in
config-handler.ts like other agents (orchestrator-sisyphus, Prometheus).
Added explicit tools setting to guarantee call_omo_agent is disabled.
2026-01-16 11:35:37 +09:00
justsisyphus
9363324e0e refactor(lsp): clean up lsp_servers references and update prompts to use PascalCase
- Remove dead lsp_servers function from tools.ts
- Update utils.ts to reference LspServers (OpenCode built-in)
- Update AGENTS.md: 7 tools → 3 tools
- Update init-deep.ts prompts to use PascalCase OpenCode tools
- Update refactor.ts prompts to use PascalCase OpenCode tools
2026-01-16 11:35:37 +09:00
Kenny
f888da8848 Merge pull request #833 from KNN-07/fix/git-master-watermark-injection
fix(git-master): inject watermark only when enabled instead of overriding defaults
2026-01-15 20:36:18 -05:00
justsisyphus
584aecf266 refactor(config): disable unused OpenCode built-in LSP tools
LspHover, LspCodeActions, LspCodeActionResolve are disabled globally as they are not needed when using oh-my-opencode's curated LSP tools.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-01-16 10:33:44 +09:00
justsisyphus
848b2e3faa refactor(lsp): remove lsp_servers - duplicates OpenCode's LspServers
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-01-16 10:33:35 +09:00
Nguyen Khac Trung Kien
e36385e671 fix(git-master): inject watermark only when enabled instead of overriding defaults
The watermark (commit footer and co-author) was inconsistently applied because:
1. The skill tool didn't receive gitMasterConfig
2. The approach was 'default ON, inject DISABLED override' which LLMs sometimes ignored

This refactors to 'inject only when enabled' approach:
- Remove hardcoded watermark section from base templates
- Dynamically inject section 5.5 based on config values
- Default is still ON (both true when no config)
- When both disabled, no injection occurs (clean prompt)

Also fixes missing config propagation to skill tool and createBuiltinAgents.
2026-01-16 08:01:04 +07:00
Jeremy Gollehon
837176d947 Merge pull request #803 from GollyJer/concurrency-hardening
feat(concurrency): prevent background task races and leaks

Summary
Fixes race conditions and memory leaks in the background task system that could cause "all tasks complete" notifications to never fire, leaving parent sessions waiting indefinitely.

Why This Change?
When background tasks are tracked for completion notifications, the system maintains a pendingByParent map to know when all tasks for a parent session are done. Several edge cases caused "stale entries" to accumulate in this map:
1. Re-registering completed tasks added them back to pending tracking, but they'd never complete again
2. Changing a task's parent session left orphan entries in the old parent's tracking set
3. Concurrent task operations could cause double-acquisition of concurrency slots
These bugs meant the system would sometimes wait forever for tasks that were already done.

What Changed
- Concurrency management: Added proper acquire/release lifecycle with cleanup on process exit (SIGINT, SIGTERM)
- Parent session tracking: Fixed cleanup order. Now clears old parent's tracking before updating parent ID
- Stale entry prevention: Only tracks tasks that are actually running; actively cleans up completed tasks
- Renamed registerExternalTask → trackTask: Clearer name (the old name implied external API consumers, but it's internal)
2026-01-15 11:09:52 -08:00
Jeremy Gollehon
8e2410f1a0 refactor(background-agent): rename registerExternalTask to trackTask
Update BackgroundManager to rename the method for tracking external tasks, improving clarity and consistency in task management. Adjust related tests to reflect the new method name.
2026-01-15 10:53:08 -08:00
justsisyphus
e264cd5078 test: skip flaky timeout test in CI 2026-01-16 02:45:41 +09:00
justsisyphus
c9f762f980 fix(hooks): use API instead of filesystem to resolve model info for session.prompt
Previously, continuation hooks (todo-continuation, boulder-continuation, ralph-loop)
and background tasks resolved model info from filesystem cache, which could be stale
or missing. This caused session.prompt to fallback to default model (Sonnet) instead
of using the originally configured model (e.g., Opus).

Now all session.prompt calls first try API (session.messages) to get current model
info, with filesystem as fallback if API fails.

Affected files:
- todo-continuation-enforcer.ts
- sisyphus-orchestrator/index.ts
- ralph-loop/index.ts
- background-agent/manager.ts
- sisyphus-task/tools.ts
- hook-message-injector/index.ts (export ToolPermission type)
2026-01-16 02:33:44 +09:00
justsisyphus
48167a6920 refactor(lsp): remove duplicate LSP tools already provided by OpenCode
Remove lsp_goto_definition, lsp_find_references, lsp_symbols as they
duplicate OpenCode's built-in LspGotoDefinition, LspFindReferences,
LspDocumentSymbols, and LspWorkspaceSymbols.

Keep oh-my-opencode-specific tools:
- lsp_diagnostics (OpenCode lacks pull diagnostics)
- lsp_servers (server management)
- lsp_prepare_rename / lsp_rename (OpenCode lacks rename)

Clean up associated dead code:
- Client methods: hover, definition, references, symbols, codeAction
- Utils: formatLocation, formatSymbolKind, formatDocumentSymbol, etc.
- Types: Location, LocationLink, SymbolInfo, DocumentSymbol, etc.
- Constants: SYMBOL_KIND_MAP, DEFAULT_MAX_REFERENCES/SYMBOLS

-418 lines removed.
2026-01-16 02:17:00 +09:00
justsisyphus
207a39b17a fix(skill): unify skill resolution to support user custom skills
sisyphus_task was only loading builtin skills via resolveMultipleSkills().
Now uses resolveMultipleSkillsAsync() which merges discoverSkills() + builtin skills.

- Add getAllSkills(), extractSkillTemplate(), resolveMultipleSkillsAsync()
- Update sisyphus_task to use async skill resolution
- Refactor skill tool to reuse unified getAllSkills()
- Add async skill resolution tests
2026-01-16 01:57:57 +09:00
Jeon Suyeol
c38b078c12 fix(test): isolate environment in non-interactive-env hook tests (#822)
- Delete PSModulePath in beforeEach() to prevent CI cross-platform detection
- Set SHELL=/bin/bash to ensure tests start with clean Unix-like environment
- Fixes flaky test failures on GitHub Actions CI runners
- Tests can still override these values for PowerShell-specific behavior
2026-01-16 00:54:53 +09:00
Kenny
5e44996746 Merge pull request #813 from KNN-07/fix/start-work-ultrawork-plan-confusion
fix(start-work): honor explicit plan name and strip ultrawork keywords
2026-01-15 10:42:45 -05:00
Kenny
c67ca8275e feat: Bun single-file executable distribution (#819)
* feat: add Bun single-file executable distribution

- Add 7 platform packages for standalone CLI binaries
- Add bin/platform.js for shared platform detection
- Add bin/oh-my-opencode.js ESM wrapper
- Add postinstall.mjs for binary verification
- Add script/build-binaries.ts for cross-compilation
- Update publish workflow for multi-package publishing
- Add CI guard against @ast-grep/napi in CLI
- Add unit tests for platform detection (12 tests)
- Update README to remove Bun runtime requirement

Platforms supported:
- macOS ARM64 & x64
- Linux x64 & ARM64 (glibc)
- Linux x64 & ARM64 (musl/Alpine)
- Windows x64

Closes #816

* chore: remove unnecessary @ast-grep/napi CI check

* chore: gitignore compiled platform binaries

* fix: use require() instead of top-level await import() for Bun compile compatibility

* refactor: use static ESM import for package.json instead of require()
2026-01-16 00:33:07 +09:00
Kenny
b8a8cc95e2 fix(sisyphus-orchestrator): update test assertions to match new prompt text
Update 5 test assertions to use stable substrings following section-markers
best practice:
- "MANDATORY VERIFICATION" → "MANDATORY:" (2 places)
- "SUBAGENTS LIE" → "LIE" (1 place)
- "0 left" → "0 remaining" (1 place)
- "2 left" → "2 remaining" (1 place)

Fixes test failures introduced in 9bed597.
2026-01-15 08:40:28 -05:00
justsisyphus
9bed597e46 feat(prompts): strengthen post-task reminders with actionable guidance
- Rewrite VERIFICATION_REMINDER with 3-step action flow (verify → determine QA → add to todo)
- Add explicit BLOCKING directive to prevent premature task progression
- Enhance buildOrchestratorReminder with clear post-verification actions
- Improve capture-pane block message with concrete Bash examples
2026-01-15 19:40:50 +09:00
justsisyphus
74f355322a feat(sisyphus_task): enhance error messages with detailed context
Add formatDetailedError helper that includes:
- Full args dump (description, category, agent, skills)
- Session ID and agent info
- Stack trace (first 10 lines)

Applied to all catch blocks for better debugging.
2026-01-15 19:02:28 +09:00
Nguyen Khac Trung Kien
e925ed0009 fix(start-work): honor explicit plan name and strip ultrawork keywords
When user types '/start-work my-plan ultrawork', the hook now:

1. Extracts plan name from <user-request> section
2. Strips ultrawork/ulw keywords from the plan name
3. Searches for matching plan (exact then partial match)
4. Uses the matched plan instead of resuming stale boulder state

This fixes the bug where '/start-work [PLAN] ultrawork' would:
- Include 'ultrawork' as part of the plan name argument
- Ignore the explicit plan and resume an old stale plan from boulder.json

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-01-15 16:55:44 +07:00
Jeremy Gollehon
b5bd837025 fix(background-agent): improve parent session ID handling in task management
Enhance the BackgroundManager to properly clean up pending tasks when the parent session ID changes. This prevents stale entries in the pending notifications and ensures that the cleanup process is only executed when necessary, improving overall task management reliability.
2026-01-15 00:16:35 -08:00
justsisyphus
abc4a34ce4 fix(sisyphus): enforce HARD BLOCK for frontend visual changes
Restore zero-tolerance policy for visual/styling changes in frontend files.
Visual keyword detection now triggers mandatory delegation to frontend-ui-ux-engineer.
2026-01-15 17:11:30 +09:00
Jeremy Gollehon
7168c2d904 fix(background-agent): prevent stale entries in pending notifications
Update BackgroundManager to track batched notifications only for running tasks. Implement cleanup for completed or cancelled tasks to avoid stale entries in pending notifications. Enhance logging to include task status for better debugging.
2026-01-14 23:51:19 -08:00
Jeremy Gollehon
7050d447cd feat(background-agent): implement process cleanup for BackgroundManager
Add functionality to manage process cleanup by registering and unregistering signal listeners. This ensures that BackgroundManager instances properly shut down and remove their listeners on process exit. Introduce tests to verify listener removal after shutdown.
2026-01-14 23:11:38 -08:00
Sisyphus
d6499cbe31 fix(non-interactive-env): add Windows/PowerShell support (#573)
* fix(non-interactive-env): add Windows/PowerShell support

- Create shared shell-env utility with cross-platform shell detection
- Detect shell type via PSModulePath, SHELL env vars, platform fallback
- Support Unix (export), PowerShell ($env:), and cmd.exe (set) syntax
- Add 41 comprehensive unit tests for shell-env utilities
- Add 5 cross-platform integration tests for hook behavior
- All 696 tests pass, type checking passes, build succeeds

Closes #566

* fix: address review feedback - add isNonInteractive check and cmd.exe % escaping

- Add isNonInteractive() check to only apply env vars in CI/non-interactive contexts (Issue #566)
- Fix cmd.exe percent sign escaping to prevent environment variable expansion
- Update test expectations for correct % escaping behavior

Resolves feedback from @greptile-apps and @cubic-dev-ai

---------

Co-authored-by: sisyphus-dev-ai <sisyphus-dev-ai@users.noreply.github.com>
2026-01-15 16:04:06 +09:00
justsisyphus
a38dc28e40 docs: update AGENTS documentation metadata and agent model references
- Update generated timestamp and commit hash metadata
- Normalize agent model names with provider prefixes (anthropic/, opencode/, google/)
- Remove deprecated Google OAuth/Antigravity references
- Update line counts and complexity hotspot entries
- Adjust test count from 82 to 80+ files and assertions from 2559+ to 2500+

🤖 Generated with assistance of oh-my-opencode
2026-01-15 15:53:51 +09:00
Jeremy Gollehon
4ac0fa7bb0 fix(background-agent): preserve external concurrency keys
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-01-14 22:40:16 -08:00
Jeremy Gollehon
c1246f61d1 feat(background-agent): add concurrency group field
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-01-14 22:40:14 -08:00
justsisyphus
89fa9ff167 fix(look-at): add path alias and validation for LLM compatibility
LLMs often call look_at with 'path' instead of 'file_path' parameter,
causing TypeError and infinite retry loops.

- Add normalizeArgs() to accept both 'path' and 'file_path'
- Add validateArgs() with clear error messages showing correct usage
- Add tests for normalization and validation

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-01-15 14:49:56 +09:00
Jeremy Gollehon
03871262b2 feat(concurrency): prevent background task races and leaks
Ensure queue waiters settle once, centralize completion with status guards, and release slots before async work so shutdown and cancellations don’t leak concurrency. Internal hardening only.
2026-01-14 21:35:01 -08:00
justsisyphus
4c22d6de76 fix(todo-continuation): preserve model when injecting continuation prompt 2026-01-15 14:30:48 +09:00
justsisyphus
1dd369fda5 perf(comment-checker): lazy init for CLI path resolution
Remove COMMENT_CHECKER_CLI_PATH constant that was blocking on module import.
Replace with getCommentCheckerPathSync() lazy function call.

- Defer file system sync calls (existsSync, require.resolve) to first use
- Add cli.test.ts with 4 BDD tests for lazy behavior verification
2026-01-15 14:23:15 +09:00
Kenny
84e97ba900 Merge pull request #801 from stranger2904/fix/session-cursor-output
fix: add session cursor for tool output
2026-01-14 21:25:40 -05:00
Kenny
ef65f405e8 fix: clean up session cursor state on session deletion
Add resetMessageCursor call in session.deleted handler to prevent
unbounded memory growth from orphaned cursor entries.
2026-01-14 21:17:41 -05:00
Kenny
3de559ff87 refactor: rename getNewMessages to consumeNewMessages
Rename to signal mutation behavior - the function advances the cursor
as a side effect, so 'consume' better reflects that calling it twice
with the same input yields different results.
2026-01-14 21:06:26 -05:00
Aleksey Bragin
acb16bcb27 fix: reset cursor when history changes 2026-01-14 19:58:56 -05:00