YeonGyu-Kim
dd11d5df1b
refactor: compress plan template while recovering lost specificity guidelines
...
Reduce plan-template from 541 to 335 lines by removing redundant verbose
examples while recovering 3 lost context items: tool-type mapping table in
QA Policy, scenario specificity requirements (selectors/data/assertions/
timing/negative) in TODO template, and structured output format hints for
each Final Verification agent.
2026-02-16 15:46:00 +09:00
YeonGyu-Kim
130aaaf910
enhance: enforce mandatory per-task QA scenarios and add Final Verification Wave
...
Strengthen TODO template to make QA scenarios non-optional with explicit
rejection warning. Add Final Verification Wave with 4 parallel review
agents: oracle (plan compliance audit), unspecified-high (code quality),
unspecified-high (real manual QA), deep (scope fidelity check) — each
with detailed verification steps and structured output format.
2026-02-16 15:46:00 +09:00
YeonGyu-Kim
abfab1a78a
enhance: calibrate Prometheus plan granularity to 5-8 parallel tasks per wave
...
Add Maximum Parallelism Principle as a top-level constraint and replace
small-scale plan template examples (6 tasks, 3 waves) with production-scale
examples (24 tasks, 4 waves, max 7 concurrent) to steer the model toward
generating fine-grained, dependency-minimized plans by default.
2026-02-16 15:14:25 +09:00
YeonGyu-Kim
a691a3ac0a
refactor: migrate delegate_task to task tool with metadata fixes
...
- Rename delegate_task tool to task across codebase (100 files)
- Update model references: claude-opus-4-6 → 4-5, gpt-5.3-codex → 5.2-codex
- Add tool-metadata-store to restore metadata overwritten by fromPlugin()
- Add session ID polling for BackgroundManager task sessions
- Await async ctx.metadata() calls in tool executors
- Add ses_ prefix guard to getMessageDir for performance
- Harden BackgroundManager with idle deferral and error handling
- Fix duplicate task key in sisyphus-junior test object literals
- Fix unawaited showOutputToUser in ast_grep_replace
- Fix background=true → run_in_background=true in ultrawork prompt
- Fix duplicate task/task references in docs and comments
2026-02-06 21:35:30 +09:00
YeonGyu-Kim
ac9e22cce5
fix(prompts): add missing run_in_background and load_skills params to examples
...
All delegate_task examples now include required parameters to prevent
model confusion about parameter omission.
Fixes #1403
2026-02-03 10:50:26 +09:00
YeonGyu-Kim
e969ca5573
refactor(prometheus): replace binary verification with layered agent-executed QA
...
Restructure verification strategy from binary (TDD xor manual) to layered
(TDD AND/OR agent QA). Elevate zero-human-intervention as universal principle,
require per-scenario ultra-detailed QA format with named scenarios, negative
cases, and evidence capture. Remove ambiguous 'manual QA' terminology.
2026-02-02 14:18:01 +09:00
YeonGyu-Kim
f146aeff0f
refactor: major codebase cleanup - BDD comments, file splitting, bug fixes ( #1350 )
...
* style(tests): normalize BDD comments from '// #given' to '// given'
- Replace 4,668 Python-style BDD comments across 107 test files
- Patterns changed: // #given -> // given, // #when -> // when, // #then -> // then
- Also handles no-space variants: //#given -> // given
* fix(rules-injector): prefer output.metadata.filePath over output.title
- Extract file path resolution to dedicated output-path.ts module
- Prefer metadata.filePath which contains actual file path
- Fall back to output.title only when metadata unavailable
- Fixes issue where rules weren't injected when tool output title was a label
* feat(slashcommand): add optional user_message parameter
- Add user_message optional parameter for command arguments
- Model can now call: command='publish' user_message='patch'
- Improves error messages with clearer format guidance
- Helps LLMs understand correct parameter usage
* feat(hooks): restore compaction-context-injector hook
- Restore hook deleted in cbbc7bd0 for session compaction context
- Injects 7 mandatory sections: User Requests, Final Goal, Work Completed,
Remaining Tasks, Active Working Context, MUST NOT Do, Agent Verification State
- Re-register in hooks/index.ts and main plugin entry
* refactor(background-agent): split manager.ts into focused modules
- Extract constants.ts for TTL values and internal types (52 lines)
- Extract state.ts for TaskStateManager class (204 lines)
- Extract spawner.ts for task creation logic (244 lines)
- Extract result-handler.ts for completion handling (265 lines)
- Reduce manager.ts from 1377 to 755 lines (45% reduction)
- Maintain backward compatible exports
* refactor(agents): split prometheus-prompt.ts into subdirectory
- Move 1196-line prometheus-prompt.ts to prometheus/ subdirectory
- Organize prompt sections into separate files for maintainability
- Update agents/index.ts exports
* refactor(delegate-task): split tools.ts into focused modules
- Extract categories.ts for category definitions and routing
- Extract executor.ts for task execution logic
- Extract helpers.ts for utility functions
- Extract prompt-builder.ts for prompt construction
- Reduce tools.ts complexity with cleaner separation of concerns
* refactor(builtin-skills): split skills.ts into individual skill files
- Move each skill to dedicated file in skills/ subdirectory
- Create barrel export for backward compatibility
- Improve maintainability with focused skill modules
* chore: update import paths and lockfile
- Update prometheus import path after refactor
- Update bun.lock
* fix(tests): complete BDD comment normalization
- Fix remaining #when/#then patterns missed by initial sed
- Affected: state.test.ts, events.test.ts
---------
Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com >
2026-02-01 16:47:50 +09:00