oh-my-openagent/src/agents/prometheus/plan-template.ts

/**
 * Prometheus Plan Template
 *
 * The markdown template structure for work plans generated by Prometheus.
 * Includes TL;DR, context, objectives, verification strategy, TODOs, and success criteria.
 */

export const PROMETHEUS_PLAN_TEMPLATE = `## Plan Structure

Generate plan to: \`.sisyphus/plans/{name}.md\`

\`\`\`markdown
# {Plan Title}

## TL;DR

> **Quick Summary**: [1-2 sentences capturing the core objective and approach]
>
> **Deliverables**: [Bullet list of concrete outputs]
> - [Output 1]
> - [Output 2]
>
> **Estimated Effort**: [Quick | Short | Medium | Large | XL]
> **Parallel Execution**: [YES - N waves | NO - sequential]
> **Critical Path**: [Task X → Task Y → Task Z]

---

## Context

### Original Request
[User's initial description]

### Interview Summary
**Key Discussions**:
- [Point 1]: [User's decision/preference]
- [Point 2]: [Agreed approach]

**Research Findings**:
- [Finding 1]: [Implication]
- [Finding 2]: [Recommendation]

### Metis Review
**Identified Gaps** (addressed):
- [Gap 1]: [How resolved]
- [Gap 2]: [How resolved]

---

## Work Objectives

### Core Objective
[1-2 sentences: what we're achieving]

### Concrete Deliverables
- [Exact file/endpoint/feature]

### Definition of Done
- [ ] [Verifiable condition with command]

### Must Have
- [Non-negotiable requirement]

### Must NOT Have (Guardrails)
- [Explicit exclusion from Metis review]
- [AI slop pattern to avoid]
- [Scope boundary]

---

## Verification Strategy (MANDATORY)

> **UNIVERSAL RULE: ZERO HUMAN INTERVENTION**
>
> ALL tasks in this plan MUST be verifiable WITHOUT any human action.
> This is NOT conditional — it applies to EVERY task, regardless of test strategy.
>
> **FORBIDDEN** — acceptance criteria that require:
> - "User manually tests..." / "사용자가 직접 테스트..."
> - "User visually confirms..." / "사용자가 눈으로 확인..."
> - "User interacts with..." / "사용자가 직접 조작..."
> - "Ask user to verify..." / "사용자에게 확인 요청..."
> - ANY step where a human must perform an action
>
> **ALL verification is executed by the agent** using tools (Playwright, interactive_bash, curl, etc.). No exceptions.

### Test Decision
- **Infrastructure exists**: [YES/NO]
- **Automated tests**: [TDD / Tests-after / None]
- **Framework**: [bun test / vitest / jest / pytest / none]

### If TDD Enabled

Each TODO follows RED-GREEN-REFACTOR:

**Task Structure:**
1. **RED**: Write failing test first
   - Test file: \`[path].test.ts\`
   - Test command: \`bun test [file]\`
   - Expected: FAIL (test exists, implementation doesn't)
2. **GREEN**: Implement minimum code to pass
   - Command: \`bun test [file]\`
   - Expected: PASS
3. **REFACTOR**: Clean up while keeping green
   - Command: \`bun test [file]\`
   - Expected: PASS (still)

**Test Setup Task (if infrastructure doesn't exist):**
- [ ] 0. Setup Test Infrastructure
  - Install: \`bun add -d [test-framework]\`
  - Config: Create \`[config-file]\`
  - Verify: \`bun test --help\` → shows help
  - Example: Create \`src/__tests__/example.test.ts\`
  - Verify: \`bun test\` → 1 test passes

### Agent-Executed QA Scenarios (MANDATORY — ALL tasks)

> Whether TDD is enabled or not, EVERY task MUST include Agent-Executed QA Scenarios.
> - **With TDD**: QA scenarios complement unit tests at integration/E2E level
> - **Without TDD**: QA scenarios are the PRIMARY verification method
>
> These describe how the executing agent DIRECTLY verifies the deliverable
> by running it — opening browsers, executing commands, sending API requests.
> The agent performs what a human tester would do, but automated via tools.

**Verification Tool by Deliverable Type:**

| Type | Tool | How Agent Verifies |
|------|------|-------------------|
| **Frontend/UI** | Playwright (playwright skill) | Navigate, interact, assert DOM, screenshot |
| **TUI/CLI** | interactive_bash (tmux) | Run command, send keystrokes, validate output |
| **API/Backend** | Bash (curl/httpie) | Send requests, parse responses, assert fields |
| **Library/Module** | Bash (bun/node REPL) | Import, call functions, compare output |
| **Config/Infra** | Bash (shell commands) | Apply config, run state checks, validate |

**Each Scenario MUST Follow This Format:**

\`\`\`
Scenario: [Descriptive name — what user action/flow is being verified]
  Tool: [Playwright / interactive_bash / Bash]
  Preconditions: [What must be true before this scenario runs]
  Steps:
    1. [Exact action with specific selector/command/endpoint]
    2. [Next action with expected intermediate state]
    3. [Assertion with exact expected value]
  Expected Result: [Concrete, observable outcome]
  Failure Indicators: [What would indicate failure]
  Evidence: [Screenshot path / output capture / response body path]
\`\`\`

**Scenario Detail Requirements:**
- **Selectors**: Specific CSS selectors (\`.login-button\`, not "the login button")
- **Data**: Concrete test data (\`"test@example.com"\`, not \`"[email]"\`)
- **Assertions**: Exact values (\`text contains "Welcome back"\`, not "verify it works")
- **Timing**: Include wait conditions where relevant (\`Wait for .dashboard (timeout: 10s)\`)
- **Negative Scenarios**: At least ONE failure/error scenario per feature
- **Evidence Paths**: Specific file paths (\`.sisyphus/evidence/task-N-scenario-name.png\`)

**Anti-patterns (NEVER write scenarios like this):**
- ❌ "Verify the login page works correctly"
- ❌ "Check that the API returns the right data"
- ❌ "Test the form validation"
- ❌ "User opens browser and confirms..."

**Write scenarios like this instead:**
- ✅ \`Navigate to /login → Fill input[name="email"] with "test@example.com" → Fill input[name="password"] with "Pass123!" → Click button[type="submit"] → Wait for /dashboard → Assert h1 contains "Welcome"\`
- ✅ \`POST /api/users {"name":"Test","email":"new@test.com"} → Assert status 201 → Assert response.id is UUID → GET /api/users/{id} → Assert name equals "Test"\`
- ✅ \`Run ./cli --config test.yaml → Wait for "Loaded" in stdout → Send "q" → Assert exit code 0 → Assert stdout contains "Goodbye"\`

**Evidence Requirements:**
- Screenshots: \`.sisyphus/evidence/\` for all UI verifications
- Terminal output: Captured for CLI/TUI verifications
- Response bodies: Saved for API verifications
- All evidence referenced by specific file path in acceptance criteria

---

## Execution Strategy

### Parallel Execution Waves

> Maximize throughput by grouping independent tasks into parallel waves.
> Each wave completes before the next begins.

\`\`\`
Wave 1 (Start Immediately):
├── Task 1: [no dependencies]
└── Task 5: [no dependencies]

Wave 2 (After Wave 1):
├── Task 2: [depends: 1]
├── Task 3: [depends: 1]
└── Task 6: [depends: 5]

Wave 3 (After Wave 2):
└── Task 4: [depends: 2, 3]

Critical Path: Task 1 → Task 2 → Task 4
Parallel Speedup: ~40% faster than sequential
\`\`\`

### Dependency Matrix

| Task | Depends On | Blocks | Can Parallelize With |
|------|------------|--------|---------------------|
| 1 | None | 2, 3 | 5 |
| 2 | 1 | 4 | 3, 6 |
| 3 | 1 | 4 | 2, 6 |
| 4 | 2, 3 | None | None (final) |
| 5 | None | 6 | 1 |
| 6 | 5 | None | 2, 3 |

### Agent Dispatch Summary

| Wave | Tasks | Recommended Agents |
|------|-------|-------------------|
| 1 | 1, 5 | task(category="...", load_skills=[...], run_in_background=false) |
| 2 | 2, 3, 6 | dispatch parallel after Wave 1 completes |
| 3 | 4 | final integration task |

---

## TODOs

> Implementation + Test = ONE Task. Never separate.
> EVERY task MUST have: Recommended Agent Profile + Parallelization info.

- [ ] 1. [Task Title]

  **What to do**:
  - [Clear implementation steps]
  - [Test cases to cover]

  **Must NOT do**:
  - [Specific exclusions from guardrails]

  **Recommended Agent Profile**:
  > Select category + skills based on task domain. Justify each choice.
  - **Category**: \`[visual-engineering | ultrabrain | artistry | quick | unspecified-low | unspecified-high | writing]\`
    - Reason: [Why this category fits the task domain]
  - **Skills**: [\`skill-1\`, \`skill-2\`]
    - \`skill-1\`: [Why needed - domain overlap explanation]
    - \`skill-2\`: [Why needed - domain overlap explanation]
  - **Skills Evaluated but Omitted**:
    - \`omitted-skill\`: [Why domain doesn't overlap]

  **Parallelization**:
  - **Can Run In Parallel**: YES | NO
  - **Parallel Group**: Wave N (with Tasks X, Y) | Sequential
  - **Blocks**: [Tasks that depend on this task completing]
  - **Blocked By**: [Tasks this depends on] | None (can start immediately)

  **References** (CRITICAL - Be Exhaustive):

  > The executor has NO context from your interview. References are their ONLY guide.
  > Each reference must answer: "What should I look at and WHY?"

  **Pattern References** (existing code to follow):
  - \`src/services/auth.ts:45-78\` - Authentication flow pattern (JWT creation, refresh token handling)
  - \`src/hooks/useForm.ts:12-34\` - Form validation pattern (Zod schema + react-hook-form integration)

  **API/Type References** (contracts to implement against):
  - \`src/types/user.ts:UserDTO\` - Response shape for user endpoints
  - \`src/api/schema.ts:createUserSchema\` - Request validation schema

  **Test References** (testing patterns to follow):
  - \`src/__tests__/auth.test.ts:describe("login")\` - Test structure and mocking patterns

  **Documentation References** (specs and requirements):
  - \`docs/api-spec.md#authentication\` - API contract details
  - \`ARCHITECTURE.md:Database Layer\` - Database access patterns

  **External References** (libraries and frameworks):
  - Official docs: \`https://zod.dev/?id=basic-usage\` - Zod validation syntax
  - Example repo: \`github.com/example/project/src/auth\` - Reference implementation

  **WHY Each Reference Matters** (explain the relevance):
  - Don't just list files - explain what pattern/information the executor should extract
  - Bad: \`src/utils.ts\` (vague, which utils? why?)
  - Good: \`src/utils/validation.ts:sanitizeInput()\` - Use this sanitization pattern for user input

  **Acceptance Criteria**:

  > **AGENT-EXECUTABLE VERIFICATION ONLY** — No human action permitted.
  > Every criterion MUST be verifiable by running a command or using a tool.
  > REPLACE all placeholders with actual values from task context.

  **If TDD (tests enabled):**
  - [ ] Test file created: src/auth/login.test.ts
  - [ ] Test covers: successful login returns JWT token
  - [ ] bun test src/auth/login.test.ts → PASS (3 tests, 0 failures)

  **Agent-Executed QA Scenarios (MANDATORY — per-scenario, ultra-detailed):**

  > Write MULTIPLE named scenarios per task: happy path AND failure cases.
  > Each scenario = exact tool + steps with real selectors/data + evidence path.

  **Example — Frontend/UI (Playwright):**

  \\\`\\\`\\\`
  Scenario: Successful login redirects to dashboard
    Tool: Playwright (playwright skill)
    Preconditions: Dev server running on localhost:3000, test user exists
    Steps:
      1. Navigate to: http://localhost:3000/login
      2. Wait for: input[name="email"] visible (timeout: 5s)
      3. Fill: input[name="email"] → "test@example.com"
      4. Fill: input[name="password"] → "ValidPass123!"
      5. Click: button[type="submit"]
      6. Wait for: navigation to /dashboard (timeout: 10s)
      7. Assert: h1 text contains "Welcome back"
      8. Assert: cookie "session_token" exists
      9. Screenshot: .sisyphus/evidence/task-1-login-success.png
    Expected Result: Dashboard loads with welcome message
    Evidence: .sisyphus/evidence/task-1-login-success.png

  Scenario: Login fails with invalid credentials
    Tool: Playwright (playwright skill)
    Preconditions: Dev server running, no valid user with these credentials
    Steps:
      1. Navigate to: http://localhost:3000/login
      2. Fill: input[name="email"] → "wrong@example.com"
      3. Fill: input[name="password"] → "WrongPass"
      4. Click: button[type="submit"]
      5. Wait for: .error-message visible (timeout: 5s)
      6. Assert: .error-message text contains "Invalid credentials"
      7. Assert: URL is still /login (no redirect)
      8. Screenshot: .sisyphus/evidence/task-1-login-failure.png
    Expected Result: Error message shown, stays on login page
    Evidence: .sisyphus/evidence/task-1-login-failure.png
  \\\`\\\`\\\`

  **Example — API/Backend (curl):**

  \\\`\\\`\\\`
  Scenario: Create user returns 201 with UUID
    Tool: Bash (curl)
    Preconditions: Server running on localhost:8080
    Steps:
      1. curl -s -w "\\n%{http_code}" -X POST http://localhost:8080/api/users \\
           -H "Content-Type: application/json" \\
           -d '{"email":"new@test.com","name":"Test User"}'
      2. Assert: HTTP status is 201
      3. Assert: response.id matches UUID format
      4. GET /api/users/{returned-id} → Assert name equals "Test User"
    Expected Result: User created and retrievable
    Evidence: Response bodies captured

  Scenario: Duplicate email returns 409
    Tool: Bash (curl)
    Preconditions: User with email "new@test.com" already exists
    Steps:
      1. Repeat POST with same email
      2. Assert: HTTP status is 409
      3. Assert: response.error contains "already exists"
    Expected Result: Conflict error returned
    Evidence: Response body captured
  \\\`\\\`\\\`

  **Example — TUI/CLI (interactive_bash):**

  \\\`\\\`\\\`
  Scenario: CLI loads config and displays menu
    Tool: interactive_bash (tmux)
    Preconditions: Binary built, test config at ./test.yaml
    Steps:
      1. tmux new-session: ./my-cli --config test.yaml
      2. Wait for: "Configuration loaded" in output (timeout: 5s)
      3. Assert: Menu items visible ("1. Create", "2. List", "3. Exit")
      4. Send keys: "3" then Enter
      5. Assert: "Goodbye" in output
      6. Assert: Process exited with code 0
    Expected Result: CLI starts, shows menu, exits cleanly
    Evidence: Terminal output captured

  Scenario: CLI handles missing config gracefully
    Tool: interactive_bash (tmux)
    Preconditions: No config file at ./nonexistent.yaml
    Steps:
      1. tmux new-session: ./my-cli --config nonexistent.yaml
      2. Wait for: output (timeout: 3s)
      3. Assert: stderr contains "Config file not found"
      4. Assert: Process exited with code 1
    Expected Result: Meaningful error, non-zero exit
    Evidence: Error output captured
  \\\`\\\`\\\`

  **Evidence to Capture:**
  - [ ] Screenshots in .sisyphus/evidence/ for UI scenarios
  - [ ] Terminal output for CLI/TUI scenarios
  - [ ] Response bodies for API scenarios
  - [ ] Each evidence file named: task-{N}-{scenario-slug}.{ext}

  **Commit**: YES | NO (groups with N)
  - Message: \`type(scope): desc\`
  - Files: \`path/to/file\`
  - Pre-commit: \`test command\`

---

## Commit Strategy

| After Task | Message | Files | Verification |
|------------|---------|-------|--------------|
| 1 | \`type(scope): desc\` | file.ts | npm test |

---

## Success Criteria

### Verification Commands
\`\`\`bash
command  # Expected: output
\`\`\`

### Final Checklist
- [ ] All "Must Have" present
- [ ] All "Must NOT Have" absent
- [ ] All tests pass
\`\`\`

---
`