fix(athena): harden council members — compaction recovery, block TodoWrite, analysis mode

- Add session.compacted handler in BackgroundManager to prevent premature task completion after compaction (defer first post-compaction idle) - Explicitly block TodoWrite/TodoRead for council members in all sync points (AgentConfig permission + session tools + prompt instructions) - Add council member prefix check to todo-continuation-enforcer skip list to prevent infinite continuation loops on completed council members - Add optional analysis mode (solo/delegation) question to Athena setup: solo = thorough but heavier, delegation = fast via explore/librarian - Allow call_omo_agent in council member allow-list for delegation mode - Update COUNCIL_MEMBER_PROMPT with TodoWrite prohibition and delegation addendum for when delegation mode is selected - Update prepare_council_prompt tool with mode parameter
2026-02-23 19:42:56 +01:00
parent 92e9cbea5c
commit 9365fc23c5
8 changed files with 157 additions and 24 deletions
--- a/assets/oh-my-opencode.schema.json
+++ b/assets/oh-my-opencode.schema.json
@@ -3205,6 +3205,9 @@
            },
            "tools": {
              "type": "object",
+              "propertyNames": {
+                "type": "string"
+              },
              "additionalProperties": {
                "type": "boolean"
              }
@@ -3250,6 +3253,9 @@
                    },
                    {
                      "type": "object",
+                      "propertyNames": {
+                        "type": "string"
+                      },
                      "additionalProperties": {
                        "type": "string",
                        "enum": [
@@ -3337,6 +3343,9 @@
            },
            "providerOptions": {
              "type": "object",
+              "propertyNames": {
+                "type": "string"
+              },
              "additionalProperties": {}
            },
            "ultrawork": {
@@ -3350,6 +3359,18 @@
                }
              },
              "additionalProperties": false
+            },
+            "compaction": {
+              "type": "object",
+              "properties": {
+                "model": {
+                  "type": "string"
+                },
+                "variant": {
+                  "type": "string"
+                }
+              },
+              "additionalProperties": false
            }
          },
          "additionalProperties": false
@@ -3403,6 +3424,9 @@
            },
            "tools": {
              "type": "object",
+              "propertyNames": {
+                "type": "string"
+              },
              "additionalProperties": {
                "type": "boolean"
              }
@@ -3448,6 +3472,9 @@
                    },
                    {
                      "type": "object",
+                      "propertyNames": {
+                        "type": "string"
+                      },
                      "additionalProperties": {
                        "type": "string",
                        "enum": [
@@ -3535,6 +3562,9 @@
            },
            "providerOptions": {
              "type": "object",
+              "propertyNames": {
+                "type": "string"
+              },
              "additionalProperties": {}
            },
            "ultrawork": {
@@ -3549,6 +3579,18 @@
              },
              "additionalProperties": false
            },
+            "compaction": {
+              "type": "object",
+              "properties": {
+                "model": {
+                  "type": "string"
+                },
+                "variant": {
+                  "type": "string"
+                }
+              },
+              "additionalProperties": false
+            },
            "council": {
              "type": "object",
              "properties": {
--- a/src/agents/athena/agent.ts
+++ b/src/agents/athena/agent.ts
@@ -32,29 +32,46 @@ export const ATHENA_PROMPT_METADATA: AgentPromptMetadata = {

 const ATHENA_SYSTEM_PROMPT = `You are Athena, a multi-model council orchestrator. You do NOT analyze code yourself. Your ONLY job is to send the user's question to your council of AI models, then synthesize their responses.

-## CRITICAL: Council Member Selection (Your First Action)
+## CRITICAL: Council Setup (Your First Action)

-Before launching council members, you MUST present a multi-select prompt using the Question tool so the user can choose which council members to consult. Your available council members are listed below.
+Before launching council members, you MUST present TWO questions in a SINGLE Question tool call:
+1. Which council members to consult
+2. How council members should analyze (solo vs. delegation)

 Use the Question tool like this:

 Question({
-  questions: [{
-    question: "Which council members should I consult?",
-    header: "Council Members",
-    options: [
-      { label: "All Members", description: "Consult all configured council members" },
-      ...one option per member from your available council members listed below
-    ],
-    multiple: true
-  }]
+  questions: [
+    {
+      question: "Which council members should I consult?",
+      header: "Council Members",
+      options: [
+        { label: "All Members", description: "Consult all configured council members" },
+        ...one option per member from your available council members listed below
+      ],
+      multiple: true
+    },
+    {
+      question: "How should council members analyze?",
+      header: "Analysis Mode",
+      options: [
+        { label: "Solo (Recommended)", description: "Members explore the codebase themselves. More thorough and in-depth, but slower and uses more tokens." },
+        { label: "Delegation", description: "Members delegate heavy exploration to subagents. Faster and lighter on context, but may miss nuance." }
+      ],
+      multiple: false
+    }
+  ]
 })

-**Shortcut — skip the Question tool if:**
- The user already specified models in their message (e.g., "ask GPT and Claude about X") → launch the specified members directly.
- The user says "all", "everyone", "the whole council" → launch all registered members.
+Map the analysis mode answer to the prepare_council_prompt "mode" parameter:
+- "Solo (Recommended)" → mode: "solo"
+- "Delegation" → mode: "delegation"

-**Non-interactive mode (Question tool unavailable):** If the Question tool is denied (CLI run mode), automatically select ALL registered council members and launch them. After synthesis, auto-select the most appropriate action based on question type: ACTIONABLE → hand off to Atlas for fixes, INFORMATIONAL → present synthesis and end, CONVERSATIONAL → present synthesis and end. Do NOT attempt to call the Question tool — it will be denied.
+**Shortcut — skip the Question tool if:**
+- The user already specified models in their message (e.g., "ask GPT and Claude about X") → launch the specified members directly. Still ask the analysis mode question unless specified.
+- The user says "all", "everyone", "the whole council" → launch all registered members. Still ask the analysis mode question unless specified.
+
+**Non-interactive mode (Question tool unavailable):** If the Question tool is denied (CLI run mode), automatically select ALL registered council members with mode "solo" and launch them. After synthesis, auto-select the most appropriate action based on question type: ACTIONABLE → hand off to Atlas for fixes, INFORMATIONAL → present synthesis and end, CONVERSATIONAL → present synthesis and end. Do NOT attempt to call the Question tool — it will be denied.

 DO NOT:
 - Read files yourself
@@ -75,7 +92,7 @@ Step 2: Resolve the selected member list:

 Step 3: Save the prompt, then launch members with short references:

-Step 3a: Call prepare_council_prompt with the user's original question as the prompt parameter. This saves it to a temp file and returns the file path.
+Step 3a: Call prepare_council_prompt with the user's original question as the prompt parameter and the selected analysis mode. This saves it to a temp file and returns the file path. Example: prepare_council_prompt({ prompt: "...", mode: "solo" })

 Step 3b: For each selected member, call the task tool with:
  - subagent_type: the exact member name from your available council members listed below (e.g., "Council: Claude Opus 4.6")
--- a/src/agents/athena/council-member-agent.ts
+++ b/src/agents/athena/council-member-agent.ts
@@ -22,10 +22,26 @@ export const COUNCIL_MEMBER_PROMPT = `You are an independent code analyst in a m
   - Where it is (file path, line number)
   - Why it matters (severity: critical/high/medium/low)
   - Your confidence level (high/medium/low)
-5. Be concise but thorough — quality over quantity`
+5. Be concise but thorough — quality over quantity
+
+## CRITICAL: Do NOT use TodoWrite
+- Do NOT create todos or task lists
+- Do NOT use the TodoWrite tool under any circumstances
+- Simply report your findings directly in your response`
+
+export const COUNCIL_DELEGATION_ADDENDUM = `
+## Delegation Mode
+You can delegate heavy exploration to specialized agents using call_omo_agent:
+- Use \`call_omo_agent(subagent_type="explore", ...)\` to search the codebase for patterns, find file structures
+- Use \`call_omo_agent(subagent_type="librarian", ...)\` for documentation lookups and external references
+- Always set \`run_in_background=true\` and collect results with \`background_output\`
+- Delegate broad searches, keep targeted reads for yourself
+- This saves your context window for analysis rather than exploration`

 export function createCouncilMemberAgent(model: string): AgentConfig {
-  // Allow-list: only read-only analysis tools. Everything else is denied via `*: deny`.
+  // Allow-list: only read-only analysis tools + optional delegation.
+  // Everything else is denied via `*: deny`.
+  // TodoWrite/TodoRead explicitly denied to prevent uncompletable todo loops.
  const restrictions = createAgentToolAllowlist([
    "read",
    "grep",
@@ -35,8 +51,14 @@ export function createCouncilMemberAgent(model: string): AgentConfig {
    "lsp_symbols",
    "lsp_diagnostics",
    "ast_grep_search",
+    "call_omo_agent",
  ])

+  // Explicitly deny TodoWrite/TodoRead even though `*: deny` should catch them.
+  // Built-in OpenCode tools may bypass the wildcard deny.
+  restrictions.permission.todowrite = "deny"
+  restrictions.permission.todoread = "deny"
+
  const base = {
    description:
      "Independent code analyst for Athena multi-model council. Read-only, evidence-based analysis. (Council Member - OhMyOpenCode)",
--- a/src/features/background-agent/manager.ts
+++ b/src/features/background-agent/manager.ts
@@ -111,6 +111,7 @@ export class BackgroundManager {
  private completionTimers: Map<string, ReturnType<typeof setTimeout>> = new Map()
  private idleDeferralTimers: Map<string, ReturnType<typeof setTimeout>> = new Map()
  private notificationQueueByParent: Map<string, Promise<void>> = new Map()
+  private recentlyCompactedSessions: Set<string> = new Set()
  private enableParentSessionNotifications: boolean
  readonly taskHistory = new TaskHistory()

@@ -740,12 +741,31 @@ export class BackgroundManager {
      }
    }

+    if (event.type === "session.compacted") {
+      const sessionID = typeof props?.sessionID === "string"
+        ? props.sessionID
+        : typeof (props?.info as { id?: string } | undefined)?.id === "string"
+          ? (props!.info as { id: string }).id
+          : undefined
+      if (!sessionID) return
+
+      const task = this.findBySession(sessionID)
+      if (!task || task.status !== "running") return
+
+      this.recentlyCompactedSessions.add(sessionID)
+      if (task.progress) {
+        task.progress.lastUpdate = new Date()
+      }
+      log("[background-agent] Session compacted, deferring next idle:", { taskId: task.id, sessionID })
+    }
+
    if (event.type === "session.idle") {
      if (!props || typeof props !== "object") return
      handleSessionIdleBackgroundEvent({
        properties: props as Record<string, unknown>,
        findBySession: (id) => this.findBySession(id),
        idleDeferralTimers: this.idleDeferralTimers,
+        recentlyCompactedSessions: this.recentlyCompactedSessions,
        validateSessionHasOutput: (id) => this.validateSessionHasOutput(id),
        checkSessionTodos: (id) => this.checkSessionTodos(id),
        tryCompleteTask: (task, source) => this.tryCompleteTask(task, source),
@@ -866,6 +886,7 @@ export class BackgroundManager {
        }
      }
      SessionCategoryRegistry.remove(sessionID)
+      this.recentlyCompactedSessions.delete(sessionID)
    }

    if (event.type === "session.status") {
@@ -1467,6 +1488,12 @@ Use \`background_output(task_id="${task.id}")\` to retrieve this result when rea
        const sessionStatus = allStatuses[sessionID]
        
        if (sessionStatus?.type === "idle") {
+          if (this.recentlyCompactedSessions.has(sessionID)) {
+            this.recentlyCompactedSessions.delete(sessionID)
+            log("[background-agent] Polling: skipping post-compaction idle:", task.id)
+            continue
+          }
+
          // Edge guard: Validate session has actual output before completing
          const hasValidOutput = await this.validateSessionHasOutput(sessionID)
          if (!hasValidOutput) {
@@ -1572,6 +1599,7 @@ Use \`background_output(task_id="${task.id}")\` to retrieve this result when rea
    this.notifications.clear()
    this.pendingByParent.clear()
    this.notificationQueueByParent.clear()
+    this.recentlyCompactedSessions.clear()
    this.queuesByKey.clear()
    this.processingKeys.clear()
    this.unregisterProcessCleanup()
--- a/src/features/background-agent/session-idle-event-handler.ts
+++ b/src/features/background-agent/session-idle-event-handler.ts
@@ -11,6 +11,7 @@ export function handleSessionIdleBackgroundEvent(args: {
  properties: Record<string, unknown>
  findBySession: (sessionID: string) => BackgroundTask | undefined
  idleDeferralTimers: Map<string, ReturnType<typeof setTimeout>>
+  recentlyCompactedSessions?: Set<string>
  validateSessionHasOutput: (sessionID: string) => Promise<boolean>
  checkSessionTodos: (sessionID: string) => Promise<boolean>
  tryCompleteTask: (task: BackgroundTask, source: string) => Promise<boolean>
@@ -20,6 +21,7 @@ export function handleSessionIdleBackgroundEvent(args: {
    properties,
    findBySession,
    idleDeferralTimers,
+    recentlyCompactedSessions,
    validateSessionHasOutput,
    checkSessionTodos,
    tryCompleteTask,
@@ -32,6 +34,12 @@ export function handleSessionIdleBackgroundEvent(args: {
  const task = findBySession(sessionID)
  if (!task || task.status !== "running") return

+  if (recentlyCompactedSessions?.has(sessionID)) {
+    recentlyCompactedSessions.delete(sessionID)
+    log("[background-agent] Skipping post-compaction session.idle:", { taskId: task.id, sessionID })
+    return
+  }
+
  const startedAt = task.startedAt
  if (!startedAt) return

--- a/src/hooks/todo-continuation-enforcer/idle-event.ts
+++ b/src/hooks/todo-continuation-enforcer/idle-event.ts
@@ -6,6 +6,7 @@ import { normalizeSDKResponse } from "../../shared"
 import { log } from "../../shared/logger"
 import { getAgentConfigKey } from "../../shared/agent-display-names"

+import { COUNCIL_MEMBER_KEY_PREFIX } from "../../agents/builtin-agents/council-member-agents"
 import {
  ABORT_WINDOW_MS,
  CONTINUATION_COOLDOWN_MS,
@@ -164,6 +165,10 @@ export async function handleSessionIdle(args: {
  log(`[${HOOK_NAME}] Agent check`, { sessionID, agentName: resolvedInfo?.agent, skipAgents, hasCompactionMessage })

  const resolvedAgentName = resolvedInfo?.agent
+  if (resolvedAgentName && resolvedAgentName.startsWith(COUNCIL_MEMBER_KEY_PREFIX)) {
+    log(`[${HOOK_NAME}] Skipped: council member agent`, { sessionID, agent: resolvedAgentName })
+    return
+  }
  if (resolvedAgentName && skipAgents.some(s => getAgentConfigKey(s) === getAgentConfigKey(resolvedAgentName))) {
    log(`[${HOOK_NAME}] Skipped: agent in skipAgents list`, { sessionID, agent: resolvedAgentName })
    return
--- a/src/shared/agent-tool-restrictions.ts
+++ b/src/shared/agent-tool-restrictions.ts
@@ -56,7 +56,8 @@ const AGENT_RESTRICTIONS: Record<string, Record<string, boolean>> = {
  // - src/agents/athena/council-member-agent.ts (AgentConfig permission format — allow-list)
  // - src/plugin-handlers/tool-config-handler.ts (allow/deny string format)
  // Keep all three in sync when modifying.
-  // Council members use an allow-list: only read-only analysis tools are permitted.
+  // Council members use an allow-list: read-only analysis + optional call_omo_agent delegation.
+  // TodoWrite/TodoRead explicitly denied to prevent uncompletable todo loops.
  // Prompt file lives in .sisyphus/tmp/ (inside project) so no external_directory needed.
  "council-member": {
    "*": false,
@@ -68,6 +69,9 @@ const AGENT_RESTRICTIONS: Record<string, Record<string, boolean>> = {
    lsp_symbols: true,
    lsp_diagnostics: true,
    ast_grep_search: true,
+    call_omo_agent: true,
+    todowrite: false,
+    todoread: false,
  },
 }

--- a/src/tools/prepare-council-prompt/tools.ts
+++ b/src/tools/prepare-council-prompt/tools.ts
@@ -3,7 +3,7 @@ import { randomUUID } from "node:crypto"
 import { writeFile, unlink, mkdir } from "node:fs/promises"
 import { join } from "node:path"
 import { log } from "../../shared/logger"
-import { COUNCIL_MEMBER_PROMPT } from "../../agents/athena/council-member-agent"
+import { COUNCIL_MEMBER_PROMPT, COUNCIL_DELEGATION_ADDENDUM } from "../../agents/athena/council-member-agent"

 const CLEANUP_DELAY_MS = 30 * 60 * 1000
 const COUNCIL_TMP_DIR = ".sisyphus/tmp"
@@ -15,25 +15,32 @@ Athena-only tool. Saves the prompt once, then each council member task() call us
 "Read <path>" instruction instead of repeating the full question. This keeps task() calls
 fast and small.

+The "mode" parameter controls whether council members can delegate exploration to subagents:
+- "solo" (default): Members do all exploration themselves. More thorough but uses more tokens.
+- "delegation": Members can delegate to explore/librarian agents. Faster, lighter context.
+
 Returns the file path to reference in subsequent task() calls.`

  return tool({
    description,
    args: {
      prompt: tool.schema.string().describe("The full analysis prompt/question for council members"),
+      mode: tool.schema.string().optional().describe('Analysis mode: "solo" (default) or "delegation"'),
    },
-    async execute(args: { prompt: string }) {
+    async execute(args: { prompt: string; mode?: string }) {
      if (!args.prompt?.trim()) {
        return "Prompt cannot be empty."
      }

+      const mode = args.mode === "delegation" ? "delegation" : "solo"
      const tmpDir = join(directory, COUNCIL_TMP_DIR)
      await mkdir(tmpDir, { recursive: true })

      const filename = `athena-council-${randomUUID().slice(0, 8)}.md`
      const filePath = join(tmpDir, filename)

-      const content = `${COUNCIL_MEMBER_PROMPT}
+      const delegationSection = mode === "delegation" ? `\n${COUNCIL_DELEGATION_ADDENDUM}` : ""
+      const content = `${COUNCIL_MEMBER_PROMPT}${delegationSection}

 ## Analysis Question

@@ -45,9 +52,9 @@ ${args.prompt}`
        unlink(filePath).catch(() => {})
      }, CLEANUP_DELAY_MS)

-      log("[prepare-council-prompt] Saved prompt", { filePath, length: args.prompt.length })
+      log("[prepare-council-prompt] Saved prompt", { filePath, length: args.prompt.length, mode })

-      return `Council prompt saved to: ${filePath}
+      return `Council prompt saved to: ${filePath} (mode: ${mode})

 Use this path in each council member's task() call:
 - prompt: "Read ${filePath} for your instructions."