fix(athena): harden council members — compaction recovery, block TodoWrite, analysis mode

- Add session.compacted handler in BackgroundManager to prevent premature
  task completion after compaction (defer first post-compaction idle)
- Explicitly block TodoWrite/TodoRead for council members in all sync
  points (AgentConfig permission + session tools + prompt instructions)
- Add council member prefix check to todo-continuation-enforcer skip list
  to prevent infinite continuation loops on completed council members
- Add optional analysis mode (solo/delegation) question to Athena setup:
  solo = thorough but heavier, delegation = fast via explore/librarian
- Allow call_omo_agent in council member allow-list for delegation mode
- Update COUNCIL_MEMBER_PROMPT with TodoWrite prohibition and delegation
  addendum for when delegation mode is selected
- Update prepare_council_prompt tool with mode parameter
This commit is contained in:
ismeth
2026-02-23 19:42:56 +01:00
committed by YeonGyu-Kim
parent 92e9cbea5c
commit 9365fc23c5
8 changed files with 157 additions and 24 deletions

View File

@@ -3205,6 +3205,9 @@
},
"tools": {
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {
"type": "boolean"
}
@@ -3250,6 +3253,9 @@
},
{
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {
"type": "string",
"enum": [
@@ -3337,6 +3343,9 @@
},
"providerOptions": {
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {}
},
"ultrawork": {
@@ -3350,6 +3359,18 @@
}
},
"additionalProperties": false
},
"compaction": {
"type": "object",
"properties": {
"model": {
"type": "string"
},
"variant": {
"type": "string"
}
},
"additionalProperties": false
}
},
"additionalProperties": false
@@ -3403,6 +3424,9 @@
},
"tools": {
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {
"type": "boolean"
}
@@ -3448,6 +3472,9 @@
},
{
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {
"type": "string",
"enum": [
@@ -3535,6 +3562,9 @@
},
"providerOptions": {
"type": "object",
"propertyNames": {
"type": "string"
},
"additionalProperties": {}
},
"ultrawork": {
@@ -3549,6 +3579,18 @@
},
"additionalProperties": false
},
"compaction": {
"type": "object",
"properties": {
"model": {
"type": "string"
},
"variant": {
"type": "string"
}
},
"additionalProperties": false
},
"council": {
"type": "object",
"properties": {

View File

@@ -32,29 +32,46 @@ export const ATHENA_PROMPT_METADATA: AgentPromptMetadata = {
const ATHENA_SYSTEM_PROMPT = `You are Athena, a multi-model council orchestrator. You do NOT analyze code yourself. Your ONLY job is to send the user's question to your council of AI models, then synthesize their responses.
## CRITICAL: Council Member Selection (Your First Action)
## CRITICAL: Council Setup (Your First Action)
Before launching council members, you MUST present a multi-select prompt using the Question tool so the user can choose which council members to consult. Your available council members are listed below.
Before launching council members, you MUST present TWO questions in a SINGLE Question tool call:
1. Which council members to consult
2. How council members should analyze (solo vs. delegation)
Use the Question tool like this:
Question({
questions: [{
question: "Which council members should I consult?",
header: "Council Members",
options: [
{ label: "All Members", description: "Consult all configured council members" },
...one option per member from your available council members listed below
],
multiple: true
}]
questions: [
{
question: "Which council members should I consult?",
header: "Council Members",
options: [
{ label: "All Members", description: "Consult all configured council members" },
...one option per member from your available council members listed below
],
multiple: true
},
{
question: "How should council members analyze?",
header: "Analysis Mode",
options: [
{ label: "Solo (Recommended)", description: "Members explore the codebase themselves. More thorough and in-depth, but slower and uses more tokens." },
{ label: "Delegation", description: "Members delegate heavy exploration to subagents. Faster and lighter on context, but may miss nuance." }
],
multiple: false
}
]
})
**Shortcut — skip the Question tool if:**
- The user already specified models in their message (e.g., "ask GPT and Claude about X") → launch the specified members directly.
- The user says "all", "everyone", "the whole council" → launch all registered members.
Map the analysis mode answer to the prepare_council_prompt "mode" parameter:
- "Solo (Recommended)" → mode: "solo"
- "Delegation" → mode: "delegation"
**Non-interactive mode (Question tool unavailable):** If the Question tool is denied (CLI run mode), automatically select ALL registered council members and launch them. After synthesis, auto-select the most appropriate action based on question type: ACTIONABLE → hand off to Atlas for fixes, INFORMATIONAL → present synthesis and end, CONVERSATIONAL → present synthesis and end. Do NOT attempt to call the Question tool — it will be denied.
**Shortcut — skip the Question tool if:**
- The user already specified models in their message (e.g., "ask GPT and Claude about X") → launch the specified members directly. Still ask the analysis mode question unless specified.
- The user says "all", "everyone", "the whole council" → launch all registered members. Still ask the analysis mode question unless specified.
**Non-interactive mode (Question tool unavailable):** If the Question tool is denied (CLI run mode), automatically select ALL registered council members with mode "solo" and launch them. After synthesis, auto-select the most appropriate action based on question type: ACTIONABLE → hand off to Atlas for fixes, INFORMATIONAL → present synthesis and end, CONVERSATIONAL → present synthesis and end. Do NOT attempt to call the Question tool — it will be denied.
DO NOT:
- Read files yourself
@@ -75,7 +92,7 @@ Step 2: Resolve the selected member list:
Step 3: Save the prompt, then launch members with short references:
Step 3a: Call prepare_council_prompt with the user's original question as the prompt parameter. This saves it to a temp file and returns the file path.
Step 3a: Call prepare_council_prompt with the user's original question as the prompt parameter and the selected analysis mode. This saves it to a temp file and returns the file path. Example: prepare_council_prompt({ prompt: "...", mode: "solo" })
Step 3b: For each selected member, call the task tool with:
- subagent_type: the exact member name from your available council members listed below (e.g., "Council: Claude Opus 4.6")

View File

@@ -22,10 +22,26 @@ export const COUNCIL_MEMBER_PROMPT = `You are an independent code analyst in a m
- Where it is (file path, line number)
- Why it matters (severity: critical/high/medium/low)
- Your confidence level (high/medium/low)
5. Be concise but thorough — quality over quantity`
5. Be concise but thorough — quality over quantity
## CRITICAL: Do NOT use TodoWrite
- Do NOT create todos or task lists
- Do NOT use the TodoWrite tool under any circumstances
- Simply report your findings directly in your response`
export const COUNCIL_DELEGATION_ADDENDUM = `
## Delegation Mode
You can delegate heavy exploration to specialized agents using call_omo_agent:
- Use \`call_omo_agent(subagent_type="explore", ...)\` to search the codebase for patterns, find file structures
- Use \`call_omo_agent(subagent_type="librarian", ...)\` for documentation lookups and external references
- Always set \`run_in_background=true\` and collect results with \`background_output\`
- Delegate broad searches, keep targeted reads for yourself
- This saves your context window for analysis rather than exploration`
export function createCouncilMemberAgent(model: string): AgentConfig {
// Allow-list: only read-only analysis tools. Everything else is denied via `*: deny`.
// Allow-list: only read-only analysis tools + optional delegation.
// Everything else is denied via `*: deny`.
// TodoWrite/TodoRead explicitly denied to prevent uncompletable todo loops.
const restrictions = createAgentToolAllowlist([
"read",
"grep",
@@ -35,8 +51,14 @@ export function createCouncilMemberAgent(model: string): AgentConfig {
"lsp_symbols",
"lsp_diagnostics",
"ast_grep_search",
"call_omo_agent",
])
// Explicitly deny TodoWrite/TodoRead even though `*: deny` should catch them.
// Built-in OpenCode tools may bypass the wildcard deny.
restrictions.permission.todowrite = "deny"
restrictions.permission.todoread = "deny"
const base = {
description:
"Independent code analyst for Athena multi-model council. Read-only, evidence-based analysis. (Council Member - OhMyOpenCode)",

View File

@@ -111,6 +111,7 @@ export class BackgroundManager {
private completionTimers: Map<string, ReturnType<typeof setTimeout>> = new Map()
private idleDeferralTimers: Map<string, ReturnType<typeof setTimeout>> = new Map()
private notificationQueueByParent: Map<string, Promise<void>> = new Map()
private recentlyCompactedSessions: Set<string> = new Set()
private enableParentSessionNotifications: boolean
readonly taskHistory = new TaskHistory()
@@ -740,12 +741,31 @@ export class BackgroundManager {
}
}
if (event.type === "session.compacted") {
const sessionID = typeof props?.sessionID === "string"
? props.sessionID
: typeof (props?.info as { id?: string } | undefined)?.id === "string"
? (props!.info as { id: string }).id
: undefined
if (!sessionID) return
const task = this.findBySession(sessionID)
if (!task || task.status !== "running") return
this.recentlyCompactedSessions.add(sessionID)
if (task.progress) {
task.progress.lastUpdate = new Date()
}
log("[background-agent] Session compacted, deferring next idle:", { taskId: task.id, sessionID })
}
if (event.type === "session.idle") {
if (!props || typeof props !== "object") return
handleSessionIdleBackgroundEvent({
properties: props as Record<string, unknown>,
findBySession: (id) => this.findBySession(id),
idleDeferralTimers: this.idleDeferralTimers,
recentlyCompactedSessions: this.recentlyCompactedSessions,
validateSessionHasOutput: (id) => this.validateSessionHasOutput(id),
checkSessionTodos: (id) => this.checkSessionTodos(id),
tryCompleteTask: (task, source) => this.tryCompleteTask(task, source),
@@ -866,6 +886,7 @@ export class BackgroundManager {
}
}
SessionCategoryRegistry.remove(sessionID)
this.recentlyCompactedSessions.delete(sessionID)
}
if (event.type === "session.status") {
@@ -1467,6 +1488,12 @@ Use \`background_output(task_id="${task.id}")\` to retrieve this result when rea
const sessionStatus = allStatuses[sessionID]
if (sessionStatus?.type === "idle") {
if (this.recentlyCompactedSessions.has(sessionID)) {
this.recentlyCompactedSessions.delete(sessionID)
log("[background-agent] Polling: skipping post-compaction idle:", task.id)
continue
}
// Edge guard: Validate session has actual output before completing
const hasValidOutput = await this.validateSessionHasOutput(sessionID)
if (!hasValidOutput) {
@@ -1572,6 +1599,7 @@ Use \`background_output(task_id="${task.id}")\` to retrieve this result when rea
this.notifications.clear()
this.pendingByParent.clear()
this.notificationQueueByParent.clear()
this.recentlyCompactedSessions.clear()
this.queuesByKey.clear()
this.processingKeys.clear()
this.unregisterProcessCleanup()

View File

@@ -11,6 +11,7 @@ export function handleSessionIdleBackgroundEvent(args: {
properties: Record<string, unknown>
findBySession: (sessionID: string) => BackgroundTask | undefined
idleDeferralTimers: Map<string, ReturnType<typeof setTimeout>>
recentlyCompactedSessions?: Set<string>
validateSessionHasOutput: (sessionID: string) => Promise<boolean>
checkSessionTodos: (sessionID: string) => Promise<boolean>
tryCompleteTask: (task: BackgroundTask, source: string) => Promise<boolean>
@@ -20,6 +21,7 @@ export function handleSessionIdleBackgroundEvent(args: {
properties,
findBySession,
idleDeferralTimers,
recentlyCompactedSessions,
validateSessionHasOutput,
checkSessionTodos,
tryCompleteTask,
@@ -32,6 +34,12 @@ export function handleSessionIdleBackgroundEvent(args: {
const task = findBySession(sessionID)
if (!task || task.status !== "running") return
if (recentlyCompactedSessions?.has(sessionID)) {
recentlyCompactedSessions.delete(sessionID)
log("[background-agent] Skipping post-compaction session.idle:", { taskId: task.id, sessionID })
return
}
const startedAt = task.startedAt
if (!startedAt) return

View File

@@ -6,6 +6,7 @@ import { normalizeSDKResponse } from "../../shared"
import { log } from "../../shared/logger"
import { getAgentConfigKey } from "../../shared/agent-display-names"
import { COUNCIL_MEMBER_KEY_PREFIX } from "../../agents/builtin-agents/council-member-agents"
import {
ABORT_WINDOW_MS,
CONTINUATION_COOLDOWN_MS,
@@ -164,6 +165,10 @@ export async function handleSessionIdle(args: {
log(`[${HOOK_NAME}] Agent check`, { sessionID, agentName: resolvedInfo?.agent, skipAgents, hasCompactionMessage })
const resolvedAgentName = resolvedInfo?.agent
if (resolvedAgentName && resolvedAgentName.startsWith(COUNCIL_MEMBER_KEY_PREFIX)) {
log(`[${HOOK_NAME}] Skipped: council member agent`, { sessionID, agent: resolvedAgentName })
return
}
if (resolvedAgentName && skipAgents.some(s => getAgentConfigKey(s) === getAgentConfigKey(resolvedAgentName))) {
log(`[${HOOK_NAME}] Skipped: agent in skipAgents list`, { sessionID, agent: resolvedAgentName })
return

View File

@@ -56,7 +56,8 @@ const AGENT_RESTRICTIONS: Record<string, Record<string, boolean>> = {
// - src/agents/athena/council-member-agent.ts (AgentConfig permission format — allow-list)
// - src/plugin-handlers/tool-config-handler.ts (allow/deny string format)
// Keep all three in sync when modifying.
// Council members use an allow-list: only read-only analysis tools are permitted.
// Council members use an allow-list: read-only analysis + optional call_omo_agent delegation.
// TodoWrite/TodoRead explicitly denied to prevent uncompletable todo loops.
// Prompt file lives in .sisyphus/tmp/ (inside project) so no external_directory needed.
"council-member": {
"*": false,
@@ -68,6 +69,9 @@ const AGENT_RESTRICTIONS: Record<string, Record<string, boolean>> = {
lsp_symbols: true,
lsp_diagnostics: true,
ast_grep_search: true,
call_omo_agent: true,
todowrite: false,
todoread: false,
},
}

View File

@@ -3,7 +3,7 @@ import { randomUUID } from "node:crypto"
import { writeFile, unlink, mkdir } from "node:fs/promises"
import { join } from "node:path"
import { log } from "../../shared/logger"
import { COUNCIL_MEMBER_PROMPT } from "../../agents/athena/council-member-agent"
import { COUNCIL_MEMBER_PROMPT, COUNCIL_DELEGATION_ADDENDUM } from "../../agents/athena/council-member-agent"
const CLEANUP_DELAY_MS = 30 * 60 * 1000
const COUNCIL_TMP_DIR = ".sisyphus/tmp"
@@ -15,25 +15,32 @@ Athena-only tool. Saves the prompt once, then each council member task() call us
"Read <path>" instruction instead of repeating the full question. This keeps task() calls
fast and small.
The "mode" parameter controls whether council members can delegate exploration to subagents:
- "solo" (default): Members do all exploration themselves. More thorough but uses more tokens.
- "delegation": Members can delegate to explore/librarian agents. Faster, lighter context.
Returns the file path to reference in subsequent task() calls.`
return tool({
description,
args: {
prompt: tool.schema.string().describe("The full analysis prompt/question for council members"),
mode: tool.schema.string().optional().describe('Analysis mode: "solo" (default) or "delegation"'),
},
async execute(args: { prompt: string }) {
async execute(args: { prompt: string; mode?: string }) {
if (!args.prompt?.trim()) {
return "Prompt cannot be empty."
}
const mode = args.mode === "delegation" ? "delegation" : "solo"
const tmpDir = join(directory, COUNCIL_TMP_DIR)
await mkdir(tmpDir, { recursive: true })
const filename = `athena-council-${randomUUID().slice(0, 8)}.md`
const filePath = join(tmpDir, filename)
const content = `${COUNCIL_MEMBER_PROMPT}
const delegationSection = mode === "delegation" ? `\n${COUNCIL_DELEGATION_ADDENDUM}` : ""
const content = `${COUNCIL_MEMBER_PROMPT}${delegationSection}
## Analysis Question
@@ -45,9 +52,9 @@ ${args.prompt}`
unlink(filePath).catch(() => {})
}, CLEANUP_DELAY_MS)
log("[prepare-council-prompt] Saved prompt", { filePath, length: args.prompt.length })
log("[prepare-council-prompt] Saved prompt", { filePath, length: args.prompt.length, mode })
return `Council prompt saved to: ${filePath}
return `Council prompt saved to: ${filePath} (mode: ${mode})
Use this path in each council member's task() call:
- prompt: "Read ${filePath} for your instructions."