Compare commits
6 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
96886f18ac | ||
|
|
a3938e8c25 | ||
|
|
821b0b8e9f | ||
|
|
356bd1dff3 | ||
|
|
f2b070cd0b | ||
|
|
1323443c85 |
1
.github/workflows/publish.yml
vendored
1
.github/workflows/publish.yml
vendored
@@ -26,6 +26,7 @@ permissions:
|
||||
jobs:
|
||||
publish:
|
||||
runs-on: ubuntu-latest
|
||||
if: github.repository == 'code-yeongyu/oh-my-opencode'
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
|
||||
19
README.ko.md
19
README.ko.md
@@ -166,6 +166,18 @@ opencode auth login
|
||||
# 브라우저에서 OAuth 플로우 완료
|
||||
```
|
||||
|
||||
**⚠️ 알려진 이슈**: 현재 공식 npm 패키지에 400 에러(`"No tool call found for function call output with call_id"`)를 유발하는 버그가 있습니다. 수정 버전이 배포될 때까지 **핫픽스 브랜치 사용을 권장합니다**. `~/.config/opencode/package.json`을 수정하세요:
|
||||
|
||||
```json
|
||||
{
|
||||
"dependencies": {
|
||||
"opencode-openai-codex-auth": "code-yeongyu/opencode-openai-codex-auth#fix/orphaned-function-call-output-with-tools"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
그 후 `cd ~/.config/opencode && bun i`를 실행하세요. `opencode.json`에서는 버전 없이 `"opencode-openai-codex-auth"`로 사용합니다 (`@4.1.0` 제외).
|
||||
|
||||
#### 4.4 대안: 프록시 기반 인증
|
||||
|
||||
프록시 기반 인증을 선호하는 사용자를 위해 [VibeProxy](https://github.com/automazeio/vibeproxy) (macOS) 또는 [CLIProxyAPI](https://github.com/router-for-me/CLIProxyAPI)를 대안으로 사용할 수 있습니다.
|
||||
@@ -206,6 +218,7 @@ OpenCode 는 아주 확장가능하고 아주 커스터마이저블합니다.
|
||||
- **explore** (`opencode/grok-code`): 빠른 코드베이스 탐색, 파일 패턴 매칭. Claude Code는 Haiku를 쓰지만, 우리는 Grok을 씁니다. 현재 무료이고, 극도로 빠르며, 파일 탐색 작업에 충분한 지능을 갖췄기 때문입니다. Claude Code 에서 영감을 받았습니다.
|
||||
- **frontend-ui-ux-engineer** (`google/gemini-3-pro-preview`): 개발자로 전향한 디자이너라는 설정을 갖고 있습니다. 멋진 UI를 만듭니다. 아름답고 창의적인 UI 코드를 생성하는 데 탁월한 Gemini를 사용합니다.
|
||||
- **document-writer** (`google/gemini-3-pro-preview`): 기술 문서 전문가라는 설정을 갖고 있습니다. Gemini 는 문학가입니다. 글을 기가막히게 씁니다.
|
||||
- **multimodal-looker** (`google/gemini-2.5-flash`): 시각적 콘텐츠 해석을 위한 전문 에이전트. PDF, 이미지, 다이어그램을 분석하여 정보를 추출합니다.
|
||||
|
||||
각 에이전트는 메인 에이전트가 알아서 호출하지만, 명시적으로 요청할 수도 있습니다:
|
||||
|
||||
@@ -258,6 +271,12 @@ OpenCode 는 아주 확장가능하고 아주 커스터마이저블합니다.
|
||||
- 기본 `glob`은 타임아웃이 없습니다. ripgrep이 멈추면 무한정 대기합니다.
|
||||
- 이 도구는 타임아웃을 강제하고 만료 시 프로세스를 종료합니다.
|
||||
|
||||
#### 내장 멀티모달 도구 (Built-in Multimodal Tools)
|
||||
|
||||
- **look_at**: 시각적 해석이 필요한 미디어 파일(PDF, 이미지, 다이어그램 등)을 Gemini 2.5 Flash를 사용하여 분석합니다. Sourcegraph Ampcode의 `look_at` 도구에서 영감을 받았습니다.
|
||||
- 파라미터: `file_path` (절대 경로), `goal` (추출할 정보)
|
||||
- 사용 사례: PDF 텍스트 추출, 이미지 설명, 다이어그램 분석
|
||||
|
||||
#### 내장 MCPs
|
||||
|
||||
- **websearch_exa**: Exa AI 웹 검색. 실시간 웹 검색과 콘텐츠 스크래핑을 수행합니다. 관련 웹사이트에서 LLM에 최적화된 컨텍스트를 반환합니다.
|
||||
|
||||
19
README.md
19
README.md
@@ -165,6 +165,18 @@ opencode auth login
|
||||
# Complete OAuth flow in browser
|
||||
```
|
||||
|
||||
**⚠️ Known Issue**: The official npm package currently has a bug that causes 400 errors (`"No tool call found for function call output with call_id"`). Until a fix is released, **use the hotfix branch instead**. Modify `~/.config/opencode/package.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"dependencies": {
|
||||
"opencode-openai-codex-auth": "code-yeongyu/opencode-openai-codex-auth#fix/orphaned-function-call-output-with-tools"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Then run `cd ~/.config/opencode && bun i`. In your `opencode.json`, use the plugin name without a version: `"opencode-openai-codex-auth"` (not `@4.1.0`).
|
||||
|
||||
#### 4.4 Alternative: Proxy-based Authentication
|
||||
|
||||
For users who prefer proxy-based authentication, [VibeProxy](https://github.com/automazeio/vibeproxy) (macOS) or [CLIProxyAPI](https://github.com/router-for-me/CLIProxyAPI) remain available as alternatives.
|
||||
@@ -203,6 +215,7 @@ I believe in the right tool for the job. For your wallet's sake, use CLIProxyAPI
|
||||
- **explore** (`opencode/grok-code`): Fast exploration and pattern matching. Claude Code uses Haiku; we use Grok. It is currently free, blazing fast, and intelligent enough for file traversal. Inspired by Claude Code.
|
||||
- **frontend-ui-ux-engineer** (`google/gemini-3-pro-preview`): A designer turned developer. Creates stunning UIs. Uses Gemini because its creativity and UI code generation are superior.
|
||||
- **document-writer** (`google/gemini-3-pro-preview`): A technical writing expert. Gemini is a wordsmith; it writes prose that flows naturally.
|
||||
- **multimodal-looker** (`google/gemini-2.5-flash`): Specialized agent for visual content interpretation. Analyzes PDFs, images, and diagrams to extract information.
|
||||
|
||||
Each agent is automatically invoked by the main agent, but you can also explicitly request them:
|
||||
|
||||
@@ -257,6 +270,12 @@ The features you use in your editor—other agents cannot access them. Oh My Ope
|
||||
- The default `glob` lacks timeout. If ripgrep hangs, it waits indefinitely.
|
||||
- This tool enforces timeouts and kills the process on expiration.
|
||||
|
||||
#### Built-in Multimodal Tools
|
||||
|
||||
- **look_at**: Analyzes media files (PDFs, images, diagrams) that require visual interpretation using Gemini 2.5 Flash. Inspired by Sourcegraph Ampcode's `look_at` tool.
|
||||
- Parameters: `file_path` (absolute path), `goal` (what to extract)
|
||||
- Use cases: PDF text extraction, image description, diagram analysis
|
||||
|
||||
#### Built-in MCPs
|
||||
|
||||
- **websearch_exa**: Exa AI web search. Performs real-time web searches and can scrape content from specific URLs. Returns LLM-optimized context from relevant websites.
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
{
|
||||
"name": "oh-my-opencode",
|
||||
"version": "1.0.1",
|
||||
"version": "1.0.2",
|
||||
"description": "OpenCode plugin - custom agents (oracle, librarian) and enhanced features",
|
||||
"main": "dist/index.js",
|
||||
"types": "dist/index.d.ts",
|
||||
|
||||
@@ -4,6 +4,7 @@ import { librarianAgent } from "./librarian"
|
||||
import { exploreAgent } from "./explore"
|
||||
import { frontendUiUxEngineerAgent } from "./frontend-ui-ux-engineer"
|
||||
import { documentWriterAgent } from "./document-writer"
|
||||
import { multimodalLookerAgent } from "./multimodal-looker"
|
||||
|
||||
export const builtinAgents: Record<string, AgentConfig> = {
|
||||
oracle: oracleAgent,
|
||||
@@ -11,6 +12,7 @@ export const builtinAgents: Record<string, AgentConfig> = {
|
||||
explore: exploreAgent,
|
||||
"frontend-ui-ux-engineer": frontendUiUxEngineerAgent,
|
||||
"document-writer": documentWriterAgent,
|
||||
"multimodal-looker": multimodalLookerAgent,
|
||||
}
|
||||
|
||||
export * from "./types"
|
||||
|
||||
42
src/agents/multimodal-looker.ts
Normal file
42
src/agents/multimodal-looker.ts
Normal file
@@ -0,0 +1,42 @@
|
||||
import type { AgentConfig } from "@opencode-ai/sdk"
|
||||
|
||||
export const multimodalLookerAgent: AgentConfig = {
|
||||
description:
|
||||
"Analyze media files (PDFs, images, diagrams) that require interpretation beyond raw text. Extracts specific information or summaries from documents, describes visual content. Use when you need analyzed/extracted data rather than literal file contents.",
|
||||
mode: "subagent",
|
||||
model: "google/gemini-2.5-flash",
|
||||
temperature: 0.1,
|
||||
tools: { Read: true },
|
||||
prompt: `You interpret media files that cannot be read as plain text.
|
||||
|
||||
Your job: examine the attached file and extract ONLY what was requested.
|
||||
|
||||
When to use you:
|
||||
- Media files the Read tool cannot interpret
|
||||
- Extracting specific information or summaries from documents
|
||||
- Describing visual content in images or diagrams
|
||||
- When analyzed/extracted data is needed, not raw file contents
|
||||
|
||||
When NOT to use you:
|
||||
- Source code or plain text files needing exact contents (use Read)
|
||||
- Files that need editing afterward (need literal content from Read)
|
||||
- Simple file reading where no interpretation is needed
|
||||
|
||||
How you work:
|
||||
1. Receive a file path and a goal describing what to extract
|
||||
2. Read and analyze the file deeply
|
||||
3. Return ONLY the relevant extracted information
|
||||
4. The main agent never processes the raw file - you save context tokens
|
||||
|
||||
For PDFs: extract text, structure, tables, data from specific sections
|
||||
For images: describe layouts, UI elements, text, diagrams, charts
|
||||
For diagrams: explain relationships, flows, architecture depicted
|
||||
|
||||
Response rules:
|
||||
- Return extracted information directly, no preamble
|
||||
- If info not found, state clearly what's missing
|
||||
- Match the language of the request
|
||||
- Be thorough on the goal, concise on everything else
|
||||
|
||||
Your output goes straight to the main agent for continued work.`,
|
||||
}
|
||||
@@ -6,6 +6,7 @@ export type AgentName =
|
||||
| "explore"
|
||||
| "frontend-ui-ux-engineer"
|
||||
| "document-writer"
|
||||
| "multimodal-looker"
|
||||
|
||||
export type AgentOverrideConfig = Partial<AgentConfig>
|
||||
|
||||
|
||||
@@ -5,6 +5,7 @@ import { librarianAgent } from "./librarian"
|
||||
import { exploreAgent } from "./explore"
|
||||
import { frontendUiUxEngineerAgent } from "./frontend-ui-ux-engineer"
|
||||
import { documentWriterAgent } from "./document-writer"
|
||||
import { multimodalLookerAgent } from "./multimodal-looker"
|
||||
import { deepMerge } from "../shared"
|
||||
|
||||
const allBuiltinAgents: Record<AgentName, AgentConfig> = {
|
||||
@@ -13,6 +14,7 @@ const allBuiltinAgents: Record<AgentName, AgentConfig> = {
|
||||
explore: exploreAgent,
|
||||
"frontend-ui-ux-engineer": frontendUiUxEngineerAgent,
|
||||
"document-writer": documentWriterAgent,
|
||||
"multimodal-looker": multimodalLookerAgent,
|
||||
}
|
||||
|
||||
function mergeAgentConfig(
|
||||
|
||||
@@ -3,6 +3,7 @@ import { homedir } from "os"
|
||||
import { join, basename } from "path"
|
||||
import type { AgentConfig } from "@opencode-ai/sdk"
|
||||
import { parseFrontmatter } from "../../shared/frontmatter"
|
||||
import { isMarkdownFile } from "../../shared/file-utils"
|
||||
import type { AgentScope, AgentFrontmatter, LoadedAgent } from "./types"
|
||||
|
||||
function parseToolsConfig(toolsStr?: string): Record<string, boolean> | undefined {
|
||||
@@ -18,10 +19,6 @@ function parseToolsConfig(toolsStr?: string): Record<string, boolean> | undefine
|
||||
return result
|
||||
}
|
||||
|
||||
function isMarkdownFile(entry: { name: string; isFile: () => boolean }): boolean {
|
||||
return !entry.name.startsWith(".") && entry.name.endsWith(".md") && entry.isFile()
|
||||
}
|
||||
|
||||
function loadAgentsFromDir(agentsDir: string, scope: AgentScope): LoadedAgent[] {
|
||||
if (!existsSync(agentsDir)) {
|
||||
return []
|
||||
|
||||
@@ -3,12 +3,9 @@ import { homedir } from "os"
|
||||
import { join, basename } from "path"
|
||||
import { parseFrontmatter } from "../../shared/frontmatter"
|
||||
import { sanitizeModelField } from "../../shared/model-sanitizer"
|
||||
import { isMarkdownFile } from "../../shared/file-utils"
|
||||
import type { CommandScope, CommandDefinition, CommandFrontmatter, LoadedCommand } from "./types"
|
||||
|
||||
function isMarkdownFile(entry: { name: string; isFile: () => boolean }): boolean {
|
||||
return !entry.name.startsWith(".") && entry.name.endsWith(".md") && entry.isFile()
|
||||
}
|
||||
|
||||
function loadCommandsFromDir(commandsDir: string, scope: CommandScope): LoadedCommand[] {
|
||||
if (!existsSync(commandsDir)) {
|
||||
return []
|
||||
|
||||
@@ -1,8 +1,9 @@
|
||||
import { existsSync, readdirSync, readFileSync, lstatSync, readlinkSync } from "fs"
|
||||
import { existsSync, readdirSync, readFileSync } from "fs"
|
||||
import { homedir } from "os"
|
||||
import { join, resolve } from "path"
|
||||
import { join } from "path"
|
||||
import { parseFrontmatter } from "../../shared/frontmatter"
|
||||
import { sanitizeModelField } from "../../shared/model-sanitizer"
|
||||
import { resolveSymlink } from "../../shared/file-utils"
|
||||
import type { CommandDefinition } from "../claude-code-command-loader/types"
|
||||
import type { SkillScope, SkillMetadata, LoadedSkillAsCommand } from "./types"
|
||||
|
||||
@@ -21,14 +22,7 @@ function loadSkillsFromDir(skillsDir: string, scope: SkillScope): LoadedSkillAsC
|
||||
|
||||
if (!entry.isDirectory() && !entry.isSymbolicLink()) continue
|
||||
|
||||
let resolvedPath = skillPath
|
||||
try {
|
||||
if (lstatSync(skillPath, { throwIfNoEntry: false })?.isSymbolicLink()) {
|
||||
resolvedPath = resolve(skillPath, "..", readlinkSync(skillPath))
|
||||
}
|
||||
} catch {
|
||||
continue
|
||||
}
|
||||
const resolvedPath = resolveSymlink(skillPath)
|
||||
|
||||
const skillMdPath = join(resolvedPath, "SKILL.md")
|
||||
if (!existsSync(skillMdPath)) continue
|
||||
|
||||
12
src/index.ts
12
src/index.ts
@@ -41,7 +41,7 @@ import {
|
||||
getCurrentSessionTitle,
|
||||
} from "./features/claude-code-session-state";
|
||||
import { updateTerminalTitle } from "./features/terminal";
|
||||
import { builtinTools, createCallOmoAgent, createBackgroundTools } from "./tools";
|
||||
import { builtinTools, createCallOmoAgent, createBackgroundTools, createLookAt } from "./tools";
|
||||
import { BackgroundManager } from "./features/background-agent";
|
||||
import { createBuiltinMcps } from "./mcp";
|
||||
import { OhMyOpenCodeConfigSchema, type OhMyOpenCodeConfig, type HookName } from "./config";
|
||||
@@ -218,6 +218,7 @@ const OhMyOpenCodePlugin: Plugin = async (ctx) => {
|
||||
const backgroundTools = createBackgroundTools(backgroundManager, ctx.client);
|
||||
|
||||
const callOmoAgent = createCallOmoAgent(ctx, backgroundManager);
|
||||
const lookAt = createLookAt(ctx);
|
||||
|
||||
const googleAuthHooks = pluginConfig.google_auth
|
||||
? await createGoogleAntigravityAuthPlugin(ctx)
|
||||
@@ -230,6 +231,7 @@ const OhMyOpenCodePlugin: Plugin = async (ctx) => {
|
||||
...builtinTools,
|
||||
...backgroundTools,
|
||||
call_omo_agent: callOmoAgent,
|
||||
look_at: lookAt,
|
||||
},
|
||||
|
||||
"chat.message": async (input, output) => {
|
||||
@@ -268,6 +270,14 @@ const OhMyOpenCodePlugin: Plugin = async (ctx) => {
|
||||
call_omo_agent: false,
|
||||
};
|
||||
}
|
||||
if (config.agent["multimodal-looker"]) {
|
||||
config.agent["multimodal-looker"].tools = {
|
||||
...config.agent["multimodal-looker"].tools,
|
||||
task: false,
|
||||
call_omo_agent: false,
|
||||
look_at: false,
|
||||
};
|
||||
}
|
||||
|
||||
const mcpResult = (pluginConfig.claude_code?.mcp ?? true)
|
||||
? await loadMcpConfigs()
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
const DANGEROUS_KEYS = new Set(["__proto__", "constructor", "prototype"]);
|
||||
const MAX_DEPTH = 50;
|
||||
|
||||
function isPlainObject(value: unknown): value is Record<string, unknown> {
|
||||
export function isPlainObject(value: unknown): value is Record<string, unknown> {
|
||||
return (
|
||||
typeof value === "object" &&
|
||||
value !== null &&
|
||||
|
||||
26
src/shared/file-utils.ts
Normal file
26
src/shared/file-utils.ts
Normal file
@@ -0,0 +1,26 @@
|
||||
import { lstatSync, readlinkSync } from "fs"
|
||||
import { resolve } from "path"
|
||||
|
||||
export function isMarkdownFile(entry: { name: string; isFile: () => boolean }): boolean {
|
||||
return !entry.name.startsWith(".") && entry.name.endsWith(".md") && entry.isFile()
|
||||
}
|
||||
|
||||
export function isSymbolicLink(filePath: string): boolean {
|
||||
try {
|
||||
return lstatSync(filePath, { throwIfNoEntry: false })?.isSymbolicLink() ?? false
|
||||
} catch {
|
||||
return false
|
||||
}
|
||||
}
|
||||
|
||||
export function resolveSymlink(filePath: string): string {
|
||||
try {
|
||||
const stats = lstatSync(filePath, { throwIfNoEntry: false })
|
||||
if (stats?.isSymbolicLink()) {
|
||||
return resolve(filePath, "..", readlinkSync(filePath))
|
||||
}
|
||||
return filePath
|
||||
} catch {
|
||||
return filePath
|
||||
}
|
||||
}
|
||||
@@ -8,3 +8,4 @@ export * from "./tool-name"
|
||||
export * from "./pattern-matcher"
|
||||
export * from "./hook-disabled"
|
||||
export * from "./deep-merge"
|
||||
export * from "./file-utils"
|
||||
|
||||
@@ -1,3 +1,5 @@
|
||||
import { isPlainObject } from "./deep-merge"
|
||||
|
||||
export function camelToSnake(str: string): string {
|
||||
return str.replace(/[A-Z]/g, (letter) => `_${letter.toLowerCase()}`)
|
||||
}
|
||||
@@ -6,10 +8,6 @@ export function snakeToCamel(str: string): string {
|
||||
return str.replace(/_([a-z])/g, (_, letter) => letter.toUpperCase())
|
||||
}
|
||||
|
||||
function isPlainObject(value: unknown): value is Record<string, unknown> {
|
||||
return typeof value === "object" && value !== null && !Array.isArray(value)
|
||||
}
|
||||
|
||||
export function objectToSnakeCase(
|
||||
obj: Record<string, unknown>,
|
||||
deep: boolean = true
|
||||
|
||||
@@ -34,6 +34,7 @@ import type { BackgroundManager } from "../features/background-agent"
|
||||
type OpencodeClient = PluginInput["client"]
|
||||
|
||||
export { createCallOmoAgent } from "./call-omo-agent"
|
||||
export { createLookAt } from "./look-at"
|
||||
|
||||
export function createBackgroundTools(manager: BackgroundManager, client: OpencodeClient) {
|
||||
return {
|
||||
|
||||
23
src/tools/look-at/constants.ts
Normal file
23
src/tools/look-at/constants.ts
Normal file
@@ -0,0 +1,23 @@
|
||||
export const MULTIMODAL_LOOKER_AGENT = "multimodal-looker" as const
|
||||
|
||||
export const LOOK_AT_DESCRIPTION = `Analyze media files (PDFs, images, diagrams) that require visual interpretation.
|
||||
|
||||
Use this tool to extract specific information from files that cannot be processed as plain text:
|
||||
- PDF documents: extract text, tables, structure, specific sections
|
||||
- Images: describe layouts, UI elements, text content, diagrams
|
||||
- Charts/Graphs: explain data, trends, relationships
|
||||
- Screenshots: identify UI components, text, visual elements
|
||||
- Architecture diagrams: explain flows, connections, components
|
||||
|
||||
Parameters:
|
||||
- file_path: Absolute path to the file to analyze
|
||||
- goal: What specific information to extract (be specific for better results)
|
||||
|
||||
Examples:
|
||||
- "Extract all API endpoints from this OpenAPI spec PDF"
|
||||
- "Describe the UI layout and components in this screenshot"
|
||||
- "Explain the data flow in this architecture diagram"
|
||||
- "List all table data from page 3 of this PDF"
|
||||
|
||||
This tool uses a separate context window with Gemini 2.5 Flash for multimodal analysis,
|
||||
saving tokens in the main conversation while providing accurate visual interpretation.`
|
||||
3
src/tools/look-at/index.ts
Normal file
3
src/tools/look-at/index.ts
Normal file
@@ -0,0 +1,3 @@
|
||||
export * from "./types"
|
||||
export * from "./constants"
|
||||
export { createLookAt } from "./tools"
|
||||
91
src/tools/look-at/tools.ts
Normal file
91
src/tools/look-at/tools.ts
Normal file
@@ -0,0 +1,91 @@
|
||||
import { tool, type PluginInput } from "@opencode-ai/plugin"
|
||||
import { LOOK_AT_DESCRIPTION, MULTIMODAL_LOOKER_AGENT } from "./constants"
|
||||
import type { LookAtArgs } from "./types"
|
||||
import { log } from "../../shared/logger"
|
||||
|
||||
export function createLookAt(ctx: PluginInput) {
|
||||
return tool({
|
||||
description: LOOK_AT_DESCRIPTION,
|
||||
args: {
|
||||
file_path: tool.schema.string().describe("Absolute path to the file to analyze"),
|
||||
goal: tool.schema.string().describe("What specific information to extract from the file"),
|
||||
},
|
||||
async execute(args: LookAtArgs, toolContext) {
|
||||
log(`[look_at] Analyzing file: ${args.file_path}, goal: ${args.goal}`)
|
||||
|
||||
const prompt = `Analyze this file and extract the requested information.
|
||||
|
||||
File path: ${args.file_path}
|
||||
Goal: ${args.goal}
|
||||
|
||||
Read the file using the Read tool, then provide ONLY the extracted information that matches the goal.
|
||||
Be thorough on what was requested, concise on everything else.
|
||||
If the requested information is not found, clearly state what is missing.`
|
||||
|
||||
log(`[look_at] Creating session with parent: ${toolContext.sessionID}`)
|
||||
const createResult = await ctx.client.session.create({
|
||||
body: {
|
||||
parentID: toolContext.sessionID,
|
||||
title: `look_at: ${args.goal.substring(0, 50)}`,
|
||||
},
|
||||
})
|
||||
|
||||
if (createResult.error) {
|
||||
log(`[look_at] Session create error:`, createResult.error)
|
||||
return `Error: Failed to create session: ${createResult.error}`
|
||||
}
|
||||
|
||||
const sessionID = createResult.data.id
|
||||
log(`[look_at] Created session: ${sessionID}`)
|
||||
|
||||
log(`[look_at] Sending prompt to session ${sessionID}`)
|
||||
await ctx.client.session.prompt({
|
||||
path: { id: sessionID },
|
||||
body: {
|
||||
agent: MULTIMODAL_LOOKER_AGENT,
|
||||
tools: {
|
||||
task: false,
|
||||
call_omo_agent: false,
|
||||
look_at: false,
|
||||
},
|
||||
parts: [{ type: "text", text: prompt }],
|
||||
},
|
||||
})
|
||||
|
||||
log(`[look_at] Prompt sent, fetching messages...`)
|
||||
|
||||
const messagesResult = await ctx.client.session.messages({
|
||||
path: { id: sessionID },
|
||||
})
|
||||
|
||||
if (messagesResult.error) {
|
||||
log(`[look_at] Messages error:`, messagesResult.error)
|
||||
return `Error: Failed to get messages: ${messagesResult.error}`
|
||||
}
|
||||
|
||||
const messages = messagesResult.data
|
||||
log(`[look_at] Got ${messages.length} messages`)
|
||||
|
||||
// eslint-disable-next-line @typescript-eslint/no-explicit-any
|
||||
const lastAssistantMessage = messages
|
||||
.filter((m: any) => m.info.role === "assistant")
|
||||
.sort((a: any, b: any) => (b.info.time?.created || 0) - (a.info.time?.created || 0))[0]
|
||||
|
||||
if (!lastAssistantMessage) {
|
||||
log(`[look_at] No assistant message found`)
|
||||
return `Error: No response from multimodal-looker agent`
|
||||
}
|
||||
|
||||
log(`[look_at] Found assistant message with ${lastAssistantMessage.parts.length} parts`)
|
||||
|
||||
// eslint-disable-next-line @typescript-eslint/no-explicit-any
|
||||
const textParts = lastAssistantMessage.parts.filter((p: any) => p.type === "text")
|
||||
// eslint-disable-next-line @typescript-eslint/no-explicit-any
|
||||
const responseText = textParts.map((p: any) => p.text).join("\n")
|
||||
|
||||
log(`[look_at] Got response, length: ${responseText.length}`)
|
||||
|
||||
return responseText
|
||||
},
|
||||
})
|
||||
}
|
||||
4
src/tools/look-at/types.ts
Normal file
4
src/tools/look-at/types.ts
Normal file
@@ -0,0 +1,4 @@
|
||||
export interface LookAtArgs {
|
||||
file_path: string
|
||||
goal: string
|
||||
}
|
||||
@@ -1,9 +1,10 @@
|
||||
import { tool } from "@opencode-ai/plugin"
|
||||
import { existsSync, readdirSync, lstatSync, readlinkSync, readFileSync } from "fs"
|
||||
import { existsSync, readdirSync, readFileSync } from "fs"
|
||||
import { homedir } from "os"
|
||||
import { join, resolve, basename } from "path"
|
||||
import { join, basename } from "path"
|
||||
import { z } from "zod/v4"
|
||||
import { parseFrontmatter, resolveCommandsInText } from "../../shared"
|
||||
import { resolveSymlink } from "../../shared/file-utils"
|
||||
import { SkillFrontmatterSchema } from "./types"
|
||||
import type { SkillScope, SkillMetadata, SkillInfo, LoadedSkill, SkillFrontmatter } from "./types"
|
||||
|
||||
@@ -37,15 +38,7 @@ function discoverSkillsFromDir(
|
||||
const skillPath = join(skillsDir, entry.name)
|
||||
|
||||
if (entry.isDirectory() || entry.isSymbolicLink()) {
|
||||
let resolvedPath = skillPath
|
||||
try {
|
||||
const stats = lstatSync(skillPath, { throwIfNoEntry: false })
|
||||
if (stats?.isSymbolicLink()) {
|
||||
resolvedPath = resolve(skillPath, "..", readlinkSync(skillPath))
|
||||
}
|
||||
} catch {
|
||||
continue
|
||||
}
|
||||
const resolvedPath = resolveSymlink(skillPath)
|
||||
|
||||
const skillMdPath = join(resolvedPath, "SKILL.md")
|
||||
if (!existsSync(skillMdPath)) continue
|
||||
@@ -83,18 +76,6 @@ const skillListForDescription = availableSkills
|
||||
.map((s) => `- ${s.name}: ${s.description} (${s.scope})`)
|
||||
.join("\n")
|
||||
|
||||
function resolveSymlink(skillPath: string): string {
|
||||
try {
|
||||
const stats = lstatSync(skillPath, { throwIfNoEntry: false })
|
||||
if (stats?.isSymbolicLink()) {
|
||||
return resolve(skillPath, "..", readlinkSync(skillPath))
|
||||
}
|
||||
return skillPath
|
||||
} catch {
|
||||
return skillPath
|
||||
}
|
||||
}
|
||||
|
||||
async function parseSkillMd(skillPath: string): Promise<SkillInfo | null> {
|
||||
const resolvedPath = resolveSymlink(skillPath)
|
||||
const skillMdPath = join(resolvedPath, "SKILL.md")
|
||||
|
||||
@@ -3,6 +3,7 @@ import { existsSync, readdirSync, readFileSync } from "fs"
|
||||
import { homedir } from "os"
|
||||
import { join, basename, dirname } from "path"
|
||||
import { parseFrontmatter, resolveCommandsInText, resolveFileReferencesInText, sanitizeModelField } from "../../shared"
|
||||
import { isMarkdownFile } from "../../shared/file-utils"
|
||||
import type { CommandScope, CommandMetadata, CommandInfo } from "./types"
|
||||
|
||||
function discoverCommandsFromDir(commandsDir: string, scope: CommandScope): CommandInfo[] {
|
||||
@@ -14,9 +15,7 @@ function discoverCommandsFromDir(commandsDir: string, scope: CommandScope): Comm
|
||||
const commands: CommandInfo[] = []
|
||||
|
||||
for (const entry of entries) {
|
||||
if (entry.name.startsWith(".")) continue
|
||||
if (!entry.name.endsWith(".md")) continue
|
||||
if (!entry.isFile()) continue
|
||||
if (!isMarkdownFile(entry)) continue
|
||||
|
||||
const commandPath = join(commandsDir, entry.name)
|
||||
const commandName = basename(entry.name, ".md")
|
||||
|
||||
Reference in New Issue
Block a user