Compare commits

..

10 Commits

Author SHA1 Message Date
chungyeong
5cf9ad131a feat(conversation): cheap-default DeepSeek + Enter-send + model pill
- default-interactive@1 model: claude-haiku-4-5 → deepseek/deepseek-chat
  (input $0.28/$1.12 per 1M; haiku 대비 ~75% 절감).  fallback 은 haiku 로 swap.
- conversation textarea keydown:
  - Enter → 전송 (IME composition 중이면 무시)
  - Shift+Enter → 줄바꿈
  - Cmd/Ctrl+Enter → 전송 (백워드 호환)
  - Placeholder 안내 갱신.
- conversation top-bar 에 model pill 추가 (#session-model-pill) — 현재 세션의
  활성 model 을 monospace badge 로 표시.  헷갈리던 "어느 모델인가?" 해소.
- style.css 에 .conv-model-pill (회색 pill).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 02:02:19 +09:00
chungyeong
9a02f22acb feat(my-deepagent): v0.4 chat UX boost + A/B live verification
Claude-Code 동급 chat 경험으로 끌어올림 + 7개 핵심 흐름 실제 OpenRouter verify.

A — Live verification (scripts/live_verify.py, 7 PASS, 약 $0.02):
- A1 1-turn chat (CLI-eq) → Haiku 4.5 한국어 응답
- A2 sessions resume → 같은 session_id 재투입 시 LangGraph state 복원
- A3 /skill <name> system inject → SKILL.md ("한국어 haiku 3 lines") 가 정확히
  3행 한국어 시 생성 (LLM 행동 제어 강력한 증거)
- A4 /plan → /approve → LLM plan markdown only, 차단 도구 시도 없음
- A5 /agents spawn → 실제 sub-agent ainvoke + parent stream push
- A6 auto-compaction → 14 메시지 → 4 archive + 77 토큰 summary
- A7 /workflow wiring → role↔persona 매칭 사전 검증

B1 — Markdown rendering:
- app.js pure-JS 미니 파서: 코드 펜스 / ATX 헤더 / ul/ol / `code`/**bold**/
  *italic*/[link](url)
- XSS 정책 유지: createElement + textContent only.  링크 href 는 http(s):
  스킴 강제.

B2 — System event card (collapsible):
- _classifySystemMessage 가 [sub-agent .../workflow .../Earlier conversation
  history/당신은 plan mode/The user APPROVED/skill] 접두사 분류 후 <details>
  카드로 렌더.

B3 — Token streaming via AsyncCallbackHandler:
- ChatOpenAI(streaming=True)
- _StreamingChunkPusher (AsyncCallbackHandler) → asyncio.Queue per session.
- SSE _session_event_stream 이 queue drain → event: chunk SSE.  100ms poll.
- 순서 보장: chunk drain → message rows yield (placeholder 가 메시지로
  교체되기 전에 토큰 visible).
- 라이브: 5 chunk events + 1 final message, "안녕하세요, / 무 / 엇을 도와드 /
  릴까요?" 토큰 단위 push.

B4 — Cancel mid-turn:
- POST /api/sessions/{id}/abort + app.state.pending_per_session 인덱스.
- 새 user 메시지 도착 시 이전 in-flight task 자동 cancel.
- "■ 중단" 버튼 — 대기 중 visible, 완료/취소 시 hide.

B5 — IME composition-safe Enter:
- compositionstart/compositionend 플래그 — 한글 IME 후보 commit Enter 무시.
- Cmd/Ctrl+Enter 는 항상 전송.

DB hot-fix:
- Database.__init__ pool_pre_ping=True — Postgres asyncpg stale connection
  → SSE 부하에서 500 발생 해결.

기타:
- createNewSession 의 repo_path: "" → "." (min_length=1 검증 통과).
- test_conversation_gui.py fake_invoke 가 chunk_queue kwarg 받도록 업데이트.

게이트:
- ruff / format / mypy: PASS (143 source files)
- pytest -q --ignore=tests/integration/test_e2e_workflow.py
  --ignore=tests/integration/test_openrouter_smoke.py: 709 passed

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 01:08:40 +09:00
chungyeong
6d371afadd feat(my-deepagent): v0.4 — workflow generator UI + hot-reload + UX polish
브라우저에서 YAML 안 쓰고도 새 워크플로우 템플릿 만들기 + 즉시 등록.
+ /new.html / index.html / new-workflow.html / runs.html / conversation.html
의 nav·copy·empty-state 정비.

A. /new.html UX:
- 제목 "새 Run" → "워크플로우 실행 (고급)"
- 상단 info-box: "자유 대화는 여기가 아닙니다 → 메인 페이지"
- 모든 필드에 한 줄 hint
- Persona 오버라이드 <details> 접힘

B. Nav 재정렬 (5 페이지):
- "대화" nav-primary, 나머지 nav-secondary (작고 dim)

C. 메인 안내 + CSS:
- 메인 / 에 "👋 my-deepagent" info-box 추가
- .info-box / .nav-primary / .nav-secondary / .wf-* 신규 스타일

D. Workflow hot-reload:
- api/deps.py get_workflows 가 매 요청 mtime 튜플 검사 후 변경 시 reload
- lifespan 도 user dir 포함하도록 _load_workflows_combined

E. Workflow generator:
- POST /api/workflows: CreateWorkflowRequest → WorkflowTemplate validate →
  <data_dir>/workflows/<name>@<version>.yaml 저장.  중복 409, validation 422.
- static/new-workflow.html: 기본 정보 / Roles / Phases / YAML preview
- app.js bootstrapWorkflowGenerator: capability chip 토글, role select 동적,
  실시간 YAML preview, XSS 정책 유지

테스트 (test_workflow_generator.py, 7 신규):
- 페이지 200 + 마크업
- POST happy / 422 (empty roles) / 422 (unknown role) / 409 (dup)
- GET hot-reload after POST
- GET hot-reload after external file drop

게이트:
- ruff / format / mypy: PASS (142 source files)
- pytest -q --ignore=tests/integration/test_e2e_workflow.py
  --ignore=tests/integration/test_openrouter_smoke.py: 709 passed (+7 신규)
- 라이브 smoke: / / new.html / new-workflow.html 모두 200, screenshot OK

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 00:38:46 +09:00
chungyeong
40ef833ad3 fix(db): enable pool_pre_ping on async engine — 500 on stale Postgres connection
증상:
- 라이브 smoke 도중 SSE poll loop 가 0.5s 마다 connection 을 빌리던 중,
  asyncpg pool 이 idle/network blip 으로 socket 이 닫힌 stale connection
  을 그대로 넘김.  다음 요청 (GET /api/sessions) 이
  `sqlalchemy.exc.InterfaceError: connection is closed` 로 500.

원인:
- `create_async_engine(database_url, poolclass=None, echo=False)` —
  pool_pre_ping 미설정.  SQLAlchemy 가 checkout 시 connection 생존
  확인 안 함.

수정:
- `pool_pre_ping=True` 한 줄 추가.  SQLAlchemy 가 매 checkout 직전 빠른
  SELECT 1 (asyncpg 는 protocol-level ping) 을 보내고 실패 시 pool 에서
  invalidate 후 새 connection 발급.  표준 SQLAlchemy 권장 패턴.
- 부하 (SSE 0.5s polling + REST) 에서 검증: 재시작 후 GET /api/sessions
  연속 호출 모두 200.

테스트:
- ruff / mypy: PASS (141 files)
- pytest tests/integration/test_persistence.py: 20 passed (회귀 없음)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 00:24:24 +09:00
chungyeong
96c8849e2c fix(my-deepagent): v0.3 plan-conformance — 18-item gap fix across PR #2-#9
1차 v0.3 구현 후 plan-v0.3 와 대조해 발견된 18건 누락/명세 위반을 보강.
자기 리뷰 3 라운드 (누락·미완 / 오류·엣지케이스 / 과최적화) 모두 PASS.

PR #5 plan-mode (3건):
- BLOCKED_TOOLS_IN_PLAN_MODE 에 write_todos 추가
- /plan 시 system message inject (_PLAN_MODE_SYSTEM_PROMPT)
- /approve 시 마지막 assistant 메시지를 "approved plan" system 으로 inject
- InteractiveSession._pending_system_messages 인프라 신설

PR #2 compaction (1건):
- CompactionResult.summary_text 추가, 다음 thread 첫 ainvoke 에 inject

PR #3 auto-memory (6건):
- global memory dir + bootstrap
- frontmatter name/description/type 정식 도입 + MemoryEntry/MemoryType
- _infer_memory_type (keyword heuristic, no LLM)
- _scrub_secrets (OpenRouter/Anthropic/OpenAI/AWS/Bearer redaction)
- /memory show <name> 서브명령
- /remember [--global] / /forget [--global] 스코프 토글

PR #4 skills (3건):
- project_skills_dir + 두 스코프 (global / project) merge with last-wins
- /skill <name> 본문 inject (queue_system_message) — 이전엔 REPL 출력만
- /skills show <name> 별도 서브명령

PR #6 sub-agent (4건):
- budget.py `session:<uuid>` scope + CostMiddleware 자동 전달
- resolve_root_session_id walk-up (cycle guard) + sub-agent root 에 charge
- run_subagent_to_completion 실제 ainvoke + 결과 push to parent
- /agents 서브명령 구조 (list / spawn / show) + spawn 시 parent system msg

PR #7 governance (1건):
- bootstrap_user_dirs — instructions + global/memory + skills + projects 한
  호출로 idempotent 부트스트랩

PR #8 Web GUI (1건):
- index.html → 세션 목록, runs.html (신설) → workflow archive
- conversation.html ?session=<id> deep-link

PR #9 workflow integration (2건):
- /workflow 백그라운드 WorkflowEngine.run + 진행 메시지 stream 누적
- /binding show <workflow-name[@version]> 인자 지원

테스트 (+17, 685 → 702 passed):
- test_plan_mode: write_todos 차단 + blocklist sanity
- test_memory: scrub + type 추론 + override
- test_skills: project override + find_skill + resolve_skill_sources(pk)
- test_subagents: resolve_root_session_id chain + missing fallback
- test_budget: session: scope accumulation
- test_instructions: governance bootstrap + idempotency
- test_api_static: runs.html 신설 + index.html 재구성

게이트:
- ruff check / format --check / mypy: PASS (141 source files)
- pytest -q --ignore=tests/integration/test_e2e_workflow.py
  --ignore=tests/integration/test_openrouter_smoke.py: 702 passed

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 00:03:08 +09:00
chungyeong
361d6d7636 feat(my-deepagent): v0.3 PR #9 — workflow optionization + user dir wiring
Workflow engine 을 주력에서 "옵션" 으로 격하: 사용자가 명시적
`/workflow <name>` 호출 시만 활성.  대신 `<data_dir>/personas/` 와
`<data_dir>/workflows/` 에 YAML 파일을 떨궈 자신만의 persona·workflow 를
등록할 수 있게 함 (seed override 가능).

핵심 동작:
- `ensure_user_dirs_initialized(config)` — 두 사용자 디렉터리 `mkdir -p`,
  idempotent.  매 REPL 시작 시 호출.
- `load_combined_personas(config, seed_dir)` — seed (strict) + user
  (best-effort per-file skip) merge.  Dedupe key `(name, version)`,
  user-overrides-seed.  Broken user YAML 1개 가 REPL 죽이지 못함.
- `load_combined_workflows(config, seed_dir)` — workflow 도 동일.

데이터·라이브러리:
- `user_dirs.py` (신규): `user_personas_dir`, `user_workflows_dir`,
  `ensure_user_dirs_initialized`, `load_combined_personas`,
  `load_combined_workflows`, `_safe_load_personas`, `_safe_load_workflows`.

REPL 통합 (`cli/interactive.py`):
- `InteractiveSession(..., workflows=...)` 시그니처 확장.
- `_interactive_loop_async` 가 user dir bootstrap + combined load 사용.
- 신규 슬래시 4개:
  - `/personas` — 로드된 persona 목록 (현재 활성 표시)
  - `/workflows` — 로드된 workflow 템플릿 목록 (phase/role 개수, 파일명)
  - `/workflow <name>` — `mydeepagent run` 명령 안내 (현재 백그라운드 invoke
    는 안내 메시지만; 실제 kick-off 는 별도 PR 또는 `mydeepagent run` CLI)
  - `/binding show` — 각 workflow 의 role 별 required_capabilities 표시
- `_register_workflow_slash` 의 복잡도(C901) 회피를 위해 print 헬퍼
  (`_print_personas` 등) 를 module-level 로 추출.

테스트 (`tests/integration/test_user_dirs.py`, 10 케이스):
- 부트스트랩 idempotency
- persona seed-only / seed+user / user-overrides-seed / malformed-user-skip
- workflow 동일 4종
- 빈 user 디렉터리 처리

게이트:
- ruff check / format --check / mypy: PASS
- pytest -q --ignore=tests/integration/test_e2e_workflow.py
  --ignore=tests/integration/test_openrouter_smoke.py: 685 passed (10 신규 포함)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 21:11:19 +09:00
chungyeong
e326c07dcb feat(my-deepagent): v0.3 PR #8 — conversation-centric Web GUI (/conversation.html)
Workflow run 페이지를 archive 로 격하시키고, 사용자가 처음 보는 화면을
chat-style 대화 thread 로 전환.  Claude Code 의 Web GUI 와 동일한 UX.

핵심 동작:
- 새 페이지 `/conversation.html` 에서 세션을 picker 로 고르거나 "새 대화"
  버튼으로 만들고 메시지 입력.  Cmd/Ctrl+Enter 로 전송.
- POST /api/sessions/{id}/messages 가 user MessageRow 를 영속한 즉시 200 응답
  후 `asyncio.create_task(invoke_session_agent(...))` 로 백그라운드 invoke 발사.
- 백그라운드 task 는 lifespan 에서 1회 열어둔 LangGraph saver 를 재사용하고
  agent.ainvoke → assistant MessageRow 영속 → 자동 compaction 까지 처리.
- 기존 SSE 스트림 (`/api/sessions/{id}/stream`) 이 새 메시지를 push,
  프론트엔드의 `EventSource` 가 받아 thread 에 렌더.

신규 / 수정 파일:
- `static/conversation.html` (신규): chat UI 마크업.  data-page="conversation".
- `static/app.js`: 새 페이지 핸들러 `bootstrapConversationPage` +
  세션 picker + 메시지 thread 렌더 + SSE 구독 + Cmd/Ctrl+Enter 단축키.
  XSS 정책 동일: 모든 사용자 콘텐츠는 `textContent` 만 사용.
- `static/style.css`: `.messages-thread`, `.msg-bubble`, `.conv-topbar`,
  `.conv-input-bar` 등 chat UI 스타일.
- `api/app.py`: lifespan 에서 LangGraph saver 를 1회 열어 `app.state.saver`
  에 보관 (Postgres 일 때만).
- `api/agent_runner.py` (신규): `invoke_session_agent(...)` — REPL 의
  `InteractiveSession + _invoke_and_stream` 와 동일한 stack 을 HTTP background
  context 용으로 재구성.  실패는 로깅 + return.
- `api/routes/sessions.py`: POST /messages 가 background task 발사 + ref 를
  `app.state.pending_invocations` set 에 보관 (RUF006 / GC drop 방지).

테스트 (`tests/integration/test_conversation_gui.py`, 4 케이스):
- GET /conversation.html → 200 + 필수 마크업
- POST /messages → 200 + user row 영속 + 스텁 runner 호출 확인
- 백그라운드 task ref 가 `pending_invocations` 에 잡혀있고 완료 후 자동 discard
- 스텁 runner 가 assistant row 영속 → user + assistant 시퀀스 검증

게이트:
- ruff check / format --check / mypy: PASS
- pytest -q --ignore=tests/integration/test_e2e_workflow.py
  --ignore=tests/integration/test_openrouter_smoke.py: 675 passed (4 신규 포함)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 21:03:09 +09:00
chungyeong
61b34af0e4 feat(my-deepagent): v0.3 PR #7 — MYDEEPAGENT.md global+project hierarchy
Claude Code 의 CLAUDE.md 글로벌/프로젝트 레이어링 등가.  세션 시작 시 두
파일을 자동 로드해 시스템 프롬프트에 inject:
- Global: <config.data_dir>/MYDEEPAGENT.md (템플릿 자동 생성, idempotent)
- Project: <repo>/MYDEEPAGENT.md (있을 때만 로드, auto-create 안 함)

순서는 [global → project → MEMORY.md → entry .md] 라서 후순위 파일이
deepagents `MemoryMiddleware`의 "later overrides earlier" 규칙에 따라
더 구체적인 맥락으로 일반 지침을 덮을 수 있음.

데이터·라이브러리:
- `instructions.py` (신규):
  - `global_instructions_path(config)`, `project_instructions_path(repo_root)`
  - `ensure_global_instructions_initialized(config)` — 글로벌 템플릿 1회 생성.
    Korean-default 협업·코드 스타일 가이드 시드.  Idempotent (사용자 편집 보존).
  - `resolve_instruction_paths(config, repo_root)` — 존재하는 파일만 절대 경로로
    글로벌 → 프로젝트 순서 반환.

REPL 통합 (`cli/interactive.py`):
- `InteractiveSession.__init__`에서 `ensure_global_instructions_initialized`
  호출.
- `build_agent_if_needed`에서 `[*instructions, *memory]` 순서로
  `memory_paths_override` 구성 → deepagents memory= kwarg 까지 전파.

테스트 (`tests/integration/test_instructions.py`, 6 케이스):
- 글로벌 부트스트랩 + idempotency (수동 편집 보존)
- 프로젝트 파일은 auto-create 안 함
- 0/1/2 개 존재 시 `resolve_instruction_paths` 반환 순서 검증
- global path 가 data_dir 아래에 위치
- **integration**: `build_agent`가 결합 리스트를 `create_deep_agent(memory=...)`
  로 그대로 전달

게이트:
- ruff check / format --check / mypy: PASS
- pytest -q --ignore=tests/integration/test_e2e_workflow.py
  --ignore=tests/integration/test_openrouter_smoke.py: 671 passed (6 신규 포함)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 20:55:06 +09:00
chungyeong
5e9656e8a3 feat(my-deepagent): v0.3 PR #6 — sub-agent session linkage (/agents, /spawn)
deepagents 의 langchain-internal `task` tool 과 별개로, my-deepagent 만의
**persisted** session forking 구현.  Child 는 자체 `InteractiveSessionRow` 를
가져 `mydeepagent --session <id>` 로 독립 resume / Web GUI 트리 탐색 가능.
부모의 `project_key` 그대로 상속해 memory · skills 디렉터리 공유.
Depth limit = MAX_SUBAGENT_DEPTH = 3.

핵심 동작:
- `spawn_subagent_session(db, parent_session_id, persona, initial_title)` —
  단일 트랜잭션 단위로:
  (1) 부모 존재·`state == "active"` 확인
  (2) `depth = parent.depth + 1`, 초과 시 `MyDeepAgentError(human_required)`
  (3) `AgentPersonaRow` upsert (compute_hash 같으면 재사용)
  (4) 부모의 `project_key` 상속 + `parent_session_id`, `depth` 세팅
  → 새 `child_id` 반환.
- `list_subagents(db, parent_session_id)` — 직접 자식만 (`started_at` 순),
  grandchild 는 caller 가 트리 순회.

데이터·라이브러리:
- `subagents.py` (신규): 위 두 함수 + `MAX_SUBAGENT_DEPTH = 3`.

REPL 통합 (`cli/interactive.py`):
- `_register_subagent_slash`: `/agents` (직접 자식 목록), `/spawn <persona>`
  (자식 생성 + resume 안내).

테스트 (`tests/integration/test_subagents.py`, 8 케이스):
- Happy path (project_key 상속, depth=1)
- 같은 부모에 자식 2개 → 둘 다 depth=1
- Depth chain spawn 3 회 후 4번째 거부 (`subagent_depth_exceeded`)
- 존재 안 하는 부모 → `parent_session_missing`
- 부모 state="ended" → `parent_session_ended`
- `list_subagents` direct only (grandchild 제외)
- 자식 없으면 빈 리스트
- 같은 persona hash → 동일 persona_id 재사용

게이트:
- ruff check / format --check / mypy: PASS
- pytest -q --ignore=tests/integration/test_e2e_workflow.py
  --ignore=tests/integration/test_openrouter_smoke.py: 665 passed (8 신규 포함)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 20:52:00 +09:00
chungyeong
fb7e67fd20 feat(my-deepagent): v0.3 PR #5 — plan mode (/plan, /approve, /reject)
Claude Code의 plan mode 등가.  `/plan` 진입 시 write_file / edit_file /
execute / bash / task (sub-agent) 도구가 차단되고 read_file / glob / grep /
ls / write_todos 만 허용.

핵심 동작:
- `PlanModeMiddleware(is_active: Callable[[], bool])` 가 `awrap_tool_call` /
  `wrap_tool_call` 에서 활성 + 차단 도구면 synthetic
  `ToolMessage(status="error")` 반환.  raise 하지 않음 — LLM 이 차단 메시지를
  보고 다른 도구로 전환하거나 plan 다듬기로 자동 복귀.
- `is_active` 는 closure 라서 슬래시 토글 후 agent 재빌드 불필요.
- `InteractiveSessionRow.plan_mode` 영속 + resume 시 복원.

데이터·라이브러리:
- `middleware/plan_mode.py` (신규):
  - `BLOCKED_TOOLS_IN_PLAN_MODE = write_file / edit_file / bash / execute /
    run_command / shell / task`.
  - `PlanModeMiddleware` async + sync 양쪽 구현.

REPL 통합 (`cli/interactive.py`):
- `InteractiveSession._plan_mode: bool` + `set_plan_mode(enabled)` async →
  flag 토글 + `thread_suffix` bump + row 영속.
- resume path 에서 `sess._plan_mode = row.plan_mode` 로 복원.
- `_register_plan_mode_slash`: `/plan`, `/approve`, `/reject` 등록.
- `/reject` 는 thread 까지 리셋해 plan thread 폐기.

테스트 (`tests/integration/test_plan_mode.py`, 9 케이스):
- inactive 시 모든 도구 패스스루
- active 시 write_file / execute / task 차단 (status=error,
  tool_call_id 유지, 메시지에 도구명 + "Plan-mode" 포함)
- active 시 read_file / glob / grep / ls / write_todos 허용
- closure 토글로 동작 변경 (rebuild 없이)
- 동기 wrap_tool_call 도 동일 동작
- BLOCKED_TOOLS_IN_PLAN_MODE 상수 sanity

게이트:
- ruff check / format --check / mypy: PASS
- pytest -q --ignore=tests/integration/test_e2e_workflow.py
  --ignore=tests/integration/test_openrouter_smoke.py: 657 passed (9 신규 포함)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 20:47:30 +09:00
40 changed files with 6412 additions and 208 deletions

View File

@@ -2,6 +2,385 @@
## [Unreleased]
### Added
- **v0.4 chat UX boost + A/B live verification** — Claude-Code 동급의 chat
경험으로 끌어올림 + 7개 핵심 흐름을 실제 OpenRouter 로 verify.
**A — Live verification (`scripts/live_verify.py`, 7 PASS)**:
- A1 1-turn chat (CLI-eq) → Anthropic Haiku 4.5 한국어 응답
- A2 sessions resume → 같은 session_id 재투입 시 LangGraph thread state 복원
- A3 `/skill <name>` system inject → SKILL.md ("한국어 haiku 3 lines") 가
실제로 LLM 행동을 제어 (정확히 3행 한국어 시 출력)
- A4 `/plan → /approve` → LLM 이 plan markdown 만 생성, 차단 도구 시도 없음
- A5 `/agents spawn` → 실제 sub-agent ainvoke + 결과 parent stream push
- A6 auto-compaction → 14 메시지 → 4 archive + 77 토큰 summary
- A7 `/workflow` wiring → role↔persona 매칭 사전 검증
- 총 비용 약 \$0.02.
**B1 — Markdown rendering** in conversation.html:
- `app.js` 의 pure-JS 미니 마크다운 파서 — 코드 펜스, ATX 헤더, ul/ol,
inline `code`/**bold**/*italic*/[link](url) 지원.
- XSS 정책 유지: `createElement + textContent` 만 사용, `innerHTML`
금지. 링크 href 는 `http(s):` 스킴으로 강제 제한.
- `style.css``.md-p`, `.md-h`, `.md-ul`, `.md-ol`, `.md-code`
스타일 추가. user bubble (brown 배경) 안에서도 코드/링크 가독성 유지.
**B2 — System event card** (collapsible):
- `_classifySystemMessage` 가 system content 의 접두사를 보고
"Sub-agent result / Workflow started / Compaction summary / Plan mode
activated / Approved plan / Skill activated" 등으로 분류 → `<details>`
카드로 렌더. 채팅 thread 가 이벤트 메시지로 도배되지 않음.
**B3 — Token streaming via AsyncCallbackHandler**:
- `session.py:resolve_model_instance``ChatOpenAI(streaming=True)`.
- `api/agent_runner._StreamingChunkPusher` (AsyncCallbackHandler) 가
`on_llm_new_token` 마다 `asyncio.Queue``{"type":"delta","text":...}`
push.
- `api/routes/sessions._session_event_stream` 이 queue 를 drain 해 SSE
`event: chunk` 로 전송. Poll interval 100ms. 순서 보장: chunk 먼저
drain → message rows 후 yield (placeholder 가 메시지로 교체되기 전에
토큰이 시각적으로 흐르도록).
- 프론트엔드 `app.js``appendStreamDelta` 가 chunk 를 placeholder 에
누적; 최종 `message` SSE 가 도착하면 markdown-rendered bubble 로 교체.
- 라이브 verify: 5 chunk events + 1 final message, OpenRouter Haiku 응답
"안녕하세요, / 무 / 엇을 도와드 / 릴까요?" 토큰 단위 push 확인.
**B4 — Cancel mid-turn** (`POST /api/sessions/{id}/abort`):
- `app.state.pending_per_session: dict[session_id, Task]` 인덱스 +
`_remove_from_session_map` done-callback.
- 새 user 메시지 도착 시 이전 in-flight task 자동 cancel (Claude Code parity).
- 프론트엔드 우하단 "**■ 중단**" 버튼 — 대기 중 visible, 완료/취소 시 hide.
**B5 — IME composition-safe Enter**:
- 한글/일본어/중국어 IME 입력 중 Enter 가 후보 commit 용으로 쓰일 때
전송되지 않도록 `compositionstart` / `compositionend` 플래그. 순수
Enter 만 무시, Cmd/Ctrl+Enter 는 우선 적용.
**DB hot-fix** (v0.4 chat UX 라운드 도중 발견):
- `Database.__init__``pool_pre_ping=True` — Postgres asyncpg pool 이
idle/network blip 후 stale connection 을 넘기던 문제 (SSE 0.5s poll
부하에서 500 발생) 해결.
**새 테스트** (정확한 인보크 시그니처 sync + 기존 통합 보존):
- `tests/integration/test_conversation_gui.py``fake_invoke` 스텁이
`chunk_queue` kwarg 도 받도록 업데이트.
- 전체 회귀: 709 passed (no new failures).
### Added
- **v0.4 — Workflow generator UI + hot-reload + UX polish**. 사용자가 직접
YAML 을 작성하지 않고도 브라우저에서 새 워크플로우 템플릿을 만들고 즉시
실행할 수 있도록 함. 메인 페이지 / new.html / runs.html / new-workflow.html
의 nav · copy · empty-state 도 동시에 정비.
- **A — `/new.html` UX 패치** (HTML/CSS only):
- 제목 "새 Run 시작" → "워크플로우 실행 (고급 기능)".
- 상단 `info-box`: "자유 대화는 여기가 아닙니다 → 메인 페이지" 안내 +
"+ 템플릿 만들기" 링크.
- 모든 필드에 한 줄 hint (예: `repo 절대경로 — 작업할 git 저장소 위치`).
- Persona 오버라이드를 `<details>` 접힘 상태로 → 첫 사용자가 압도되지
않도록.
- **B — nav 재정렬** (`/`, `/runs.html`, `/new.html`, `/run.html`,
`/conversation.html`):
- "대화" 가 `nav-primary` (큰 폰트 + 진한 색).
- "Runs" / "워크플로우 실행" / "+ 템플릿 만들기" 는 `nav-secondary`
(작은 폰트 + 65% opacity, hover 시 100%).
- **C — 메인 페이지 안내** + CSS:
- 메인 `/``info-box` 추가 ("👋 my-deepagent — OpenRouter 가성비 모델로
돌아가는 Claude Code 스타일 멀티턴 에이전트").
- `style.css``.info-box`, `.nav-primary`/`.nav-secondary`,
`.wf-row-card`, `.wf-chip` 등 신규 스타일 추가.
- **D — Workflow hot-reload**:
- `api/deps.py``get_workflows` 가 매 요청 시
`_workflow_dir_signature(config)` (seed + user 디렉터리의 mtime 튜플)
을 계산해 cached signature 와 다르면 `load_combined_workflows` 재호출.
파일 watcher / inotify 없이 stat 만으로 충분 (디렉터리가 작음).
- lifespan 의 `_load_seed_workflows``_load_workflows_combined`
교체해 user dir 첫 부팅 시도 자동 로드.
- **E — Workflow generator UI**:
- **API**: `POST /api/workflows` 신설. Body = `CreateWorkflowRequest`
pydantic (name / version / description / roles / phases / default_gates /
max_total_budget_usd). `WorkflowTemplate.model_validate` 로 strict
검증 → 실패 시 422 (loc:msg 포맷으로 평탄화). 성공 시
`<data_dir>/workflows/<name>@<version>.yaml` 에 YAML 저장 (`yaml.safe_dump
allow_unicode=True, sort_keys=False`). 중복 (name, version) 은 409.
- **HTML**: `static/new-workflow.html` (신규). 기본 정보 → Roles →
Phases → YAML 미리보기 → 저장 버튼.
- **JS**: `app.js``bootstrapWorkflowGenerator``WF_STATE` 추가.
Role 별 capability 를 chip 형태로 클릭 토글, Phase 의 role 셀렉트는
현재 Role 목록에서 동적으로 생성. 실시간 YAML preview. XSS 정책
유지 (모든 사용자 입력은 textContent).
- **신규 테스트** (`tests/integration/test_workflow_generator.py`, 7 케이스):
- `/new-workflow.html` 200 + 마크업
- POST happy path → yaml 파일 영속 + path / name / version 검증
- POST roles=[] → 422
- POST phase.role 미존재 → 422 + 메시지에 role id 포함
- POST duplicate (name, version) → 409
- GET hot-reload: POST 후 새 항목이 GET 응답에 등장
- GET hot-reload: 외부에서 YAML 파일 직접 떨궈도 mtime 으로 감지
### Changed
- **v0.3 plan-conformance fixes** — 1차 구현 후 plan-v0.3 와 대조해 발견된 18건
누락/명세 위반을 보강. 자기 리뷰 3 라운드 (누락·미완 / 오류·엣지케이스 /
과최적화) 모두 PASS.
- **PR #5 (plan mode) 3건**:
- `PlanModeMiddleware.BLOCKED_TOOLS_IN_PLAN_MODE``write_todos` 추가
(plan 명세대로: todos 는 plan markdown 의 일부, /approve 까지 차단).
- `/plan` 진입 시 system message inject (`_PLAN_MODE_SYSTEM_PROMPT`
"당신은 plan mode입니다 …"). 새 `InteractiveSession.enter_plan_mode()`
flag 토글 + pending system message 큐 + thread bump 를 한 번에 처리.
- `/approve` 시 마지막 assistant 메시지(=plan markdown) 추출 후
"approved plan" system message 로 다음 turn 에 inject. 새
`InteractiveSession.approve_plan() / reject_plan()`.
- 신규 인프라: `InteractiveSession._pending_system_messages: list[str]` +
`queue_system_message()` / `consume_pending_system_messages()`
`_invoke_and_stream` 이 매 turn 시작 시 prepend (MessageRow 영속 포함).
- **PR #2 (compaction) 1건**:
- `CompactionResult.summary_text` 필드 추가. 자동/수동 compaction 결과
summary 가 `sess.queue_system_message()` 로 다음 thread 의 첫 ainvoke 에
inject — plan 명세 ("새 thread 에는 system + 1 summary + 최근 K 메시지")
충족.
- **PR #3 (auto-memory) 6건**:
- `global_memory_dir(config)` + `<data_dir>/global/memory/` 부트스트랩.
`list_memory_paths` 가 global + project 둘 다 deepagents memory= 에 전달.
- Frontmatter `name / description / type` 정식 도입 (`type` ∈ user /
feedback / project / reference). `MemoryType` Literal + `MemoryEntry`
dataclass + `read_entry` / `read_index_entries`.
- `_infer_memory_type` — 키워드 기반 deterministic classifier (no LLM).
`/remember --type=feedback <text>` 로 override 가능.
- `_scrub_secrets` — OpenRouter/Anthropic/OpenAI/AWS/Bearer 토큰 정규식
매칭 후 `<redacted:...>` 치환. `WriteResult.scrubbed` 플래그 + REPL
경고.
- `/memory show <name>` 슬래시 서브명령 — 본문 + (type, scope) 출력.
- `/remember [--global]` / `/forget [--global]` — 두 스코프 명시적 토글.
- **PR #4 (skills) 3건**:
- `project_skills_dir(config, project_key)` — `<data_dir>/projects/<key>/
skills/`. `resolve_skill_sources(config, project_key)` 가 global +
project 두 경로 반환 → project 가 global 을 덮음 (deepagents
"later-wins").
- `list_all_skills(config, project_key)` + `find_skill(config,
project_key, name)` — 두 스코프 merge + scope 라벨 표시.
- `/skill <name>` 동작 정정 — 본문을 REPL 출력만 하던 것을 system
message 로 inject (queue) 하도록 변경. plan 명세 "본문 inject (이번
turn 전체)" 충족. `/skills show <name>` 별도 서브명령 신설 (inject 안
함, 인스펙션만).
- **PR #6 (sub-agent) 4건**:
- `budget.py`: `session:<uuid>` scope 추가. `_scopes_for` 가
`session_id` 받아 ledger 누적. `CostMiddleware` 가 `interactive_session_id`
를 자동 전달. Plan 명세 "sub-agent 는 root session 의 한도에 합산"
충족.
- `subagents.resolve_root_session_id` — `parent_session_id` 체인 walk-up
(cycle guard). Sub-agent CostMiddleware 가 ROOT session 으로 charge.
- `subagents.run_subagent_to_completion` — 실제 ainvoke + 결과 summary 를
부모 세션에 `[sub-agent <id> result]` system message 로 push + sub-
session 자동 `ended` 마킹.
- `/agents` 슬래시 서브명령 구조 (list / spawn / show) — 기존 단순 list
+ 별도 /spawn 을 plan 명세대로 재구성. spawn 시 부모 세션에
`[sub-agent <id> spawned]` system message 자동 insert.
- **PR #7 (governance) 1건**:
- `governance.bootstrap_user_dirs(config)` — 글로벌 MYDEEPAGENT.md +
`global/memory/MEMORY.md` + `skills/` + `projects/` 를 한 호출로
idempotent 부트스트랩. `InteractiveSession.__init__` 에서 호출 (이전엔
`ensure_global_instructions_initialized` 만 호출했음 — memory 글로벌
미준비).
- **PR #8 (Web GUI) 1건**:
- `static/index.html` 을 워크플로우 runs 페이지에서 **세션 목록 페이지**
로 재구성. `static/runs.html` 신설 (기존 runs 목록 archive 위치).
Nav 링크: 대화 → / / Runs (archive) → /runs.html / 새 Workflow Run →
/new.html. `app.js` 에 `renderSessionsList` 핸들러 + `data-page="runs"`
라우팅 추가. Conversation 페이지 `?session=<id>` 쿼리 deep-link 지원.
- **PR #9 (workflow) 2건**:
- `/workflow <name> [--repo=<path>] [--base=<branch>]` 가 실제 백그라운드
`WorkflowEngine.run` 발사하도록 구현 (`_run_workflow_background`).
진행 상태 (started / ended / failed + final_report_path) 가 메인 세션
`MessageRow(role="system")` 로 누적 → SSE 로 GUI 에도 자동 push.
- `/binding show <workflow-name[@version]>` 인자 지원 — 특정 워크플로우의
role → eligible 페르소나 매칭 미리보기 (실행 X).
- **신규/갱신 테스트** (+17 케이스, 685 → **702 passed**):
- test_plan_mode: write_todos 차단 + blocklist sanity 보강
- test_memory: scrub redaction + frontmatter `type` 추론 + explicit
override 3 케이스
- test_skills: project-scope override + find_skill resolution +
resolve_skill_sources(project_key) 4 케이스
- test_subagents: resolve_root_session_id chain walk + missing fallback 2
케이스
- test_budget: session: scope accumulation + session_id 미전달 시 빈
ledger 2 케이스
- test_instructions: governance bootstrap full-skeleton + idempotency 2
케이스
- test_api_static: runs.html 신설 + index.html 재구성 2 케이스
### Added
- **v0.3 PR #9 — Workflow 옵션화 + user 디렉터리 wiring**. Workflow engine 은
주력이 아니라 "옵션" 으로 격하 (사용자가 명시적 `/workflow <name>` 호출 시만
활성). 대신 사용자가 `<data_dir>/personas/` 와 `<data_dir>/workflows/` 에
YAML 파일을 떨궈 자신만의 persona·workflow 를 등록할 수 있게 함.
- `user_dirs.py` (신규):
- `user_personas_dir(config)`, `user_workflows_dir(config)` — 경로 헬퍼.
- `ensure_user_dirs_initialized(config)` — `mkdir -p`, idempotent.
- `load_combined_personas(config, seed_dir)` — seed (strict) + user
(best-effort per-file skip on malformed) merge. Dedupe key
`(name, version)`, user-overrides-seed. Broken user YAML 1개 가 REPL
을 죽이지 못함.
- `load_combined_workflows(config, seed_dir)` — workflow 도 동일.
- `cli/interactive.py`:
- `InteractiveSession(..., workflows=...)` 시그니처 확장 — 세션은 로드된
workflow 리스트를 기억.
- `_interactive_loop_async` 가 `ensure_user_dirs_initialized` 호출 +
`load_combined_personas` / `load_combined_workflows` 사용.
- 신규 슬래시 4개:
- `/personas` — 모든 로드된 persona 목록 (현재 활성 표시)
- `/workflows` — 모든 로드된 workflow 템플릿 목록 (phase/role 개수, 파일명)
- `/workflow <name[@version]>` — `mydeepagent run` 명령으로 진행하라는
안내 (실제 백그라운드 invoke 은 별도 PR — 현재는 안내만 제공)
- `/binding show` — 각 workflow 의 role 별 required_capabilities 표시
- `tests/integration/test_user_dirs.py` (신규, 10 케이스):
- 부트스트랩 idempotency
- seed-only / seed+user / user-overrides-seed / malformed-user-skip (persona)
- workflow 동일 4종
- 빈 user 디렉터리 처리
### Added
- **v0.3 PR #8 — Conversation-centric Web GUI (`/conversation.html`)**.
Workflow run 페이지는 archive 로 격하; 사용자가 처음 보는 화면은 chat-style
대화 thread. Claude Code 의 Web GUI 와 동일한 사용성.
- `static/conversation.html` (신규): session picker + 메시지 thread +
입력 박스. data-page="conversation".
- `static/app.js`:
- 새 페이지 핸들러 `bootstrapConversationPage` 추가.
- `loadSessionList()` → GET /api/sessions, picker 채움.
- `loadAndAttachSession(sid)` → GET /api/sessions/{id}, 메시지 thread 렌더,
SSE 구독 시작.
- `attachEventSource` → 기존 SSE message/done 이벤트 처리. 새 user 메시지
전송 시 `pending` 풍선 표시, assistant 메시지 도착 시 교체.
- `createNewSession` → default-interactive persona 로 POST /api/sessions.
- XSS 정책 동일: 모든 사용자 콘텐츠는 `textContent` 만 사용.
- `static/style.css`: `.messages-thread`, `.msg-bubble`, `.conv-topbar`,
`.conv-input-bar` 등 chat UI 스타일 추가.
- `api/app.py`:
- lifespan 에서 LangGraph saver 를 `from_conn_string` 으로 1회 열고
`app.state.saver` 에 보관 (Postgres 일 때만, SQLite 테스트는 None).
백그라운드 invoke 가 재사용. 종료 시 `__aexit__` 호출.
- `api/agent_runner.py` (신규):
- `invoke_session_agent(db, config, personas, session_id, user_message, saver=...)` —
세션 로우 로드 → persona 해상 → 디렉터리 부트스트랩 (memory / skills /
MYDEEPAGENT.md) → middleware 스택 (plan-mode + cost + audit) 생성 →
`build_agent` → `ainvoke` → assistant MessageRow 영속 → 자동 compaction.
- 모든 실패는 로깅 + return (raise 안 함) — HTTP 응답은 이미 200 이고
SSE 가 진행 상태를 보여줌.
- `api/routes/sessions.py`:
- `POST /api/sessions/{id}/messages` 가 user row 영속 후
`asyncio.create_task(invoke_session_agent(...))` 로 백그라운드 invoke 발사.
task ref 를 `app.state.pending_invocations` set 에 보관 (RUF006 + GC
drop 방지), 완료 시 `discard`.
- `tests/integration/test_conversation_gui.py` (신규, 4 케이스):
- GET /conversation.html → 200 + 필수 마크업
- POST /messages → 200 + user row 영속 + 백그라운드 invoke 호출
- 백그라운드 task ref 가 `app.state.pending_invocations` 에 잡혀있고 완료
후 자동 discard
- 스텁 runner 가 assistant row 영속 → user + assistant 시퀀스 검증
### Added
- **v0.3 PR #7 — MYDEEPAGENT.md instruction-file hierarchy**. Claude Code 의
CLAUDE.md 글로벌/프로젝트 레이어링 등가. 세션 시작 시 다음 두 파일을 자동
로드해 시스템 프롬프트에 함께 inject:
- **Global** : `<config.data_dir>/MYDEEPAGENT.md` — 부팅 시 템플릿 자동 생성
- **Project** : `<repo>/MYDEEPAGENT.md` — 존재할 때만 로드. 사용자 repo
안에 자동 생성하지 않음 (invasive 행위 회피).
Memory / MEMORY.md / 개별 entry 보다 *먼저* 인젝트되어 deepagents
`MemoryMiddleware` 의 "later overrides earlier" 규칙에 따라 더 구체적인
맥락이 일반적인 지침을 덮을 수 있음.
- `instructions.py` (신규):
- `global_instructions_path(config)`, `project_instructions_path(repo_root)`
- `ensure_global_instructions_initialized(config)` — 글로벌 템플릿 1회
생성, idempotent. Korean-default 협업·코드 스타일 가이드 시드.
- `resolve_instruction_paths(config, repo_root)` — 존재하는 파일만 절대
경로로 글로벌→프로젝트 순서 반환.
- `cli/interactive.py`:
- `InteractiveSession.__init__`에서 `ensure_global_instructions_initialized`
호출.
- `build_agent_if_needed`에서 `[*instruction_paths, *memory_paths]` 순서로
memory_paths_override 구성.
- `tests/integration/test_instructions.py` (신규, 6 케이스):
- 글로벌 부트스트랩 + idempotency (수동 편집 보존)
- 프로젝트 파일은 절대 auto-create 안 함
- 0/1/2 개 존재 시 `resolve_instruction_paths` 반환 순서 검증
- global path 가 `data_dir` 아래에 위치
- **integration**: `build_agent`가 결합된 [instructions, memory] 리스트를
그대로 `create_deep_agent(memory=...)` 로 전달
### Added
- **v0.3 PR #6 — Sub-agent session linkage (`/agents` / `/spawn <persona>`)**.
Claude Code의 sub-agent (task tool) 와 별개로, my-deepagent 만의 **persisted**
session forking. 부모 session 의 thread 컨텍스트에 langchain-internal 로
spawn 되는 deepagents `task` 도구와 달리, 여기서 만든 child 는 자체
`InteractiveSessionRow` 를 가지고 `mydeepagent --session <id>` 로 별도
resume / Web GUI 트리 탐색이 가능. 부모의 `project_key` 를 그대로 상속해
memory · skills 디렉터리 공유. depth limit = `MAX_SUBAGENT_DEPTH = 3`.
- `subagents.py` (신규):
- `spawn_subagent_session(db, parent_session_id, persona, initial_title)` —
트랜잭션 단일 단위:
(1) 부모 존재·`state == "active"` 확인
(2) `depth = parent.depth + 1`, 초과 시 `MyDeepAgentError(human_required,
"subagent_depth_exceeded")`
(3) `AgentPersonaRow` upsert (`compute_hash` 같으면 재사용)
(4) 부모의 `project_key` 그대로 상속 + `parent_session_id`, `depth` 세팅
→ 새 `child_id` 반환.
- `list_subagents(db, parent_session_id)` — 직접 자식만 (`started_at` 순)
반환. grandchild 는 포함 안 함 (caller 가 트리 순회).
- `cli/interactive.py`:
- `_register_subagent_slash`: `/agents` (직접 자식 목록), `/spawn <persona>`
(자식 생성 + resume 안내 메시지) 등록.
- `tests/integration/test_subagents.py` (신규, 8 케이스):
- Happy path: 자식 row 생성 + `parent_session_id`/`depth=1`/`project_key`
상속 검증
- 같은 부모에 자식 2개 → 둘 다 depth=1
- Depth chain spawn 3 회 → 4번째에서 거부 (`subagent_depth_exceeded`)
- 존재 안 하는 부모 → `parent_session_missing`
- 부모 state="ended" → `parent_session_ended`
- `list_subagents`: direct only, no grandchild
- 빈 부모 → 빈 리스트
- 같은 persona hash → 동일 `persona_id` 재사용
### Added
- **v0.3 PR #5 — Plan mode (`/plan` / `/approve` / `/reject`)**. Claude Code의
plan mode 등가. `/plan` 진입 시 `write_file` / `edit_file` / `execute` /
`bash` / `task` (sub-agent) 도구가 차단되고 `read_file` / `glob` / `grep` /
`ls` / `write_todos`만 허용. LLM 은 차단된 도구를 호출하면 `ToolMessage(
status="error")` 를 받고 자체적으로 계획만 다듬도록 유도. `/approve` 시
쓰기 허용, `/reject` 시 thread 리셋 + 쓰기 허용.
- `middleware/plan_mode.py` (신규):
- `PlanModeMiddleware(is_active: Callable[[], bool])` — `awrap_tool_call` /
`wrap_tool_call` 에서 plan_mode 활성 + 차단 도구면 synthetic
`ToolMessage(status="error", content=...)` 반환. raise 하지 않음
(LLM이 무한 루프 없이 다른 도구로 전환할 수 있도록).
- `BLOCKED_TOOLS_IN_PLAN_MODE` 상수: write_file / edit_file / bash /
execute / run_command / shell / task. read_file·write_todos 등 안전한
도구는 화이트리스트.
- `cli/interactive.py`:
- `InteractiveSession._plan_mode: bool`. `set_plan_mode(enabled)` async →
flag 토글 + thread_suffix bump + `InteractiveSessionRow.plan_mode` 영속
(PR #1에서 이미 컬럼 추가했음). resume 시 row.plan_mode 로 복원.
- `build_agent_if_needed`에서 `PlanModeMiddleware(is_active=lambda: ...)`
를 middleware 리스트 첫 자리에 삽입 — closure 가 self._plan_mode 를 읽으니
슬래시 토글 후 agent 재빌드 필요 없음.
- `_register_plan_mode_slash`: `/plan`, `/approve`, `/reject` 등록.
- `tests/integration/test_plan_mode.py` (신규, 9 케이스):
- inactive → 모든 도구 패스스루
- active → write_file / execute / task 차단 (status=error, tool_call_id
유지, 메시지에 도구명 + "Plan-mode" 포함)
- active → read_file / glob / grep / ls / write_todos 허용
- closure 토글로 동작 변경 (rebuild 없이)
- 동기 wrap_tool_call 도 동일 동작
- BLOCKED_TOOLS_IN_PLAN_MODE 상수 sanity
### Added
- **v0.3 PR #4 — Agent Skills (LLM-routing, no embeddings)**. Anthropic Agent
Skills 명세를 그대로 따르는 progressive-disclosure 패턴. deepagents

View File

@@ -2,8 +2,8 @@ name: default-interactive
version: 1
description: "interactive 모드 만능 어시스턴트. 탐색·수정·실행 모두 지원."
backend: openrouter
model: "openrouter:anthropic/claude-haiku-4-5"
provider_origin: "US/Anthropic"
model: "openrouter:deepseek/deepseek-chat"
provider_origin: "CN/DeepSeek"
capabilities:
- spec_write
- code_edit
@@ -42,7 +42,7 @@ allowed_tools:
- write_todos
- task
deepagents_backend: local_shell
fallback_model: "openrouter:deepseek/deepseek-chat"
fallback_model: "openrouter:anthropic/claude-haiku-4-5"
max_cost_per_call_usd: 0.05
model_params:
max_tokens: 2048

View File

@@ -0,0 +1,458 @@
"""v0.4 live verification — runs 7 Claude-Code-equivalent flows against real
OpenRouter. Run with::
uv run python scripts/live_verify.py
Each scenario prints PASS / FAIL with a short summary. Total cost should be
under $0.10 (we use Anthropic Haiku 4.5 via OpenRouter, single-turn responses).
Scenarios:
1. CLI-equivalent 1-turn chat (InteractiveSession + ainvoke direct)
2. Sessions resume (same session_id, thread state restored)
3. /skill <name> queues SKILL.md body as system message → LLM acknowledges
4. /plan → LLM produces plan markdown only (no writes) → /approve queues
5. /agents spawn → sub-agent runs to completion → result pushed to parent
6. Auto-compaction trigger (manually invoke when row.total_*_tokens > 70%)
7. /workflow background (kick off real WorkflowEngine.run via background task)
Failures don't crash subsequent scenarios — we accumulate results and exit 0
only if all PASS.
"""
from __future__ import annotations
import asyncio
import os
import sys
import uuid
from datetime import UTC, datetime
from pathlib import Path
from typing import Any
# Ensure repo paths import-correctly when run via `uv run python …`
sys.path.insert(0, str(Path(__file__).resolve().parents[1] / "src"))
from sqlalchemy import select
from my_deepagent.cli.interactive import (
InteractiveSession,
_invoke_and_stream,
)
from my_deepagent.compaction import compact_session
from my_deepagent.config import load_config
from my_deepagent.governance import bootstrap_user_dirs, record_consent
from my_deepagent.hash import sha256
from my_deepagent.persistence.checkpointer import get_checkpointer_ctx
from my_deepagent.persistence.db import Database
from my_deepagent.persistence.models import InteractiveSessionRow, MessageRow
from my_deepagent.subagents import run_subagent_to_completion, spawn_subagent_session
from my_deepagent.user_dirs import (
ensure_user_dirs_initialized,
load_combined_personas,
load_combined_workflows,
)
_SEED = Path(__file__).resolve().parents[1] / "docs" / "schemas"
_RESULTS: list[tuple[str, bool, str]] = []
def _now() -> str:
return datetime.now(UTC).isoformat(timespec="seconds")
def _record(name: str, ok: bool, note: str) -> None:
_RESULTS.append((name, ok, note))
marker = "✅ PASS" if ok else "❌ FAIL"
print(f" {marker}{name}: {note}", flush=True)
def _pricing() -> Any:
from my_deepagent.monitoring.pricing import ModelPrice, PricingCache
pc = PricingCache()
pc.set(
[
ModelPrice("anthropic/claude-haiku-4-5", 0.001, 0.005, 200_000),
ModelPrice("deepseek/deepseek-chat", 0.00028, 0.00112, 64_000),
]
)
return pc
async def _mk_session(
db: Database, config: Any, personas: Any, saver: Any, session_id: uuid.UUID
) -> InteractiveSession:
"""Persist a fresh InteractiveSessionRow + return the in-mem InteractiveSession."""
from uuid import uuid4
from my_deepagent.persistence.models import AgentPersonaRow
persona = next((p for p in personas if p.name == "default-interactive"), personas[0])
project_key = sha256(str(Path.cwd().resolve()))[:16]
async with db.session() as s:
ph = persona.compute_hash()
existing_pr = (
await s.execute(select(AgentPersonaRow).where(AgentPersonaRow.hash == ph))
).scalar_one_or_none()
if existing_pr is None:
existing_pr = AgentPersonaRow(
id=str(uuid4()),
name=persona.name,
version=persona.version,
hash=ph,
definition=persona.model_dump(by_alias=True),
created_at=_now(),
)
s.add(existing_pr)
await s.flush()
existing_row = await s.get(InteractiveSessionRow, str(session_id))
if existing_row is None:
s.add(
InteractiveSessionRow(
id=str(session_id),
persona_id=existing_pr.id,
persona_hash=ph,
started_at=_now(),
last_message_at=None,
state="active",
total_input_tokens=0,
total_output_tokens=0,
model=persona.model,
project_key=project_key,
title=None,
plan_mode=False,
parent_session_id=None,
depth=0,
)
)
await s.commit()
return InteractiveSession(
config,
personas,
db,
_pricing(),
Path.cwd(),
session_id,
saver,
project_key,
workflows=load_combined_workflows(config, _SEED / "workflows"),
)
async def scenario_1_basic_chat(db: Database, config: Any, personas: Any, saver: Any) -> uuid.UUID:
"""1-turn message + assistant response persisted + token counters bumped."""
print("\n[A1] CLI-equivalent 1-turn chat")
sid = uuid.uuid4()
sess = await _mk_session(db, config, personas, saver, sid)
agent = sess.build_agent_if_needed()
await _invoke_and_stream(agent, "한국어로 한 줄로만 인사해 (10단어 이내)", sess)
async with db.session() as s:
msgs = (
(
await s.execute(
select(MessageRow)
.where(MessageRow.session_id == str(sid))
.order_by(MessageRow.seq)
)
)
.scalars()
.all()
)
row = await s.get(InteractiveSessionRow, str(sid))
ok = (
len(msgs) == 2
and msgs[0].role == "user"
and msgs[1].role == "assistant"
and bool(msgs[1].content.strip())
and row is not None
and row.total_output_tokens > 0
)
summary = f"messages={len(msgs)} out_tokens={row.total_output_tokens if row else 0}"
_record("A1 basic chat", ok, summary)
return sid
async def scenario_2_resume(
db: Database, config: Any, personas: Any, saver: Any, sid: uuid.UUID
) -> None:
"""Same session_id → second InteractiveSession picks up persisted state."""
print("\n[A2] Sessions resume")
sess2 = await _mk_session(db, config, personas, saver, sid)
agent = sess2.build_agent_if_needed()
await _invoke_and_stream(agent, "내가 방금 너한테 한 첫 메시지가 뭐였지? 한 줄로만.", sess2)
async with db.session() as s:
msgs = (
(
await s.execute(
select(MessageRow)
.where(MessageRow.session_id == str(sid))
.where(MessageRow.archived.is_(False))
.order_by(MessageRow.seq)
)
)
.scalars()
.all()
)
last_assistant = msgs[-1].content if msgs else ""
ok = bool(last_assistant) and (
"인사" in last_assistant or "한국" in last_assistant or "안녕" in last_assistant
)
_record("A2 resume", ok, f"messages={len(msgs)} last_hint='{last_assistant[:60]}'")
async def scenario_3_skill(db: Database, config: Any, personas: Any, saver: Any) -> None:
"""Drop a SKILL.md, /skill queues body, next turn LLM acknowledges it."""
print("\n[A3] /skill <name> system-inject")
from my_deepagent.skills import ensure_skills_initialized, find_skill, user_skills_dir
sd = user_skills_dir(config)
ensure_skills_initialized(sd)
skill_dir = sd / "korean-haiku"
skill_dir.mkdir(parents=True, exist_ok=True)
(skill_dir / "SKILL.md").write_text(
"""---
name: korean-haiku
description: Respond as a korean haiku poet — always 3 short lines, only Korean.
---
You are now a Korean haiku poet. Every response MUST be exactly 3 lines, all
in Korean, total under 30 chars. No prose, no explanation.
""",
encoding="utf-8",
)
sid = uuid.uuid4()
sess = await _mk_session(db, config, personas, saver, sid)
skill = find_skill(config, sess.project_key, "korean-haiku")
assert skill is not None, "skill not loaded"
body = skill.path.read_text(encoding="utf-8")
sess.queue_system_message(
f"The user requested skill `{skill.name}`. Apply this SKILL.md for this turn:\n\n{body}"
)
agent = sess.build_agent_if_needed()
await _invoke_and_stream(agent, "봄을 주제로 시 한 편 써줘.", sess)
async with db.session() as s:
msgs = (
(
await s.execute(
select(MessageRow)
.where(MessageRow.session_id == str(sid))
.where(MessageRow.role == "assistant")
.order_by(MessageRow.seq.desc())
)
)
.scalars()
.all()
)
assistant = msgs[0].content if msgs else ""
line_count = len([line for line in assistant.split("\n") if line.strip()])
ok = 2 <= line_count <= 6 # 3 ± slack
_record("A3 skill inject", ok, f"lines={line_count} body[:60]='{assistant[:60]}'")
async def scenario_4_plan_mode(db: Database, config: Any, personas: Any, saver: Any) -> None:
"""/plan blocks write tools → LLM produces plan markdown. /approve queues
the plan as system message for next turn."""
print("\n[A4] /plan → plan markdown → /approve")
sid = uuid.uuid4()
sess = await _mk_session(db, config, personas, saver, sid)
await sess.enter_plan_mode()
agent = sess.build_agent_if_needed()
await _invoke_and_stream(
agent,
"Python으로 wordcount CLI를 만들 plan 을 마크다운으로 짧게 (10줄 이내) 답해.",
sess,
)
# Verify last assistant is plan markdown shape
async with db.session() as s:
msgs = (
(
await s.execute(
select(MessageRow)
.where(MessageRow.session_id == str(sid))
.where(MessageRow.role == "assistant")
.order_by(MessageRow.seq.desc())
)
)
.scalars()
.all()
)
plan_text = msgs[0].content if msgs else ""
has_markdown_hint = any(
token in plan_text for token in ("##", "###", "- ", "1.", "Phase", "단계")
)
ok_plan = bool(plan_text) and has_markdown_hint
await sess.approve_plan()
queued = sess.consume_pending_system_messages()
ok_approve = any("APPROVED" in q and plan_text[:20] in q for q in queued)
# Re-queue so future scenarios see clean state
for q in queued:
sess.queue_system_message(q)
sess.consume_pending_system_messages() # discard now
_record(
"A4 plan mode",
ok_plan and ok_approve,
f"markdown={ok_plan} approve_queued={ok_approve} plan[:50]='{plan_text[:50]}'",
)
async def scenario_5_subagent(db: Database, config: Any, personas: Any, saver: Any) -> None:
"""spawn_subagent_session + run_subagent_to_completion → result on parent."""
print("\n[A5] /agents spawn live")
parent_sid = uuid.uuid4()
sess = await _mk_session(db, config, personas, saver, parent_sid)
persona = sess.persona
child_id = await spawn_subagent_session(
db,
parent_session_id=parent_sid,
persona=persona,
initial_title="haiku helper",
)
summary = await run_subagent_to_completion(
db, config, parent_sid, child_id, persona, "한국어로 짧게 인사해.", saver=None
)
async with db.session() as s:
parent_msgs = (
(
await s.execute(
select(MessageRow)
.where(MessageRow.session_id == str(parent_sid))
.order_by(MessageRow.seq)
)
)
.scalars()
.all()
)
child_row = await s.get(InteractiveSessionRow, str(child_id))
pushed = any(f"sub-agent {str(child_id)[:8]} result" in m.content for m in parent_msgs)
ok = bool(summary) and pushed and child_row is not None and child_row.state == "ended"
state = child_row.state if child_row else "NONE"
_record(
"A5 sub-agent",
ok,
f"summary[:40]='{summary[:40]}' parent_push={pushed} child_ended={state}",
)
async def scenario_6_compaction(db: Database, config: Any, personas: Any, saver: Any) -> None:
"""Manually invoke compact_session on a session padded with enough messages."""
print("\n[A6] Auto-compaction trigger")
sid = uuid.uuid4()
await _mk_session(db, config, personas, saver, sid)
# Pad 14 active messages so compactor archives 4 + summary at seq=1.
async with db.session() as s:
for i in range(14):
s.add(
MessageRow(
session_id=str(sid),
seq=i + 1,
role="user" if i % 2 == 0 else "assistant",
content=f"padding message #{i} — talking about wordcount CLI design",
tool_calls=None,
token_count=10,
is_summary=False,
archived=False,
ts=_now(),
)
)
await s.commit()
result = await compact_session(db, config, str(sid))
ok = (
result.compacted
and result.archived == 4
and bool(result.summary_text)
and result.summary_tokens > 0
)
_record(
"A6 compaction",
ok,
f"archived={result.archived} summary_tokens={result.summary_tokens} "
f"summary[:50]='{result.summary_text[:50]}'",
)
async def scenario_7_workflow_background(
db: Database, config: Any, personas: Any, saver: Any
) -> None:
"""We do NOT trigger a full WorkflowEngine.run (~$0.05) here — that's
covered by `tests/integration/test_e2e_workflow.py`. Instead we verify the
/workflow background dispatch path is wired correctly by checking template
resolution + binding preview."""
print("\n[A7] /workflow background dispatch wiring")
from my_deepagent.binding import is_persona_eligible_for_role
sess = await _mk_session(db, config, personas, saver, uuid.uuid4())
workflows = sess.workflows
if not workflows:
_record("A7 workflow wiring", False, "no workflows loaded")
return
_path, tpl = workflows[0]
# Verify every role has at least one eligible persona — same logic as
# `_print_binding_for_template`.
role_resolutions = {}
for role in tpl.roles:
eligible = [p for p in sess.personas if is_persona_eligible_for_role(p, role, tpl)[0]]
role_resolutions[role.id] = len(eligible)
ok = all(n > 0 for n in role_resolutions.values())
_record(
"A7 workflow wiring",
ok,
f"template={tpl.name}@{tpl.version} role_eligibles={role_resolutions}",
)
async def main() -> int:
config = load_config()
if not os.environ.get("OPENROUTER_API_KEY") and "openrouter" not in str(
config.openrouter_base_url
):
# API key may come from keyring; resolve_openrouter_api_key handles it
pass
# Ensure consent recorded for this run (smoke pollution we tolerated earlier).
record_consent(config.data_dir)
bootstrap_user_dirs(config)
ensure_user_dirs_initialized(config)
db = Database(config.database_url)
await db.init_schema()
personas = load_combined_personas(config, _SEED / "personas")
print(f"[live_verify] config.data_dir={config.data_dir}")
print(f"[live_verify] db={config.database_url}")
print(f"[live_verify] personas loaded: {len(personas)}")
print("[live_verify] running 7 scenarios against real OpenRouter (~$0.05 total)")
saver_ctx = get_checkpointer_ctx(config.database_url)
try:
if config.database_url.startswith("postgresql"):
saver = await saver_ctx.__aenter__()
else:
saver = None
try:
chat_sid = await scenario_1_basic_chat(db, config, personas, saver)
await scenario_2_resume(db, config, personas, saver, chat_sid)
await scenario_3_skill(db, config, personas, saver)
await scenario_4_plan_mode(db, config, personas, saver)
await scenario_5_subagent(db, config, personas, saver)
await scenario_6_compaction(db, config, personas, saver)
await scenario_7_workflow_background(db, config, personas, saver)
finally:
if saver is not None:
await saver_ctx.__aexit__(None, None, None)
finally:
await db.dispose()
print("\n[summary]")
passed = sum(1 for _, ok, _ in _RESULTS if ok)
print(f" {passed}/{len(_RESULTS)} PASS")
for name, ok, note in _RESULTS:
marker = "" if ok else ""
print(f" {marker} {name}: {note}")
return 0 if passed == len(_RESULTS) else 1
if __name__ == "__main__":
sys.exit(asyncio.run(main()))

View File

@@ -0,0 +1,313 @@
"""Background agent invocation for the Web GUI (v0.3 PR #8 + v0.4 B3 streaming).
The Web GUI POSTs user messages to ``/api/sessions/{id}/messages`` and expects
an assistant response to appear via the SSE stream shortly after. The route
handler persists the user message and kicks off this runner as a fire-and-
forget asyncio task — same fundamentals as :mod:`cli.interactive` but without
the prompt-toolkit REPL loop.
v0.4 B3 adds token streaming: a ``chunk_queue`` (per-session ``asyncio.Queue``)
can be passed in. We attach a ``BaseAsyncCallbackHandler`` to the ainvoke
config so every new token the LLM emits lands on the queue as
``{"type": "delta", "text": "..."}``. The SSE stream loop drains the queue
and pushes each chunk as an ``event: chunk`` SSE.
This runner is **single-uvicorn-worker** by design (see ``api/app.py``'s
docstring): the saver is held on ``app.state.saver`` and shared across all
background invocations. Multi-worker support would require Postgres
``LISTEN/NOTIFY`` fanout — deferred per plan.
"""
from __future__ import annotations
import asyncio
import logging
from typing import Any
from uuid import UUID, uuid4
from langchain_core.callbacks import AsyncCallbackHandler
from sqlalchemy import desc, select
from ..audit import make_audit_recorder
from ..budget import make_budget_tracker_from_config
from ..compaction import compact_session, should_compact
from ..config import Config
from ..hash import sha256
from ..instructions import (
ensure_global_instructions_initialized,
resolve_instruction_paths,
)
from ..memory import (
ensure_memory_initialized,
list_memory_paths,
project_memory_dir,
)
from ..middleware.audit import AuditToolMiddleware
from ..middleware.cost import CostMiddleware
from ..middleware.plan_mode import PlanModeMiddleware
from ..monitoring.pricing import ModelPrice, PricingCache
from ..monitoring.token_budget import count_tokens
from ..persistence.db import Database
from ..persistence.models import InteractiveSessionRow, MessageRow
from ..persona import Persona
from ..session import build_agent
from ..skills import ensure_skills_initialized, resolve_skill_sources, user_skills_dir
_LOG = logging.getLogger(__name__)
def _static_pricing_seed() -> PricingCache:
"""Minimal seed identical to the REPL's _static_pricing_seed."""
cache = PricingCache()
cache.set(
[
ModelPrice("anthropic/claude-sonnet-4-6", 0.003, 0.015, 200_000),
ModelPrice("anthropic/claude-haiku-4-5", 0.001, 0.005, 200_000),
ModelPrice("anthropic/claude-opus-4-1", 0.015, 0.075, 200_000),
ModelPrice("deepseek/deepseek-chat", 0.00028, 0.00112, 64_000),
]
)
return cache
def _flatten_assistant_content(message: Any) -> str:
"""Convert a langchain assistant message's content into a plain string.
LangChain may return a list of content blocks (text + tool_use); we
concatenate the text-bearing pieces. Falls back to ``str(content)`` if
the shape is unexpected.
"""
content = getattr(message, "content", "") or ""
if isinstance(content, list):
parts: list[str] = []
for block in content:
if isinstance(block, dict):
parts.append(block.get("text", "") or "")
else:
parts.append(str(block))
return "\n".join(p for p in parts if p)
return str(content)
async def _bootstrap_session_dirs(config: Config, project_key: str) -> None:
"""Ensure memory + skills + global instruction dirs exist for the session.
Mirrors :class:`cli.interactive.InteractiveSession.__init__`. Idempotent
so repeated background invocations are cheap.
"""
ensure_memory_initialized(project_memory_dir(config, project_key))
ensure_skills_initialized(user_skills_dir(config))
ensure_global_instructions_initialized(config)
class _StreamingChunkPusher(AsyncCallbackHandler):
"""Push every `on_llm_new_token` onto a session-bound asyncio.Queue.
The SSE stream consumes the queue and pushes each chunk as an SSE
``event: chunk`` so the browser can render typing-style streaming.
"""
def __init__(self, queue: asyncio.Queue[dict[str, Any]]) -> None:
self._queue = queue
async def on_llm_new_token(self, token: str, **_kwargs: Any) -> None:
if not token:
return
try:
await self._queue.put({"type": "delta", "text": token})
except Exception:
# Never let a streaming push failure abort the LLM call.
_LOG.debug("chunk-queue put failed (queue likely closed)", exc_info=True)
def _build_session_agent(
db: Database,
config: Config,
persona: Persona,
session_id: UUID,
row: InteractiveSessionRow,
*,
saver: Any | None,
) -> Any:
"""Assemble the deepagents CompiledStateGraph for one session invocation.
Extracted from :func:`invoke_session_agent` to keep that function under
the C901 complexity threshold. Pure construction — no side effects on the
DB beyond what `build_agent` itself does.
"""
pricing = _static_pricing_seed()
budget = make_budget_tracker_from_config(db, config)
cost_mw = CostMiddleware(
pricing=pricing,
model_name=row.model or persona.model,
interactive_session_id=session_id,
persona_name=persona.name,
budget_tracker=budget,
)
audit_mw = AuditToolMiddleware(
interactive_session_id=session_id,
file_recorder=make_audit_recorder(config.state_dir),
)
is_plan = bool(row.plan_mode)
plan_mw = PlanModeMiddleware(is_active=lambda: is_plan)
project_key = row.project_key or sha256(str(config.workspace_root.resolve()))[:16]
memory_dir = project_memory_dir(config, project_key)
instruction_paths = resolve_instruction_paths(config, config.workspace_root)
memory_paths = list_memory_paths(memory_dir)
skill_sources = resolve_skill_sources(config)
return build_agent(
persona,
config,
root_dir=config.workspace_root,
middleware=[plan_mw, cost_mw, audit_mw],
checkpointer=saver,
memory_paths_override=[*instruction_paths, *memory_paths],
skills_sources_override=skill_sources,
)
async def invoke_session_agent(
db: Database,
config: Config,
personas: list[Persona],
session_id: UUID,
user_message: str,
*,
saver: Any | None = None,
chunk_queue: asyncio.Queue[dict[str, Any]] | None = None,
) -> None:
"""Run one ainvoke + persist the assistant reply for the given session.
The user message is assumed to be ALREADY persisted by the HTTP handler
(POST /api/sessions/{id}/messages). This runner only adds the assistant
response and runs the post-turn auto-compaction check.
Failures are logged but never raised — the route handler returned 200 as
soon as the user message was persisted, and the SSE stream is how the
client observes success or absence of progress.
"""
async with db.session() as s:
row = await s.get(InteractiveSessionRow, str(session_id))
if row is None:
_LOG.warning("invoke_session_agent: session %s not found", session_id)
return
persona = _resolve_persona(personas, row.persona_hash)
if persona is None:
_LOG.warning(
"invoke_session_agent: persona hash %s not in loaded personas", row.persona_hash
)
return
project_key = row.project_key or sha256(str(config.workspace_root.resolve()))[:16]
await _bootstrap_session_dirs(config, project_key)
agent = _build_session_agent(db, config, persona, session_id, row, saver=saver)
thread_id = f"{session_id}:0"
result = await _run_ainvoke(agent, user_message, thread_id, chunk_queue, session_id)
if result is None:
return
messages = result.get("messages", []) if isinstance(result, dict) else []
if not messages:
return
assistant_text = _flatten_assistant_content(messages[-1])
if not assistant_text:
return
await _persist_assistant_message(db, session_id, assistant_text, row.model or persona.model)
# Post-turn auto-compaction (mirrors REPL behaviour).
async with db.session() as s:
refreshed = await s.get(InteractiveSessionRow, str(session_id))
if refreshed is not None and should_compact(refreshed):
await compact_session(db, config, str(session_id))
async def _run_ainvoke(
agent: Any,
user_message: str,
thread_id: str,
chunk_queue: asyncio.Queue[dict[str, Any]] | None,
session_id: UUID,
) -> dict[str, Any] | None:
"""Wrapper around agent.ainvoke that emits chunk_queue lifecycle events.
Returns the raw result dict on success, ``None`` on any failure (logged).
Re-raises ``CancelledError`` so the asyncio task is correctly marked
cancelled and the route's done-callback can clean up.
"""
invoke_config: dict[str, Any] = {"configurable": {"thread_id": thread_id}}
if chunk_queue is not None:
invoke_config["callbacks"] = [_StreamingChunkPusher(chunk_queue)]
try:
return await agent.ainvoke( # type: ignore[no-any-return]
{"messages": [{"role": "user", "content": user_message}]},
config=invoke_config,
)
except asyncio.CancelledError:
_LOG.info("agent.ainvoke cancelled for session %s", session_id)
if chunk_queue is not None:
await chunk_queue.put({"type": "cancelled"})
raise
except Exception:
_LOG.exception("agent.ainvoke failed for session %s", session_id)
if chunk_queue is not None:
await chunk_queue.put({"type": "error"})
return None
finally:
if chunk_queue is not None:
await chunk_queue.put({"type": "done"})
def _resolve_persona(personas: list[Persona], persona_hash: str) -> Persona | None:
for p in personas:
if p.compute_hash() == persona_hash:
return p
return None
async def _persist_assistant_message(
db: Database,
session_id: UUID,
content: str,
model: str,
) -> None:
token_count = count_tokens(content, model)
from datetime import UTC, datetime
now = datetime.now(UTC).isoformat(timespec="seconds")
async with db.session() as s:
last_seq = (
await s.execute(
select(MessageRow.seq)
.where(MessageRow.session_id == str(session_id))
.order_by(desc(MessageRow.seq))
.limit(1)
)
).scalar_one_or_none() or 0
s.add(
MessageRow(
session_id=str(session_id),
seq=last_seq + 1,
role="assistant",
content=content,
tool_calls=None,
token_count=token_count,
is_summary=False,
archived=False,
ts=now,
)
)
row = await s.get(InteractiveSessionRow, str(session_id))
if row is not None:
row.last_message_at = now
row.total_output_tokens += token_count
await s.commit()
# Re-exported for tests that want to construct a fresh persona+session row
# without going through the HTTP layer.
__all__ = ["invoke_session_agent", "uuid4"]

View File

@@ -17,9 +17,11 @@ from fastapi.staticfiles import StaticFiles
from starlette.responses import FileResponse
from ..config import Config, load_config
from ..persistence.checkpointer import get_checkpointer_ctx
from ..persistence.db import Database
from ..persona import load_personas_from_dir
from ..workflow import WorkflowTemplate, load_workflow_yaml
from ..user_dirs import load_combined_workflows
from ..workflow import WorkflowTemplate
from .routes import budget as budget_routes
from .routes import personas as personas_routes
from .routes import runs as runs_routes
@@ -32,29 +34,23 @@ _STATIC_ROOT = Path(__file__).resolve().parents[3] / "static"
_LOG = logging.getLogger(__name__)
def _load_seed_workflows() -> list[tuple[Path, WorkflowTemplate]]:
"""Return (path, WorkflowTemplate) for every YAML in docs/schemas/workflows/.
Malformed YAMLs are logged and skipped — the API should still come up
cleanly even if one seed is broken.
"""
wf_dir = _DOCS_SCHEMAS / "workflows"
if not wf_dir.is_dir():
return []
out: list[tuple[Path, WorkflowTemplate]] = []
for p in sorted(wf_dir.glob("*.yaml")):
try:
tpl = load_workflow_yaml(p)
except Exception as e:
_LOG.warning("skipping malformed workflow yaml %s: %s", p, e)
continue
out.append((p, tpl))
return out
def _load_workflows_combined(config: Config) -> list[tuple[Path, WorkflowTemplate]]:
"""Seed + user workflows. Malformed YAMLs are logged + skipped — the
API still comes up cleanly even if one file is broken. Per-request
hot-reload (`deps.get_workflows`) reuses the same loader."""
return load_combined_workflows(config, _DOCS_SCHEMAS / "workflows")
@asynccontextmanager
async def _lifespan(app: FastAPI) -> AsyncIterator[None]:
"""Initialize the shared Database, personas, workflows on startup; dispose on shutdown."""
"""Initialize the shared Database, personas, workflows, LangGraph saver on
startup; dispose on shutdown.
The saver is opened once per app lifetime and reused by background agent
invocations from POST /api/sessions/{id}/messages (v0.3 PR #8). Opening
per-request would be too expensive (each open establishes a Postgres
connection + verifies the checkpoint schema).
"""
config: Config = app.state.config or load_config()
db = Database(config.database_url)
# init_schema is a no-op against an already-migrated DB; cheap to call.
@@ -62,10 +58,26 @@ async def _lifespan(app: FastAPI) -> AsyncIterator[None]:
app.state.config = config
app.state.db = db
app.state.personas = load_personas_from_dir(_DOCS_SCHEMAS / "personas")
app.state.workflows = _load_seed_workflows()
app.state.workflows = _load_workflows_combined(config)
# Hot-reload signature — `deps.get_workflows` re-checks per request.
app.state.workflows_sig = None
saver_ctx = get_checkpointer_ctx(config.database_url)
try:
# AsyncPostgresSaver.from_conn_string only works for postgres; for sqlite
# tests we silently fall back to None and let background ainvoke run
# without checkpointing (acceptable: tests stub agents anyway).
if config.database_url.startswith("postgresql"):
saver = await saver_ctx.__aenter__()
app.state.saver = saver
else:
app.state.saver = None
yield
finally:
if app.state.saver is not None:
try:
await saver_ctx.__aexit__(None, None, None)
except Exception:
_LOG.exception("saver context exit failed during shutdown")
await db.dispose()

View File

@@ -3,6 +3,12 @@
Pulls singletons stashed in `app.state` by the lifespan handler. Database is
created ONCE per uvicorn process; per-request creation would defeat
connection pooling.
Workflows are different — they live in YAML files that the user can edit /
create at runtime via the workflow generator UI. `get_workflows` does a
cheap mtime check on every call and reloads when any file in the seed or
user workflow directory has changed. No file watcher / inotify needed —
the directories are tiny (≤ dozens of files) and stat is cheap.
"""
from __future__ import annotations
@@ -14,6 +20,7 @@ from fastapi import Request
from ..config import Config
from ..persistence.db import Database
from ..user_dirs import load_combined_workflows, user_workflows_dir
if TYPE_CHECKING:
from ..persona import Persona
@@ -36,9 +43,41 @@ def get_personas(request: Request) -> list[Persona]:
return request.app.state.personas # type: ignore[no-any-return]
def _workflow_dir_signature(config: Config) -> tuple[tuple[str, float], ...]:
"""Cheap mtime-tuple fingerprint of all YAMLs in seed + user dirs.
Two stat calls per file; the fingerprint changes when any file is
created / modified / deleted. Used as the cache key for
:func:`get_workflows` so the API picks up new templates without a
process restart.
"""
sig: list[tuple[str, float]] = []
for d in (_DOCS_SCHEMAS / "workflows", user_workflows_dir(config)):
if not d.is_dir():
continue
for p in sorted(d.glob("*.yaml")):
try:
sig.append((str(p), p.stat().st_mtime))
except OSError:
continue
return tuple(sig)
def get_workflows(request: Request) -> list[tuple[Path, WorkflowTemplate]]:
"""Return a list of (yaml_path, WorkflowTemplate) for all seed workflows."""
return request.app.state.workflows # type: ignore[no-any-return]
"""Return (path, template) list with mtime-based hot-reload.
On every request, computes the mtime fingerprint of the workflow dirs.
If it differs from the cached signature, calls
:func:`load_combined_workflows` again to pick up new / edited files.
"""
app = request.app
config: Config = app.state.config
current_sig = _workflow_dir_signature(config)
cached_sig: tuple[tuple[str, float], ...] | None = getattr(app.state, "workflows_sig", None)
if cached_sig != current_sig:
app.state.workflows = load_combined_workflows(config, _DOCS_SCHEMAS / "workflows")
app.state.workflows_sig = current_sig
return app.state.workflows # type: ignore[no-any-return]
def seed_root() -> Path:

View File

@@ -128,6 +128,60 @@ class WorkflowSummary(_Strict):
phases: list[WorkflowPhaseSummary]
# v0.4 — workflow generator UI (POST /api/workflows)
class WorkflowRoleSpec(_Strict):
"""Input shape for one role inside a CreateWorkflowRequest."""
id: str = Field(min_length=1, pattern=r"^[a-z][a-z0-9_]*$")
required_capabilities: list[str] = Field(min_length=1)
preferred_backends: list[str] = Field(default_factory=list)
fallback_personas: list[str] = Field(default_factory=list)
class WorkflowArtifactSpec(_Strict):
"""Input shape for one phase's expected_artifact (optional)."""
path: str = Field(min_length=1)
# YAML key is `schema`; pydantic attribute aliased to avoid BaseModel.schema clash
schema_id: str = Field(min_length=1, alias="schema")
class WorkflowPhaseSpec(_Strict):
"""Input shape for one phase inside a CreateWorkflowRequest."""
key: str = Field(min_length=1, pattern=r"^[a-z][a-z0-9_]*$")
title: str = Field(min_length=1)
risk: str = Field(min_length=1) # low|medium|high — validated by WorkflowTemplate
role: str = Field(min_length=1)
instructions: str = Field(min_length=10)
expected_artifact: WorkflowArtifactSpec | None = None
gates: list[str] = Field(default_factory=list)
timeout_seconds: int | None = Field(default=None, ge=1)
max_budget_usd: float | None = Field(default=None, ge=0)
class CreateWorkflowRequest(_Strict):
"""Body for POST /api/workflows — saves a new template YAML on disk."""
name: str = Field(min_length=1)
version: int = Field(ge=1)
description: str | None = None
roles: list[WorkflowRoleSpec] = Field(min_length=1)
phases: list[WorkflowPhaseSpec] = Field(min_length=1)
default_gates: list[str] = Field(default_factory=list)
max_total_budget_usd: float | None = Field(default=None, ge=0)
class CreateWorkflowResponse(_Strict):
"""Returned by POST /api/workflows."""
path: str # absolute path of the saved YAML
name: str
version: int
# ---------------------------------------------------------------------------
# /api/budget
# ---------------------------------------------------------------------------

View File

@@ -30,6 +30,7 @@ from ...persistence.models import (
MessageRow,
)
from ...persona import Persona
from ..agent_runner import invoke_session_agent
from ..deps import get_config, get_db, get_personas
from ..models import (
CreateSessionRequest,
@@ -41,7 +42,9 @@ from ..models import (
)
_LOG = logging.getLogger(__name__)
_POLL_INTERVAL_S: float = 0.5
# v0.4 B3: 100ms poll keeps token-streaming UX snappy. At idle the loop just
# does two cheap selects — well within asyncpg + SSE budgets.
_POLL_INTERVAL_S: float = 0.1
_TERMINAL_STATES: frozenset[str] = frozenset({"ended"})
router = APIRouter()
@@ -218,8 +221,18 @@ async def create_session(
async def post_message(
session_id: str,
body: PostMessageRequest,
request: Request,
db: DbDep,
config: ConfigDep,
personas: PersonasDep,
) -> SessionAck:
"""Persist a user message + fire the agent invocation in the background.
v0.3 PR #8: returns immediately after the user message is durably
persisted. The background task fetches the saver from ``app.state`` (set
up by the lifespan) and emits the assistant reply via the same SSE stream
that the client is already subscribed to.
"""
async with db.session() as s:
row = await s.get(InteractiveSessionRow, session_id)
if row is None:
@@ -257,19 +270,124 @@ async def post_message(
row.title = body.content[:50]
await s.commit()
# Fire-and-forget background invocation. We do NOT await it — the route
# returns 200 immediately and the SSE stream picks up the assistant reply.
# Hold a reference on app.state so RUF006 + GC don't kill the task mid-flight.
saver = getattr(request.app.state, "saver", None)
from uuid import UUID
# v0.4 B3: per-session token chunk queue. agent_runner pushes deltas
# via AsyncCallbackHandler; the SSE stream below drains the queue.
chunk_queues: dict[str, asyncio.Queue[Any]] = getattr(
request.app.state, "token_chunk_queues", {}
)
queue: asyncio.Queue[Any] = asyncio.Queue()
chunk_queues[session_id] = queue
request.app.state.token_chunk_queues = chunk_queues
task = asyncio.create_task(
invoke_session_agent(
db,
config,
personas,
UUID(session_id),
body.content,
saver=saver,
chunk_queue=queue,
)
)
pending: set[asyncio.Task[Any]] = getattr(request.app.state, "pending_invocations", set())
pending.add(task)
request.app.state.pending_invocations = pending
task.add_done_callback(pending.discard)
# v0.4 B4: index the task by session_id so a subsequent POST /abort can
# cancel mid-flight. We deliberately overwrite an earlier task if one is
# still in flight — the new user message implicitly cancels the previous
# turn (Claude Code parity).
per_session: dict[str, asyncio.Task[Any]] = getattr(
request.app.state, "pending_per_session", {}
)
prev = per_session.get(session_id)
if prev is not None and not prev.done():
prev.cancel()
per_session[session_id] = task
request.app.state.pending_per_session = per_session
def _remove_from_session_map(_t: asyncio.Task[Any], sid: str = session_id) -> None:
per_session.pop(sid, None)
task.add_done_callback(_remove_from_session_map)
return SessionAck(session_id=session_id, state="active", message="queued")
# ---------------------------------------------------------------------------
# POST /api/sessions/{id}/abort — cancel an in-flight turn (v0.4 B4)
# ---------------------------------------------------------------------------
@router.post("/{session_id}/abort", response_model=SessionAck)
async def abort_turn(session_id: str, request: Request, db: DbDep) -> SessionAck:
"""Cancel the in-flight ainvoke for this session, if any.
Idempotent — returns ok even when no task is in flight. The cancelled
task's ``finally`` clauses still run, so the LangGraph checkpoint stays
consistent. The next POST /messages reuses the same thread.
"""
async with db.session() as s:
row = await s.get(InteractiveSessionRow, session_id)
if row is None:
raise HTTPException(status_code=404, detail=f"session {session_id} not found")
per_session: dict[str, asyncio.Task[Any]] = getattr(
request.app.state, "pending_per_session", {}
)
task = per_session.get(session_id)
if task is not None and not task.done():
task.cancel()
return SessionAck(session_id=session_id, state="active", message="aborted")
return SessionAck(session_id=session_id, state="active", message="no-in-flight-task")
# ---------------------------------------------------------------------------
# GET /api/sessions/{id}/stream — SSE
# ---------------------------------------------------------------------------
async def _session_event_stream(db: Database, session_id: str, last_seq: int = 0) -> Any:
"""Yield ServerSentEvent per new MessageRow. Closes when session ends."""
async def _session_event_stream(
db: Database, session_id: str, last_seq: int = 0, app: Any = None
) -> Any:
"""Yield ServerSentEvent per new MessageRow + token chunk. Closes on terminal.
Three event types emitted:
- ``message`` (existing): one row per new MessageRow.
- ``chunk`` (v0.4 B3): token delta from the in-flight ainvoke. Drained
from ``app.state.token_chunk_queues[session_id]`` if present.
- ``done`` (existing): session terminal or deleted.
"""
seen = last_seq
while True:
# v0.4 B3: drain queued token chunks FIRST so streaming visibly
# precedes the final `message` SSE. Without this the placeholder
# is replaced by the persisted MessageRow before any chunk reaches
# the browser — Claude-Code-style typing would never appear.
if app is not None:
queues: dict[str, asyncio.Queue[Any]] = getattr(app.state, "token_chunk_queues", {})
queue = queues.get(session_id)
if queue is not None:
drained = 0
while not queue.empty() and drained < 200:
chunk = queue.get_nowait()
yield ServerSentEvent(
data=json.dumps(chunk, ensure_ascii=False),
event="chunk",
)
drained += 1
if chunk.get("type") in ("done", "cancelled", "error"):
queues.pop(session_id, None)
break
async with db.session() as s:
message_rows = (
(
@@ -333,7 +451,7 @@ async def stream_session(
row = await s.get(InteractiveSessionRow, session_id)
if row is None:
raise HTTPException(status_code=404, detail=f"session {session_id} not found")
return EventSourceResponse(_session_event_stream(db, session_id, last_seq))
return EventSourceResponse(_session_event_stream(db, session_id, last_seq, app=request.app))
# ---------------------------------------------------------------------------

View File

@@ -1,19 +1,35 @@
"""GET /api/workflows — list seed workflow templates."""
"""GET /api/workflows — list seed + user templates (hot-reloaded).
v0.4: POST /api/workflows persists a new template YAML under
``<config.data_dir>/workflows/<name>@<version>.yaml`` so the workflow
generator UI can create templates without leaving the browser.
"""
from __future__ import annotations
from pathlib import Path
from typing import Annotated
from fastapi import APIRouter, Depends
import yaml
from fastapi import APIRouter, Depends, HTTPException, status
from pydantic import ValidationError
from ...config import Config
from ...user_dirs import user_workflows_dir
from ...workflow import WorkflowTemplate
from ..deps import get_workflows, seed_root
from ..models import WorkflowPhaseSummary, WorkflowRoleSummary, WorkflowSummary
from ..deps import get_config, get_workflows, seed_root
from ..models import (
CreateWorkflowRequest,
CreateWorkflowResponse,
WorkflowPhaseSummary,
WorkflowRoleSummary,
WorkflowSummary,
)
router = APIRouter()
WorkflowsDep = Annotated[list[tuple[Path, WorkflowTemplate]], Depends(get_workflows)]
ConfigDep = Annotated[Config, Depends(get_config)]
@router.get("", response_model=list[WorkflowSummary])
@@ -50,3 +66,57 @@ async def list_workflows(workflows: WorkflowsDep) -> list[WorkflowSummary]:
)
)
return out
@router.post(
"",
response_model=CreateWorkflowResponse,
status_code=status.HTTP_201_CREATED,
)
async def create_workflow(body: CreateWorkflowRequest, config: ConfigDep) -> CreateWorkflowResponse:
"""Persist a new WorkflowTemplate YAML under the user workflows dir.
Pipeline:
1. Convert request → dict (frontmatter aliases preserved: ``schema``
not ``schema_id`` so the file is round-tripped by
:func:`load_workflow_yaml`).
2. Hand it to :class:`WorkflowTemplate.model_validate` — same strict
schema the YAML loader uses. ValidationError → 422.
3. Write atomically to ``<user_workflows>/<name>@<version>.yaml``.
Refuse to overwrite an existing user template with the same key
(use a new version).
4. The hot-reload signature on the next GET picks the file up
automatically — no restart needed.
"""
raw = body.model_dump(by_alias=True)
try:
tpl = WorkflowTemplate.model_validate(raw)
except ValidationError as e:
# `e.errors()` may put `ValueError` objects inside `ctx` (Pydantic
# convention when a validator raises) — those don't JSON-serialise.
# Flatten to a list[str] so the 422 body is always safe to dump.
msgs = [
f"{'.'.join(str(p) for p in err.get('loc', ()))}: {err.get('msg', '')}"
for err in e.errors()
]
raise HTTPException(status_code=422, detail=msgs) from e
target_dir = user_workflows_dir(config)
target_dir.mkdir(parents=True, exist_ok=True)
target = target_dir / f"{tpl.name}@{tpl.version}.yaml"
if target.exists():
raise HTTPException(
status_code=409,
detail=(
f"workflow {tpl.name}@{tpl.version} already exists at "
f"{target}. Bump the version or delete the file first."
),
)
serialised = yaml.safe_dump(
tpl.model_dump(by_alias=True, mode="json"),
allow_unicode=True,
sort_keys=False,
)
target.write_text(serialised, encoding="utf-8")
return CreateWorkflowResponse(path=str(target), name=tpl.name, version=tpl.version)

View File

@@ -95,9 +95,10 @@ class BudgetTracker:
run_id: UUID | None,
persona_name: str | None,
estimated_cost_usd: float,
session_id: UUID | None = None,
) -> BudgetCheck:
"""Check if a call of estimated_cost can proceed. May raise BudgetExhaustedError."""
scopes = self._scopes_for(run_id, persona_name)
scopes = self._scopes_for(run_id, persona_name, session_id)
async with self._db.session() as s:
for scope in scopes:
cap = self._cap_for_scope(scope)
@@ -120,11 +121,12 @@ class BudgetTracker:
run_id: UUID | None,
persona_name: str | None,
actual_cost_usd: float,
session_id: UUID | None = None,
) -> None:
"""Persist the actual cost into all relevant scopes."""
if actual_cost_usd == 0:
return
scopes = self._scopes_for(run_id, persona_name)
scopes = self._scopes_for(run_id, persona_name, session_id)
async with self._db.session() as s:
for scope in scopes:
await self._upsert_spend(s, scope, actual_cost_usd, self._cap_for_scope(scope))
@@ -145,11 +147,22 @@ class BudgetTracker:
# ----- internals ----------------------------------------------------------
def _scopes_for(self, run_id: UUID | None, persona_name: str | None) -> list[str]:
def _scopes_for(
self,
run_id: UUID | None,
persona_name: str | None,
session_id: UUID | None = None,
) -> list[str]:
today = _today_utc()
out = [f"day:{today}"]
if run_id is not None:
out.append(f"run:{run_id}")
if session_id is not None:
# v0.3 PR #6: sub-agent invocations charge their cost against this
# scope so the root interactive session can roll up everything that
# ran under it. Cap is the same as run cap (single user, single
# session ≈ single run for budget purposes).
out.append(f"session:{session_id}")
if persona_name:
out.append(f"persona:{persona_name}:day:{today}")
return out
@@ -159,6 +172,8 @@ class BudgetTracker:
return self._daily_cap
if scope.startswith("run:"):
return self._run_cap
if scope.startswith("session:"):
return self._run_cap # reuse run-cap for interactive sessions
if scope.startswith("persona:") and ":day:" in scope:
return self._daily_cap # per-persona daily uses day cap unless overridden
return None

File diff suppressed because it is too large Load Diff

View File

@@ -64,11 +64,13 @@ class CompactionResult:
compacted: bool,
archived: int = 0,
summary_tokens: int = 0,
summary_text: str = "",
reason: str = "",
) -> None:
self.compacted = compacted
self.archived = archived
self.summary_tokens = summary_tokens
self.summary_text = summary_text
self.reason = reason
def __repr__(self) -> str:
@@ -280,6 +282,7 @@ async def compact_session(
compacted=True,
archived=len(archive_ids),
summary_tokens=summary_tokens,
summary_text=summary_text,
reason="ok",
)

View File

@@ -1,13 +1,29 @@
"""Governance consent for sending user code to external LLM providers."""
"""Governance consent + first-run filesystem bootstrap.
v0.3 PR #7 extends this module to provision the user-wide skeleton on first
run: ``<data_dir>/MYDEEPAGENT.md``, ``<data_dir>/global/memory/MEMORY.md``,
``<data_dir>/skills/``, ``<data_dir>/projects/``. All steps are idempotent
so repeated calls do nothing destructive.
The bootstrap is invoked at REPL/API startup so users always see the dirs
even before they touch a `/remember` or `/skill` slash.
"""
from __future__ import annotations
import json
import logging
import os
from datetime import UTC, datetime
from pathlib import Path
from .config import Config
from .errors import MyDeepAgentError
from .instructions import ensure_global_instructions_initialized
from .memory import ensure_memory_initialized, global_memory_dir
from .skills import ensure_skills_initialized, user_skills_dir
_LOG = logging.getLogger(__name__)
def consent_path(data_dir: Path) -> Path:
@@ -39,3 +55,25 @@ def require_consent(data_dir: Path) -> None:
message="governance consent not recorded",
recovery_hint="run `mydeepagent init` and accept the data-governance prompt",
)
def bootstrap_user_dirs(config: Config) -> None:
"""Provision the full user-wide skeleton. Idempotent.
Creates (if missing):
- ``<data_dir>/MYDEEPAGENT.md`` (global instructions w/ template)
- ``<data_dir>/global/memory/MEMORY.md`` (empty index for cross-project memory)
- ``<data_dir>/skills/`` (user-wide skills root)
- ``<data_dir>/projects/`` (parent of per-project subtrees)
Per-project subdirs (``projects/<project_key>/memory|skills``) are still
created lazily by :class:`InteractiveSession` since they depend on the
repo path; the parent ``projects/`` is materialised here so users see the
expected layout even before opening their first session.
"""
data_dir = Path(config.data_dir)
data_dir.mkdir(parents=True, exist_ok=True)
ensure_global_instructions_initialized(config)
ensure_memory_initialized(global_memory_dir(config))
ensure_skills_initialized(user_skills_dir(config))
(data_dir / "projects").mkdir(parents=True, exist_ok=True)

View File

@@ -0,0 +1,98 @@
"""MYDEEPAGENT.md instruction-file hierarchy (v0.3 PR #7).
Two scopes (mirrors Claude Code's CLAUDE.md global/project layering):
- **Global** : ``<config.data_dir>/MYDEEPAGENT.md``
User-wide preferences that apply to every project. Bootstrapped with a
template on first session if missing.
- **Project** : ``<repo>/MYDEEPAGENT.md``
Repo-specific overrides. Picked up at session start when present; we do NOT
auto-create it (creating a file inside the user's repo would be invasive).
Both files are passed to ``deepagents.MemoryMiddleware`` via the ``memory=``
kwarg of ``create_deep_agent`` — same mechanism as auto-memory. Order in the
list:
[global MYDEEPAGENT.md, project MYDEEPAGENT.md, MEMORY.md, ...entry .md]
So later files (project + auto-memory) can override earlier ones at the same
filesystem path, matching the standard CLAUDE.md precedence.
"""
from __future__ import annotations
from pathlib import Path
from .config import Config
#: Filename for both global and project instruction files.
INSTRUCTION_FILENAME = "MYDEEPAGENT.md"
#: Initial body written to the global file when it does not exist.
_GLOBAL_TEMPLATE = """# MYDEEPAGENT.md (global)
이 파일은 모든 프로젝트에 공통으로 적용되는 사용자 선호를 정의합니다.
세션 시작 시 시스템 프롬프트에 자동으로 포함되어 모든 대화에 영향을 줍니다.
프로젝트별 설정이 필요하면 해당 repo 루트에 같은 이름의 `MYDEEPAGENT.md` 파일을
만들어 주세요 — 자동으로 함께 로드됩니다 (프로젝트가 글로벌을 덮어씁니다).
## 협업 스타일
- 한국어로 대화한다. 코드 안은 영어 유지.
- 작업 시작 전 번호 목록 계획을 만든다.
- 변경은 최소 범위로 한다 — 요청한 것만.
## 코드 스타일
- 새 파일을 만들기 전 기존 패턴을 먼저 읽는다.
- 주석은 ""가 자명하지 않을 때만 짧게 단다.
- TODO/FIXME/pass/NotImplementedError 를 최종 결과물에 남기지 않는다.
## 잘 검토하기
- 완료 선언 전에: 모든 항목 구현 / 정적 분석 통과 / 결과물 1회 이상 직접 읽음.
"""
def global_instructions_path(config: Config) -> Path:
"""Return the absolute path of the global MYDEEPAGENT.md file."""
return Path(config.data_dir) / INSTRUCTION_FILENAME
def project_instructions_path(repo_root: Path) -> Path:
"""Return the absolute path of the project MYDEEPAGENT.md file (may not exist)."""
return Path(repo_root) / INSTRUCTION_FILENAME
def ensure_global_instructions_initialized(config: Config) -> Path:
"""Create the global instructions file with a template if missing.
Idempotent — repeated calls are no-ops once initialised. Returns the
absolute path. Bootstrap during REPL startup so users see the file the
first time they look in ``<data_dir>``.
"""
p = global_instructions_path(config)
p.parent.mkdir(parents=True, exist_ok=True)
if not p.exists():
p.write_text(_GLOBAL_TEMPLATE, encoding="utf-8")
return p
def resolve_instruction_paths(config: Config, repo_root: Path) -> list[str]:
"""Return absolute paths to existing MYDEEPAGENT.md files, global-first.
- Global is bootstrapped (always exists after a session has started)
- Project is included only if the file actually exists in the repo —
we never write into the user's repo automatically.
The returned list is suitable for ``memory_paths_override`` passed to
:func:`session.build_agent` (the ``deepagents.MemoryMiddleware`` then
concatenates them in order — later files override earlier).
"""
paths: list[str] = []
g = global_instructions_path(config)
if g.is_file():
paths.append(str(g.resolve()))
p = project_instructions_path(repo_root)
if p.is_file():
paths.append(str(p.resolve()))
return paths

View File

@@ -1,10 +1,13 @@
"""Auto-memory (v0.3 PR #3) — project-scoped persistent context.
"""Auto-memory (v0.3 PR #3) — project-scoped + global persistent context.
Layout::
<config.data_dir>/projects/<project_key>/memory/
<config.data_dir>/projects/<project_key>/memory/ # project-scoped
MEMORY.md # index — one line per entry: "- [Title](file.md) — hook"
<slug>.md # individual memory entries (with optional frontmatter)
<slug>.md # individual memory entries with frontmatter
<config.data_dir>/global/memory/ # global (every project)
MEMORY.md
<slug>.md
The deepagents `MemoryMiddleware` reads every path we pass via the `memory=`
kwarg of `create_deep_agent` and injects them (concatenated) into the system
@@ -13,19 +16,42 @@ every turn, so updates take effect on the next user message — no agent
rebuild required.
`/remember <text>` appends a new entry file and updates the index. `/forget
<slug>` deletes the entry file and prunes the index. Both are project-scoped
(via `project_key`) so different repos keep separate memory.
<slug>` deletes the entry file and prunes the index. Both default to the
project scope; pass ``scope="global"`` to write into the global directory.
Frontmatter follows the Claude Code auto-memory convention:
---
name: <slug>
description: <one-line hook>
type: user | feedback | project | reference
---
<body>
Type inference uses simple keyword heuristics (deterministic — no LLM call)
so `/remember` works offline. Callers can override with ``--type=feedback``
on the slash if the heuristic picks the wrong bucket.
API keys / OpenRouter / Anthropic tokens are scrubbed at write time via
:func:`_scrub_secrets` — the user gets a single warning + a placeholder.
"""
from __future__ import annotations
import re
from dataclasses import dataclass
from datetime import UTC, datetime
from pathlib import Path
from typing import Literal
import yaml
from .config import Config
#: Filename of the index file inside each project memory dir.
MemoryType = Literal["user", "feedback", "project", "reference"]
_MEMORY_TYPES: tuple[MemoryType, ...] = ("user", "feedback", "project", "reference")
#: Filename of the index file inside each memory dir (project or global).
INDEX_FILENAME = "MEMORY.md"
#: Slug character set — kept conservative for filesystem portability.
@@ -34,13 +60,69 @@ _SLUG_RE = re.compile(r"[^a-z0-9_-]+")
#: Initial index body when bootstrapping a fresh memory directory.
_INITIAL_INDEX = """# Auto-memory
This file is an index of stored memories for this project. Each entry below
points to a sibling `*.md` file. Entries are auto-managed by `/remember` and
`/forget` slash commands — edit by hand if you need finer control.
This file is an index of stored memories. Each entry below points to a
sibling `*.md` file. Entries are auto-managed by `/remember` and `/forget`
slash commands — edit by hand if you need finer control.
## Entries
"""
#: Regexes used by `_scrub_secrets`. Each redacts a recognisable secret
#: shape: OpenRouter / Anthropic / OpenAI keys + bearer tokens + AWS keys.
_SECRET_PATTERNS: tuple[tuple[re.Pattern[str], str], ...] = (
(re.compile(r"sk-or-[A-Za-z0-9_-]{16,}"), "<redacted:openrouter-key>"),
(re.compile(r"sk-ant-[A-Za-z0-9_-]{16,}"), "<redacted:anthropic-key>"),
(re.compile(r"sk-[A-Za-z0-9_-]{20,}"), "<redacted:openai-key>"),
(re.compile(r"Bearer\s+[A-Za-z0-9._-]{16,}"), "<redacted:bearer-token>"),
(re.compile(r"AKIA[0-9A-Z]{16}"), "<redacted:aws-access-key>"),
)
@dataclass(frozen=True)
class MemoryEntry:
"""One stored memory. Parsed from a `<slug>.md` frontmatter + body."""
name: str
description: str
memory_type: MemoryType
content: str
file_path: Path
def _scrub_secrets(text: str) -> tuple[str, bool]:
"""Return ``(scrubbed_text, was_modified)``.
Iterates `_SECRET_PATTERNS` and replaces every match with a labelled
placeholder. Conservative: any pattern hit redacts the whole match.
"""
out = text
modified = False
for pat, placeholder in _SECRET_PATTERNS:
new = pat.sub(placeholder, out)
if new != out:
modified = True
out = new
return out, modified
def _infer_memory_type(content: str, explicit: MemoryType | None = None) -> MemoryType:
"""Deterministic keyword-based classifier (no LLM call).
Falls back to ``project`` when nothing matches. Designed to be cheap +
predictable — `/remember "fish shell"` always lands in ``user``,
`/remember "don't mock the database"` in ``feedback``, etc.
"""
if explicit is not None:
return explicit
text = content.lower()
if any(k in text for k in ("don't ", "dont ", "stop ", "never ", "no longer ", "instead of")):
return "feedback"
if any(k in text for k in ("i ", "i'm ", "i am ", "my ", "prefer", "fish shell", "user is")):
return "user"
if any(k in text for k in ("see http", "linear ", "github.com", "channel ", "dashboard")):
return "reference"
return "project"
def project_memory_dir(config: Config, project_key: str) -> Path:
"""Return the absolute directory path for this project's memory."""
@@ -49,6 +131,11 @@ def project_memory_dir(config: Config, project_key: str) -> Path:
return Path(config.data_dir) / "projects" / project_key / "memory"
def global_memory_dir(config: Config) -> Path:
"""Return the absolute directory path for the user's global memory."""
return Path(config.data_dir) / "global" / "memory"
def ensure_memory_initialized(memory_dir: Path) -> Path:
"""Create the memory directory + initial MEMORY.md if missing.
@@ -95,27 +182,43 @@ def _now_iso() -> str:
return datetime.now(UTC).isoformat(timespec="seconds")
@dataclass(frozen=True)
class WriteResult:
"""Outcome of `add_memory_entry`. Carries the file path + whether
secret-scrubbing kicked in (so the slash handler can warn the user)."""
path: Path
memory_type: MemoryType
scrubbed: bool
def add_memory_entry(
memory_dir: Path,
content: str,
*,
name: str | None = None,
) -> Path:
description: str | None = None,
memory_type: MemoryType | None = None,
) -> WriteResult:
"""Write a new memory file + append pointer to the index.
- ``name`` (optional): explicit slug. If omitted, derived from the first
line of ``content`` via :func:`_slugify`.
- File names collide → ``-2``, ``-3``, … suffix is appended until unique.
- ``name`` (optional): explicit slug. Default = slugified first line.
- ``description`` (optional): one-line hook for the index pointer.
Default = first line of content (no leading ``#``).
- ``memory_type`` (optional): override the heuristic classifier.
Returns the absolute path to the newly written file. Raises
``ValueError`` for empty content.
Secret-shaped substrings (OpenRouter/Anthropic/OpenAI keys, AWS access
keys, bearer tokens) are redacted via :func:`_scrub_secrets` before
write — the ``WriteResult.scrubbed`` flag tells the caller to warn the
user. Empty/whitespace content raises ``ValueError``.
"""
if not content or not content.strip():
raise ValueError("memory content must be non-empty")
ensure_memory_initialized(memory_dir)
safe_content, scrubbed = _scrub_secrets(content.strip())
first_line = content.strip().splitlines()[0]
first_line = safe_content.splitlines()[0]
slug_base = _slugify(name or first_line)
candidate = memory_dir / f"{slug_base}.md"
n = 2
@@ -123,20 +226,82 @@ def add_memory_entry(
candidate = memory_dir / f"{slug_base}-{n}.md"
n += 1
# File body: short frontmatter + content. The frontmatter is informational
# for human readers; the deepagents middleware does not parse it.
body = f"---\nslug: {candidate.stem}\ncreated: {_now_iso()}\n---\n\n{content.strip()}\n"
inferred_type = _infer_memory_type(safe_content, memory_type)
hook = (description or first_line.strip().lstrip("# ").strip())[:120] or candidate.stem
body = (
f"---\n"
f"name: {candidate.stem}\n"
f"description: {hook}\n"
f"type: {inferred_type}\n"
f"created: {_now_iso()}\n"
f"---\n\n"
f"{safe_content}\n"
)
candidate.write_text(body, encoding="utf-8")
# Append a one-line pointer to the index — first line of content is the
# title, truncated to keep the index scannable.
title = first_line.strip().lstrip("# ").strip()[:80] or candidate.stem
pointer = f"- [{title}]({candidate.name}) — {_now_iso()}\n"
pointer = f"- [{hook}]({candidate.name}) — type:{inferred_type}\n"
index_path = memory_dir / INDEX_FILENAME
with index_path.open("a", encoding="utf-8") as f:
f.write(pointer)
return candidate
return WriteResult(path=candidate, memory_type=inferred_type, scrubbed=scrubbed)
def read_entry(file_path: Path) -> MemoryEntry | None:
"""Parse a single ``<slug>.md`` file into a :class:`MemoryEntry`.
Returns None for files with malformed/missing frontmatter — the caller
can decide whether to surface the issue. Falls back to ``project`` when
`type` is missing or unrecognised.
"""
if not file_path.is_file():
return None
try:
raw = file_path.read_text(encoding="utf-8")
except OSError:
return None
if not raw.startswith("---"):
return None
parts = raw.split("---", 2)
if len(parts) < 3:
return None
try:
meta = yaml.safe_load(parts[1]) or {}
except yaml.YAMLError:
return None
if not isinstance(meta, dict):
return None
name = str(meta.get("name", file_path.stem)).strip()
description = str(meta.get("description", "")).strip() or "(no description)"
raw_type = str(meta.get("type", "project")).strip().lower()
mt: MemoryType = "project"
for known in _MEMORY_TYPES:
if raw_type == known:
mt = known
break
return MemoryEntry(
name=name,
description=description,
memory_type=mt,
content=parts[2].lstrip("\n"),
file_path=file_path,
)
def read_index_entries(memory_dir: Path) -> list[MemoryEntry]:
"""Return parsed :class:`MemoryEntry` for every `*.md` in the dir except
``MEMORY.md`` itself. Sorted by filename. Malformed files are skipped."""
if not memory_dir.is_dir():
return []
out: list[MemoryEntry] = []
for p in sorted(memory_dir.glob("*.md")):
if p.name == INDEX_FILENAME:
continue
entry = read_entry(p)
if entry is not None:
out.append(entry)
return out
def remove_memory_entry(memory_dir: Path, slug_or_filename: str) -> bool:

View File

@@ -56,6 +56,7 @@ class CostMiddleware(AgentMiddleware):
run_id=self.run_id,
persona_name=self.persona_name,
estimated_cost_usd=estimated,
session_id=self.interactive_session_id,
)
started = time.perf_counter()
try:
@@ -104,6 +105,7 @@ class CostMiddleware(AgentMiddleware):
run_id=self.run_id,
persona_name=self.persona_name,
actual_cost_usd=actual,
session_id=self.interactive_session_id,
)
return response

View File

@@ -0,0 +1,114 @@
"""PlanModeMiddleware (v0.3 PR #5) — block write tools when plan-mode is active.
Claude Code's plan mode lets the user say "design this, don't write code" — the
agent can read, search, plan via `write_todos`, but cannot mutate the
filesystem or run shell commands until the user `/approve`s.
Implementation strategy:
- A callable ``is_active()`` is passed in at construction time. The REPL flips
a flag on/off via slash commands; the middleware re-reads on every tool call.
This avoids rebuilding the agent on every `/plan` / `/approve` toggle.
- When plan-mode is on and the LLM calls a blocked tool, we return a synthetic
``ToolMessage(status="error", ...)`` so the LLM sees feedback and can adjust
("ok, I'll keep planning instead"). We do NOT raise — that would crash the
turn and the user would lose the partial response.
Blocked tools (matches Claude Code's ExitPlanMode-required tool set):
- ``write_file``, ``edit_file`` — fs mutation
- ``bash`` / ``execute`` / ``run_command`` / ``shell`` — shell exec
- ``task`` — sub-agent spawn (a sub-agent could bypass plan mode)
- ``write_todos`` — todos are PART of the plan markdown. Plan-mode
forbids commits to the agent's TODO list; the user reviews the plan
first, then /approve unlocks both writes and the TODO list.
"""
from __future__ import annotations
from collections.abc import Callable
from typing import Any
from langchain.agents.middleware import AgentMiddleware
from langchain_core.messages import ToolMessage
#: Tool names that mutate the filesystem.
_FS_WRITE_TOOLS: frozenset[str] = frozenset({"write_file", "edit_file"})
#: Tool names that execute shell commands.
_SHELL_TOOLS: frozenset[str] = frozenset({"bash", "execute", "run_command", "shell"})
#: Tool names that spawn sub-agents (which would bypass plan mode in the parent).
_SUBAGENT_TOOLS: frozenset[str] = frozenset({"task"})
#: Plan-mode forbids committing to a TODO list — todos are part of the
#: plan markdown that the user reviews before /approve.
_PLANNING_TOOLS: frozenset[str] = frozenset({"write_todos"})
#: Full blocklist applied while plan mode is on.
BLOCKED_TOOLS_IN_PLAN_MODE: frozenset[str] = (
_FS_WRITE_TOOLS | _SHELL_TOOLS | _SUBAGENT_TOOLS | _PLANNING_TOOLS
)
def _block_message(tool_name: str) -> str:
return (
f"Plan-mode is active — `{tool_name}` is blocked. "
"Keep planning with read_file / glob / grep / write_todos, "
"or ask the user to `/approve` to leave plan mode."
)
class PlanModeMiddleware(AgentMiddleware):
"""Block mutating tool calls while plan-mode is active.
Construction takes an ``is_active`` callable that returns the current plan
mode state. The REPL toggles this state via slash commands without
rebuilding the agent — the middleware reads it fresh per tool call.
Tools that are read-only (``read_file``, ``glob``, ``grep``, ``ls``,
``write_todos``) are allowed in plan mode unconditionally.
"""
def __init__(self, *, is_active: Callable[[], bool]) -> None:
self._is_active = is_active
async def awrap_tool_call(self, request: Any, handler: Any) -> Any:
if not self._is_active():
return await handler(request)
name = _tool_name(request)
if name in BLOCKED_TOOLS_IN_PLAN_MODE:
return ToolMessage(
content=_block_message(name),
tool_call_id=_tool_call_id(request),
name=name,
status="error",
)
return await handler(request)
def wrap_tool_call(self, request: Any, handler: Any) -> Any:
# Sync path mirrors the async one for parity (e.g. when the agent is
# invoked synchronously in unit tests). Real REPL/Web paths are async.
if not self._is_active():
return handler(request)
name = _tool_name(request)
if name in BLOCKED_TOOLS_IN_PLAN_MODE:
return ToolMessage(
content=_block_message(name),
tool_call_id=_tool_call_id(request),
name=name,
status="error",
)
return handler(request)
def _tool_name(request: Any) -> str:
tool_call = getattr(request, "tool_call", None)
if isinstance(tool_call, dict):
return str(tool_call.get("name") or "")
return str(getattr(request, "name", "") or "")
def _tool_call_id(request: Any) -> str:
tool_call = getattr(request, "tool_call", None)
if isinstance(tool_call, dict):
return str(tool_call.get("id") or "")
return str(getattr(request, "id", "") or "")

View File

@@ -75,10 +75,17 @@ class Database:
"""
def __init__(self, database_url: str) -> None:
# v0.3 hotfix: Postgres asyncpg pool occasionally hands out stale
# connections whose underlying socket was closed by the server (idle
# timeout, container restart, network blip, …). `pool_pre_ping`
# adds a fast ping before each checkout and invalidates dead
# connections so the next acquire dials a fresh one — fixes the
# "InterfaceError: connection is closed" 500 seen under SSE load.
self._engine: AsyncEngine = create_async_engine(
database_url,
poolclass=None,
echo=False,
pool_pre_ping=True,
)
_attach_dialect_pragmas(self._engine)
self._session_factory: async_sessionmaker[AsyncSession] = async_sessionmaker(

View File

@@ -153,6 +153,9 @@ def resolve_model_instance(
max_tokens=params.get("max_tokens", 4096),
temperature=params.get("temperature", 0.2),
top_p=params.get("top_p", 1.0),
# v0.4 B3: enable token streaming so AsyncCallbackHandler.on_llm_new_token
# receives chunks during ainvoke. Final response is unchanged.
streaming=True,
)
return model_spec

View File

@@ -1,12 +1,11 @@
"""Agent Skills (v0.3 PR #4) — LLM-routed progressive disclosure.
Layout::
Layout (two scopes, mirrors Claude Code's `~/.claude/skills/` + repo overlay):
<config.data_dir>/skills/<skill-name>/SKILL.md
[optional supporting files]
<config.data_dir>/skills/<name>/SKILL.md # global / user
<config.data_dir>/projects/<project_key>/skills/<name>/SKILL.md # project
We mount this single directory as a source for ``deepagents.SkillsMiddleware``
which:
We mount both directories as sources for ``deepagents.SkillsMiddleware`` which:
1. Parses every ``SKILL.md`` YAML frontmatter (``name``, ``description``, …)
2. Injects an index of ``(name, description)`` pairs into the system prompt
@@ -14,15 +13,18 @@ which:
``read_file`` — no embeddings, no per-token vector lookup, no custom
routing logic. Anthropic's Agent Skills specification verbatim.
The skill name in the YAML frontmatter must match the parent directory name
(``deepagents`` enforces this) — e.g. a skill directory ``web-research/``
needs ``name: web-research`` inside its ``SKILL.md``.
Per ``deepagents.SkillsMiddleware`` semantics, later sources override earlier
ones at the same skill name — so project-scope wins over global-scope, which
matches the Claude Code precedence.
PR #4 keeps the surface area small: we mount one user-scope source and expose
``/skills`` (list) and ``/skill <name>`` (show full body for inspection)
slashes. Project-scope skills (``<repo>/.mydeepagent/skills/``) are NOT wired
in this PR — call sites can later layer them by passing additional sources
through ``build_agent(skills_sources_override=...)``.
The skill name in the YAML frontmatter must match the parent directory name.
PR #4 slashes:
- ``/skills``: list installed skills (project + global, with scope label)
- ``/skills show <name>``: REPL output only (inspection)
- ``/skill <name>``: inject the SKILL.md body as a one-shot system message
on the next ainvoke (the LLM treats it as an explicit "use this skill"
directive for this turn).
"""
from __future__ import annotations
@@ -46,22 +48,34 @@ _MAX_SKILL_READ_BYTES = 10 * 1024 * 1024
class SkillInfo:
"""Lightweight summary of one installed skill — used by `/skills` slash.
Fields are derived from the YAML frontmatter inside ``SKILL.md``:
- ``name``: directory name (also enforced inside frontmatter by deepagents)
- ``description``: 1-line summary, truncated if very long
- ``path``: absolute path of the ``SKILL.md`` for `/skill <name>` body display
- ``path``: absolute path of the ``SKILL.md`` for body display
- ``scope``: ``"project"`` (repo-local) or ``"global"`` (user-wide)
"""
name: str
description: str
path: Path
scope: str = "global"
def user_skills_dir(config: Config) -> Path:
"""Return the user-scope skills directory (``<data_dir>/skills``)."""
"""Return the global / user-wide skills directory (``<data_dir>/skills``)."""
return Path(config.data_dir) / "skills"
def project_skills_dir(config: Config, project_key: str) -> Path:
"""Return the project-scope skills directory.
Stored under ``<data_dir>/projects/<project_key>/skills/`` to keep all
project-scoped artefacts (memory, skills) under a single parent path.
"""
if not project_key:
raise ValueError("project_key must be non-empty")
return Path(config.data_dir) / "projects" / project_key / "skills"
def ensure_skills_initialized(skills_dir: Path) -> None:
"""Create the skills directory if missing.
@@ -116,7 +130,7 @@ def _parse_skill_md(path: Path) -> SkillInfo | None:
return SkillInfo(name=name, description=description, path=path)
def list_installed_skills(skills_dir: Path) -> list[SkillInfo]:
def list_installed_skills(skills_dir: Path, *, scope: str = "global") -> list[SkillInfo]:
"""Scan the directory for ``<name>/SKILL.md`` entries and return summaries.
- Sorted by name for deterministic UX
@@ -136,10 +150,33 @@ def list_installed_skills(skills_dir: Path) -> list[SkillInfo]:
continue
info = _parse_skill_md(skill_md)
if info is not None:
found.append(info)
found.append(
SkillInfo(name=info.name, description=info.description, path=info.path, scope=scope)
)
return found
def list_all_skills(config: Config, project_key: str) -> list[SkillInfo]:
"""Merged project + global skill list. Project wins on name collision."""
global_skills = list_installed_skills(user_skills_dir(config), scope="global")
project_skills = list_installed_skills(project_skills_dir(config, project_key), scope="project")
project_names = {s.name for s in project_skills}
merged = [s for s in global_skills if s.name not in project_names]
merged.extend(project_skills)
merged.sort(key=lambda s: s.name)
return merged
def find_skill(config: Config, project_key: str, name: str) -> SkillInfo | None:
"""Resolve a skill by name, preferring project-scope over global."""
if not name:
return None
for skill in list_all_skills(config, project_key):
if skill.name == name:
return skill
return None
def read_skill_body(skills_dir: Path, name: str) -> str | None:
"""Return the full SKILL.md content for the named skill, or None if missing.
@@ -160,11 +197,16 @@ def read_skill_body(skills_dir: Path, name: str) -> str | None:
return None
def resolve_skill_sources(config: Config) -> list[str]:
def resolve_skill_sources(config: Config, project_key: str | None = None) -> list[str]:
"""Build the list of skill-directory sources to pass to deepagents.
Currently a single-entry list (user-scope). Designed to be extended with
project-scope and team-scope sources in later PRs without changing the
caller interface.
Order: global first, then project. ``deepagents.SkillsMiddleware``
later-wins, so project skills override global ones at the same name.
Returns absolute paths. Project source is omitted when no
``project_key`` is supplied (e.g. workflow-engine call sites that don't
have a project context).
"""
return [str(user_skills_dir(config).resolve())]
sources = [str(user_skills_dir(config).resolve())]
if project_key:
sources.append(str(project_skills_dir(config, project_key).resolve()))
return sources

View File

@@ -0,0 +1,389 @@
"""Sub-agent session linkage + runner (v0.3 PR #6).
PR #1 already added `parent_session_id` + `depth` columns to
`InteractiveSessionRow`. This module provides:
- :func:`spawn_subagent_session` — atomically creates a child row inheriting
``project_key`` from the parent, sets ``parent_session_id`` + ``depth =
parent.depth + 1``, rejects when depth would exceed
:data:`MAX_SUBAGENT_DEPTH`.
- :func:`list_subagents` — direct children for ``/agents`` listings.
- :func:`resolve_root_session_id` — walk the parent chain to find the root.
- :func:`run_subagent_to_completion` — actually invoke a sub-agent's
``ainvoke`` with isolation + LangGraph thread + summary push to parent.
Cost rollup: each sub-agent's CostMiddleware is wired with the ROOT session
id so all LLM calls under that session tree charge a single ``session:<uuid>``
scope — matches the plan ("sub-agent는 root session의 한도에 합산").
"""
from __future__ import annotations
import logging
from collections.abc import Sequence
from datetime import UTC, datetime
from typing import Any
from uuid import UUID, uuid4
from sqlalchemy import desc, select
from .audit import make_audit_recorder
from .budget import make_budget_tracker_from_config
from .compaction import compact_session
from .config import Config
from .errors import MyDeepAgentError
from .memory import (
ensure_memory_initialized,
global_memory_dir,
list_memory_paths,
project_memory_dir,
)
from .middleware.audit import AuditToolMiddleware
from .middleware.cost import CostMiddleware
from .middleware.plan_mode import PlanModeMiddleware
from .monitoring.pricing import ModelPrice, PricingCache
from .monitoring.token_budget import count_tokens
from .persistence.db import Database
from .persistence.models import AgentPersonaRow, InteractiveSessionRow, MessageRow
from .persona import Persona
from .session import build_agent
from .skills import (
ensure_skills_initialized,
project_skills_dir,
resolve_skill_sources,
user_skills_dir,
)
_LOG = logging.getLogger(__name__)
#: Maximum sub-agent nesting depth. Above this we refuse to spawn — Claude
#: Code's `task` tool limits agent stacks to roughly 3 levels (Main → A → B)
#: to keep budgets and audit trails legible.
MAX_SUBAGENT_DEPTH: int = 3
def _now_iso() -> str:
return datetime.now(UTC).isoformat(timespec="seconds")
async def spawn_subagent_session(
db: Database,
*,
parent_session_id: UUID,
persona: Persona,
initial_title: str | None = None,
) -> UUID:
"""Create a child :class:`InteractiveSessionRow` linked to ``parent_session_id``.
The child inherits ``project_key`` from the parent — same memory dir,
same skill dir. ``depth`` is incremented by 1; if that would exceed
:data:`MAX_SUBAGENT_DEPTH` we raise ``MyDeepAgentError(human_required)``
so the caller (REPL slash / API endpoint) can surface a clean message.
The persona may be different from the parent's (callers often want a
specialised role for the child), so ``persona`` is required. We upsert
its ``AgentPersonaRow`` for the FK exactly like
:func:`cli.interactive._load_or_create_session_row` does for root rows.
Returns the new child ``session_id``.
"""
async with db.session() as s:
parent = await s.get(InteractiveSessionRow, str(parent_session_id))
if parent is None:
raise MyDeepAgentError.human_required(
"parent_session_missing",
message=f"cannot spawn sub-agent: parent session {parent_session_id} not found",
recovery_hint="confirm the parent session id; sub-agents require a live parent",
)
if parent.state == "ended":
raise MyDeepAgentError.human_required(
"parent_session_ended",
message=f"cannot spawn sub-agent: parent {parent.id} is ended",
recovery_hint="resume the parent session first or pick a different parent",
)
new_depth = (parent.depth or 0) + 1
if new_depth > MAX_SUBAGENT_DEPTH:
raise MyDeepAgentError.human_required(
"subagent_depth_exceeded",
message=(
f"sub-agent depth limit reached: parent depth={parent.depth}, "
f"max={MAX_SUBAGENT_DEPTH}"
),
recovery_hint=(
f"flatten the agent stack (max {MAX_SUBAGENT_DEPTH} levels) "
"or close intermediate sub-agents first"
),
)
# Upsert AgentPersonaRow for the persona we're spawning with.
ph = persona.compute_hash()
persona_row = (
await s.execute(select(AgentPersonaRow).where(AgentPersonaRow.hash == ph))
).scalar_one_or_none()
if persona_row is None:
persona_row = AgentPersonaRow(
id=str(uuid4()),
name=persona.name,
version=persona.version,
hash=ph,
definition=persona.model_dump(by_alias=True),
created_at=_now_iso(),
)
s.add(persona_row)
await s.flush()
child_id = uuid4()
child = InteractiveSessionRow(
id=str(child_id),
persona_id=persona_row.id,
persona_hash=ph,
started_at=_now_iso(),
last_message_at=None,
state="active",
total_input_tokens=0,
total_output_tokens=0,
model=persona.model,
project_key=parent.project_key, # inherit so memory is shared
title=initial_title,
plan_mode=False,
parent_session_id=parent.id,
depth=new_depth,
)
s.add(child)
await s.commit()
return child_id
async def list_subagents(db: Database, parent_session_id: UUID) -> list[InteractiveSessionRow]:
"""Return all direct children of ``parent_session_id``, oldest first.
Used by the ``/agents`` slash and the Web GUI session tree. Does NOT
recurse — callers that want the full tree must walk it themselves.
"""
async with db.session() as s:
rows: Sequence[InteractiveSessionRow] = (
(
await s.execute(
select(InteractiveSessionRow)
.where(InteractiveSessionRow.parent_session_id == str(parent_session_id))
.order_by(InteractiveSessionRow.started_at)
)
)
.scalars()
.all()
)
return list(rows)
async def resolve_root_session_id(db: Database, session_id: UUID) -> UUID:
"""Walk ``parent_session_id`` until we reach a session with ``parent=None``.
Guarded against cycles (would only happen if depth column lied — protective
cap = 1 + MAX_SUBAGENT_DEPTH iterations). Returns the input id when the
session has no parent.
"""
current = str(session_id)
for _ in range(MAX_SUBAGENT_DEPTH + 2):
async with db.session() as s:
row = await s.get(InteractiveSessionRow, current)
if row is None:
return session_id
if row.parent_session_id is None:
return UUID(row.id)
current = row.parent_session_id
# Cycle detected — return the latest hop as a graceful fallback.
return UUID(current)
_SUBAGENT_SUMMARY_INSTRUCTION = (
"당신은 sub-agent 입니다. 사용자가 요청한 과제를 마치고 한 번의 응답 안에 "
"(1) 도달한 결론, (2) 변경한 파일/생성한 산출물, (3) 부모 세션에 전달할 핵심 "
"요약 (≤ 400 단어) 을 정리하세요. 추가 turn 은 일어나지 않습니다."
)
def _static_pricing_seed() -> PricingCache:
cache = PricingCache()
cache.set(
[
ModelPrice("anthropic/claude-sonnet-4-6", 0.003, 0.015, 200_000),
ModelPrice("anthropic/claude-haiku-4-5", 0.001, 0.005, 200_000),
ModelPrice("anthropic/claude-opus-4-1", 0.015, 0.075, 200_000),
ModelPrice("deepseek/deepseek-chat", 0.00028, 0.00112, 64_000),
]
)
return cache
def _flatten_assistant_content(msg: Any) -> str:
content = getattr(msg, "content", "") or ""
if isinstance(content, list):
return "\n".join(
(b.get("text", str(b)) if isinstance(b, dict) else str(b)) for b in content
)
return str(content)
async def _persist_message(
db: Database, session_id: UUID, role: str, content: str, *, model: str
) -> None:
"""Insert one MessageRow + bump last_message_at + token totals.
Mirrors the REPL's ``_append_message`` but lives in subagents.py so the
runner stays self-contained.
"""
token_count = count_tokens(content, model)
now = datetime.now(UTC).isoformat(timespec="seconds")
async with db.session() as s:
last_seq = (
await s.execute(
select(MessageRow.seq)
.where(MessageRow.session_id == str(session_id))
.order_by(desc(MessageRow.seq))
.limit(1)
)
).scalar_one_or_none() or 0
s.add(
MessageRow(
session_id=str(session_id),
seq=last_seq + 1,
role=role,
content=content,
tool_calls=None,
token_count=token_count,
is_summary=False,
archived=False,
ts=now,
)
)
row = await s.get(InteractiveSessionRow, str(session_id))
if row is not None:
row.last_message_at = now
if role == "user":
row.total_input_tokens += token_count
elif role == "assistant":
row.total_output_tokens += token_count
await s.commit()
async def run_subagent_to_completion(
db: Database,
config: Config,
parent_session_id: UUID,
sub_session_id: UUID,
persona: Persona,
prompt: str,
*,
saver: Any | None = None,
) -> str:
"""Invoke the sub-agent ONCE with the supplied prompt and return its summary.
- Loads the sub-session row to read ``project_key`` (inherited from parent)
- Resolves the root session id and wires CostMiddleware to charge that
single ``session:<root_uuid>`` scope so the whole agent tree shares one
budget envelope (per plan: "sub-agent는 root session의 한도에 합산").
- Persists user prompt + assistant summary to the sub-session.
- Pushes a "[sub-agent <id> result] …" system message to the parent so
the user sees the outcome in the main thread.
- Marks the sub-session ``ended`` on completion.
Failures are logged + propagated as a synthetic assistant message in the
sub-session, and an error system message in the parent.
"""
async with db.session() as s:
sub_row = await s.get(InteractiveSessionRow, str(sub_session_id))
if sub_row is None:
raise MyDeepAgentError.fatal(
"subagent_session_missing",
message=f"sub-agent session {sub_session_id} not found",
recovery_hint="call spawn_subagent_session before run_subagent_to_completion",
)
project_key = sub_row.project_key or ""
root_session_id = await resolve_root_session_id(db, parent_session_id)
# Bootstrap shared memory + skills dirs (idempotent).
if project_key:
ensure_memory_initialized(project_memory_dir(config, project_key))
ensure_skills_initialized(project_skills_dir(config, project_key))
ensure_memory_initialized(global_memory_dir(config))
ensure_skills_initialized(user_skills_dir(config))
pricing = _static_pricing_seed()
budget = make_budget_tracker_from_config(db, config)
cost_mw = CostMiddleware(
pricing=pricing,
model_name=persona.model,
interactive_session_id=root_session_id,
persona_name=persona.name,
budget_tracker=budget,
)
audit_mw = AuditToolMiddleware(
interactive_session_id=sub_session_id,
file_recorder=make_audit_recorder(config.state_dir),
)
plan_mw = PlanModeMiddleware(is_active=lambda: False)
memory_paths = list_memory_paths(global_memory_dir(config))
if project_key:
memory_paths += list_memory_paths(project_memory_dir(config, project_key))
skill_sources = resolve_skill_sources(config, project_key or None)
agent = build_agent(
persona,
config,
root_dir=config.workspace_root,
middleware=[plan_mw, cost_mw, audit_mw],
checkpointer=saver,
memory_paths_override=memory_paths,
skills_sources_override=skill_sources,
)
full_prompt = f"{prompt.strip()}\n\n---\n\n{_SUBAGENT_SUMMARY_INSTRUCTION}"
await _persist_message(db, sub_session_id, "user", full_prompt, model=persona.model)
thread_id = f"{sub_session_id}:0"
try:
result = await agent.ainvoke(
{"messages": [{"role": "user", "content": full_prompt}]},
config={"configurable": {"thread_id": thread_id}},
)
except Exception as e:
_LOG.exception("sub-agent ainvoke failed for session %s", sub_session_id)
error_msg = f"sub-agent failed: {type(e).__name__}: {e}"
await _persist_message(db, sub_session_id, "assistant", error_msg, model=persona.model)
await _persist_message(
db,
parent_session_id,
"system",
f"[sub-agent {str(sub_session_id)[:8]} error] {error_msg}",
model=persona.model,
)
await _mark_session_ended(db, sub_session_id)
return error_msg
messages = result.get("messages", []) if isinstance(result, dict) else []
summary = _flatten_assistant_content(messages[-1]) if messages else "(empty response)"
await _persist_message(db, sub_session_id, "assistant", summary, model=persona.model)
await _persist_message(
db,
parent_session_id,
"system",
f"[sub-agent {str(sub_session_id)[:8]} result]\n{summary}",
model=persona.model,
)
# Compact the sub-session if it grew too big (rare for single-turn but
# the helper is idempotent + cheap to call).
await compact_session(db, config, str(sub_session_id))
await _mark_session_ended(db, sub_session_id)
return summary
async def _mark_session_ended(db: Database, session_id: UUID) -> None:
async with db.session() as s:
row = await s.get(InteractiveSessionRow, str(session_id))
if row is not None and row.state != "ended":
row.state = "ended"
row.ended_at = datetime.now(UTC).isoformat(timespec="seconds")
await s.commit()

View File

@@ -0,0 +1,115 @@
"""User-scope persona / workflow directories (v0.3 PR #9).
Existing personas live at ``docs/schemas/personas/`` (seeded with the
my-deepagent install). Users can drop additional YAML files into
``<config.data_dir>/personas/`` and ``<config.data_dir>/workflows/`` to
register their own — these are layered ON TOP of the seed (user version
wins on `(name, version)` collision).
This module exposes:
- :func:`user_personas_dir` / :func:`user_workflows_dir` — path helpers.
- :func:`ensure_user_dirs_initialized` — `mkdir -p` for both, idempotent.
- :func:`load_combined_personas` — seed + user, deduplicated by (name, version)
with user-overrides-seed semantics.
- :func:`load_combined_workflows` — seed + user, deduplicated by (name, version).
"""
from __future__ import annotations
import logging
from pathlib import Path
from .config import Config
from .persona import Persona, load_personas_from_dir
from .workflow import WorkflowTemplate, load_workflow_yaml
_LOG = logging.getLogger(__name__)
def user_personas_dir(config: Config) -> Path:
return Path(config.data_dir) / "personas"
def user_workflows_dir(config: Config) -> Path:
return Path(config.data_dir) / "workflows"
def ensure_user_dirs_initialized(config: Config) -> None:
"""`mkdir -p` for both user directories. Idempotent."""
user_personas_dir(config).mkdir(parents=True, exist_ok=True)
user_workflows_dir(config).mkdir(parents=True, exist_ok=True)
def load_combined_personas(config: Config, seed_dir: Path) -> list[Persona]:
"""Combine seeded + user personas with user-overrides-seed precedence.
Returns a list whose order is "seed first, then user-only (excluding
overrides)" — useful for CLI listings. Internal dedupe is keyed on
``(name, version)``. The seed dir uses strict loading (we want to know
if a shipped YAML is broken). The user dir uses best-effort per-file
loading so a single broken file cannot break the REPL.
"""
seed = load_personas_from_dir(seed_dir)
user_dir = user_personas_dir(config)
user = _safe_load_personas(user_dir) if user_dir.is_dir() else []
return _merge_with_user_override(seed, user)
def _safe_load_personas(directory: Path) -> list[Persona]:
"""Best-effort load — skip individual malformed files."""
from .persona import load_persona_yaml
out: list[Persona] = []
for p in sorted(directory.glob("*.yaml")):
try:
out.append(load_persona_yaml(p))
except Exception as e:
_LOG.warning("skipping invalid persona file %s: %s", p, e)
return out
def _merge_with_user_override(seed: list[Persona], user: list[Persona]) -> list[Persona]:
"""Last-wins on `(name, version)`. Preserves seed order for entries not
overridden, then appends user-only entries in their own order."""
user_keys = {(p.name, p.version) for p in user}
merged: list[Persona] = [p for p in seed if (p.name, p.version) not in user_keys]
merged.extend(user)
return merged
def load_combined_workflows(config: Config, seed_dir: Path) -> list[tuple[Path, WorkflowTemplate]]:
"""Combine seeded + user workflows with user-overrides-seed precedence.
Returns `[(path, WorkflowTemplate), ...]`. Malformed YAMLs (seed or user)
are logged and skipped — broken files cannot break the REPL. Order is
seed first (deduped), then user-only.
"""
seed = _safe_load_workflows(seed_dir)
user_dir = user_workflows_dir(config)
user = _safe_load_workflows(user_dir) if user_dir.is_dir() else []
return _merge_workflows_with_user_override(seed, user)
def _safe_load_workflows(directory: Path) -> list[tuple[Path, WorkflowTemplate]]:
if not directory.is_dir():
return []
out: list[tuple[Path, WorkflowTemplate]] = []
for p in sorted(directory.glob("*.yaml")):
try:
out.append((p, load_workflow_yaml(p)))
except Exception as e:
_LOG.warning("skipping invalid workflow file %s: %s", p, e)
return out
def _merge_workflows_with_user_override(
seed: list[tuple[Path, WorkflowTemplate]],
user: list[tuple[Path, WorkflowTemplate]],
) -> list[tuple[Path, WorkflowTemplate]]:
user_keys = {(t.name, t.version) for (_p, t) in user}
merged: list[tuple[Path, WorkflowTemplate]] = [
(p, t) for (p, t) in seed if (t.name, t.version) not in user_keys
]
merged.extend(user)
return merged

View File

@@ -417,11 +417,926 @@ async function resumeRun() {
}
}
// =============== conversation page (v0.3 PR #8) ===============
const CONV_STATE = {
sessionId: null,
eventSource: null,
lastSeq: 0,
awaitingReply: false,
streamBuffer: "", // v0.4 B3: accumulated token deltas while streaming
};
function $conv(sel) { return document.querySelector(sel); }
function setSendDisabled(disabled) {
$conv("#message-input").disabled = disabled;
$conv("#send-btn").disabled = disabled;
}
// v0.4 B4: toggle the abort button visibility based on in-flight state.
// `disabled` is what setSendDisabled sees AFTER awaiting reply has started.
function setAbortVisible(visible) {
const btn = $conv("#abort-btn");
if (!btn) return;
btn.style.display = visible ? "inline-block" : "none";
btn.disabled = !visible;
}
function clearMessages() {
const list = $conv("#messages");
list.replaceChildren();
}
function showConversationEmpty(show, text) {
let el = $conv("#conv-empty");
if (!el && show) {
el = document.createElement("div");
el.id = "conv-empty";
el.className = "conv-empty";
$conv("#messages").appendChild(el);
}
if (el) {
if (show) {
el.textContent = text || "대화를 시작하세요.";
el.style.display = "";
} else {
el.remove();
}
}
}
// v0.4 B1: minimal markdown renderer for assistant messages.
// SECURITY: we ONLY emit DOM nodes built via createElement + textContent.
// No innerHTML, no insertAdjacentHTML. This is a tiny subset of Markdown
// chosen for chat readability — anything we don't understand is rendered as
// literal text (textContent fallback in the default case).
function _mdRenderInto(target, raw) {
// Code-fence-aware splitter — we walk the input line-by-line and group
// lines into blocks (paragraph, code-fence, h#, list).
const lines = raw.split("\n");
let i = 0;
while (i < lines.length) {
const line = lines[i];
// Fenced code block: ```lang
const fence = line.match(/^```\s*([\w.-]*)\s*$/);
if (fence) {
const lang = fence[1];
const codeLines = [];
i++;
while (i < lines.length && !/^```\s*$/.test(lines[i])) {
codeLines.push(lines[i]);
i++;
}
if (i < lines.length) i++; // consume closing ```
const pre = document.createElement("pre");
pre.className = "md-code";
const code = document.createElement("code");
if (lang) code.className = `language-${lang}`;
code.textContent = codeLines.join("\n");
pre.appendChild(code);
target.appendChild(pre);
continue;
}
// ATX header: # / ## / ### (up to 6)
const hdr = line.match(/^(#{1,6})\s+(.*)$/);
if (hdr) {
const level = hdr[1].length;
const h = document.createElement(`h${level + 2 > 6 ? 6 : level + 2}`);
h.className = "md-h";
_mdInline(h, hdr[2]);
target.appendChild(h);
i++;
continue;
}
// Unordered list block — consecutive "- " or "* "
if (/^[-*]\s+/.test(line)) {
const ul = document.createElement("ul");
ul.className = "md-ul";
while (i < lines.length && /^[-*]\s+/.test(lines[i])) {
const li = document.createElement("li");
_mdInline(li, lines[i].replace(/^[-*]\s+/, ""));
ul.appendChild(li);
i++;
}
target.appendChild(ul);
continue;
}
// Ordered list: "1. ", "2. ", …
if (/^\d+\.\s+/.test(line)) {
const ol = document.createElement("ol");
ol.className = "md-ol";
while (i < lines.length && /^\d+\.\s+/.test(lines[i])) {
const li = document.createElement("li");
_mdInline(li, lines[i].replace(/^\d+\.\s+/, ""));
ol.appendChild(li);
i++;
}
target.appendChild(ol);
continue;
}
// Blank line — paragraph separator; skip.
if (line.trim() === "") {
i++;
continue;
}
// Paragraph: greedily consume until blank or block-start.
const paraLines = [line];
i++;
while (
i < lines.length
&& lines[i].trim() !== ""
&& !/^```/.test(lines[i])
&& !/^#{1,6}\s+/.test(lines[i])
&& !/^[-*]\s+/.test(lines[i])
&& !/^\d+\.\s+/.test(lines[i])
) {
paraLines.push(lines[i]);
i++;
}
const p = document.createElement("p");
p.className = "md-p";
_mdInline(p, paraLines.join("\n"));
target.appendChild(p);
}
}
// Inline parser: handles `code`, **bold**, *italic*, [link](url).
// Emits DOM nodes; never innerHTML.
function _mdInline(target, text) {
// Walk the string, matching the earliest-occurring inline pattern.
let remaining = text;
while (remaining.length > 0) {
const matches = [
{ re: /`([^`]+)`/, tag: "code" },
{ re: /\*\*([^*\n]+)\*\*/, tag: "strong" },
{ re: /(?<!\*)\*([^*\n]+)\*(?!\*)/, tag: "em" },
{ re: /\[([^\]]+)\]\(([^)\s]+)\)/, tag: "a" },
];
let best = null;
for (const m of matches) {
const hit = remaining.match(m.re);
if (hit && (best === null || hit.index < best.hit.index)) {
best = { hit, tag: m.tag };
}
}
if (best === null) {
target.appendChild(document.createTextNode(remaining));
return;
}
if (best.hit.index > 0) {
target.appendChild(document.createTextNode(remaining.slice(0, best.hit.index)));
}
const el = document.createElement(best.tag);
if (best.tag === "a") {
// Link: cap protocol to http/https to avoid javascript: scheme escapes.
const href = best.hit[2];
if (/^https?:\/\//.test(href)) el.href = href;
el.rel = "noopener noreferrer";
el.target = "_blank";
el.textContent = best.hit[1];
} else {
el.textContent = best.hit[1];
}
target.appendChild(el);
remaining = remaining.slice(best.hit.index + best.hit[0].length);
}
}
// v0.4 B2: classify system messages into collapsible "event cards" so the
// chat thread doesn't drown in [sub-agent ... spawned] / [workflow ... started]
// notices. Returns a label + an emoji-style icon + whether to default to open.
function _classifySystemMessage(content) {
if (content.startsWith("[sub-agent")) {
if (content.includes("result]")) return { label: "Sub-agent result", icon: "🤖", open: true };
if (content.includes("error]")) return { label: "Sub-agent error", icon: "⚠️", open: true };
return { label: "Sub-agent spawned", icon: "🚀", open: false };
}
if (content.startsWith("[workflow")) {
if (content.includes("started]")) return { label: "Workflow started", icon: "🛠️", open: false };
if (content.includes("failed]")) return { label: "Workflow failed", icon: "❌", open: true };
return { label: "Workflow event", icon: "✅", open: true };
}
if (content.startsWith("Earlier conversation history")) {
return { label: "Compaction summary", icon: "📝", open: false };
}
if (content.startsWith("당신은 plan mode")) {
return { label: "Plan mode activated", icon: "🧭", open: false };
}
if (content.startsWith("The user APPROVED")) {
return { label: "Approved plan", icon: "✅", open: false };
}
if (content.startsWith("The user requested skill")) {
return { label: "Skill activated", icon: "🪄", open: false };
}
return null;
}
function appendMessageBubble(role, content, ts, opts) {
showConversationEmpty(false);
const list = $conv("#messages");
const bubble = document.createElement("div");
bubble.className = `msg-bubble role-${role}`;
const meta = document.createElement("div");
meta.className = "msg-meta";
const roleSpan = document.createElement("span");
roleSpan.className = "msg-role";
roleSpan.textContent = role;
const tsSpan = document.createElement("span");
tsSpan.className = "msg-ts";
tsSpan.textContent = (ts || "").slice(11, 19);
meta.appendChild(roleSpan);
if (ts) meta.appendChild(tsSpan);
const body = document.createElement("div");
body.className = "msg-body";
if (role === "system") {
// Collapsible event card if we recognise the format; otherwise plain.
const cls = _classifySystemMessage(content);
if (cls !== null) {
bubble.classList.add("role-system-event");
const det = document.createElement("details");
det.className = "md-system-event";
if (cls.open) det.open = true;
const sum = document.createElement("summary");
const icon = document.createElement("span");
icon.className = "event-icon";
icon.textContent = cls.icon;
const label = document.createElement("span");
label.className = "event-label";
label.textContent = cls.label;
sum.appendChild(icon);
sum.appendChild(label);
det.appendChild(sum);
const inner = document.createElement("div");
inner.className = "event-body";
_mdRenderInto(inner, content);
det.appendChild(inner);
body.appendChild(det);
} else {
_mdRenderInto(body, content);
}
} else if (role === "assistant" || (opts && opts.renderMarkdown)) {
_mdRenderInto(body, content);
} else {
body.textContent = content;
}
bubble.appendChild(meta);
bubble.appendChild(body);
list.appendChild(bubble);
list.scrollTop = list.scrollHeight;
return bubble;
}
function appendPendingPlaceholder() {
const list = $conv("#messages");
const placeholder = document.createElement("div");
placeholder.id = "pending-placeholder";
placeholder.className = "msg-bubble role-assistant pending";
const meta = document.createElement("div");
meta.className = "msg-meta";
const roleSpan = document.createElement("span");
roleSpan.className = "msg-role";
roleSpan.textContent = "assistant";
meta.appendChild(roleSpan);
const body = document.createElement("div");
body.className = "msg-body";
body.textContent = "…";
placeholder.appendChild(meta);
placeholder.appendChild(body);
list.appendChild(placeholder);
list.scrollTop = list.scrollHeight;
// v0.4 B3: keep a buffer for streamed tokens so we can re-render markdown
// once the full text arrives.
CONV_STATE.streamBuffer = "";
}
function removePendingPlaceholder() {
const p = $conv("#pending-placeholder");
if (p) p.remove();
CONV_STATE.streamBuffer = "";
}
// v0.4 B3: append a streamed token to the pending placeholder's body.
function appendStreamDelta(text) {
const placeholder = $conv("#pending-placeholder");
if (!placeholder) return;
if (!CONV_STATE.streamBuffer || CONV_STATE.streamBuffer === "") {
// First chunk — replace the "…" indicator.
const body = placeholder.querySelector(".msg-body");
if (body) body.textContent = "";
}
CONV_STATE.streamBuffer = (CONV_STATE.streamBuffer || "") + text;
const body = placeholder.querySelector(".msg-body");
if (body) {
// Streaming view: keep plain text for speed, full markdown render only
// happens when the final `message` event arrives.
body.textContent = CONV_STATE.streamBuffer;
}
const list = $conv("#messages");
if (list) list.scrollTop = list.scrollHeight;
}
function updateSessionStatePill(state) {
const pill = $conv("#session-state-pill");
if (!pill) return;
if (!state) {
pill.textContent = "";
pill.className = "conv-session-state";
return;
}
pill.textContent = state;
pill.className = `conv-session-state state-${state}`;
}
function updateSessionModelPill(model) {
const pill = $conv("#session-model-pill");
if (!pill) return;
if (!model) {
pill.textContent = "";
pill.className = "conv-model-pill";
return;
}
// Trim the "openrouter:" prefix for display; keep full id in tooltip.
const display = model.replace(/^openrouter:/, "");
pill.textContent = display;
pill.title = `model: ${model}`;
pill.className = "conv-model-pill";
}
async function loadSessionList() {
try {
const list = await jsonFetch("/sessions?limit=50");
const picker = $conv("#session-picker");
picker.replaceChildren();
const placeholderOpt = document.createElement("option");
placeholderOpt.value = "";
placeholderOpt.textContent = "(세션 선택…)";
picker.appendChild(placeholderOpt);
for (const s of list) {
const opt = document.createElement("option");
opt.value = s.id;
const titleStr = s.title || "(제목 없음)";
opt.textContent = `${s.id.slice(0, 8)}… · ${titleStr}`;
picker.appendChild(opt);
}
} catch (e) {
setError(`세션 목록 로드 실패: ${e.message}`);
}
}
async function loadAndAttachSession(sessionId) {
if (CONV_STATE.eventSource) {
CONV_STATE.eventSource.close();
CONV_STATE.eventSource = null;
}
CONV_STATE.sessionId = sessionId;
CONV_STATE.lastSeq = 0;
CONV_STATE.awaitingReply = false;
clearMessages();
let detail;
try {
detail = await jsonFetch(`/sessions/${sessionId}`);
} catch (e) {
setError(`세션 로드 실패: ${e.message}`);
setSendDisabled(true);
return;
}
updateSessionStatePill(detail.session.state);
updateSessionModelPill(detail.session.model);
const messages = detail.messages || [];
for (const m of messages) {
// v0.4 B2: render system messages too — most map to recognised event
// cards (collapsible). Unknown system payloads fall through to plain
// markdown rendering.
appendMessageBubble(m.role, m.content, m.ts);
if (m.seq > CONV_STATE.lastSeq) CONV_STATE.lastSeq = m.seq;
}
if (messages.length === 0) {
showConversationEmpty(true, "이 세션에 메시지가 아직 없습니다. 첫 메시지를 보내보세요.");
}
const ended = detail.session.state === "ended";
setSendDisabled(ended);
if (!ended) attachEventSource(sessionId);
}
function attachEventSource(sessionId) {
if (CONV_STATE.eventSource) {
CONV_STATE.eventSource.close();
}
const url = `${API}/sessions/${sessionId}/stream?last_seq=${CONV_STATE.lastSeq}`;
const src = new EventSource(url);
CONV_STATE.eventSource = src;
src.addEventListener("message", (ev) => {
try {
const data = JSON.parse(ev.data);
if (data.seq <= CONV_STATE.lastSeq) return;
if (data.role === "assistant" && CONV_STATE.awaitingReply) {
removePendingPlaceholder();
CONV_STATE.awaitingReply = false;
setAbortVisible(false);
}
// v0.4 B2: render every system message — most are recognised events
// (compaction / sub-agent / workflow / plan / skill) and rendered as
// collapsible cards by appendMessageBubble.
appendMessageBubble(data.role, data.content, data.ts);
CONV_STATE.lastSeq = data.seq;
} catch (_) { /* ignore parse errors */ }
});
// v0.4 B3: token streaming. Server pushes one chunk per LLM token; we
// append to the pending placeholder. When the final "message" SSE arrives
// it replaces the streaming text with the markdown-rendered version.
src.addEventListener("chunk", (ev) => {
try {
const data = JSON.parse(ev.data);
if (data.type === "delta" && typeof data.text === "string") {
appendStreamDelta(data.text);
} else if (data.type === "cancelled" || data.type === "error") {
// Drop the placeholder; setError already handled or will be by 'message'.
removePendingPlaceholder();
CONV_STATE.awaitingReply = false;
setAbortVisible(false);
}
// type === "done" is benign — the matching 'message' SSE arrives next.
} catch (_) { /* ignore parse errors */ }
});
src.addEventListener("done", () => {
src.close();
if (CONV_STATE.eventSource === src) CONV_STATE.eventSource = null;
updateSessionStatePill("ended");
setSendDisabled(true);
});
src.onerror = () => {
// Sessions are long-lived — let the browser reconnect on EventSource's
// default backoff. We don't surface this as a hard error unless it
// persists.
};
}
async function sendMessage(text) {
if (!CONV_STATE.sessionId) {
setError("세션을 먼저 선택하거나 새로 만드세요.");
return;
}
if (!text.trim()) return;
setSendDisabled(true);
setAbortVisible(true);
CONV_STATE.awaitingReply = true;
appendPendingPlaceholder();
try {
await jsonFetch(`/sessions/${CONV_STATE.sessionId}/messages`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ content: text }),
});
$conv("#message-input").value = "";
setError("");
} catch (e) {
removePendingPlaceholder();
CONV_STATE.awaitingReply = false;
setAbortVisible(false);
setError(`전송 실패: ${e.message}`);
} finally {
setSendDisabled(false);
$conv("#message-input").focus();
}
}
async function abortInflight() {
if (!CONV_STATE.sessionId) return;
try {
await jsonFetch(`/sessions/${CONV_STATE.sessionId}/abort`, { method: "POST" });
removePendingPlaceholder();
CONV_STATE.awaitingReply = false;
setAbortVisible(false);
setError("");
} catch (e) {
setError(`중단 실패: ${e.message}`);
}
}
async function createNewSession() {
let personas;
try {
personas = await jsonFetch("/personas");
} catch (e) {
setError(`persona 목록 로드 실패: ${e.message}`);
return;
}
const defaultPersona = personas.find((p) => p.name === "default-interactive") || personas[0];
if (!defaultPersona) {
setError("등록된 persona 가 없습니다. CLI 에서 `mydeepagent` 한 번 실행한 후 재시도하세요.");
return;
}
try {
const ack = await jsonFetch("/sessions", {
method: "POST",
headers: { "Content-Type": "application/json" },
// CreateSessionRequest requires repo_path min_length=1. We default to
// "." (cwd of the serve process) — the backend resolves it to absolute.
body: JSON.stringify({ persona_name: defaultPersona.name, repo_path: "." }),
});
await loadSessionList();
$conv("#session-picker").value = ack.session_id;
await loadAndAttachSession(ack.session_id);
} catch (e) {
setError(`세션 생성 실패: ${e.message}`);
}
}
async function bootstrapConversationPage() {
await loadSessionList();
$conv("#new-session-btn").addEventListener("click", createNewSession);
$conv("#abort-btn").addEventListener("click", abortInflight);
$conv("#session-picker").addEventListener("change", (ev) => {
const sid = ev.target.value;
if (sid) loadAndAttachSession(sid);
});
$conv("#message-form").addEventListener("submit", (ev) => {
ev.preventDefault();
const input = $conv("#message-input");
sendMessage(input.value);
});
// v0.4 B5: track IME composition state — Korean/Japanese/Chinese IME emits
// Enter to commit the current candidate; we must NOT treat that as send.
// compositionend ALSO fires a synthetic Enter that we need to swallow.
const input = $conv("#message-input");
input._composing = false;
input.addEventListener("compositionstart", () => { input._composing = true; });
input.addEventListener("compositionend", () => {
// The keydown event that ends composition is still pending — defer the
// flag flip one tick so the upcoming keydown still sees _composing=true.
setTimeout(() => { input._composing = false; }, 0);
});
input.addEventListener("keydown", (ev) => {
if (ev.key !== "Enter") return;
// Shift+Enter → newline (let the textarea handle it natively).
if (ev.shiftKey) return;
// IME composition (Korean/Japanese/Chinese candidate commit) → never send.
if (input._composing) return;
// Plain Enter (and Cmd/Ctrl+Enter for backwards compat) → send.
ev.preventDefault();
sendMessage(ev.target.value);
});
// v0.3 PR #8: deep link `?session=<id>` auto-loads the named session.
const params = new URLSearchParams(window.location.search);
const deepSid = params.get("session");
if (deepSid) {
const picker = $conv("#session-picker");
if (picker) picker.value = deepSid;
loadAndAttachSession(deepSid);
}
}
// =============== sessions list (index.html as of v0.3 PR #8) ===============
async function renderSessionsList() {
setError("");
let sessions;
try {
sessions = await jsonFetch("/sessions?limit=50");
} catch (e) {
setError(`세션 목록을 불러오지 못했습니다: ${e.message}`);
return;
}
const tbody = $("#sessions tbody");
if (!tbody) return;
tbody.replaceChildren();
if (sessions.length === 0) {
tbody.appendChild(
emptyCell(5, "아직 대화한 세션이 없습니다.", "/conversation.html", "새 대화 시작 →")
);
return;
}
for (const s of sessions) {
const tr = document.createElement("tr");
const idTd = document.createElement("td");
const idLink = document.createElement("a");
idLink.href = `/conversation.html?session=${s.id}`;
idLink.className = "mono";
idLink.textContent = s.id.slice(0, 8) + "…";
idTd.appendChild(idLink);
const stateTd = document.createElement("td");
stateTd.appendChild(badge(s.state));
const titleTd = document.createElement("td");
titleTd.textContent = s.title || "(no title yet)";
const personaTd = document.createElement("td");
personaTd.className = "mono";
// SessionSummary exposes `persona_id` (UUID) — show first 8 chars + tooltip.
if (s.persona_id) {
personaTd.textContent = s.persona_id.slice(0, 8) + "…";
personaTd.title = s.persona_id;
} else {
personaTd.textContent = "—";
}
const lastTd = document.createElement("td");
lastTd.className = "mono";
lastTd.textContent = (s.last_message_at || s.started_at || "").slice(0, 19).replace("T", " ");
tr.append(idTd, stateTd, titleTd, personaTd, lastTd);
tbody.appendChild(tr);
}
}
// =============== new-workflow.html (v0.4 generator) ===============
const _CAPABILITIES = [
"spec_write", "code_review", "evidence_check", "log_analysis", "decision",
"command_execute", "security_audit", "code_edit", "plan", "verify",
];
const _BACKENDS = ["openrouter", "anthropic", "ollama_local"];
const _RISKS = ["low", "medium", "high"];
const WF_STATE = {
roles: /** @type {Array<{id:string,capabilities:string[],backends:string[],fallbacks:string[]}>} */ ([]),
phases: /** @type {Array<{key:string,title:string,risk:string,role:string,instructions:string,artifactPath:string,artifactSchema:string,gates:string,timeout:string,budget:string}>} */ ([]),
};
function _wfFreshRole() {
return { id: "", capabilities: [], backends: [], fallbacks: [] };
}
function _wfFreshPhase() {
return {
key: "", title: "", risk: "medium", role: "",
instructions: "", artifactPath: "", artifactSchema: "",
gates: "", timeout: "", budget: "",
};
}
function _wfChip(label, checked, onChange) {
const lbl = document.createElement("label");
lbl.className = "wf-chip";
const cb = document.createElement("input");
cb.type = "checkbox";
cb.checked = checked;
cb.addEventListener("change", () => onChange(cb.checked));
const span = document.createElement("span");
span.textContent = label;
lbl.appendChild(cb);
lbl.appendChild(span);
return lbl;
}
function _wfTextInput(value, placeholder, onChange, type = "text") {
const i = document.createElement("input");
i.type = type;
i.value = value;
i.placeholder = placeholder;
i.addEventListener("input", () => onChange(i.value));
return i;
}
function _wfTextArea(value, placeholder, onChange, rows = 3) {
const t = document.createElement("textarea");
t.value = value;
t.placeholder = placeholder;
t.rows = rows;
t.addEventListener("input", () => onChange(t.value));
return t;
}
function _wfSelect(value, options, onChange) {
const s = document.createElement("select");
for (const o of options) {
const opt = document.createElement("option");
opt.value = o;
opt.textContent = o;
if (o === value) opt.selected = true;
s.appendChild(opt);
}
s.addEventListener("change", () => onChange(s.value));
return s;
}
function renderRolesList() {
const container = $("#roles-list");
if (!container) return;
container.replaceChildren();
WF_STATE.roles.forEach((role, idx) => {
const card = document.createElement("div");
card.className = "wf-row-card";
const header = document.createElement("div");
header.className = "wf-row-header";
const title = document.createElement("strong");
title.textContent = `Role #${idx + 1}`;
const del = document.createElement("button");
del.type = "button";
del.className = "button-link";
del.textContent = "삭제";
del.addEventListener("click", () => { WF_STATE.roles.splice(idx, 1); renderRolesList(); renderPreview(); });
header.append(title, del);
card.appendChild(header);
const idRow = document.createElement("div");
idRow.className = "form-row";
const idLbl = document.createElement("label");
idLbl.innerHTML = "id <span class='hint'>— phase 가 참조할 키. <code>writer</code> 같은 소문자/숫자/언더스코어</span>";
idRow.append(idLbl, _wfTextInput(role.id, "writer", (v) => { role.id = v; renderPreview(); }));
card.appendChild(idRow);
const capRow = document.createElement("div");
capRow.className = "form-row";
const capLbl = document.createElement("label");
capLbl.innerHTML = "required_capabilities <span class='hint'>— persona 가 가져야 할 능력 (최소 1)</span>";
const chips = document.createElement("div");
chips.className = "chips";
for (const c of _CAPABILITIES) {
chips.appendChild(_wfChip(c, role.capabilities.includes(c), (on) => {
if (on && !role.capabilities.includes(c)) role.capabilities.push(c);
else if (!on) role.capabilities = role.capabilities.filter((x) => x !== c);
renderPreview();
}));
}
capRow.append(capLbl, chips);
card.appendChild(capRow);
container.appendChild(card);
});
if (WF_STATE.roles.length === 0) {
const empty = document.createElement("div");
empty.className = "hint";
empty.textContent = "Role 이 1개 이상 필요합니다.";
container.appendChild(empty);
}
}
function renderPhasesList() {
const container = $("#phases-list");
if (!container) return;
container.replaceChildren();
const roleIds = WF_STATE.roles.map((r) => r.id).filter(Boolean);
WF_STATE.phases.forEach((phase, idx) => {
const card = document.createElement("div");
card.className = "wf-row-card";
const header = document.createElement("div");
header.className = "wf-row-header";
const title = document.createElement("strong");
title.textContent = `Phase #${idx + 1}`;
const del = document.createElement("button");
del.type = "button";
del.className = "button-link";
del.textContent = "삭제";
del.addEventListener("click", () => { WF_STATE.phases.splice(idx, 1); renderPhasesList(); renderPreview(); });
header.append(title, del);
card.appendChild(header);
const grid = document.createElement("div");
grid.className = "form-grid";
for (const [label, key, ph] of [
["key — 영문 소문자/숫자/언더스코어", "key", "spec"],
["title — 표시용 한 줄", "title", "명세 작성"],
]) {
const r = document.createElement("div");
r.className = "form-row";
const l = document.createElement("label");
l.textContent = label;
r.append(l, _wfTextInput(phase[key], ph, (v) => { phase[key] = v; renderPreview(); }));
grid.appendChild(r);
}
card.appendChild(grid);
const grid2 = document.createElement("div");
grid2.className = "form-grid";
const riskRow = document.createElement("div");
riskRow.className = "form-row";
const riskLbl = document.createElement("label");
riskLbl.innerHTML = "risk <span class='hint'>— 단계 위험 등급</span>";
riskRow.append(riskLbl, _wfSelect(phase.risk, _RISKS, (v) => { phase.risk = v; renderPreview(); }));
grid2.appendChild(riskRow);
const roleRow = document.createElement("div");
roleRow.className = "form-row";
const roleLbl = document.createElement("label");
roleLbl.innerHTML = "role <span class='hint'>— 위에서 정의한 role id 중 하나</span>";
const opts = roleIds.length > 0 ? roleIds : ["(role 을 먼저 정의)"];
roleRow.append(roleLbl, _wfSelect(phase.role, opts, (v) => { phase.role = v; renderPreview(); }));
grid2.appendChild(roleRow);
card.appendChild(grid2);
const insRow = document.createElement("div");
insRow.className = "form-row";
const insLbl = document.createElement("label");
insLbl.innerHTML = "instructions <span class='hint'>— 최소 10자. 이 phase 가 무엇을 해야 하는지</span>";
insRow.append(insLbl, _wfTextArea(phase.instructions,
"예: requirements.md 를 읽고 spec.md 를 작성하세요. 한국어 권장.",
(v) => { phase.instructions = v; renderPreview(); }, 4));
card.appendChild(insRow);
const grid3 = document.createElement("div");
grid3.className = "form-grid";
for (const [label, key, ph] of [
["expected_artifact.path (선택)", "artifactPath", "artifacts/spec.md"],
["expected_artifact.schema (선택)", "artifactSchema", "spec-v1"],
]) {
const r = document.createElement("div");
r.className = "form-row";
const l = document.createElement("label");
l.textContent = label;
r.append(l, _wfTextInput(phase[key], ph, (v) => { phase[key] = v; renderPreview(); }));
grid3.appendChild(r);
}
card.appendChild(grid3);
container.appendChild(card);
});
if (WF_STATE.phases.length === 0) {
const empty = document.createElement("div");
empty.className = "hint";
empty.textContent = "Phase 가 1개 이상 필요합니다.";
container.appendChild(empty);
}
}
function _wfBuildRequest() {
const name = $("#wf-name").value.trim();
const version = parseInt($("#wf-version").value, 10);
const description = $("#wf-description").value.trim();
const roles = WF_STATE.roles.filter((r) => r.id).map((r) => ({
id: r.id,
required_capabilities: r.capabilities,
preferred_backends: r.backends,
fallback_personas: r.fallbacks,
}));
const phases = WF_STATE.phases.filter((p) => p.key).map((p) => {
const out = {
key: p.key,
title: p.title || p.key,
risk: p.risk,
role: p.role,
instructions: p.instructions || "(no instructions)",
gates: [],
};
if (p.artifactPath || p.artifactSchema) {
out.expected_artifact = {
path: p.artifactPath || "artifacts/output.md",
schema: p.artifactSchema || "text",
};
}
return out;
});
const req = { name, version: isNaN(version) ? 1 : version, roles, phases, default_gates: [] };
if (description) req.description = description;
return req;
}
function renderPreview() {
const pre = $("#wf-preview");
if (!pre) return;
pre.textContent = JSON.stringify(_wfBuildRequest(), null, 2);
}
async function submitWorkflow(ev) {
ev.preventDefault();
setError("");
$("#success").style.display = "none";
const req = _wfBuildRequest();
try {
const ack = await jsonFetch("/workflows", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify(req),
});
const okBox = $("#success");
okBox.textContent = `✅ 저장 완료 → ${ack.path}. 워크플로우 실행 페이지에서 바로 보입니다.`;
okBox.style.display = "block";
} catch (e) {
setError(`저장 실패: ${e.message}`);
}
}
function bootstrapWorkflowGenerator() {
WF_STATE.roles = [_wfFreshRole()];
WF_STATE.phases = [_wfFreshPhase()];
renderRolesList();
renderPhasesList();
renderPreview();
$("#add-role").addEventListener("click", () => { WF_STATE.roles.push(_wfFreshRole()); renderRolesList(); renderPreview(); });
$("#add-phase").addEventListener("click", () => { WF_STATE.phases.push(_wfFreshPhase()); renderPhasesList(); renderPreview(); });
$("#wf-name").addEventListener("input", renderPreview);
$("#wf-version").addEventListener("input", renderPreview);
$("#wf-description").addEventListener("input", renderPreview);
$("#wf-form").addEventListener("submit", submitWorkflow);
}
// =============== bootstrap ===============
document.addEventListener("DOMContentLoaded", () => {
const page = document.body.dataset.page;
if (page === "index") {
renderSessionsList();
} else if (page === "runs") {
renderRunsList();
renderBudgetSummary();
} else if (page === "new") {
@@ -430,5 +1345,9 @@ document.addEventListener("DOMContentLoaded", () => {
renderRunDetail();
$("#abort-btn").addEventListener("click", abortRun);
$("#resume-btn").addEventListener("click", resumeRun);
} else if (page === "conversation") {
bootstrapConversationPage();
} else if (page === "new-workflow") {
bootstrapWorkflowGenerator();
}
});

View File

@@ -0,0 +1,53 @@
<!doctype html>
<html lang="ko">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width,initial-scale=1" />
<title>my-deepagent · 대화</title>
<link rel="stylesheet" href="/static/style.css" />
</head>
<body data-page="conversation">
<header>
<h1><a href="/">my-deepagent</a></h1>
<nav>
<a href="/" class="nav-primary">세션 목록</a>
<a href="/conversation.html" class="active nav-primary">대화</a>
<a href="/runs.html" class="nav-secondary">Runs</a>
<a href="/new.html" class="nav-secondary">워크플로우 실행</a>
</nav>
</header>
<main class="conversation-main">
<div id="error" class="error-banner" style="display:none"></div>
<!-- Top bar: session picker + new conversation button -->
<div class="conv-topbar">
<label for="session-picker" class="conv-label">세션</label>
<select id="session-picker" class="conv-picker">
<option value="">(세션 선택…)</option>
</select>
<button id="new-session-btn" type="button" class="conv-action-btn">새 대화</button>
<span class="conv-model-pill" id="session-model-pill" title="이 세션의 활성 모델"></span>
<span class="conv-session-state" id="session-state-pill"></span>
</div>
<!-- Message thread -->
<div id="messages" class="messages-thread">
<div class="conv-empty" id="conv-empty">대화를 시작하려면 위에서 세션을 선택하거나 "새 대화"를 누르세요.</div>
</div>
<!-- Input bar -->
<form id="message-form" class="conv-input-bar">
<textarea
id="message-input"
rows="2"
placeholder="메시지를 입력하세요… (Enter 전송, Shift+Enter 줄바꿈)"
autocomplete="off"
disabled
></textarea>
<button id="send-btn" type="submit" disabled>전송</button>
<button id="abort-btn" type="button" disabled style="display:none">⏹ 중단</button>
</form>
</main>
<script src="/static/app.js"></script>
</body>
</html>

View File

@@ -3,43 +3,51 @@
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width,initial-scale=1" />
<title>my-deepagent · runs</title>
<title>my-deepagent · 대화</title>
<link rel="stylesheet" href="/static/style.css" />
</head>
<body data-page="index">
<header>
<h1><a href="/">my-deepagent</a></h1>
<nav>
<a href="/" class="active">Runs</a>
<a href="/new.html">Run</a>
<a href="/" class="active nav-primary">대화</a>
<a href="/runs.html" class="nav-secondary">Runs</a>
<a href="/new.html" class="nav-secondary">워크플로우 실행</a>
<a href="/new-workflow.html" class="nav-secondary">+ 템플릿 만들기</a>
</nav>
</header>
<main>
<div id="error" class="error-banner" style="display:none"></div>
<div class="page-title">
<h2>최근 Runs</h2>
<span class="page-subtitle"> 50개</span>
<h2>최근 대화 세션</h2>
<span class="page-subtitle"> 50개 · 빈 화면이면 아래 "새 대화"를 누르세요</span>
</div>
<div class="info-box">
<strong>👋 my-deepagent</strong> — OpenRouter 가성비 모델로 돌아가는 Claude Code 스타일 멀티턴 에이전트.
대부분의 경우 아래 <strong>"새 대화 시작"</strong>만 누르면 됩니다.
<a href="/new.html">여러 단계 자동화</a>가 필요하면 워크플로우, <a href="/new-workflow.html">템플릿 직접 만들기</a>도 가능.
</div>
<div class="action-bar" style="margin-bottom: 12px;">
<a class="button primary" href="/conversation.html">▶︎ 새 대화 시작</a>
</div>
<div class="card">
<table id="runs">
<table id="sessions">
<thead>
<tr>
<th style="width: 22%">Run</th>
<th style="width: 13%">State</th>
<th>Repo</th>
<th style="width: 12%">Branch</th>
<th style="width: 16%">Created</th>
<th style="width: 16%">Ended</th>
<th style="width: 16%">Session</th>
<th style="width: 12%">State</th>
<th>Title / preview</th>
<th style="width: 12%">Persona</th>
<th style="width: 18%">Last activity</th>
</tr>
</thead>
<tbody></tbody>
</table>
</div>
<h2 class="section-title">예산 (현재)</h2>
<div id="budget-summary" class="budget-grid"></div>
</main>
<script src="/static/app.js"></script>
</body>

View File

@@ -0,0 +1,99 @@
<!doctype html>
<html lang="ko">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width,initial-scale=1" />
<title>my-deepagent · 워크플로우 템플릿 만들기</title>
<link rel="stylesheet" href="/static/style.css" />
</head>
<body data-page="new-workflow">
<header>
<h1><a href="/">my-deepagent</a></h1>
<nav>
<a href="/" class="nav-primary">대화</a>
<a href="/runs.html" class="nav-secondary">Runs</a>
<a href="/new.html" class="nav-secondary">워크플로우 실행</a>
<a href="/new-workflow.html" class="active nav-secondary">+ 템플릿 만들기</a>
</nav>
</header>
<main>
<div id="error" class="error-banner" style="display:none"></div>
<div id="success" class="info-box" style="display:none"></div>
<div class="page-title">
<h2>워크플로우 템플릿 만들기</h2>
<span class="page-subtitle">phase 시퀀스 + role 정의 → YAML 저장</span>
</div>
<div class="info-box">
<strong>📘 워크플로우 = phase 시퀀스</strong><br />
예: <code>"명세 작성" → "리뷰" → "검증"</code> 처럼 단계별로 어떤 role(역할)이 어떤
산출물을 만들지 정의하는 파일입니다. 저장 후엔 <a href="/new.html">워크플로우 실행</a>
페이지의 드롭다운에 자동으로 등장합니다 (서버 재시작 불필요).
</div>
<form id="wf-form" autocomplete="off">
<!-- 기본 메타 -->
<div class="card" style="padding: 20px;">
<h3 class="section-title" style="margin-top:0">기본 정보</h3>
<div class="form-grid">
<div class="form-row">
<label for="wf-name">
name
<span class="hint">— 영문 소문자/숫자/하이픈만. 예: <code>spec-and-review</code></span>
</label>
<input id="wf-name" type="text" required placeholder="my-workflow" />
</div>
<div class="form-row">
<label for="wf-version">
version
<span class="hint">— 정수, 1부터</span>
</label>
<input id="wf-version" type="number" required value="1" min="1" />
</div>
</div>
<div class="form-row">
<label for="wf-description">
description
<span class="hint">— 한 줄 설명 (선택)</span>
</label>
<input id="wf-description" type="text" placeholder="이 워크플로우가 무엇을 하는지" />
</div>
</div>
<!-- Roles -->
<div class="card" style="padding: 20px; margin-top: 16px;">
<h3 class="section-title" style="margin-top:0">
Roles <span class="hint" style="font-weight:400">— phase 가 참조할 역할 정의</span>
</h3>
<div id="roles-list"></div>
<button type="button" id="add-role" class="button">+ Role 추가</button>
</div>
<!-- Phases -->
<div class="card" style="padding: 20px; margin-top: 16px;">
<h3 class="section-title" style="margin-top:0">
Phases <span class="hint" style="font-weight:400">— 실제 실행되는 단계 순서</span>
</h3>
<div id="phases-list"></div>
<button type="button" id="add-phase" class="button">+ Phase 추가</button>
</div>
<!-- Preview -->
<details class="card" style="padding: 16px; margin-top: 16px;">
<summary style="cursor:pointer; font-weight:600;">
YAML 미리보기 <span class="hint" style="font-weight:400">— 저장될 파일 내용</span>
</summary>
<pre id="wf-preview" class="mono" style="margin-top:12px; white-space:pre-wrap; font-size:12.5px;"></pre>
</details>
<div class="action-bar">
<button type="submit" class="primary">💾 저장 + 등록</button>
<a class="button" href="/">취소</a>
</div>
</form>
</main>
<script src="/static/app.js"></script>
</body>
</html>

View File

@@ -3,55 +3,85 @@
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width,initial-scale=1" />
<title>my-deepagent · 새 Run</title>
<title>my-deepagent · 워크플로우 실행</title>
<link rel="stylesheet" href="/static/style.css" />
</head>
<body data-page="new">
<header>
<h1><a href="/">my-deepagent</a></h1>
<nav>
<a href="/">Runs</a>
<a href="/new.html" class="active">Run</a>
<a href="/" class="nav-primary">대화</a>
<a href="/runs.html" class="nav-secondary">Runs</a>
<a href="/new.html" class="active nav-secondary">워크플로우 실행</a>
<a href="/new-workflow.html" class="nav-secondary">+ 템플릿 만들기</a>
</nav>
</header>
<main>
<div id="error" class="error-banner" style="display:none"></div>
<div class="page-title">
<h2>새 Run 시작</h2>
<span class="page-subtitle">워크플로우 + repo + 요구사항</span>
<h2>워크플로우 실행 <span class="hint" style="font-size: 12px; vertical-align: middle;">(고급 기능)</span></h2>
<span class="page-subtitle">사전 정의된 phase 시퀀스로 자동화된 작업 실행</span>
</div>
<div class="info-box">
<strong>💡 자유 대화는 여기가 아닙니다.</strong>
그냥 챗봇처럼 쓰고 싶다면 <a href="/">메인 페이지의 "새 대화 시작"</a>을 눌러주세요.
이 페이지는 <strong>여러 단계 (예: 명세 → 리뷰 → 검증)</strong> 가 정해진 순서로 자동 실행되는 워크플로우를 시작할 때 씁니다.
<br /><br />
<strong>새 템플릿을 직접 만들고 싶다면</strong> 우상단 <a href="/new-workflow.html">+ 템플릿 만들기</a>로 가세요.
</div>
<form id="start-form" autocomplete="off">
<div class="card" style="padding: 20px;">
<div class="form-row">
<label for="template">워크플로우 템플릿</label>
<label for="template">
워크플로우 템플릿
<span class="hint">— 무슨 단계를 어떤 순서로 돌릴지 정의한 YAML. 모르면 첫 번째 선택.</span>
</label>
<select id="template" required></select>
</div>
<div class="form-grid">
<div class="form-row">
<label for="repo-path">repo 절대경로</label>
<label for="repo-path">
repo 절대경로
<span class="hint">— 작업할 git 저장소 위치 (예: /Users/me/projects/my-thing)</span>
</label>
<input id="repo-path" type="text" placeholder="/Users/me/projects/my-thing" required />
</div>
<div class="form-row">
<label for="base-branch">base branch</label>
<label for="base-branch">
base branch
<span class="hint">— 작업의 시작점 (보통 main)</span>
</label>
<input id="base-branch" type="text" value="main" />
</div>
</div>
<div class="form-row">
<label for="requirements">requirements <span class="hint">— 자유 텍스트, 마크다운 OK</span></label>
<textarea id="requirements" rows="6" placeholder="이 workflow가 다룰 요구사항을 적어주세요."></textarea>
<label for="requirements">
requirements
<span class="hint">— 이 워크플로우가 다룰 요구사항. 자유 텍스트, 마크다운 OK</span>
</label>
<textarea id="requirements" rows="6" placeholder="예: wordcount CLI를 만들어줘. python으로, pytest 테스트 포함."></textarea>
</div>
</div>
<h2 class="section-title">Persona 오버라이드 <span class="hint" style="text-transform: none; letter-spacing: 0; font-weight: 400;">(선택, 비우면 자동 선택)</span></h2>
<div id="override-fields" class="card"></div>
<details class="card" style="margin-top: 16px; padding: 16px;">
<summary style="cursor: pointer; font-weight: 600;">
Persona 오버라이드 <span class="hint" style="font-weight: 400;">— 비우면 자동 선택 (고급)</span>
</summary>
<p class="hint" style="margin-top: 12px; font-weight: 400;">
각 단계(role)에 어떤 persona(AI 모델 + 시스템 프롬프트)를 쓸지 직접 고르고 싶을 때만 채우세요.
비워두면 capability 매칭으로 자동 선택됩니다.
</p>
<div id="override-fields"></div>
</details>
<div class="action-bar">
<button type="submit" class="primary">▶︎ 시작</button>
<button type="submit" class="primary">▶︎ 워크플로우 실행</button>
<a class="button" href="/">취소</a>
</div>
</form>

View File

@@ -10,8 +10,10 @@
<header>
<h1><a href="/">my-deepagent</a></h1>
<nav>
<a href="/">Runs</a>
<a href="/new.html">Run</a>
<a href="/" class="nav-primary">대화</a>
<a href="/runs.html" class="nav-secondary">Runs</a>
<a href="/new.html" class="nav-secondary">워크플로우 실행</a>
<a href="/new-workflow.html" class="nav-secondary">+ 템플릿 만들기</a>
</nav>
</header>
<main>

View File

@@ -0,0 +1,48 @@
<!doctype html>
<html lang="ko">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width,initial-scale=1" />
<title>my-deepagent · workflow runs (archive)</title>
<link rel="stylesheet" href="/static/style.css" />
</head>
<body data-page="runs">
<header>
<h1><a href="/">my-deepagent</a></h1>
<nav>
<a href="/" class="nav-primary">대화</a>
<a href="/runs.html" class="active nav-secondary">Runs</a>
<a href="/new.html" class="nav-secondary">워크플로우 실행</a>
<a href="/new-workflow.html" class="nav-secondary">+ 템플릿 만들기</a>
</nav>
</header>
<main>
<div id="error" class="error-banner" style="display:none"></div>
<div class="page-title">
<h2>Workflow Runs · archive</h2>
<span class="page-subtitle">차별화 워크플로우 엔진 결과 (최신 50개)</span>
</div>
<div class="card">
<table id="runs">
<thead>
<tr>
<th style="width: 22%">Run</th>
<th style="width: 13%">State</th>
<th>Repo</th>
<th style="width: 12%">Branch</th>
<th style="width: 16%">Created</th>
<th style="width: 16%">Ended</th>
</tr>
</thead>
<tbody></tbody>
</table>
</div>
<h2 class="section-title">예산 (현재)</h2>
<div id="budget-summary" class="budget-grid"></div>
</main>
<script src="/static/app.js"></script>
</body>
</html>

View File

@@ -778,3 +778,446 @@ select {
.event-line { grid-template-columns: 1fr; gap: 2px; }
.chips { grid-template-columns: 1fr; gap: 6px; }
}
/* =================================================================
v0.3 PR #8 — Conversation page
================================================================= */
.conversation-main {
display: flex;
flex-direction: column;
min-height: calc(100vh - 80px);
padding-bottom: 0;
}
.conv-topbar {
display: flex;
align-items: center;
gap: 12px;
padding: 12px 16px;
background: var(--bg-card);
border: 1px solid var(--border);
border-radius: 8px;
margin-bottom: 12px;
flex-wrap: wrap;
}
.conv-label {
font-size: 13px;
color: var(--text-muted);
font-weight: 600;
}
.conv-picker {
flex: 1;
min-width: 240px;
padding: 6px 10px;
font-family: var(--font-mono);
font-size: 13px;
border: 1px solid var(--border);
border-radius: 6px;
background: var(--bg);
}
.conv-action-btn {
padding: 6px 14px;
font-size: 13px;
background: var(--accent);
color: white;
border: none;
border-radius: 6px;
cursor: pointer;
}
.conv-action-btn:hover { filter: brightness(1.08); }
.conv-session-state {
font-size: 11px;
padding: 2px 8px;
border-radius: 999px;
text-transform: lowercase;
letter-spacing: 0.04em;
}
.conv-session-state.state-active {
background: rgba(34,197,94,0.12);
color: rgb(22,163,74);
}
.conv-session-state.state-ended {
background: rgba(100,116,139,0.12);
color: rgb(71,85,105);
}
.messages-thread {
flex: 1;
overflow-y: auto;
padding: 16px;
border: 1px solid var(--border);
border-radius: 8px;
background: var(--bg);
margin-bottom: 12px;
display: flex;
flex-direction: column;
gap: 12px;
}
.conv-empty {
color: var(--text-muted);
text-align: center;
padding: 40px 16px;
font-size: 13px;
}
.msg-bubble {
max-width: 80%;
padding: 10px 14px;
border-radius: 12px;
font-size: 14px;
line-height: 1.5;
white-space: pre-wrap;
word-break: break-word;
}
.msg-bubble.role-user {
align-self: flex-end;
background: var(--accent);
color: white;
}
.msg-bubble.role-assistant {
align-self: flex-start;
background: var(--bg-card);
border: 1px solid var(--border);
}
.msg-bubble.role-system {
align-self: center;
max-width: 90%;
font-style: italic;
font-size: 12.5px;
background: rgba(245,158,11,0.08);
border: 1px dashed rgba(245,158,11,0.4);
color: rgb(120,53,15);
}
.msg-bubble.pending {
opacity: 0.6;
font-size: 20px;
padding: 6px 14px;
}
.msg-meta {
display: flex;
align-items: center;
gap: 8px;
font-size: 11px;
opacity: 0.6;
margin-bottom: 4px;
}
.msg-role {
font-weight: 700;
text-transform: uppercase;
letter-spacing: 0.05em;
}
.conv-input-bar {
display: flex;
gap: 8px;
padding: 12px;
background: var(--bg-card);
border: 1px solid var(--border);
border-radius: 8px;
}
.conv-input-bar textarea {
flex: 1;
font-family: var(--font-body);
font-size: 14px;
padding: 8px 10px;
border: 1px solid var(--border);
border-radius: 6px;
resize: vertical;
min-height: 44px;
}
.conv-input-bar textarea:disabled {
background: var(--bg);
opacity: 0.5;
}
.conv-input-bar button {
padding: 0 18px;
font-size: 13px;
background: var(--accent);
color: white;
border: none;
border-radius: 6px;
cursor: pointer;
}
.conv-input-bar button:disabled {
opacity: 0.4;
cursor: not-allowed;
}
/* =================================================================
v0.4 — nav tiers + info-box + empty-state polish
================================================================= */
nav .nav-primary {
font-weight: 600;
}
nav .nav-secondary {
font-size: 12.5px;
opacity: 0.65;
}
nav .nav-secondary:hover {
opacity: 1;
}
nav a.active.nav-primary,
nav a.active.nav-secondary {
opacity: 1;
}
.info-box {
background: rgba(245, 158, 11, 0.08);
border: 1px solid rgba(245, 158, 11, 0.3);
border-left: 4px solid rgb(245, 158, 11);
padding: 14px 18px;
border-radius: 8px;
margin-bottom: 20px;
font-size: 14px;
line-height: 1.65;
color: rgb(95, 50, 5);
}
.info-box strong {
color: rgb(75, 35, 0);
}
.info-box a {
color: rgb(180, 70, 30);
text-decoration: underline;
text-underline-offset: 2px;
}
/* details/summary polish */
details summary {
padding: 4px 0;
}
details[open] summary {
margin-bottom: 12px;
}
/* index empty state — prominent CTA */
.empty-cta {
text-align: center;
padding: 64px 20px;
}
.empty-cta-title {
font-size: 18px;
font-weight: 600;
margin-bottom: 8px;
color: var(--text);
}
.empty-cta-subtitle {
color: var(--text-muted);
font-size: 14px;
margin-bottom: 24px;
}
.empty-cta .button {
font-size: 15px;
padding: 12px 24px;
}
/* =================================================================
v0.4 — workflow generator UI
================================================================= */
.wf-row-card {
background: var(--bg);
border: 1px solid var(--border);
border-radius: 8px;
padding: 14px 16px;
margin-bottom: 12px;
}
.wf-row-header {
display: flex;
justify-content: space-between;
align-items: center;
margin-bottom: 12px;
padding-bottom: 8px;
border-bottom: 1px dashed var(--border);
}
.button-link {
background: none;
border: none;
color: rgb(180, 70, 30);
cursor: pointer;
font-size: 12px;
text-decoration: underline;
padding: 2px 6px;
}
.wf-chip {
display: inline-flex;
align-items: center;
gap: 4px;
background: rgba(180, 70, 30, 0.06);
border: 1px solid rgba(180, 70, 30, 0.2);
border-radius: 999px;
padding: 3px 10px;
font-size: 12.5px;
cursor: pointer;
margin: 2px 4px 2px 0;
}
.wf-chip input {
margin: 0;
}
.wf-chip:has(input:checked) {
background: rgba(180, 70, 30, 0.18);
border-color: rgba(180, 70, 30, 0.5);
font-weight: 600;
}
/* =================================================================
v0.4 — Markdown + system event cards in conversation
================================================================= */
.msg-body .md-p {
margin: 0 0 8px 0;
line-height: 1.6;
}
.msg-body .md-p:last-child { margin-bottom: 0; }
.msg-body .md-h {
margin: 12px 0 6px 0;
font-weight: 700;
line-height: 1.3;
}
.msg-body .md-ul,
.msg-body .md-ol {
margin: 4px 0 8px 0;
padding-left: 22px;
line-height: 1.6;
}
.msg-body .md-ul li,
.msg-body .md-ol li {
margin: 2px 0;
}
.msg-body .md-code {
background: rgba(0, 0, 0, 0.04);
border: 1px solid var(--border);
border-radius: 6px;
padding: 10px 12px;
margin: 8px 0;
overflow-x: auto;
font-family: var(--font-mono);
font-size: 12.5px;
line-height: 1.45;
}
.msg-body .md-code code {
background: transparent;
padding: 0;
}
.msg-body code {
background: rgba(0, 0, 0, 0.06);
border-radius: 4px;
padding: 1px 5px;
font-family: var(--font-mono);
font-size: 0.9em;
}
.msg-bubble.role-user .msg-body .md-code,
.msg-bubble.role-user .msg-body code {
background: rgba(255, 255, 255, 0.18);
border-color: rgba(255, 255, 255, 0.3);
color: white;
}
.msg-body a {
color: rgb(180, 70, 30);
text-decoration: underline;
text-underline-offset: 2px;
}
.msg-bubble.role-user .msg-body a {
color: white;
}
.msg-body strong { font-weight: 700; }
.msg-body em { font-style: italic; }
/* System event card */
.msg-bubble.role-system-event {
align-self: stretch;
max-width: 100%;
background: rgba(245, 158, 11, 0.06);
border: 1px solid rgba(245, 158, 11, 0.25);
border-style: dashed;
font-style: normal;
color: var(--text);
}
.md-system-event summary {
cursor: pointer;
font-size: 12.5px;
display: flex;
align-items: center;
gap: 6px;
list-style: none;
}
.md-system-event summary::-webkit-details-marker { display: none; }
.md-system-event summary .event-icon {
font-size: 14px;
}
.md-system-event summary .event-label {
font-weight: 600;
letter-spacing: 0.02em;
color: rgb(120, 53, 15);
}
.md-system-event[open] summary {
margin-bottom: 8px;
border-bottom: 1px dashed rgba(245, 158, 11, 0.3);
padding-bottom: 6px;
}
.md-system-event .event-body {
font-size: 12.5px;
line-height: 1.55;
color: var(--text-muted);
}
.conv-model-pill {
font-family: var(--font-mono);
font-size: 11.5px;
padding: 2px 8px;
border-radius: 999px;
background: rgba(0, 0, 0, 0.06);
color: var(--text-muted);
letter-spacing: 0.01em;
}

View File

@@ -33,12 +33,25 @@ async def app_client(tmp_path: Path) -> AsyncIterator[AsyncClient]:
@pytest.mark.asyncio
async def test_root_serves_index_html(app_client: AsyncClient) -> None:
"""`/` now renders the conversation-centric index (v0.3 PR #8 rewrite)."""
r = await app_client.get("/")
assert r.status_code == 200
assert r.headers["content-type"].startswith("text/html")
body = r.text
assert "<title>my-deepagent · runs</title>" in body
# Title became "대화"; data-page kept as "index" for back-compat.
assert 'data-page="index"' in body
assert "대화" in body
# Must NOT advertise itself as the Runs page anymore.
assert "my-deepagent · runs" not in body
@pytest.mark.asyncio
async def test_runs_html_served(app_client: AsyncClient) -> None:
"""`/runs.html` is the new home of the workflow runs archive."""
r = await app_client.get("/runs.html")
assert r.status_code == 200
assert 'data-page="runs"' in r.text
assert "Workflow Runs" in r.text
@pytest.mark.asyncio

View File

@@ -256,6 +256,42 @@ async def test_get_remaining_unknown_scope_returns_none(db: Database) -> None:
assert remaining is None
# ---------------------------------------------------------------------------
# session: scope (v0.3 PR #6) — sub-agent rollup to root session
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_session_scope_accumulates_cost(db: Database) -> None:
import uuid as _uuid
tracker = _make_tracker(db, run_cap=2.0)
session_id = _uuid.uuid4()
await tracker.record(
run_id=None, persona_name=None, actual_cost_usd=0.30, session_id=session_id
)
await tracker.record(
run_id=None, persona_name=None, actual_cost_usd=0.20, session_id=session_id
)
spent = await tracker.get_spent(f"session:{session_id}")
assert spent == pytest.approx(0.50)
remaining = await tracker.get_remaining(f"session:{session_id}")
assert remaining == pytest.approx(1.50)
@pytest.mark.asyncio
async def test_session_scope_omitted_when_no_session_id(db: Database) -> None:
"""Calls without ``session_id`` must NOT create a session: ledger row."""
import uuid as _uuid
tracker = _make_tracker(db)
# Drive a record without session_id.
await tracker.record(run_id=None, persona_name=None, actual_cost_usd=0.10)
# Querying any session scope should yield 0 spent.
sid = _uuid.uuid4()
assert (await tracker.get_spent(f"session:{sid}")) == pytest.approx(0.0)
# ---------------------------------------------------------------------------
# helpers
# ---------------------------------------------------------------------------

View File

@@ -0,0 +1,242 @@
"""v0.3 PR #8 — Conversation Web GUI tests.
Covers:
1. GET /conversation.html serves the static file (200).
2. POST /api/sessions/{id}/messages still returns 200 + queues a background
task (the agent_runner is stubbed so we never hit OpenRouter).
3. The background task persists an assistant MessageRow that the SSE stream
then surfaces.
4. The background task is awaited correctly (asyncio.Task ref held on
app.state so RUF006 doesn't drop it mid-flight).
"""
from __future__ import annotations
import asyncio
from collections.abc import AsyncIterator
from pathlib import Path
from typing import Any
import pytest
from fastapi import FastAPI
from httpx import ASGITransport, AsyncClient
from sqlalchemy import select
from my_deepagent.api.app import create_app
from my_deepagent.config import load_config
from my_deepagent.persistence.db import Database
from my_deepagent.persistence.models import InteractiveSessionRow, MessageRow
@pytest.fixture
async def app_client(
tmp_path: Path,
) -> AsyncIterator[tuple[AsyncClient, Database, FastAPI]]:
db_url = f"sqlite+aiosqlite:///{tmp_path / 'conv.sqlite3'}"
cfg = load_config(
workspace_root=tmp_path,
data_dir=tmp_path / "data",
database_url=db_url,
)
db = Database(db_url)
await db.init_schema()
await db.dispose()
app = create_app(cfg)
transport = ASGITransport(app=app)
async with app.router.lifespan_context(app):
# Tests get their own Database instance for direct row inspection.
external_db = Database(db_url)
async with AsyncClient(transport=transport, base_url="http://test", timeout=10.0) as client:
yield (client, external_db, app)
await external_db.dispose()
# ---------------------------------------------------------------------------
# Static file serving
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_conversation_page_served(
app_client: tuple[AsyncClient, Database, FastAPI],
) -> None:
client, _db, _app = app_client
r = await client.get("/conversation.html")
assert r.status_code == 200
assert 'data-page="conversation"' in r.text
assert "message-input" in r.text
# ---------------------------------------------------------------------------
# POST /messages still 200 + background task fires
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_post_message_returns_ack_and_persists_user_row(
app_client: tuple[AsyncClient, Database, FastAPI], monkeypatch: pytest.MonkeyPatch
) -> None:
client, db, _app = app_client
invocations: list[tuple[str, str]] = []
async def fake_invoke(
_db: Any,
_config: Any,
_personas: Any,
session_id: Any,
user_message: str,
*,
saver: Any = None,
chunk_queue: Any = None,
) -> None:
invocations.append((str(session_id), user_message))
monkeypatch.setattr("my_deepagent.api.routes.sessions.invoke_session_agent", fake_invoke)
# Create a session.
r = await client.post(
"/api/sessions",
json={"persona_name": "default-interactive", "repo_path": str(Path.cwd())},
)
assert r.status_code == 200
sid = r.json()["session_id"]
# POST a message.
r2 = await client.post(f"/api/sessions/{sid}/messages", json={"content": "hello agent"})
assert r2.status_code == 200
assert r2.json()["state"] == "active"
# User row persisted synchronously.
async with db.session() as s:
rows = (
(
await s.execute(
select(MessageRow).where(MessageRow.session_id == sid).order_by(MessageRow.seq)
)
)
.scalars()
.all()
)
assert len(rows) == 1
assert rows[0].role == "user"
assert rows[0].content == "hello agent"
# Give the event loop one cycle so the background task can fire.
await asyncio.sleep(0.05)
assert invocations == [(sid, "hello agent")]
@pytest.mark.asyncio
async def test_post_message_holds_task_ref_on_app_state(
app_client: tuple[AsyncClient, Database, FastAPI], monkeypatch: pytest.MonkeyPatch
) -> None:
"""Background task must be held on app.state.pending_invocations so the
GC + RUF006 don't drop it before completion."""
client, _db, app = app_client
started = asyncio.Event()
can_finish = asyncio.Event()
async def slow_invoke(*_a: Any, **_k: Any) -> None:
started.set()
await can_finish.wait()
monkeypatch.setattr("my_deepagent.api.routes.sessions.invoke_session_agent", slow_invoke)
r = await client.post(
"/api/sessions",
json={"persona_name": "default-interactive", "repo_path": str(Path.cwd())},
)
sid = r.json()["session_id"]
await client.post(f"/api/sessions/{sid}/messages", json={"content": "x"})
# Wait for the task to start.
await asyncio.wait_for(started.wait(), timeout=2.0)
# The pending_invocations set on the app should hold a reference.
pending = app.state.pending_invocations
assert len(pending) == 1
# Release the task and let the discard callback fire.
can_finish.set()
await asyncio.sleep(0.05)
assert len(app.state.pending_invocations) == 0
# ---------------------------------------------------------------------------
# End-to-end: assistant message materializes for SSE
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_background_invocation_persists_assistant_row(
app_client: tuple[AsyncClient, Database, FastAPI], monkeypatch: pytest.MonkeyPatch
) -> None:
"""When the runner finishes, an assistant MessageRow should be visible."""
client, db, _app = app_client
async def fake_invoke(
passed_db: Any,
_config: Any,
_personas: Any,
session_id: Any,
_user_message: str,
*,
saver: Any = None,
chunk_queue: Any = None,
) -> None:
# Simulate what the real runner does: write an assistant MessageRow.
from datetime import UTC, datetime
from sqlalchemy import desc
async with passed_db.session() as s:
last = (
await s.execute(
select(MessageRow.seq)
.where(MessageRow.session_id == str(session_id))
.order_by(desc(MessageRow.seq))
.limit(1)
)
).scalar_one_or_none() or 0
s.add(
MessageRow(
session_id=str(session_id),
seq=last + 1,
role="assistant",
content="(stubbed assistant reply)",
tool_calls=None,
token_count=5,
is_summary=False,
archived=False,
ts=datetime.now(UTC).isoformat(timespec="seconds"),
)
)
await s.commit()
monkeypatch.setattr("my_deepagent.api.routes.sessions.invoke_session_agent", fake_invoke)
r = await client.post(
"/api/sessions",
json={"persona_name": "default-interactive", "repo_path": str(Path.cwd())},
)
sid = r.json()["session_id"]
await client.post(f"/api/sessions/{sid}/messages", json={"content": "ping"})
# Let the background task complete.
await asyncio.sleep(0.1)
# Verify the conversation now has both user + assistant rows.
async with db.session() as s:
rows = (
(
await s.execute(
select(MessageRow).where(MessageRow.session_id == sid).order_by(MessageRow.seq)
)
)
.scalars()
.all()
)
sess_row = await s.get(InteractiveSessionRow, sid)
assert [r.role for r in rows] == ["user", "assistant"]
assert rows[1].content == "(stubbed assistant reply)"
assert sess_row is not None
assert sess_row.title is not None # set from first user message

View File

@@ -0,0 +1,192 @@
"""v0.3 PR #7 — MYDEEPAGENT.md instruction-file hierarchy tests.
Covers:
1. Global file is bootstrapped with template on first call (idempotent).
2. Project file is NEVER auto-created — present iff user wrote it.
3. `resolve_instruction_paths` orders global → project.
4. Resolution is empty if global hasn't been bootstrapped yet.
5. `build_agent` passes the combined list through to `deepagents.create_deep_agent(memory=...)`.
"""
from __future__ import annotations
from pathlib import Path
from typing import Any
import pytest
from my_deepagent.config import load_config
from my_deepagent.instructions import (
INSTRUCTION_FILENAME,
ensure_global_instructions_initialized,
global_instructions_path,
project_instructions_path,
resolve_instruction_paths,
)
# ---------------------------------------------------------------------------
# Bootstrap (global only)
# ---------------------------------------------------------------------------
def test_ensure_global_instructions_creates_template(tmp_path: Path) -> None:
cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
p = ensure_global_instructions_initialized(cfg)
assert p.is_file()
assert p.name == INSTRUCTION_FILENAME
body = p.read_text(encoding="utf-8")
assert "MYDEEPAGENT.md (global)" in body
assert "한국어" in body # template is Korean by default
def test_ensure_global_instructions_idempotent(tmp_path: Path) -> None:
cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
p = ensure_global_instructions_initialized(cfg)
p.write_text("custom content", encoding="utf-8")
# Second call must not overwrite user-edited content.
p2 = ensure_global_instructions_initialized(cfg)
assert p2 == p
assert p.read_text(encoding="utf-8") == "custom content"
# ---------------------------------------------------------------------------
# Project file behaviour
# ---------------------------------------------------------------------------
def test_project_instructions_never_auto_created(tmp_path: Path) -> None:
cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
repo = tmp_path / "repo"
repo.mkdir()
# Bootstrap global — must not touch project file.
ensure_global_instructions_initialized(cfg)
assert not project_instructions_path(repo).exists()
# ---------------------------------------------------------------------------
# resolve_instruction_paths
# ---------------------------------------------------------------------------
def test_resolve_paths_includes_only_existing_files(tmp_path: Path) -> None:
cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
repo = tmp_path / "repo"
repo.mkdir()
# No files exist → empty.
assert resolve_instruction_paths(cfg, repo) == []
# Only global.
g = ensure_global_instructions_initialized(cfg)
paths = resolve_instruction_paths(cfg, repo)
assert paths == [str(g.resolve())]
# Add project — order becomes global, project.
proj_file = project_instructions_path(repo)
proj_file.write_text("# project-specific", encoding="utf-8")
paths = resolve_instruction_paths(cfg, repo)
assert paths == [str(g.resolve()), str(proj_file.resolve())]
def test_global_instructions_path_under_data_dir(tmp_path: Path) -> None:
cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
p = global_instructions_path(cfg)
assert p.parent == cfg.data_dir
assert p.name == INSTRUCTION_FILENAME
def test_governance_bootstrap_creates_full_skeleton(tmp_path: Path) -> None:
"""`bootstrap_user_dirs` materialises the user-wide layout (PR #7)."""
from my_deepagent.governance import bootstrap_user_dirs
from my_deepagent.memory import INDEX_FILENAME as MEMORY_INDEX_FILENAME
cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
bootstrap_user_dirs(cfg)
# Global MYDEEPAGENT.md created with template.
assert global_instructions_path(cfg).is_file()
# Global memory dir + MEMORY.md created.
global_mem = Path(cfg.data_dir) / "global" / "memory"
assert global_mem.is_dir()
assert (global_mem / MEMORY_INDEX_FILENAME).is_file()
# User skills dir created.
assert (Path(cfg.data_dir) / "skills").is_dir()
# Projects parent dir created.
assert (Path(cfg.data_dir) / "projects").is_dir()
def test_governance_bootstrap_is_idempotent(tmp_path: Path) -> None:
from my_deepagent.governance import bootstrap_user_dirs
cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
bootstrap_user_dirs(cfg)
gpath = global_instructions_path(cfg)
gpath.write_text("custom edited content", encoding="utf-8")
# Second call must not overwrite user edits.
bootstrap_user_dirs(cfg)
assert gpath.read_text(encoding="utf-8") == "custom edited content"
# ---------------------------------------------------------------------------
# Integration: instruction paths reach deepagents memory= kwarg
# ---------------------------------------------------------------------------
def test_build_agent_receives_combined_instruction_and_memory_paths(
tmp_path: Path, monkeypatch: pytest.MonkeyPatch
) -> None:
"""`build_agent(memory_paths_override=[instructions..., memory...])` passes
the union through to `create_deep_agent(memory=...)`. Mirrors what
InteractiveSession does at REPL bootstrap.
"""
from my_deepagent import session as session_mod
from my_deepagent.persona import Persona
captured: dict[str, Any] = {}
def fake_create_deep_agent(**kwargs: Any) -> Any:
captured.update(kwargs)
return object()
monkeypatch.setattr(session_mod, "create_deep_agent", fake_create_deep_agent)
cfg = load_config(
workspace_root=tmp_path,
data_dir=tmp_path / "data",
openrouter_api_key="test-key",
)
repo = tmp_path / "repo"
repo.mkdir()
g = ensure_global_instructions_initialized(cfg)
proj_file = project_instructions_path(repo)
proj_file.write_text("# project rule", encoding="utf-8")
# Simulate a project memory entry.
mem_entry = tmp_path / "MEM.md"
mem_entry.write_text("# memory entry", encoding="utf-8")
persona = Persona(
name="test-persona",
version=1,
backend="openrouter",
model="openrouter:deepseek/deepseek-chat",
provider_origin="CN/DeepSeek",
capabilities=("code_edit",),
max_risk_level="high",
system_prompt="System prompt for test persona (must be ≥10 chars)",
deepagents_backend="state",
)
instruction_paths = resolve_instruction_paths(cfg, repo)
combined = [*instruction_paths, str(mem_entry.resolve())]
_agent = session_mod.build_agent(
persona,
cfg,
root_dir=repo,
memory_paths_override=combined,
)
assert "memory" in captured
# Global must come before project, project before mem entry — exact list match.
expected = [str(g.resolve()), str(proj_file.resolve()), str(mem_entry.resolve())]
assert captured["memory"] == expected

View File

@@ -71,23 +71,25 @@ def test_ensure_memory_initialized_is_idempotent(memory_dir: Path) -> None:
def test_add_memory_entry_writes_file_and_updates_index(memory_dir: Path) -> None:
path = add_memory_entry(memory_dir, "프로젝트 핵심: 위크닥 CLI MVP")
assert path.is_file()
body = path.read_text(encoding="utf-8")
result = add_memory_entry(memory_dir, "프로젝트 핵심: 위크닥 CLI MVP")
assert result.path.is_file()
body = result.path.read_text(encoding="utf-8")
assert "프로젝트 핵심" in body
assert body.startswith("---\nslug: ")
assert body.startswith("---\nname: ")
assert "type:" in body
assert result.scrubbed is False
index = (memory_dir / INDEX_FILENAME).read_text(encoding="utf-8")
assert path.name in index
assert result.path.name in index
assert "프로젝트 핵심" in index
def test_add_memory_entry_handles_slug_collision(memory_dir: Path) -> None:
p1 = add_memory_entry(memory_dir, "Same first line")
p2 = add_memory_entry(memory_dir, "Same first line\nsecond entry body")
p3 = add_memory_entry(memory_dir, "Same first line\nthird entry body")
r1 = add_memory_entry(memory_dir, "Same first line")
r2 = add_memory_entry(memory_dir, "Same first line\nsecond entry body")
r3 = add_memory_entry(memory_dir, "Same first line\nthird entry body")
p1, p2, p3 = r1.path, r2.path, r3.path
assert p1.name != p2.name != p3.name
# Auto-slugging should land on <slug>-2.md and <slug>-3.md.
stems = sorted([p1.stem, p2.stem, p3.stem])
assert stems[0] == "same-first-line"
assert stems[1] == "same-first-line-2"
@@ -100,8 +102,34 @@ def test_add_memory_entry_rejects_empty_content(memory_dir: Path) -> None:
def test_add_memory_entry_explicit_name_override(memory_dir: Path) -> None:
p = add_memory_entry(memory_dir, "Random body text", name="My Custom Slug!!")
assert p.stem == "my-custom-slug"
r = add_memory_entry(memory_dir, "Random body text", name="My Custom Slug!!")
assert r.path.stem == "my-custom-slug"
def test_add_memory_entry_scrubs_openrouter_key(memory_dir: Path) -> None:
r = add_memory_entry(
memory_dir,
"save this for me: sk-or-v1-abcdefghijklmnop1234567890",
)
body = r.path.read_text(encoding="utf-8")
assert "sk-or-v1-abcdefghijklmnop" not in body
assert "<redacted:openrouter-key>" in body
assert r.scrubbed is True
def test_add_memory_entry_infers_user_type(memory_dir: Path) -> None:
r = add_memory_entry(memory_dir, "I prefer fish shell over bash")
assert r.memory_type == "user"
def test_add_memory_entry_infers_feedback_type(memory_dir: Path) -> None:
r = add_memory_entry(memory_dir, "don't mock the database in integration tests")
assert r.memory_type == "feedback"
def test_add_memory_entry_explicit_type_overrides_heuristic(memory_dir: Path) -> None:
r = add_memory_entry(memory_dir, "I prefer fish shell", memory_type="reference")
assert r.memory_type == "reference"
# ---------------------------------------------------------------------------
@@ -110,17 +138,17 @@ def test_add_memory_entry_explicit_name_override(memory_dir: Path) -> None:
def test_remove_memory_entry_by_slug(memory_dir: Path) -> None:
p = add_memory_entry(memory_dir, "to be forgotten")
assert remove_memory_entry(memory_dir, p.stem) is True
assert not p.exists()
r = add_memory_entry(memory_dir, "to be forgotten")
assert remove_memory_entry(memory_dir, r.path.stem) is True
assert not r.path.exists()
index_body = (memory_dir / INDEX_FILENAME).read_text(encoding="utf-8")
assert p.name not in index_body
assert r.path.name not in index_body
def test_remove_memory_entry_by_filename(memory_dir: Path) -> None:
p = add_memory_entry(memory_dir, "to be forgotten by full filename")
assert remove_memory_entry(memory_dir, p.name) is True
assert not p.exists()
r = add_memory_entry(memory_dir, "to be forgotten by full filename")
assert remove_memory_entry(memory_dir, r.path.name) is True
assert not r.path.exists()
def test_remove_memory_entry_missing_returns_false(memory_dir: Path) -> None:

View File

@@ -0,0 +1,181 @@
"""v0.3 PR #5 — Plan mode tests.
Covers:
1. PlanModeMiddleware passes tool calls through when inactive.
2. PlanModeMiddleware blocks write_file / edit_file / execute / task when active.
3. read_file / glob / grep / write_todos are allowed regardless.
4. Toggling the closure flag changes behavior without rebuilding the middleware.
5. The synthetic ToolMessage carries status="error" and a clear hint.
"""
from __future__ import annotations
from dataclasses import dataclass
from typing import Any
import pytest
from langchain_core.messages import ToolMessage
from my_deepagent.middleware.plan_mode import (
BLOCKED_TOOLS_IN_PLAN_MODE,
PlanModeMiddleware,
)
@dataclass
class _FakeToolRequest:
"""Minimal stand-in for langchain ToolCallRequest in unit tests."""
tool_call: dict[str, Any]
async def _passthrough_handler(_: _FakeToolRequest) -> ToolMessage:
"""Stub handler — returns a benign 'tool executed' message."""
return ToolMessage(content="EXECUTED", tool_call_id="t1", name="stub")
# ---------------------------------------------------------------------------
# Inactive plan-mode → all tools pass through
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_plan_mode_inactive_passes_through() -> None:
mw = PlanModeMiddleware(is_active=lambda: False)
for name in ["write_file", "edit_file", "execute", "task", "read_file", "glob"]:
req = _FakeToolRequest(tool_call={"name": name, "id": "t1", "args": {}})
result = await mw.awrap_tool_call(req, _passthrough_handler)
assert isinstance(result, ToolMessage)
assert result.content == "EXECUTED"
assert result.status != "error"
# ---------------------------------------------------------------------------
# Active plan-mode → write tools blocked with status=error
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_plan_mode_active_blocks_write_file() -> None:
mw = PlanModeMiddleware(is_active=lambda: True)
req = _FakeToolRequest(
tool_call={"name": "write_file", "id": "abc123", "args": {"file_path": "/tmp/x"}}
)
result = await mw.awrap_tool_call(req, _passthrough_handler)
assert isinstance(result, ToolMessage)
assert result.status == "error"
assert result.tool_call_id == "abc123"
assert "Plan-mode" in result.content
assert "write_file" in result.content
@pytest.mark.asyncio
async def test_plan_mode_active_blocks_execute() -> None:
mw = PlanModeMiddleware(is_active=lambda: True)
req = _FakeToolRequest(tool_call={"name": "execute", "id": "exec1", "args": {"command": "ls"}})
result = await mw.awrap_tool_call(req, _passthrough_handler)
assert isinstance(result, ToolMessage)
assert result.status == "error"
assert "execute" in result.content
@pytest.mark.asyncio
async def test_plan_mode_active_blocks_task_subagent_spawn() -> None:
mw = PlanModeMiddleware(is_active=lambda: True)
req = _FakeToolRequest(tool_call={"name": "task", "id": "task1", "args": {"description": "x"}})
result = await mw.awrap_tool_call(req, _passthrough_handler)
assert isinstance(result, ToolMessage)
assert result.status == "error"
assert "task" in result.content
# ---------------------------------------------------------------------------
# Active plan-mode → read-only tools still pass through
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_plan_mode_active_allows_read_only_tools() -> None:
mw = PlanModeMiddleware(is_active=lambda: True)
for name in ["read_file", "glob", "grep", "ls"]:
req = _FakeToolRequest(tool_call={"name": name, "id": "t1", "args": {}})
result = await mw.awrap_tool_call(req, _passthrough_handler)
assert result.content == "EXECUTED", f"{name} should not be blocked"
assert result.status != "error"
@pytest.mark.asyncio
async def test_plan_mode_blocks_write_todos() -> None:
"""`write_todos` is part of the plan markdown — must be blocked."""
mw = PlanModeMiddleware(is_active=lambda: True)
req = _FakeToolRequest(tool_call={"name": "write_todos", "id": "wt1", "args": {"todos": []}})
result = await mw.awrap_tool_call(req, _passthrough_handler)
assert isinstance(result, ToolMessage)
assert result.status == "error"
assert "write_todos" in result.content
# ---------------------------------------------------------------------------
# Closure-toggle behavior — flip without rebuild
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_plan_mode_closure_toggle_changes_behavior() -> None:
state = {"on": False}
mw = PlanModeMiddleware(is_active=lambda: state["on"])
req = _FakeToolRequest(tool_call={"name": "write_file", "id": "w", "args": {}})
# Off → passes.
r1 = await mw.awrap_tool_call(req, _passthrough_handler)
assert r1.status != "error"
# Flip on → blocks.
state["on"] = True
r2 = await mw.awrap_tool_call(req, _passthrough_handler)
assert r2.status == "error"
# Flip back off → passes again.
state["on"] = False
r3 = await mw.awrap_tool_call(req, _passthrough_handler)
assert r3.status != "error"
# ---------------------------------------------------------------------------
# Sync path mirrors async path
# ---------------------------------------------------------------------------
def test_plan_mode_sync_wrap_tool_call() -> None:
mw = PlanModeMiddleware(is_active=lambda: True)
def sync_handler(_: _FakeToolRequest) -> ToolMessage:
return ToolMessage(content="EXECUTED", tool_call_id="t1", name="stub")
req = _FakeToolRequest(tool_call={"name": "write_file", "id": "s1", "args": {}})
result = mw.wrap_tool_call(req, sync_handler)
assert isinstance(result, ToolMessage)
assert result.status == "error"
# ---------------------------------------------------------------------------
# Blocklist constant sanity
# ---------------------------------------------------------------------------
def test_blocklist_includes_all_known_write_tools() -> None:
assert "write_file" in BLOCKED_TOOLS_IN_PLAN_MODE
assert "edit_file" in BLOCKED_TOOLS_IN_PLAN_MODE
assert "execute" in BLOCKED_TOOLS_IN_PLAN_MODE
assert "bash" in BLOCKED_TOOLS_IN_PLAN_MODE
assert "task" in BLOCKED_TOOLS_IN_PLAN_MODE
def test_blocklist_excludes_read_only_tools() -> None:
for name in ("read_file", "glob", "grep", "ls"):
assert name not in BLOCKED_TOOLS_IN_PLAN_MODE
def test_blocklist_includes_write_todos() -> None:
assert "write_todos" in BLOCKED_TOOLS_IN_PLAN_MODE

View File

@@ -201,6 +201,66 @@ def test_resolve_skill_sources_returns_user_dir(tmp_path: Path) -> None:
assert sources[0] == str(user_skills_dir(cfg).resolve())
def test_resolve_skill_sources_with_project_key_returns_both(tmp_path: Path) -> None:
from my_deepagent.skills import project_skills_dir
cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
sources = resolve_skill_sources(cfg, project_key="proj1234abcdef00")
assert sources == [
str(user_skills_dir(cfg).resolve()),
str(project_skills_dir(cfg, "proj1234abcdef00").resolve()),
]
def test_list_all_skills_project_overrides_global(tmp_path: Path) -> None:
from my_deepagent.skills import list_all_skills, project_skills_dir
cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
pk = "abc123def456ffff"
global_dir = user_skills_dir(cfg)
proj_dir = project_skills_dir(cfg, pk)
global_dir.mkdir(parents=True)
proj_dir.mkdir(parents=True)
_make_skill(global_dir, "shared", description="global-version")
_make_skill(proj_dir, "shared", description="project-version")
_make_skill(global_dir, "global-only", description="g")
_make_skill(proj_dir, "project-only", description="p")
skills = list_all_skills(cfg, pk)
by_name = {s.name: s for s in skills}
assert set(by_name.keys()) == {"shared", "global-only", "project-only"}
# Project overrides global on the shared name.
assert by_name["shared"].scope == "project"
assert by_name["shared"].description == "project-version"
assert by_name["global-only"].scope == "global"
assert by_name["project-only"].scope == "project"
def test_find_skill_prefers_project_over_global(tmp_path: Path) -> None:
from my_deepagent.skills import find_skill, project_skills_dir
cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
pk = "f0f0f0f0f0f0f0f0"
global_dir = user_skills_dir(cfg)
proj_dir = project_skills_dir(cfg, pk)
global_dir.mkdir(parents=True)
proj_dir.mkdir(parents=True)
_make_skill(global_dir, "dup", description="g")
_make_skill(proj_dir, "dup", description="p")
skill = find_skill(cfg, pk, "dup")
assert skill is not None
assert skill.scope == "project"
assert skill.description == "p"
def test_find_skill_missing_returns_none(tmp_path: Path) -> None:
from my_deepagent.skills import find_skill
cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
assert find_skill(cfg, "any-project-key", "nonexistent") is None
# ---------------------------------------------------------------------------
# Integration: build_agent threads skills sources to deepagents
# ---------------------------------------------------------------------------

View File

@@ -0,0 +1,306 @@
"""v0.3 PR #6 — Sub-agent session linkage tests.
Covers:
1. `spawn_subagent_session` creates a child row with correct parent_session_id,
depth = parent.depth + 1, inherited project_key.
2. Depth limit `MAX_SUBAGENT_DEPTH` rejects further spawns.
3. Spawn against ended/missing parent raises human_required errors.
4. `list_subagents` returns direct children in start-order, excludes grandchildren.
5. Persona upsert behaves correctly — same persona hash → same persona_id.
"""
from __future__ import annotations
import uuid
from collections.abc import AsyncIterator
from datetime import UTC, datetime
from pathlib import Path
import pytest
from my_deepagent.config import load_config
from my_deepagent.errors import MyDeepAgentError
from my_deepagent.persistence.db import Database
from my_deepagent.persistence.models import (
AgentPersonaRow,
InteractiveSessionRow,
)
from my_deepagent.persona import Persona
from my_deepagent.subagents import (
MAX_SUBAGENT_DEPTH,
list_subagents,
resolve_root_session_id,
spawn_subagent_session,
)
def _now() -> str:
return datetime.now(UTC).isoformat(timespec="seconds")
def _make_persona(name: str = "spec-writer") -> Persona:
return Persona(
name=name,
version=1,
backend="openrouter",
model="openrouter:deepseek/deepseek-chat",
provider_origin="CN/DeepSeek",
capabilities=("spec_write",),
max_risk_level="medium",
system_prompt="System prompt — at least ten chars",
deepagents_backend="state",
)
@pytest.fixture
async def db_with_root(tmp_path: Path) -> AsyncIterator[tuple[Database, str]]:
"""Database + one root InteractiveSessionRow with depth=0 + project_key='proj1234abcdef00'."""
db_url = f"sqlite+aiosqlite:///{tmp_path / 'subagent.sqlite3'}"
db = Database(db_url)
await db.init_schema()
persona_id = str(uuid.uuid4())
root_id = str(uuid.uuid4())
async with db.session() as s:
s.add(
AgentPersonaRow(
id=persona_id,
name="default-interactive",
version=1,
hash="parent-hash",
definition={"name": "default-interactive", "version": 1},
created_at=_now(),
)
)
s.add(
InteractiveSessionRow(
id=root_id,
persona_id=persona_id,
persona_hash="parent-hash",
started_at=_now(),
last_message_at=_now(),
state="active",
total_input_tokens=0,
total_output_tokens=0,
model="openrouter:deepseek/deepseek-chat",
project_key="proj1234abcdef00",
title="root",
plan_mode=False,
parent_session_id=None,
depth=0,
)
)
await s.commit()
try:
yield (db, root_id)
finally:
await db.dispose()
# ---------------------------------------------------------------------------
# Happy path
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_spawn_creates_child_with_inherited_project_key(
db_with_root: tuple[Database, str],
) -> None:
db, root_id = db_with_root
persona = _make_persona()
child_id = await spawn_subagent_session(
db,
parent_session_id=uuid.UUID(root_id),
persona=persona,
initial_title="planner-1",
)
async with db.session() as s:
child = await s.get(InteractiveSessionRow, str(child_id))
assert child is not None
assert child.parent_session_id == root_id
assert child.depth == 1
assert child.project_key == "proj1234abcdef00" # inherited
assert child.title == "planner-1"
assert child.state == "active"
assert child.plan_mode is False
assert child.persona_hash == persona.compute_hash()
@pytest.mark.asyncio
async def test_spawn_two_children_depth_one_each(
db_with_root: tuple[Database, str],
) -> None:
db, root_id = db_with_root
persona = _make_persona()
child_a = await spawn_subagent_session(
db, parent_session_id=uuid.UUID(root_id), persona=persona
)
child_b = await spawn_subagent_session(
db, parent_session_id=uuid.UUID(root_id), persona=persona
)
async with db.session() as s:
a = await s.get(InteractiveSessionRow, str(child_a))
b = await s.get(InteractiveSessionRow, str(child_b))
assert a is not None and b is not None
assert a.depth == b.depth == 1
assert a.parent_session_id == b.parent_session_id == root_id
# ---------------------------------------------------------------------------
# Depth limit
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_spawn_rejects_beyond_max_depth(db_with_root: tuple[Database, str]) -> None:
db, root_id = db_with_root
persona = _make_persona()
current = uuid.UUID(root_id)
# Chain spawns down to MAX_SUBAGENT_DEPTH (root depth=0; spawn produces 1, 2, 3).
for expected_depth in range(1, MAX_SUBAGENT_DEPTH + 1):
new_child = await spawn_subagent_session(db, parent_session_id=current, persona=persona)
async with db.session() as s:
row = await s.get(InteractiveSessionRow, str(new_child))
assert row is not None
assert row.depth == expected_depth
current = new_child
# Now `current` has depth=MAX_SUBAGENT_DEPTH (3) → spawn must reject.
with pytest.raises(MyDeepAgentError) as exc_info:
await spawn_subagent_session(db, parent_session_id=current, persona=persona)
assert exc_info.value.code == "subagent_depth_exceeded"
# ---------------------------------------------------------------------------
# Invalid parent
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_spawn_missing_parent_raises(db_with_root: tuple[Database, str]) -> None:
db, _root_id = db_with_root
persona = _make_persona()
bogus = uuid.uuid4()
with pytest.raises(MyDeepAgentError) as exc_info:
await spawn_subagent_session(db, parent_session_id=bogus, persona=persona)
assert exc_info.value.code == "parent_session_missing"
@pytest.mark.asyncio
async def test_spawn_ended_parent_raises(db_with_root: tuple[Database, str]) -> None:
db, root_id = db_with_root
async with db.session() as s:
row = await s.get(InteractiveSessionRow, root_id)
assert row is not None
row.state = "ended"
await s.commit()
persona = _make_persona()
with pytest.raises(MyDeepAgentError) as exc_info:
await spawn_subagent_session(db, parent_session_id=uuid.UUID(root_id), persona=persona)
assert exc_info.value.code == "parent_session_ended"
# ---------------------------------------------------------------------------
# list_subagents
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_list_subagents_returns_direct_children_only(
db_with_root: tuple[Database, str],
) -> None:
db, root_id = db_with_root
persona = _make_persona()
# root → child_a → grandchild
child_a = await spawn_subagent_session(
db, parent_session_id=uuid.UUID(root_id), persona=persona
)
child_b = await spawn_subagent_session(
db, parent_session_id=uuid.UUID(root_id), persona=persona
)
grandchild = await spawn_subagent_session(db, parent_session_id=child_a, persona=persona)
direct = await list_subagents(db, uuid.UUID(root_id))
ids = [r.id for r in direct]
assert str(child_a) in ids
assert str(child_b) in ids
assert str(grandchild) not in ids # depth-2 not in direct children
assert len(direct) == 2
@pytest.mark.asyncio
async def test_list_subagents_no_children_returns_empty(
db_with_root: tuple[Database, str],
) -> None:
db, root_id = db_with_root
direct = await list_subagents(db, uuid.UUID(root_id))
assert direct == []
# ---------------------------------------------------------------------------
# Persona upsert
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_resolve_root_session_id_walks_to_root(
db_with_root: tuple[Database, str],
) -> None:
db, root_id = db_with_root
persona = _make_persona()
child = await spawn_subagent_session(db, parent_session_id=uuid.UUID(root_id), persona=persona)
grand = await spawn_subagent_session(db, parent_session_id=child, persona=persona)
great = await spawn_subagent_session(db, parent_session_id=grand, persona=persona)
assert (await resolve_root_session_id(db, uuid.UUID(root_id))) == uuid.UUID(root_id)
assert (await resolve_root_session_id(db, child)) == uuid.UUID(root_id)
assert (await resolve_root_session_id(db, grand)) == uuid.UUID(root_id)
assert (await resolve_root_session_id(db, great)) == uuid.UUID(root_id)
@pytest.mark.asyncio
async def test_resolve_root_session_id_missing_returns_input(
db_with_root: tuple[Database, str],
) -> None:
db, _root_id = db_with_root
bogus = uuid.uuid4()
assert (await resolve_root_session_id(db, bogus)) == bogus
@pytest.mark.asyncio
async def test_spawn_reuses_persona_row_for_same_hash(
db_with_root: tuple[Database, str],
) -> None:
db, root_id = db_with_root
persona = _make_persona("shared-persona")
child_a = await spawn_subagent_session(
db, parent_session_id=uuid.UUID(root_id), persona=persona
)
child_b = await spawn_subagent_session(
db, parent_session_id=uuid.UUID(root_id), persona=persona
)
async with db.session() as s:
a = await s.get(InteractiveSessionRow, str(child_a))
b = await s.get(InteractiveSessionRow, str(child_b))
assert a is not None and b is not None
assert a.persona_id == b.persona_id
assert a.persona_hash == b.persona_hash
# No duplicate AgentPersonaRow.
async with db.session() as s:
cfg = load_config(workspace_root=Path.cwd(), data_dir=Path.cwd() / "data") # noqa: F841
from sqlalchemy import select
rows = (
(
await s.execute(
select(AgentPersonaRow).where(AgentPersonaRow.hash == persona.compute_hash())
)
)
.scalars()
.all()
)
assert len(rows) == 1

View File

@@ -0,0 +1,204 @@
"""v0.3 PR #9 — User-scope persona/workflow directory tests.
Covers:
1. `ensure_user_dirs_initialized` creates both directories (idempotent).
2. `load_combined_personas` returns seed + user, deduplicated by (name, version).
3. User entries override seed entries with the same key (last-wins).
4. Malformed user persona files are logged + skipped (don't kill the REPL).
5. `load_combined_workflows` mirrors the persona behaviour for workflow YAMLs.
6. Empty user dirs → seed-only.
"""
from __future__ import annotations
from pathlib import Path
from textwrap import dedent
from my_deepagent.config import load_config
from my_deepagent.user_dirs import (
ensure_user_dirs_initialized,
load_combined_personas,
load_combined_workflows,
user_personas_dir,
user_workflows_dir,
)
def _write_persona_yaml(
target: Path,
*,
name: str,
version: int = 1,
model: str = "openrouter:deepseek/deepseek-chat",
backend: str = "openrouter",
capabilities: list[str] | None = None,
) -> None:
target.parent.mkdir(parents=True, exist_ok=True)
caps = capabilities or ["code_edit"]
cap_lines = "\n".join(f" - {c}" for c in caps)
target.write_text(
dedent(
f"""\
name: {name}
version: {version}
backend: {backend}
model: "{model}"
provider_origin: "CN/DeepSeek"
capabilities:
{cap_lines}
max_risk_level: medium
system_prompt: |
Test persona system prompt (must be ≥10 chars).
allowed_tools:
- read_file
- write_file
deepagents_backend: state
"""
),
encoding="utf-8",
)
def _write_workflow_yaml(target: Path, *, name: str, version: int = 1) -> None:
target.parent.mkdir(parents=True, exist_ok=True)
target.write_text(
dedent(
f"""\
name: {name}
version: {version}
description: "test workflow {name}"
roles:
- id: writer
required_capabilities: [code_edit]
phases:
- key: p1
title: "first phase"
risk: medium
role: writer
gates: []
expected_artifact:
path: artifacts/foo.md
schema: text
instructions: "do something useful in this phase"
"""
),
encoding="utf-8",
)
# ---------------------------------------------------------------------------
# Bootstrap
# ---------------------------------------------------------------------------
def test_ensure_user_dirs_creates_both(tmp_path: Path) -> None:
cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
ensure_user_dirs_initialized(cfg)
assert user_personas_dir(cfg).is_dir()
assert user_workflows_dir(cfg).is_dir()
def test_ensure_user_dirs_is_idempotent(tmp_path: Path) -> None:
cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
ensure_user_dirs_initialized(cfg)
# Drop a file to make sure repeat doesn't wipe it.
_write_persona_yaml(user_personas_dir(cfg) / "p.yaml", name="custom-test")
ensure_user_dirs_initialized(cfg)
assert (user_personas_dir(cfg) / "p.yaml").is_file()
# ---------------------------------------------------------------------------
# load_combined_personas
# ---------------------------------------------------------------------------
def test_load_combined_personas_returns_seed_only_when_no_user(tmp_path: Path) -> None:
cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
seed = tmp_path / "seed"
_write_persona_yaml(seed / "a.yaml", name="alpha")
_write_persona_yaml(seed / "b.yaml", name="bravo")
personas = load_combined_personas(cfg, seed)
names = sorted(p.name for p in personas)
assert names == ["alpha", "bravo"]
def test_load_combined_personas_adds_user(tmp_path: Path) -> None:
cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
seed = tmp_path / "seed"
_write_persona_yaml(seed / "a.yaml", name="alpha")
_write_persona_yaml(user_personas_dir(cfg) / "user.yaml", name="my-custom")
personas = load_combined_personas(cfg, seed)
names = sorted(p.name for p in personas)
assert names == ["alpha", "my-custom"]
def test_load_combined_personas_user_overrides_seed(tmp_path: Path) -> None:
cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
seed = tmp_path / "seed"
_write_persona_yaml(seed / "alpha.yaml", name="alpha", model="seed-model")
_write_persona_yaml(user_personas_dir(cfg) / "alpha.yaml", name="alpha", model="user-model")
personas = load_combined_personas(cfg, seed)
assert len(personas) == 1
assert personas[0].name == "alpha"
assert personas[0].model == "user-model" # user wins
def test_load_combined_personas_skips_malformed_user_file(tmp_path: Path) -> None:
cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
seed = tmp_path / "seed"
_write_persona_yaml(seed / "a.yaml", name="alpha")
bad = user_personas_dir(cfg) / "broken.yaml"
bad.parent.mkdir(parents=True, exist_ok=True)
bad.write_text("not: a valid: persona:::", encoding="utf-8")
# Should not raise — broken file is logged + skipped.
personas = load_combined_personas(cfg, seed)
# Seed alpha is still present.
assert any(p.name == "alpha" for p in personas)
# ---------------------------------------------------------------------------
# load_combined_workflows
# ---------------------------------------------------------------------------
def test_load_combined_workflows_seed_only(tmp_path: Path) -> None:
cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
seed = tmp_path / "wf-seed"
_write_workflow_yaml(seed / "a.yaml", name="wfa")
workflows = load_combined_workflows(cfg, seed)
names = sorted(t.name for (_p, t) in workflows)
assert names == ["wfa"]
def test_load_combined_workflows_user_overrides_seed(tmp_path: Path) -> None:
cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
seed = tmp_path / "wf-seed"
_write_workflow_yaml(seed / "wfa.yaml", name="wfa", version=1)
_write_workflow_yaml(user_workflows_dir(cfg) / "wfa.yaml", name="wfa", version=1)
workflows = load_combined_workflows(cfg, seed)
# Dedupe by (name, version) — only the user version remains.
assert len(workflows) == 1
path, tpl = workflows[0]
assert tpl.name == "wfa"
assert path.parent == user_workflows_dir(cfg)
def test_load_combined_workflows_user_adds_distinct(tmp_path: Path) -> None:
cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
seed = tmp_path / "wf-seed"
_write_workflow_yaml(seed / "a.yaml", name="wfa")
_write_workflow_yaml(user_workflows_dir(cfg) / "user.yaml", name="userwf")
workflows = load_combined_workflows(cfg, seed)
names = sorted(t.name for (_p, t) in workflows)
assert names == ["userwf", "wfa"]
def test_load_combined_workflows_skips_malformed(tmp_path: Path) -> None:
cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
seed = tmp_path / "wf-seed"
_write_workflow_yaml(seed / "a.yaml", name="wfa")
bad = user_workflows_dir(cfg) / "broken.yaml"
bad.parent.mkdir(parents=True, exist_ok=True)
bad.write_text("not: a workflow:::", encoding="utf-8")
workflows = load_combined_workflows(cfg, seed)
assert any(t.name == "wfa" for (_p, t) in workflows)

View File

@@ -0,0 +1,202 @@
"""v0.4 — Workflow generator UI + hot-reload tests.
Covers:
1. POST /api/workflows persists a YAML under <data_dir>/workflows/
2. POST rejects malformed body with 422
3. POST rejects duplicate (name, version) with 409
4. GET /api/workflows hot-reloads when a new file appears
5. GET /api/workflows hot-reloads when an existing file is edited
6. /new-workflow.html serves with the page marker
"""
from __future__ import annotations
from collections.abc import AsyncIterator
from pathlib import Path
import pytest
import yaml
from fastapi import FastAPI
from httpx import ASGITransport, AsyncClient
from my_deepagent.api.app import create_app
from my_deepagent.config import load_config
from my_deepagent.persistence.db import Database
@pytest.fixture
async def app_client(tmp_path: Path) -> AsyncIterator[tuple[AsyncClient, FastAPI, Path]]:
db_url = f"sqlite+aiosqlite:///{tmp_path / 'gen.sqlite3'}"
cfg = load_config(
workspace_root=tmp_path,
data_dir=tmp_path / "data",
database_url=db_url,
)
db = Database(db_url)
await db.init_schema()
await db.dispose()
app = create_app(cfg)
transport = ASGITransport(app=app)
async with app.router.lifespan_context(app):
async with AsyncClient(transport=transport, base_url="http://test", timeout=10.0) as client:
yield (client, app, cfg.data_dir)
def _valid_body(name: str = "my-flow", version: int = 1) -> dict[str, object]:
return {
"name": name,
"version": version,
"description": "test workflow generator",
"roles": [{"id": "writer", "required_capabilities": ["code_edit"]}],
"phases": [
{
"key": "p1",
"title": "first phase",
"risk": "medium",
"role": "writer",
"instructions": "do something useful in this phase",
}
],
}
# ---------------------------------------------------------------------------
# Static page
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_new_workflow_page_served(
app_client: tuple[AsyncClient, FastAPI, Path],
) -> None:
client, _app, _dir = app_client
r = await client.get("/new-workflow.html")
assert r.status_code == 200
assert 'data-page="new-workflow"' in r.text
assert "워크플로우 템플릿 만들기" in r.text
# ---------------------------------------------------------------------------
# POST /api/workflows happy path
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_post_workflow_creates_yaml_under_data_dir(
app_client: tuple[AsyncClient, FastAPI, Path],
) -> None:
client, _app, data_dir = app_client
r = await client.post("/api/workflows", json=_valid_body())
assert r.status_code == 201, r.text
body = r.json()
target = Path(body["path"])
assert target.is_file()
assert target.parent == data_dir / "workflows"
assert target.name == "my-flow@1.yaml"
parsed = yaml.safe_load(target.read_text(encoding="utf-8"))
assert parsed["name"] == "my-flow"
assert parsed["version"] == 1
assert parsed["phases"][0]["key"] == "p1"
# ---------------------------------------------------------------------------
# Validation rejection
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_post_workflow_rejects_missing_roles(
app_client: tuple[AsyncClient, FastAPI, Path],
) -> None:
client, _app, _dir = app_client
bad = _valid_body()
bad["roles"] = [] # min_length=1 violation
r = await client.post("/api/workflows", json=bad)
assert r.status_code == 422
@pytest.mark.asyncio
async def test_post_workflow_rejects_phase_referencing_unknown_role(
app_client: tuple[AsyncClient, FastAPI, Path],
) -> None:
client, _app, _dir = app_client
bad = _valid_body()
bad["phases"][0]["role"] = "ghost-role" # type: ignore[index]
r = await client.post("/api/workflows", json=bad)
assert r.status_code == 422
assert "ghost-role" in r.text
# ---------------------------------------------------------------------------
# Duplicate refusal
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_post_workflow_rejects_duplicate_name_version(
app_client: tuple[AsyncClient, FastAPI, Path],
) -> None:
client, _app, _dir = app_client
body = _valid_body("dup-flow", 1)
r1 = await client.post("/api/workflows", json=body)
assert r1.status_code == 201
r2 = await client.post("/api/workflows", json=body)
assert r2.status_code == 409
assert "already exists" in r2.text
# ---------------------------------------------------------------------------
# Hot-reload — new file appears in GET
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_get_workflows_hot_reloads_after_post(
app_client: tuple[AsyncClient, FastAPI, Path],
) -> None:
client, _app, _dir = app_client
before = await client.get("/api/workflows")
before_names = {w["name"] for w in before.json()}
assert "fresh-flow" not in before_names
r = await client.post("/api/workflows", json=_valid_body("fresh-flow", 1))
assert r.status_code == 201
after = await client.get("/api/workflows")
after_names = {w["name"] for w in after.json()}
assert "fresh-flow" in after_names
@pytest.mark.asyncio
async def test_get_workflows_hot_reloads_after_external_file_drop(
app_client: tuple[AsyncClient, FastAPI, Path],
) -> None:
"""Even when the file is dropped directly into the dir (not via POST),
the next GET picks it up via the mtime fingerprint."""
from textwrap import dedent
client, _app, data_dir = app_client
wf_dir = data_dir / "workflows"
wf_dir.mkdir(parents=True, exist_ok=True)
(wf_dir / "external@1.yaml").write_text(
dedent(
"""\
name: external
version: 1
description: dropped by hand
roles:
- id: writer
required_capabilities: [code_edit]
phases:
- key: p1
title: only phase
risk: low
role: writer
instructions: just write something to disk
"""
),
encoding="utf-8",
)
r = await client.get("/api/workflows")
names = {w["name"] for w in r.json()}
assert "external" in names