feat(conversation): cheap-default DeepSeek + Enter-send + model pill

- default-interactive@1 model: claude-haiku-4-5 → deepseek/deepseek-chat (input $0.28/$1.12 per 1M; haiku 대비 ~75% 절감). fallback 은 haiku 로 swap. - conversation textarea keydown: - Enter → 전송 (IME composition 중이면 무시) - Shift+Enter → 줄바꿈 - Cmd/Ctrl+Enter → 전송 (백워드 호환) - Placeholder 안내 갱신. - conversation top-bar 에 model pill 추가 (#session-model-pill) — 현재 세션의 활성 model 을 monospace badge 로 표시. 헷갈리던 "어느 모델인가?" 해소. - style.css 에 .conv-model-pill (회색 pill). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
feat(my-deepagent): v0.4 chat UX boost + A/B live verification
2026-05-18 02:02:19 +09:00 · 2026-05-18 01:08:40 +09:00 · 2026-05-18 00:38:46 +09:00 · 2026-05-18 00:24:24 +09:00 · 2026-05-18 00:03:08 +09:00 · 2026-05-17 21:11:19 +09:00
40 changed files with 6412 additions and 208 deletions
--- a/my-deepagent/CHANGELOG.md
+++ b/my-deepagent/CHANGELOG.md
@@ -2,6 +2,385 @@

 ## [Unreleased]

+### Added
+- **v0.4 chat UX boost + A/B live verification** — Claude-Code 동급의 chat
+  경험으로 끌어올림 + 7개 핵심 흐름을 실제 OpenRouter 로 verify.
+
+  **A — Live verification (`scripts/live_verify.py`, 7 PASS)**:
+  - A1 1-turn chat (CLI-eq) → Anthropic Haiku 4.5 한국어 응답
+  - A2 sessions resume → 같은 session_id 재투입 시 LangGraph thread state 복원
+  - A3 `/skill <name>` system inject → SKILL.md ("한국어 haiku 3 lines") 가
+    실제로 LLM 행동을 제어 (정확히 3행 한국어 시 출력)
+  - A4 `/plan → /approve` → LLM 이 plan markdown 만 생성, 차단 도구 시도 없음
+  - A5 `/agents spawn` → 실제 sub-agent ainvoke + 결과 parent stream push
+  - A6 auto-compaction → 14 메시지 → 4 archive + 77 토큰 summary
+  - A7 `/workflow` wiring → role↔persona 매칭 사전 검증
+  - 총 비용 약 \$0.02.
+
+  **B1 — Markdown rendering** in conversation.html:
+  - `app.js` 의 pure-JS 미니 마크다운 파서 — 코드 펜스, ATX 헤더, ul/ol,
+    inline `code`/**bold**/*italic*/[link](url) 지원.
+  - XSS 정책 유지: `createElement + textContent` 만 사용, `innerHTML`
+    금지.  링크 href 는 `http(s):` 스킴으로 강제 제한.
+  - `style.css` 에 `.md-p`, `.md-h`, `.md-ul`, `.md-ol`, `.md-code` 등
+    스타일 추가.  user bubble (brown 배경) 안에서도 코드/링크 가독성 유지.
+
+  **B2 — System event card** (collapsible):
+  - `_classifySystemMessage` 가 system content 의 접두사를 보고
+    "Sub-agent result / Workflow started / Compaction summary / Plan mode
+    activated / Approved plan / Skill activated" 등으로 분류 → `<details>`
+    카드로 렌더.  채팅 thread 가 이벤트 메시지로 도배되지 않음.
+
+  **B3 — Token streaming via AsyncCallbackHandler**:
+  - `session.py:resolve_model_instance` 의 `ChatOpenAI(streaming=True)`.
+  - `api/agent_runner._StreamingChunkPusher` (AsyncCallbackHandler) 가
+    `on_llm_new_token` 마다 `asyncio.Queue` 에 `{"type":"delta","text":...}`
+    push.
+  - `api/routes/sessions._session_event_stream` 이 queue 를 drain 해 SSE
+    `event: chunk` 로 전송.  Poll interval 100ms.  순서 보장: chunk 먼저
+    drain → message rows 후 yield (placeholder 가 메시지로 교체되기 전에
+    토큰이 시각적으로 흐르도록).
+  - 프론트엔드 `app.js` 의 `appendStreamDelta` 가 chunk 를 placeholder 에
+    누적; 최종 `message` SSE 가 도착하면 markdown-rendered bubble 로 교체.
+  - 라이브 verify: 5 chunk events + 1 final message, OpenRouter Haiku 응답
+    "안녕하세요, / 무 / 엇을 도와드 / 릴까요?" 토큰 단위 push 확인.
+
+  **B4 — Cancel mid-turn** (`POST /api/sessions/{id}/abort`):
+  - `app.state.pending_per_session: dict[session_id, Task]` 인덱스 +
+    `_remove_from_session_map` done-callback.
+  - 새 user 메시지 도착 시 이전 in-flight task 자동 cancel (Claude Code parity).
+  - 프론트엔드 우하단 "**■ 중단**" 버튼 — 대기 중 visible, 완료/취소 시 hide.
+
+  **B5 — IME composition-safe Enter**:
+  - 한글/일본어/중국어 IME 입력 중 Enter 가 후보 commit 용으로 쓰일 때
+    전송되지 않도록 `compositionstart` / `compositionend` 플래그.  순수
+    Enter 만 무시, Cmd/Ctrl+Enter 는 우선 적용.
+
+  **DB hot-fix** (v0.4 chat UX 라운드 도중 발견):
+  - `Database.__init__` 에 `pool_pre_ping=True` — Postgres asyncpg pool 이
+    idle/network blip 후 stale connection 을 넘기던 문제 (SSE 0.5s poll
+    부하에서 500 발생) 해결.
+
+  **새 테스트** (정확한 인보크 시그니처 sync + 기존 통합 보존):
+  - `tests/integration/test_conversation_gui.py` 의 `fake_invoke` 스텁이
+    `chunk_queue` kwarg 도 받도록 업데이트.
+  - 전체 회귀: 709 passed (no new failures).
+
+### Added
+- **v0.4 — Workflow generator UI + hot-reload + UX polish**.  사용자가 직접
+  YAML 을 작성하지 않고도 브라우저에서 새 워크플로우 템플릿을 만들고 즉시
+  실행할 수 있도록 함.  메인 페이지 / new.html / runs.html / new-workflow.html
+  의 nav · copy · empty-state 도 동시에 정비.
+  - **A — `/new.html` UX 패치** (HTML/CSS only):
+    - 제목 "새 Run 시작" → "워크플로우 실행 (고급 기능)".
+    - 상단 `info-box`: "자유 대화는 여기가 아닙니다 → 메인 페이지" 안내 +
+      "+ 템플릿 만들기" 링크.
+    - 모든 필드에 한 줄 hint (예: `repo 절대경로 — 작업할 git 저장소 위치`).
+    - Persona 오버라이드를 `<details>` 접힘 상태로 → 첫 사용자가 압도되지
+      않도록.
+  - **B — nav 재정렬** (`/`, `/runs.html`, `/new.html`, `/run.html`,
+    `/conversation.html`):
+    - "대화" 가 `nav-primary` (큰 폰트 + 진한 색).
+    - "Runs" / "워크플로우 실행" / "+ 템플릿 만들기" 는 `nav-secondary`
+      (작은 폰트 + 65% opacity, hover 시 100%).
+  - **C — 메인 페이지 안내** + CSS:
+    - 메인 `/` 에 `info-box` 추가 ("👋 my-deepagent — OpenRouter 가성비 모델로
+      돌아가는 Claude Code 스타일 멀티턴 에이전트").
+    - `style.css` 에 `.info-box`, `.nav-primary`/`.nav-secondary`,
+      `.wf-row-card`, `.wf-chip` 등 신규 스타일 추가.
+  - **D — Workflow hot-reload**:
+    - `api/deps.py` 의 `get_workflows` 가 매 요청 시
+      `_workflow_dir_signature(config)` (seed + user 디렉터리의 mtime 튜플)
+      을 계산해 cached signature 와 다르면 `load_combined_workflows` 재호출.
+      파일 watcher / inotify 없이 stat 만으로 충분 (디렉터리가 작음).
+    - lifespan 의 `_load_seed_workflows` 도 `_load_workflows_combined` 로
+      교체해 user dir 첫 부팅 시도 자동 로드.
+  - **E — Workflow generator UI**:
+    - **API**: `POST /api/workflows` 신설.  Body = `CreateWorkflowRequest`
+      pydantic (name / version / description / roles / phases / default_gates /
+      max_total_budget_usd).  `WorkflowTemplate.model_validate` 로 strict
+      검증 → 실패 시 422 (loc:msg 포맷으로 평탄화).  성공 시
+      `<data_dir>/workflows/<name>@<version>.yaml` 에 YAML 저장 (`yaml.safe_dump
+      allow_unicode=True, sort_keys=False`).  중복 (name, version) 은 409.
+    - **HTML**: `static/new-workflow.html` (신규).  기본 정보 → Roles →
+      Phases → YAML 미리보기 → 저장 버튼.
+    - **JS**: `app.js` 에 `bootstrapWorkflowGenerator` 와 `WF_STATE` 추가.
+      Role 별 capability 를 chip 형태로 클릭 토글, Phase 의 role 셀렉트는
+      현재 Role 목록에서 동적으로 생성.  실시간 YAML preview.  XSS 정책
+      유지 (모든 사용자 입력은 textContent).
+  - **신규 테스트** (`tests/integration/test_workflow_generator.py`, 7 케이스):
+    - `/new-workflow.html` 200 + 마크업
+    - POST happy path → yaml 파일 영속 + path / name / version 검증
+    - POST roles=[] → 422
+    - POST phase.role 미존재 → 422 + 메시지에 role id 포함
+    - POST duplicate (name, version) → 409
+    - GET hot-reload: POST 후 새 항목이 GET 응답에 등장
+    - GET hot-reload: 외부에서 YAML 파일 직접 떨궈도 mtime 으로 감지
+
+### Changed
+- **v0.3 plan-conformance fixes** — 1차 구현 후 plan-v0.3 와 대조해 발견된 18건
+  누락/명세 위반을 보강.  자기 리뷰 3 라운드 (누락·미완 / 오류·엣지케이스 /
+  과최적화) 모두 PASS.
+
+  - **PR #5 (plan mode) 3건**:
+    - `PlanModeMiddleware.BLOCKED_TOOLS_IN_PLAN_MODE` 에 `write_todos` 추가
+      (plan 명세대로: todos 는 plan markdown 의 일부, /approve 까지 차단).
+    - `/plan` 진입 시 system message inject (`_PLAN_MODE_SYSTEM_PROMPT` —
+      "당신은 plan mode입니다 …"). 새 `InteractiveSession.enter_plan_mode()` 가
+      flag 토글 + pending system message 큐 + thread bump 를 한 번에 처리.
+    - `/approve` 시 마지막 assistant 메시지(=plan markdown) 추출 후
+      "approved plan" system message 로 다음 turn 에 inject. 새
+      `InteractiveSession.approve_plan() / reject_plan()`.
+    - 신규 인프라: `InteractiveSession._pending_system_messages: list[str]` +
+      `queue_system_message()` / `consume_pending_system_messages()` —
+      `_invoke_and_stream` 이 매 turn 시작 시 prepend (MessageRow 영속 포함).
+
+  - **PR #2 (compaction) 1건**:
+    - `CompactionResult.summary_text` 필드 추가.  자동/수동 compaction 결과
+      summary 가 `sess.queue_system_message()` 로 다음 thread 의 첫 ainvoke 에
+      inject — plan 명세 ("새 thread 에는 system + 1 summary + 최근 K 메시지")
+      충족.
+
+  - **PR #3 (auto-memory) 6건**:
+    - `global_memory_dir(config)` + `<data_dir>/global/memory/` 부트스트랩.
+      `list_memory_paths` 가 global + project 둘 다 deepagents memory= 에 전달.
+    - Frontmatter `name / description / type` 정식 도입 (`type` ∈ user /
+      feedback / project / reference).  `MemoryType` Literal + `MemoryEntry`
+      dataclass + `read_entry` / `read_index_entries`.
+    - `_infer_memory_type` — 키워드 기반 deterministic classifier (no LLM).
+      `/remember --type=feedback <text>` 로 override 가능.
+    - `_scrub_secrets` — OpenRouter/Anthropic/OpenAI/AWS/Bearer 토큰 정규식
+      매칭 후 `<redacted:...>` 치환.  `WriteResult.scrubbed` 플래그 + REPL
+      경고.
+    - `/memory show <name>` 슬래시 서브명령 — 본문 + (type, scope) 출력.
+    - `/remember [--global]` / `/forget [--global]` — 두 스코프 명시적 토글.
+
+  - **PR #4 (skills) 3건**:
+    - `project_skills_dir(config, project_key)` — `<data_dir>/projects/<key>/
+      skills/`.  `resolve_skill_sources(config, project_key)` 가 global +
+      project 두 경로 반환 → project 가 global 을 덮음 (deepagents
+      "later-wins").
+    - `list_all_skills(config, project_key)` + `find_skill(config,
+      project_key, name)` — 두 스코프 merge + scope 라벨 표시.
+    - `/skill <name>` 동작 정정 — 본문을 REPL 출력만 하던 것을 system
+      message 로 inject (queue) 하도록 변경.  plan 명세 "본문 inject (이번
+      turn 전체)" 충족.  `/skills show <name>` 별도 서브명령 신설 (inject 안
+      함, 인스펙션만).
+
+  - **PR #6 (sub-agent) 4건**:
+    - `budget.py`: `session:<uuid>` scope 추가.  `_scopes_for` 가
+      `session_id` 받아 ledger 누적.  `CostMiddleware` 가 `interactive_session_id`
+      를 자동 전달.  Plan 명세 "sub-agent 는 root session 의 한도에 합산"
+      충족.
+    - `subagents.resolve_root_session_id` — `parent_session_id` 체인 walk-up
+      (cycle guard).  Sub-agent CostMiddleware 가 ROOT session 으로 charge.
+    - `subagents.run_subagent_to_completion` — 실제 ainvoke + 결과 summary 를
+      부모 세션에 `[sub-agent <id> result]` system message 로 push + sub-
+      session 자동 `ended` 마킹.
+    - `/agents` 슬래시 서브명령 구조 (list / spawn / show) — 기존 단순 list
+      + 별도 /spawn 을 plan 명세대로 재구성.  spawn 시 부모 세션에
+      `[sub-agent <id> spawned]` system message 자동 insert.
+
+  - **PR #7 (governance) 1건**:
+    - `governance.bootstrap_user_dirs(config)` — 글로벌 MYDEEPAGENT.md +
+      `global/memory/MEMORY.md` + `skills/` + `projects/` 를 한 호출로
+      idempotent 부트스트랩.  `InteractiveSession.__init__` 에서 호출 (이전엔
+      `ensure_global_instructions_initialized` 만 호출했음 — memory 글로벌
+      미준비).
+
+  - **PR #8 (Web GUI) 1건**:
+    - `static/index.html` 을 워크플로우 runs 페이지에서 **세션 목록 페이지**
+      로 재구성.  `static/runs.html` 신설 (기존 runs 목록 archive 위치).
+      Nav 링크: 대화 → / / Runs (archive) → /runs.html / 새 Workflow Run →
+      /new.html.  `app.js` 에 `renderSessionsList` 핸들러 + `data-page="runs"`
+      라우팅 추가.  Conversation 페이지 `?session=<id>` 쿼리 deep-link 지원.
+
+  - **PR #9 (workflow) 2건**:
+    - `/workflow <name> [--repo=<path>] [--base=<branch>]` 가 실제 백그라운드
+      `WorkflowEngine.run` 발사하도록 구현 (`_run_workflow_background`).
+      진행 상태 (started / ended / failed + final_report_path) 가 메인 세션
+      `MessageRow(role="system")` 로 누적 → SSE 로 GUI 에도 자동 push.
+    - `/binding show <workflow-name[@version]>` 인자 지원 — 특정 워크플로우의
+      role → eligible 페르소나 매칭 미리보기 (실행 X).
+
+  - **신규/갱신 테스트** (+17 케이스, 685 → **702 passed**):
+    - test_plan_mode: write_todos 차단 + blocklist sanity 보강
+    - test_memory: scrub redaction + frontmatter `type` 추론 + explicit
+      override 3 케이스
+    - test_skills: project-scope override + find_skill resolution +
+      resolve_skill_sources(project_key) 4 케이스
+    - test_subagents: resolve_root_session_id chain walk + missing fallback 2
+      케이스
+    - test_budget: session: scope accumulation + session_id 미전달 시 빈
+      ledger 2 케이스
+    - test_instructions: governance bootstrap full-skeleton + idempotency 2
+      케이스
+    - test_api_static: runs.html 신설 + index.html 재구성 2 케이스
+
+### Added
+- **v0.3 PR #9 — Workflow 옵션화 + user 디렉터리 wiring**.  Workflow engine 은
+  주력이 아니라 "옵션" 으로 격하 (사용자가 명시적 `/workflow <name>` 호출 시만
+  활성).  대신 사용자가 `<data_dir>/personas/` 와 `<data_dir>/workflows/` 에
+  YAML 파일을 떨궈 자신만의 persona·workflow 를 등록할 수 있게 함.
+  - `user_dirs.py` (신규):
+    - `user_personas_dir(config)`, `user_workflows_dir(config)` — 경로 헬퍼.
+    - `ensure_user_dirs_initialized(config)` — `mkdir -p`, idempotent.
+    - `load_combined_personas(config, seed_dir)` — seed (strict) + user
+      (best-effort per-file skip on malformed) merge.  Dedupe key
+      `(name, version)`, user-overrides-seed.  Broken user YAML 1개 가 REPL
+      을 죽이지 못함.
+    - `load_combined_workflows(config, seed_dir)` — workflow 도 동일.
+  - `cli/interactive.py`:
+    - `InteractiveSession(..., workflows=...)` 시그니처 확장 — 세션은 로드된
+      workflow 리스트를 기억.
+    - `_interactive_loop_async` 가 `ensure_user_dirs_initialized` 호출 +
+      `load_combined_personas` / `load_combined_workflows` 사용.
+    - 신규 슬래시 4개:
+      - `/personas` — 모든 로드된 persona 목록 (현재 활성 표시)
+      - `/workflows` — 모든 로드된 workflow 템플릿 목록 (phase/role 개수, 파일명)
+      - `/workflow <name[@version]>` — `mydeepagent run` 명령으로 진행하라는
+        안내 (실제 백그라운드 invoke 은 별도 PR — 현재는 안내만 제공)
+      - `/binding show` — 각 workflow 의 role 별 required_capabilities 표시
+  - `tests/integration/test_user_dirs.py` (신규, 10 케이스):
+    - 부트스트랩 idempotency
+    - seed-only / seed+user / user-overrides-seed / malformed-user-skip (persona)
+    - workflow 동일 4종
+    - 빈 user 디렉터리 처리
+
+### Added
+- **v0.3 PR #8 — Conversation-centric Web GUI (`/conversation.html`)**.
+  Workflow run 페이지는 archive 로 격하; 사용자가 처음 보는 화면은 chat-style
+  대화 thread.  Claude Code 의 Web GUI 와 동일한 사용성.
+  - `static/conversation.html` (신규): session picker + 메시지 thread +
+    입력 박스.  data-page="conversation".
+  - `static/app.js`:
+    - 새 페이지 핸들러 `bootstrapConversationPage` 추가.
+    - `loadSessionList()` → GET /api/sessions, picker 채움.
+    - `loadAndAttachSession(sid)` → GET /api/sessions/{id}, 메시지 thread 렌더,
+      SSE 구독 시작.
+    - `attachEventSource` → 기존 SSE message/done 이벤트 처리.  새 user 메시지
+      전송 시 `pending` 풍선 표시, assistant 메시지 도착 시 교체.
+    - `createNewSession` → default-interactive persona 로 POST /api/sessions.
+    - XSS 정책 동일: 모든 사용자 콘텐츠는 `textContent` 만 사용.
+  - `static/style.css`: `.messages-thread`, `.msg-bubble`, `.conv-topbar`,
+    `.conv-input-bar` 등 chat UI 스타일 추가.
+  - `api/app.py`:
+    - lifespan 에서 LangGraph saver 를 `from_conn_string` 으로 1회 열고
+      `app.state.saver` 에 보관 (Postgres 일 때만, SQLite 테스트는 None).
+      백그라운드 invoke 가 재사용.  종료 시 `__aexit__` 호출.
+  - `api/agent_runner.py` (신규):
+    - `invoke_session_agent(db, config, personas, session_id, user_message, saver=...)` —
+      세션 로우 로드 → persona 해상 → 디렉터리 부트스트랩 (memory / skills /
+      MYDEEPAGENT.md) → middleware 스택 (plan-mode + cost + audit) 생성 →
+      `build_agent` → `ainvoke` → assistant MessageRow 영속 → 자동 compaction.
+    - 모든 실패는 로깅 + return (raise 안 함) — HTTP 응답은 이미 200 이고
+      SSE 가 진행 상태를 보여줌.
+  - `api/routes/sessions.py`:
+    - `POST /api/sessions/{id}/messages` 가 user row 영속 후
+      `asyncio.create_task(invoke_session_agent(...))` 로 백그라운드 invoke 발사.
+      task ref 를 `app.state.pending_invocations` set 에 보관 (RUF006 + GC
+      drop 방지), 완료 시 `discard`.
+  - `tests/integration/test_conversation_gui.py` (신규, 4 케이스):
+    - GET /conversation.html → 200 + 필수 마크업
+    - POST /messages → 200 + user row 영속 + 백그라운드 invoke 호출
+    - 백그라운드 task ref 가 `app.state.pending_invocations` 에 잡혀있고 완료
+      후 자동 discard
+    - 스텁 runner 가 assistant row 영속 → user + assistant 시퀀스 검증
+
+### Added
+- **v0.3 PR #7 — MYDEEPAGENT.md instruction-file hierarchy**.  Claude Code 의
+  CLAUDE.md 글로벌/프로젝트 레이어링 등가.  세션 시작 시 다음 두 파일을 자동
+  로드해 시스템 프롬프트에 함께 inject:
+  - **Global** : `<config.data_dir>/MYDEEPAGENT.md` — 부팅 시 템플릿 자동 생성
+  - **Project** : `<repo>/MYDEEPAGENT.md` — 존재할 때만 로드.  사용자 repo
+    안에 자동 생성하지 않음 (invasive 행위 회피).
+  Memory / MEMORY.md / 개별 entry 보다 *먼저* 인젝트되어 deepagents
+  `MemoryMiddleware` 의 "later overrides earlier" 규칙에 따라 더 구체적인
+  맥락이 일반적인 지침을 덮을 수 있음.
+  - `instructions.py` (신규):
+    - `global_instructions_path(config)`, `project_instructions_path(repo_root)`
+    - `ensure_global_instructions_initialized(config)` — 글로벌 템플릿 1회
+      생성, idempotent.  Korean-default 협업·코드 스타일 가이드 시드.
+    - `resolve_instruction_paths(config, repo_root)` — 존재하는 파일만 절대
+      경로로 글로벌→프로젝트 순서 반환.
+  - `cli/interactive.py`:
+    - `InteractiveSession.__init__`에서 `ensure_global_instructions_initialized`
+      호출.
+    - `build_agent_if_needed`에서 `[*instruction_paths, *memory_paths]` 순서로
+      memory_paths_override 구성.
+  - `tests/integration/test_instructions.py` (신규, 6 케이스):
+    - 글로벌 부트스트랩 + idempotency (수동 편집 보존)
+    - 프로젝트 파일은 절대 auto-create 안 함
+    - 0/1/2 개 존재 시 `resolve_instruction_paths` 반환 순서 검증
+    - global path 가 `data_dir` 아래에 위치
+    - **integration**: `build_agent`가 결합된 [instructions, memory] 리스트를
+      그대로 `create_deep_agent(memory=...)` 로 전달
+
+### Added
+- **v0.3 PR #6 — Sub-agent session linkage (`/agents` / `/spawn <persona>`)**.
+  Claude Code의 sub-agent (task tool) 와 별개로, my-deepagent 만의 **persisted**
+  session forking.  부모 session 의 thread 컨텍스트에 langchain-internal 로
+  spawn 되는 deepagents `task` 도구와 달리, 여기서 만든 child 는 자체
+  `InteractiveSessionRow` 를 가지고 `mydeepagent --session <id>` 로 별도
+  resume / Web GUI 트리 탐색이 가능.  부모의 `project_key` 를 그대로 상속해
+  memory · skills 디렉터리 공유.  depth limit = `MAX_SUBAGENT_DEPTH = 3`.
+  - `subagents.py` (신규):
+    - `spawn_subagent_session(db, parent_session_id, persona, initial_title)` —
+      트랜잭션 단일 단위:
+      (1) 부모 존재·`state == "active"` 확인
+      (2) `depth = parent.depth + 1`, 초과 시 `MyDeepAgentError(human_required,
+          "subagent_depth_exceeded")`
+      (3) `AgentPersonaRow` upsert (`compute_hash` 같으면 재사용)
+      (4) 부모의 `project_key` 그대로 상속 + `parent_session_id`, `depth` 세팅
+      → 새 `child_id` 반환.
+    - `list_subagents(db, parent_session_id)` — 직접 자식만 (`started_at` 순)
+      반환.  grandchild 는 포함 안 함 (caller 가 트리 순회).
+  - `cli/interactive.py`:
+    - `_register_subagent_slash`: `/agents` (직접 자식 목록), `/spawn <persona>`
+      (자식 생성 + resume 안내 메시지) 등록.
+  - `tests/integration/test_subagents.py` (신규, 8 케이스):
+    - Happy path: 자식 row 생성 + `parent_session_id`/`depth=1`/`project_key`
+      상속 검증
+    - 같은 부모에 자식 2개 → 둘 다 depth=1
+    - Depth chain spawn 3 회 → 4번째에서 거부 (`subagent_depth_exceeded`)
+    - 존재 안 하는 부모 → `parent_session_missing`
+    - 부모 state="ended" → `parent_session_ended`
+    - `list_subagents`: direct only, no grandchild
+    - 빈 부모 → 빈 리스트
+    - 같은 persona hash → 동일 `persona_id` 재사용
+
+### Added
+- **v0.3 PR #5 — Plan mode (`/plan` / `/approve` / `/reject`)**.  Claude Code의
+  plan mode 등가.  `/plan` 진입 시 `write_file` / `edit_file` / `execute` /
+  `bash` / `task` (sub-agent) 도구가 차단되고 `read_file` / `glob` / `grep` /
+  `ls` / `write_todos`만 허용.  LLM 은 차단된 도구를 호출하면 `ToolMessage(
+  status="error")` 를 받고 자체적으로 계획만 다듬도록 유도.  `/approve` 시
+  쓰기 허용, `/reject` 시 thread 리셋 + 쓰기 허용.
+  - `middleware/plan_mode.py` (신규):
+    - `PlanModeMiddleware(is_active: Callable[[], bool])` — `awrap_tool_call` /
+      `wrap_tool_call` 에서 plan_mode 활성 + 차단 도구면 synthetic
+      `ToolMessage(status="error", content=...)` 반환.  raise 하지 않음
+      (LLM이 무한 루프 없이 다른 도구로 전환할 수 있도록).
+    - `BLOCKED_TOOLS_IN_PLAN_MODE` 상수: write_file / edit_file / bash /
+      execute / run_command / shell / task.  read_file·write_todos 등 안전한
+      도구는 화이트리스트.
+  - `cli/interactive.py`:
+    - `InteractiveSession._plan_mode: bool`.  `set_plan_mode(enabled)` async →
+      flag 토글 + thread_suffix bump + `InteractiveSessionRow.plan_mode` 영속
+      (PR #1에서 이미 컬럼 추가했음).  resume 시 row.plan_mode 로 복원.
+    - `build_agent_if_needed`에서 `PlanModeMiddleware(is_active=lambda: ...)`
+      를 middleware 리스트 첫 자리에 삽입 — closure 가 self._plan_mode 를 읽으니
+      슬래시 토글 후 agent 재빌드 필요 없음.
+    - `_register_plan_mode_slash`: `/plan`, `/approve`, `/reject` 등록.
+  - `tests/integration/test_plan_mode.py` (신규, 9 케이스):
+    - inactive → 모든 도구 패스스루
+    - active → write_file / execute / task 차단 (status=error, tool_call_id
+      유지, 메시지에 도구명 + "Plan-mode" 포함)
+    - active → read_file / glob / grep / ls / write_todos 허용
+    - closure 토글로 동작 변경 (rebuild 없이)
+    - 동기 wrap_tool_call 도 동일 동작
+    - BLOCKED_TOOLS_IN_PLAN_MODE 상수 sanity
+
 ### Added
 - **v0.3 PR #4 — Agent Skills (LLM-routing, no embeddings)**.  Anthropic Agent
  Skills 명세를 그대로 따르는 progressive-disclosure 패턴.  deepagents
--- a/my-deepagent/docs/schemas/personas/default-interactive@1.yaml
+++ b/my-deepagent/docs/schemas/personas/default-interactive@1.yaml
@@ -2,8 +2,8 @@ name: default-interactive
 version: 1
 description: "interactive 모드 만능 어시스턴트. 탐색·수정·실행 모두 지원."
 backend: openrouter
-model: "openrouter:anthropic/claude-haiku-4-5"
-provider_origin: "US/Anthropic"
+model: "openrouter:deepseek/deepseek-chat"
+provider_origin: "CN/DeepSeek"
 capabilities:
  - spec_write
  - code_edit
@@ -42,7 +42,7 @@ allowed_tools:
  - write_todos
  - task
 deepagents_backend: local_shell
-fallback_model: "openrouter:deepseek/deepseek-chat"
+fallback_model: "openrouter:anthropic/claude-haiku-4-5"
 max_cost_per_call_usd: 0.05
 model_params:
  max_tokens: 2048
--- a/my-deepagent/scripts/live_verify.py
+++ b/my-deepagent/scripts/live_verify.py
@@ -0,0 +1,458 @@
+"""v0.4 live verification — runs 7 Claude-Code-equivalent flows against real
+OpenRouter.  Run with::
+
+    uv run python scripts/live_verify.py
+
+Each scenario prints PASS / FAIL with a short summary.  Total cost should be
+under $0.10 (we use Anthropic Haiku 4.5 via OpenRouter, single-turn responses).
+
+Scenarios:
+1. CLI-equivalent 1-turn chat (InteractiveSession + ainvoke direct)
+2. Sessions resume (same session_id, thread state restored)
+3. /skill <name> queues SKILL.md body as system message → LLM acknowledges
+4. /plan → LLM produces plan markdown only (no writes) → /approve queues
+5. /agents spawn → sub-agent runs to completion → result pushed to parent
+6. Auto-compaction trigger (manually invoke when row.total_*_tokens > 70%)
+7. /workflow background (kick off real WorkflowEngine.run via background task)
+
+Failures don't crash subsequent scenarios — we accumulate results and exit 0
+only if all PASS.
+"""
+
+from __future__ import annotations
+
+import asyncio
+import os
+import sys
+import uuid
+from datetime import UTC, datetime
+from pathlib import Path
+from typing import Any
+
+# Ensure repo paths import-correctly when run via `uv run python …`
+sys.path.insert(0, str(Path(__file__).resolve().parents[1] / "src"))
+
+from sqlalchemy import select
+
+from my_deepagent.cli.interactive import (
+    InteractiveSession,
+    _invoke_and_stream,
+)
+from my_deepagent.compaction import compact_session
+from my_deepagent.config import load_config
+from my_deepagent.governance import bootstrap_user_dirs, record_consent
+from my_deepagent.hash import sha256
+from my_deepagent.persistence.checkpointer import get_checkpointer_ctx
+from my_deepagent.persistence.db import Database
+from my_deepagent.persistence.models import InteractiveSessionRow, MessageRow
+from my_deepagent.subagents import run_subagent_to_completion, spawn_subagent_session
+from my_deepagent.user_dirs import (
+    ensure_user_dirs_initialized,
+    load_combined_personas,
+    load_combined_workflows,
+)
+
+_SEED = Path(__file__).resolve().parents[1] / "docs" / "schemas"
+_RESULTS: list[tuple[str, bool, str]] = []
+
+
+def _now() -> str:
+    return datetime.now(UTC).isoformat(timespec="seconds")
+
+
+def _record(name: str, ok: bool, note: str) -> None:
+    _RESULTS.append((name, ok, note))
+    marker = "✅ PASS" if ok else "❌ FAIL"
+    print(f"  {marker} — {name}: {note}", flush=True)
+
+
+def _pricing() -> Any:
+    from my_deepagent.monitoring.pricing import ModelPrice, PricingCache
+
+    pc = PricingCache()
+    pc.set(
+        [
+            ModelPrice("anthropic/claude-haiku-4-5", 0.001, 0.005, 200_000),
+            ModelPrice("deepseek/deepseek-chat", 0.00028, 0.00112, 64_000),
+        ]
+    )
+    return pc
+
+
+async def _mk_session(
+    db: Database, config: Any, personas: Any, saver: Any, session_id: uuid.UUID
+) -> InteractiveSession:
+    """Persist a fresh InteractiveSessionRow + return the in-mem InteractiveSession."""
+    from uuid import uuid4
+
+    from my_deepagent.persistence.models import AgentPersonaRow
+
+    persona = next((p for p in personas if p.name == "default-interactive"), personas[0])
+    project_key = sha256(str(Path.cwd().resolve()))[:16]
+
+    async with db.session() as s:
+        ph = persona.compute_hash()
+        existing_pr = (
+            await s.execute(select(AgentPersonaRow).where(AgentPersonaRow.hash == ph))
+        ).scalar_one_or_none()
+        if existing_pr is None:
+            existing_pr = AgentPersonaRow(
+                id=str(uuid4()),
+                name=persona.name,
+                version=persona.version,
+                hash=ph,
+                definition=persona.model_dump(by_alias=True),
+                created_at=_now(),
+            )
+            s.add(existing_pr)
+            await s.flush()
+        existing_row = await s.get(InteractiveSessionRow, str(session_id))
+        if existing_row is None:
+            s.add(
+                InteractiveSessionRow(
+                    id=str(session_id),
+                    persona_id=existing_pr.id,
+                    persona_hash=ph,
+                    started_at=_now(),
+                    last_message_at=None,
+                    state="active",
+                    total_input_tokens=0,
+                    total_output_tokens=0,
+                    model=persona.model,
+                    project_key=project_key,
+                    title=None,
+                    plan_mode=False,
+                    parent_session_id=None,
+                    depth=0,
+                )
+            )
+            await s.commit()
+
+    return InteractiveSession(
+        config,
+        personas,
+        db,
+        _pricing(),
+        Path.cwd(),
+        session_id,
+        saver,
+        project_key,
+        workflows=load_combined_workflows(config, _SEED / "workflows"),
+    )
+
+
+async def scenario_1_basic_chat(db: Database, config: Any, personas: Any, saver: Any) -> uuid.UUID:
+    """1-turn message + assistant response persisted + token counters bumped."""
+    print("\n[A1] CLI-equivalent 1-turn chat")
+    sid = uuid.uuid4()
+    sess = await _mk_session(db, config, personas, saver, sid)
+    agent = sess.build_agent_if_needed()
+    await _invoke_and_stream(agent, "한국어로 한 줄로만 인사해 (10단어 이내)", sess)
+    async with db.session() as s:
+        msgs = (
+            (
+                await s.execute(
+                    select(MessageRow)
+                    .where(MessageRow.session_id == str(sid))
+                    .order_by(MessageRow.seq)
+                )
+            )
+            .scalars()
+            .all()
+        )
+        row = await s.get(InteractiveSessionRow, str(sid))
+    ok = (
+        len(msgs) == 2
+        and msgs[0].role == "user"
+        and msgs[1].role == "assistant"
+        and bool(msgs[1].content.strip())
+        and row is not None
+        and row.total_output_tokens > 0
+    )
+    summary = f"messages={len(msgs)} out_tokens={row.total_output_tokens if row else 0}"
+    _record("A1 basic chat", ok, summary)
+    return sid
+
+
+async def scenario_2_resume(
+    db: Database, config: Any, personas: Any, saver: Any, sid: uuid.UUID
+) -> None:
+    """Same session_id → second InteractiveSession picks up persisted state."""
+    print("\n[A2] Sessions resume")
+    sess2 = await _mk_session(db, config, personas, saver, sid)
+    agent = sess2.build_agent_if_needed()
+    await _invoke_and_stream(agent, "내가 방금 너한테 한 첫 메시지가 뭐였지? 한 줄로만.", sess2)
+    async with db.session() as s:
+        msgs = (
+            (
+                await s.execute(
+                    select(MessageRow)
+                    .where(MessageRow.session_id == str(sid))
+                    .where(MessageRow.archived.is_(False))
+                    .order_by(MessageRow.seq)
+                )
+            )
+            .scalars()
+            .all()
+        )
+    last_assistant = msgs[-1].content if msgs else ""
+    ok = bool(last_assistant) and (
+        "인사" in last_assistant or "한국" in last_assistant or "안녕" in last_assistant
+    )
+    _record("A2 resume", ok, f"messages={len(msgs)} last_hint='{last_assistant[:60]}'")
+
+
+async def scenario_3_skill(db: Database, config: Any, personas: Any, saver: Any) -> None:
+    """Drop a SKILL.md, /skill queues body, next turn LLM acknowledges it."""
+    print("\n[A3] /skill <name> system-inject")
+    from my_deepagent.skills import ensure_skills_initialized, find_skill, user_skills_dir
+
+    sd = user_skills_dir(config)
+    ensure_skills_initialized(sd)
+    skill_dir = sd / "korean-haiku"
+    skill_dir.mkdir(parents=True, exist_ok=True)
+    (skill_dir / "SKILL.md").write_text(
+        """---
+name: korean-haiku
+description: Respond as a korean haiku poet — always 3 short lines, only Korean.
+---
+
+You are now a Korean haiku poet.  Every response MUST be exactly 3 lines, all
+in Korean, total under 30 chars.  No prose, no explanation.
+""",
+        encoding="utf-8",
+    )
+    sid = uuid.uuid4()
+    sess = await _mk_session(db, config, personas, saver, sid)
+    skill = find_skill(config, sess.project_key, "korean-haiku")
+    assert skill is not None, "skill not loaded"
+    body = skill.path.read_text(encoding="utf-8")
+    sess.queue_system_message(
+        f"The user requested skill `{skill.name}`. Apply this SKILL.md for this turn:\n\n{body}"
+    )
+    agent = sess.build_agent_if_needed()
+    await _invoke_and_stream(agent, "봄을 주제로 시 한 편 써줘.", sess)
+    async with db.session() as s:
+        msgs = (
+            (
+                await s.execute(
+                    select(MessageRow)
+                    .where(MessageRow.session_id == str(sid))
+                    .where(MessageRow.role == "assistant")
+                    .order_by(MessageRow.seq.desc())
+                )
+            )
+            .scalars()
+            .all()
+        )
+    assistant = msgs[0].content if msgs else ""
+    line_count = len([line for line in assistant.split("\n") if line.strip()])
+    ok = 2 <= line_count <= 6  # 3 ± slack
+    _record("A3 skill inject", ok, f"lines={line_count} body[:60]='{assistant[:60]}'")
+
+
+async def scenario_4_plan_mode(db: Database, config: Any, personas: Any, saver: Any) -> None:
+    """/plan blocks write tools → LLM produces plan markdown.  /approve queues
+    the plan as system message for next turn."""
+    print("\n[A4] /plan → plan markdown → /approve")
+    sid = uuid.uuid4()
+    sess = await _mk_session(db, config, personas, saver, sid)
+    await sess.enter_plan_mode()
+    agent = sess.build_agent_if_needed()
+    await _invoke_and_stream(
+        agent,
+        "Python으로 wordcount CLI를 만들 plan 을 마크다운으로 짧게 (10줄 이내) 답해.",
+        sess,
+    )
+    # Verify last assistant is plan markdown shape
+    async with db.session() as s:
+        msgs = (
+            (
+                await s.execute(
+                    select(MessageRow)
+                    .where(MessageRow.session_id == str(sid))
+                    .where(MessageRow.role == "assistant")
+                    .order_by(MessageRow.seq.desc())
+                )
+            )
+            .scalars()
+            .all()
+        )
+    plan_text = msgs[0].content if msgs else ""
+    has_markdown_hint = any(
+        token in plan_text for token in ("##", "###", "- ", "1.", "Phase", "단계")
+    )
+    ok_plan = bool(plan_text) and has_markdown_hint
+
+    await sess.approve_plan()
+    queued = sess.consume_pending_system_messages()
+    ok_approve = any("APPROVED" in q and plan_text[:20] in q for q in queued)
+    # Re-queue so future scenarios see clean state
+    for q in queued:
+        sess.queue_system_message(q)
+    sess.consume_pending_system_messages()  # discard now
+    _record(
+        "A4 plan mode",
+        ok_plan and ok_approve,
+        f"markdown={ok_plan} approve_queued={ok_approve} plan[:50]='{plan_text[:50]}'",
+    )
+
+
+async def scenario_5_subagent(db: Database, config: Any, personas: Any, saver: Any) -> None:
+    """spawn_subagent_session + run_subagent_to_completion → result on parent."""
+    print("\n[A5] /agents spawn live")
+    parent_sid = uuid.uuid4()
+    sess = await _mk_session(db, config, personas, saver, parent_sid)
+    persona = sess.persona
+    child_id = await spawn_subagent_session(
+        db,
+        parent_session_id=parent_sid,
+        persona=persona,
+        initial_title="haiku helper",
+    )
+    summary = await run_subagent_to_completion(
+        db, config, parent_sid, child_id, persona, "한국어로 짧게 인사해.", saver=None
+    )
+    async with db.session() as s:
+        parent_msgs = (
+            (
+                await s.execute(
+                    select(MessageRow)
+                    .where(MessageRow.session_id == str(parent_sid))
+                    .order_by(MessageRow.seq)
+                )
+            )
+            .scalars()
+            .all()
+        )
+        child_row = await s.get(InteractiveSessionRow, str(child_id))
+    pushed = any(f"sub-agent {str(child_id)[:8]} result" in m.content for m in parent_msgs)
+    ok = bool(summary) and pushed and child_row is not None and child_row.state == "ended"
+    state = child_row.state if child_row else "NONE"
+    _record(
+        "A5 sub-agent",
+        ok,
+        f"summary[:40]='{summary[:40]}' parent_push={pushed} child_ended={state}",
+    )
+
+
+async def scenario_6_compaction(db: Database, config: Any, personas: Any, saver: Any) -> None:
+    """Manually invoke compact_session on a session padded with enough messages."""
+    print("\n[A6] Auto-compaction trigger")
+    sid = uuid.uuid4()
+    await _mk_session(db, config, personas, saver, sid)
+    # Pad 14 active messages so compactor archives 4 + summary at seq=1.
+    async with db.session() as s:
+        for i in range(14):
+            s.add(
+                MessageRow(
+                    session_id=str(sid),
+                    seq=i + 1,
+                    role="user" if i % 2 == 0 else "assistant",
+                    content=f"padding message #{i} — talking about wordcount CLI design",
+                    tool_calls=None,
+                    token_count=10,
+                    is_summary=False,
+                    archived=False,
+                    ts=_now(),
+                )
+            )
+        await s.commit()
+    result = await compact_session(db, config, str(sid))
+    ok = (
+        result.compacted
+        and result.archived == 4
+        and bool(result.summary_text)
+        and result.summary_tokens > 0
+    )
+    _record(
+        "A6 compaction",
+        ok,
+        f"archived={result.archived} summary_tokens={result.summary_tokens} "
+        f"summary[:50]='{result.summary_text[:50]}'",
+    )
+
+
+async def scenario_7_workflow_background(
+    db: Database, config: Any, personas: Any, saver: Any
+) -> None:
+    """We do NOT trigger a full WorkflowEngine.run (~$0.05) here — that's
+    covered by `tests/integration/test_e2e_workflow.py`.  Instead we verify the
+    /workflow background dispatch path is wired correctly by checking template
+    resolution + binding preview."""
+    print("\n[A7] /workflow background dispatch wiring")
+    from my_deepagent.binding import is_persona_eligible_for_role
+
+    sess = await _mk_session(db, config, personas, saver, uuid.uuid4())
+    workflows = sess.workflows
+    if not workflows:
+        _record("A7 workflow wiring", False, "no workflows loaded")
+        return
+    _path, tpl = workflows[0]
+    # Verify every role has at least one eligible persona — same logic as
+    # `_print_binding_for_template`.
+    role_resolutions = {}
+    for role in tpl.roles:
+        eligible = [p for p in sess.personas if is_persona_eligible_for_role(p, role, tpl)[0]]
+        role_resolutions[role.id] = len(eligible)
+    ok = all(n > 0 for n in role_resolutions.values())
+    _record(
+        "A7 workflow wiring",
+        ok,
+        f"template={tpl.name}@{tpl.version} role_eligibles={role_resolutions}",
+    )
+
+
+async def main() -> int:
+    config = load_config()
+    if not os.environ.get("OPENROUTER_API_KEY") and "openrouter" not in str(
+        config.openrouter_base_url
+    ):
+        # API key may come from keyring; resolve_openrouter_api_key handles it
+        pass
+    # Ensure consent recorded for this run (smoke pollution we tolerated earlier).
+    record_consent(config.data_dir)
+    bootstrap_user_dirs(config)
+    ensure_user_dirs_initialized(config)
+
+    db = Database(config.database_url)
+    await db.init_schema()
+
+    personas = load_combined_personas(config, _SEED / "personas")
+
+    print(f"[live_verify] config.data_dir={config.data_dir}")
+    print(f"[live_verify] db={config.database_url}")
+    print(f"[live_verify] personas loaded: {len(personas)}")
+    print("[live_verify] running 7 scenarios against real OpenRouter (~$0.05 total)")
+
+    saver_ctx = get_checkpointer_ctx(config.database_url)
+    try:
+        if config.database_url.startswith("postgresql"):
+            saver = await saver_ctx.__aenter__()
+        else:
+            saver = None
+        try:
+            chat_sid = await scenario_1_basic_chat(db, config, personas, saver)
+            await scenario_2_resume(db, config, personas, saver, chat_sid)
+            await scenario_3_skill(db, config, personas, saver)
+            await scenario_4_plan_mode(db, config, personas, saver)
+            await scenario_5_subagent(db, config, personas, saver)
+            await scenario_6_compaction(db, config, personas, saver)
+            await scenario_7_workflow_background(db, config, personas, saver)
+        finally:
+            if saver is not None:
+                await saver_ctx.__aexit__(None, None, None)
+    finally:
+        await db.dispose()
+
+    print("\n[summary]")
+    passed = sum(1 for _, ok, _ in _RESULTS if ok)
+    print(f"  {passed}/{len(_RESULTS)} PASS")
+    for name, ok, note in _RESULTS:
+        marker = "✅" if ok else "❌"
+        print(f"  {marker} {name}: {note}")
+    return 0 if passed == len(_RESULTS) else 1
+
+
+if __name__ == "__main__":
+    sys.exit(asyncio.run(main()))
--- a/my-deepagent/src/my_deepagent/api/agent_runner.py
+++ b/my-deepagent/src/my_deepagent/api/agent_runner.py
@@ -0,0 +1,313 @@
+"""Background agent invocation for the Web GUI (v0.3 PR #8 + v0.4 B3 streaming).
+
+The Web GUI POSTs user messages to ``/api/sessions/{id}/messages`` and expects
+an assistant response to appear via the SSE stream shortly after.  The route
+handler persists the user message and kicks off this runner as a fire-and-
+forget asyncio task — same fundamentals as :mod:`cli.interactive` but without
+the prompt-toolkit REPL loop.
+
+v0.4 B3 adds token streaming: a ``chunk_queue`` (per-session ``asyncio.Queue``)
+can be passed in.  We attach a ``BaseAsyncCallbackHandler`` to the ainvoke
+config so every new token the LLM emits lands on the queue as
+``{"type": "delta", "text": "..."}``.  The SSE stream loop drains the queue
+and pushes each chunk as an ``event: chunk`` SSE.
+
+This runner is **single-uvicorn-worker** by design (see ``api/app.py``'s
+docstring): the saver is held on ``app.state.saver`` and shared across all
+background invocations.  Multi-worker support would require Postgres
+``LISTEN/NOTIFY`` fanout — deferred per plan.
+"""
+
+from __future__ import annotations
+
+import asyncio
+import logging
+from typing import Any
+from uuid import UUID, uuid4
+
+from langchain_core.callbacks import AsyncCallbackHandler
+from sqlalchemy import desc, select
+
+from ..audit import make_audit_recorder
+from ..budget import make_budget_tracker_from_config
+from ..compaction import compact_session, should_compact
+from ..config import Config
+from ..hash import sha256
+from ..instructions import (
+    ensure_global_instructions_initialized,
+    resolve_instruction_paths,
+)
+from ..memory import (
+    ensure_memory_initialized,
+    list_memory_paths,
+    project_memory_dir,
+)
+from ..middleware.audit import AuditToolMiddleware
+from ..middleware.cost import CostMiddleware
+from ..middleware.plan_mode import PlanModeMiddleware
+from ..monitoring.pricing import ModelPrice, PricingCache
+from ..monitoring.token_budget import count_tokens
+from ..persistence.db import Database
+from ..persistence.models import InteractiveSessionRow, MessageRow
+from ..persona import Persona
+from ..session import build_agent
+from ..skills import ensure_skills_initialized, resolve_skill_sources, user_skills_dir
+
+_LOG = logging.getLogger(__name__)
+
+
+def _static_pricing_seed() -> PricingCache:
+    """Minimal seed identical to the REPL's _static_pricing_seed."""
+    cache = PricingCache()
+    cache.set(
+        [
+            ModelPrice("anthropic/claude-sonnet-4-6", 0.003, 0.015, 200_000),
+            ModelPrice("anthropic/claude-haiku-4-5", 0.001, 0.005, 200_000),
+            ModelPrice("anthropic/claude-opus-4-1", 0.015, 0.075, 200_000),
+            ModelPrice("deepseek/deepseek-chat", 0.00028, 0.00112, 64_000),
+        ]
+    )
+    return cache
+
+
+def _flatten_assistant_content(message: Any) -> str:
+    """Convert a langchain assistant message's content into a plain string.
+
+    LangChain may return a list of content blocks (text + tool_use); we
+    concatenate the text-bearing pieces.  Falls back to ``str(content)`` if
+    the shape is unexpected.
+    """
+    content = getattr(message, "content", "") or ""
+    if isinstance(content, list):
+        parts: list[str] = []
+        for block in content:
+            if isinstance(block, dict):
+                parts.append(block.get("text", "") or "")
+            else:
+                parts.append(str(block))
+        return "\n".join(p for p in parts if p)
+    return str(content)
+
+
+async def _bootstrap_session_dirs(config: Config, project_key: str) -> None:
+    """Ensure memory + skills + global instruction dirs exist for the session.
+
+    Mirrors :class:`cli.interactive.InteractiveSession.__init__`.  Idempotent
+    so repeated background invocations are cheap.
+    """
+    ensure_memory_initialized(project_memory_dir(config, project_key))
+    ensure_skills_initialized(user_skills_dir(config))
+    ensure_global_instructions_initialized(config)
+
+
+class _StreamingChunkPusher(AsyncCallbackHandler):
+    """Push every `on_llm_new_token` onto a session-bound asyncio.Queue.
+
+    The SSE stream consumes the queue and pushes each chunk as an SSE
+    ``event: chunk`` so the browser can render typing-style streaming.
+    """
+
+    def __init__(self, queue: asyncio.Queue[dict[str, Any]]) -> None:
+        self._queue = queue
+
+    async def on_llm_new_token(self, token: str, **_kwargs: Any) -> None:
+        if not token:
+            return
+        try:
+            await self._queue.put({"type": "delta", "text": token})
+        except Exception:
+            # Never let a streaming push failure abort the LLM call.
+            _LOG.debug("chunk-queue put failed (queue likely closed)", exc_info=True)
+
+
+def _build_session_agent(
+    db: Database,
+    config: Config,
+    persona: Persona,
+    session_id: UUID,
+    row: InteractiveSessionRow,
+    *,
+    saver: Any | None,
+) -> Any:
+    """Assemble the deepagents CompiledStateGraph for one session invocation.
+
+    Extracted from :func:`invoke_session_agent` to keep that function under
+    the C901 complexity threshold.  Pure construction — no side effects on the
+    DB beyond what `build_agent` itself does.
+    """
+    pricing = _static_pricing_seed()
+    budget = make_budget_tracker_from_config(db, config)
+    cost_mw = CostMiddleware(
+        pricing=pricing,
+        model_name=row.model or persona.model,
+        interactive_session_id=session_id,
+        persona_name=persona.name,
+        budget_tracker=budget,
+    )
+    audit_mw = AuditToolMiddleware(
+        interactive_session_id=session_id,
+        file_recorder=make_audit_recorder(config.state_dir),
+    )
+    is_plan = bool(row.plan_mode)
+    plan_mw = PlanModeMiddleware(is_active=lambda: is_plan)
+
+    project_key = row.project_key or sha256(str(config.workspace_root.resolve()))[:16]
+    memory_dir = project_memory_dir(config, project_key)
+    instruction_paths = resolve_instruction_paths(config, config.workspace_root)
+    memory_paths = list_memory_paths(memory_dir)
+    skill_sources = resolve_skill_sources(config)
+
+    return build_agent(
+        persona,
+        config,
+        root_dir=config.workspace_root,
+        middleware=[plan_mw, cost_mw, audit_mw],
+        checkpointer=saver,
+        memory_paths_override=[*instruction_paths, *memory_paths],
+        skills_sources_override=skill_sources,
+    )
+
+
+async def invoke_session_agent(
+    db: Database,
+    config: Config,
+    personas: list[Persona],
+    session_id: UUID,
+    user_message: str,
+    *,
+    saver: Any | None = None,
+    chunk_queue: asyncio.Queue[dict[str, Any]] | None = None,
+) -> None:
+    """Run one ainvoke + persist the assistant reply for the given session.
+
+    The user message is assumed to be ALREADY persisted by the HTTP handler
+    (POST /api/sessions/{id}/messages).  This runner only adds the assistant
+    response and runs the post-turn auto-compaction check.
+
+    Failures are logged but never raised — the route handler returned 200 as
+    soon as the user message was persisted, and the SSE stream is how the
+    client observes success or absence of progress.
+    """
+    async with db.session() as s:
+        row = await s.get(InteractiveSessionRow, str(session_id))
+    if row is None:
+        _LOG.warning("invoke_session_agent: session %s not found", session_id)
+        return
+
+    persona = _resolve_persona(personas, row.persona_hash)
+    if persona is None:
+        _LOG.warning(
+            "invoke_session_agent: persona hash %s not in loaded personas", row.persona_hash
+        )
+        return
+
+    project_key = row.project_key or sha256(str(config.workspace_root.resolve()))[:16]
+    await _bootstrap_session_dirs(config, project_key)
+
+    agent = _build_session_agent(db, config, persona, session_id, row, saver=saver)
+
+    thread_id = f"{session_id}:0"
+    result = await _run_ainvoke(agent, user_message, thread_id, chunk_queue, session_id)
+    if result is None:
+        return
+    messages = result.get("messages", []) if isinstance(result, dict) else []
+    if not messages:
+        return
+    assistant_text = _flatten_assistant_content(messages[-1])
+    if not assistant_text:
+        return
+
+    await _persist_assistant_message(db, session_id, assistant_text, row.model or persona.model)
+
+    # Post-turn auto-compaction (mirrors REPL behaviour).
+    async with db.session() as s:
+        refreshed = await s.get(InteractiveSessionRow, str(session_id))
+    if refreshed is not None and should_compact(refreshed):
+        await compact_session(db, config, str(session_id))
+
+
+async def _run_ainvoke(
+    agent: Any,
+    user_message: str,
+    thread_id: str,
+    chunk_queue: asyncio.Queue[dict[str, Any]] | None,
+    session_id: UUID,
+) -> dict[str, Any] | None:
+    """Wrapper around agent.ainvoke that emits chunk_queue lifecycle events.
+
+    Returns the raw result dict on success, ``None`` on any failure (logged).
+    Re-raises ``CancelledError`` so the asyncio task is correctly marked
+    cancelled and the route's done-callback can clean up.
+    """
+    invoke_config: dict[str, Any] = {"configurable": {"thread_id": thread_id}}
+    if chunk_queue is not None:
+        invoke_config["callbacks"] = [_StreamingChunkPusher(chunk_queue)]
+    try:
+        return await agent.ainvoke(  # type: ignore[no-any-return]
+            {"messages": [{"role": "user", "content": user_message}]},
+            config=invoke_config,
+        )
+    except asyncio.CancelledError:
+        _LOG.info("agent.ainvoke cancelled for session %s", session_id)
+        if chunk_queue is not None:
+            await chunk_queue.put({"type": "cancelled"})
+        raise
+    except Exception:
+        _LOG.exception("agent.ainvoke failed for session %s", session_id)
+        if chunk_queue is not None:
+            await chunk_queue.put({"type": "error"})
+        return None
+    finally:
+        if chunk_queue is not None:
+            await chunk_queue.put({"type": "done"})
+
+
+def _resolve_persona(personas: list[Persona], persona_hash: str) -> Persona | None:
+    for p in personas:
+        if p.compute_hash() == persona_hash:
+            return p
+    return None
+
+
+async def _persist_assistant_message(
+    db: Database,
+    session_id: UUID,
+    content: str,
+    model: str,
+) -> None:
+    token_count = count_tokens(content, model)
+    from datetime import UTC, datetime
+
+    now = datetime.now(UTC).isoformat(timespec="seconds")
+    async with db.session() as s:
+        last_seq = (
+            await s.execute(
+                select(MessageRow.seq)
+                .where(MessageRow.session_id == str(session_id))
+                .order_by(desc(MessageRow.seq))
+                .limit(1)
+            )
+        ).scalar_one_or_none() or 0
+        s.add(
+            MessageRow(
+                session_id=str(session_id),
+                seq=last_seq + 1,
+                role="assistant",
+                content=content,
+                tool_calls=None,
+                token_count=token_count,
+                is_summary=False,
+                archived=False,
+                ts=now,
+            )
+        )
+        row = await s.get(InteractiveSessionRow, str(session_id))
+        if row is not None:
+            row.last_message_at = now
+            row.total_output_tokens += token_count
+        await s.commit()
+
+
+# Re-exported for tests that want to construct a fresh persona+session row
+# without going through the HTTP layer.
+__all__ = ["invoke_session_agent", "uuid4"]
--- a/my-deepagent/src/my_deepagent/api/app.py
+++ b/my-deepagent/src/my_deepagent/api/app.py
@@ -17,9 +17,11 @@ from fastapi.staticfiles import StaticFiles
 from starlette.responses import FileResponse

 from ..config import Config, load_config
+from ..persistence.checkpointer import get_checkpointer_ctx
 from ..persistence.db import Database
 from ..persona import load_personas_from_dir
-from ..workflow import WorkflowTemplate, load_workflow_yaml
+from ..user_dirs import load_combined_workflows
+from ..workflow import WorkflowTemplate
 from .routes import budget as budget_routes
 from .routes import personas as personas_routes
 from .routes import runs as runs_routes
@@ -32,29 +34,23 @@ _STATIC_ROOT = Path(__file__).resolve().parents[3] / "static"
 _LOG = logging.getLogger(__name__)


-def _load_seed_workflows() -> list[tuple[Path, WorkflowTemplate]]:
-    """Return (path, WorkflowTemplate) for every YAML in docs/schemas/workflows/.
-
-    Malformed YAMLs are logged and skipped — the API should still come up
-    cleanly even if one seed is broken.
-    """
-    wf_dir = _DOCS_SCHEMAS / "workflows"
-    if not wf_dir.is_dir():
-        return []
-    out: list[tuple[Path, WorkflowTemplate]] = []
-    for p in sorted(wf_dir.glob("*.yaml")):
-        try:
-            tpl = load_workflow_yaml(p)
-        except Exception as e:
-            _LOG.warning("skipping malformed workflow yaml %s: %s", p, e)
-            continue
-        out.append((p, tpl))
-    return out
+def _load_workflows_combined(config: Config) -> list[tuple[Path, WorkflowTemplate]]:
+    """Seed + user workflows.  Malformed YAMLs are logged + skipped — the
+    API still comes up cleanly even if one file is broken.  Per-request
+    hot-reload (`deps.get_workflows`) reuses the same loader."""
+    return load_combined_workflows(config, _DOCS_SCHEMAS / "workflows")


@asynccontextmanager
 async def _lifespan(app: FastAPI) -> AsyncIterator[None]:
-    """Initialize the shared Database, personas, workflows on startup; dispose on shutdown."""
+    """Initialize the shared Database, personas, workflows, LangGraph saver on
+    startup; dispose on shutdown.
+
+    The saver is opened once per app lifetime and reused by background agent
+    invocations from POST /api/sessions/{id}/messages (v0.3 PR #8).  Opening
+    per-request would be too expensive (each open establishes a Postgres
+    connection + verifies the checkpoint schema).
+    """
    config: Config = app.state.config or load_config()
    db = Database(config.database_url)
    # init_schema is a no-op against an already-migrated DB; cheap to call.
@@ -62,10 +58,26 @@ async def _lifespan(app: FastAPI) -> AsyncIterator[None]:
    app.state.config = config
    app.state.db = db
    app.state.personas = load_personas_from_dir(_DOCS_SCHEMAS / "personas")
-    app.state.workflows = _load_seed_workflows()
+    app.state.workflows = _load_workflows_combined(config)
+    # Hot-reload signature — `deps.get_workflows` re-checks per request.
+    app.state.workflows_sig = None
+    saver_ctx = get_checkpointer_ctx(config.database_url)
    try:
+        # AsyncPostgresSaver.from_conn_string only works for postgres; for sqlite
+        # tests we silently fall back to None and let background ainvoke run
+        # without checkpointing (acceptable: tests stub agents anyway).
+        if config.database_url.startswith("postgresql"):
+            saver = await saver_ctx.__aenter__()
+            app.state.saver = saver
+        else:
+            app.state.saver = None
        yield
    finally:
+        if app.state.saver is not None:
+            try:
+                await saver_ctx.__aexit__(None, None, None)
+            except Exception:
+                _LOG.exception("saver context exit failed during shutdown")
        await db.dispose()


--- a/my-deepagent/src/my_deepagent/api/deps.py
+++ b/my-deepagent/src/my_deepagent/api/deps.py
@@ -3,6 +3,12 @@
 Pulls singletons stashed in `app.state` by the lifespan handler. Database is
 created ONCE per uvicorn process; per-request creation would defeat
 connection pooling.
+
+Workflows are different — they live in YAML files that the user can edit /
+create at runtime via the workflow generator UI.  `get_workflows` does a
+cheap mtime check on every call and reloads when any file in the seed or
+user workflow directory has changed.  No file watcher / inotify needed —
+the directories are tiny (≤ dozens of files) and stat is cheap.
 """

 from __future__ import annotations
@@ -14,6 +20,7 @@ from fastapi import Request

 from ..config import Config
 from ..persistence.db import Database
+from ..user_dirs import load_combined_workflows, user_workflows_dir

 if TYPE_CHECKING:
    from ..persona import Persona
@@ -36,9 +43,41 @@ def get_personas(request: Request) -> list[Persona]:
    return request.app.state.personas  # type: ignore[no-any-return]


+def _workflow_dir_signature(config: Config) -> tuple[tuple[str, float], ...]:
+    """Cheap mtime-tuple fingerprint of all YAMLs in seed + user dirs.
+
+    Two stat calls per file; the fingerprint changes when any file is
+    created / modified / deleted.  Used as the cache key for
+    :func:`get_workflows` so the API picks up new templates without a
+    process restart.
+    """
+    sig: list[tuple[str, float]] = []
+    for d in (_DOCS_SCHEMAS / "workflows", user_workflows_dir(config)):
+        if not d.is_dir():
+            continue
+        for p in sorted(d.glob("*.yaml")):
+            try:
+                sig.append((str(p), p.stat().st_mtime))
+            except OSError:
+                continue
+    return tuple(sig)
+
+
 def get_workflows(request: Request) -> list[tuple[Path, WorkflowTemplate]]:
-    """Return a list of (yaml_path, WorkflowTemplate) for all seed workflows."""
-    return request.app.state.workflows  # type: ignore[no-any-return]
+    """Return (path, template) list with mtime-based hot-reload.
+
+    On every request, computes the mtime fingerprint of the workflow dirs.
+    If it differs from the cached signature, calls
+    :func:`load_combined_workflows` again to pick up new / edited files.
+    """
+    app = request.app
+    config: Config = app.state.config
+    current_sig = _workflow_dir_signature(config)
+    cached_sig: tuple[tuple[str, float], ...] | None = getattr(app.state, "workflows_sig", None)
+    if cached_sig != current_sig:
+        app.state.workflows = load_combined_workflows(config, _DOCS_SCHEMAS / "workflows")
+        app.state.workflows_sig = current_sig
+    return app.state.workflows  # type: ignore[no-any-return]


 def seed_root() -> Path:
--- a/my-deepagent/src/my_deepagent/api/models.py
+++ b/my-deepagent/src/my_deepagent/api/models.py
@@ -128,6 +128,60 @@ class WorkflowSummary(_Strict):
    phases: list[WorkflowPhaseSummary]


+# v0.4 — workflow generator UI (POST /api/workflows)
+
+
+class WorkflowRoleSpec(_Strict):
+    """Input shape for one role inside a CreateWorkflowRequest."""
+
+    id: str = Field(min_length=1, pattern=r"^[a-z][a-z0-9_]*$")
+    required_capabilities: list[str] = Field(min_length=1)
+    preferred_backends: list[str] = Field(default_factory=list)
+    fallback_personas: list[str] = Field(default_factory=list)
+
+
+class WorkflowArtifactSpec(_Strict):
+    """Input shape for one phase's expected_artifact (optional)."""
+
+    path: str = Field(min_length=1)
+    # YAML key is `schema`; pydantic attribute aliased to avoid BaseModel.schema clash
+    schema_id: str = Field(min_length=1, alias="schema")
+
+
+class WorkflowPhaseSpec(_Strict):
+    """Input shape for one phase inside a CreateWorkflowRequest."""
+
+    key: str = Field(min_length=1, pattern=r"^[a-z][a-z0-9_]*$")
+    title: str = Field(min_length=1)
+    risk: str = Field(min_length=1)  # low|medium|high — validated by WorkflowTemplate
+    role: str = Field(min_length=1)
+    instructions: str = Field(min_length=10)
+    expected_artifact: WorkflowArtifactSpec | None = None
+    gates: list[str] = Field(default_factory=list)
+    timeout_seconds: int | None = Field(default=None, ge=1)
+    max_budget_usd: float | None = Field(default=None, ge=0)
+
+
+class CreateWorkflowRequest(_Strict):
+    """Body for POST /api/workflows — saves a new template YAML on disk."""
+
+    name: str = Field(min_length=1)
+    version: int = Field(ge=1)
+    description: str | None = None
+    roles: list[WorkflowRoleSpec] = Field(min_length=1)
+    phases: list[WorkflowPhaseSpec] = Field(min_length=1)
+    default_gates: list[str] = Field(default_factory=list)
+    max_total_budget_usd: float | None = Field(default=None, ge=0)
+
+
+class CreateWorkflowResponse(_Strict):
+    """Returned by POST /api/workflows."""
+
+    path: str  # absolute path of the saved YAML
+    name: str
+    version: int
+
+
 # ---------------------------------------------------------------------------
 # /api/budget
 # ---------------------------------------------------------------------------
--- a/my-deepagent/src/my_deepagent/api/routes/sessions.py
+++ b/my-deepagent/src/my_deepagent/api/routes/sessions.py
@@ -30,6 +30,7 @@ from ...persistence.models import (
    MessageRow,
 )
 from ...persona import Persona
+from ..agent_runner import invoke_session_agent
 from ..deps import get_config, get_db, get_personas
 from ..models import (
    CreateSessionRequest,
@@ -41,7 +42,9 @@ from ..models import (
 )

 _LOG = logging.getLogger(__name__)
-_POLL_INTERVAL_S: float = 0.5
+# v0.4 B3: 100ms poll keeps token-streaming UX snappy.  At idle the loop just
+# does two cheap selects — well within asyncpg + SSE budgets.
+_POLL_INTERVAL_S: float = 0.1
 _TERMINAL_STATES: frozenset[str] = frozenset({"ended"})

 router = APIRouter()
@@ -218,8 +221,18 @@ async def create_session(
 async def post_message(
    session_id: str,
    body: PostMessageRequest,
+    request: Request,
    db: DbDep,
+    config: ConfigDep,
+    personas: PersonasDep,
 ) -> SessionAck:
+    """Persist a user message + fire the agent invocation in the background.
+
+    v0.3 PR #8: returns immediately after the user message is durably
+    persisted.  The background task fetches the saver from ``app.state`` (set
+    up by the lifespan) and emits the assistant reply via the same SSE stream
+    that the client is already subscribed to.
+    """
    async with db.session() as s:
        row = await s.get(InteractiveSessionRow, session_id)
        if row is None:
@@ -257,19 +270,124 @@ async def post_message(
            row.title = body.content[:50]
        await s.commit()

+    # Fire-and-forget background invocation.  We do NOT await it — the route
+    # returns 200 immediately and the SSE stream picks up the assistant reply.
+    # Hold a reference on app.state so RUF006 + GC don't kill the task mid-flight.
+    saver = getattr(request.app.state, "saver", None)
+    from uuid import UUID
+
+    # v0.4 B3: per-session token chunk queue.  agent_runner pushes deltas
+    # via AsyncCallbackHandler; the SSE stream below drains the queue.
+    chunk_queues: dict[str, asyncio.Queue[Any]] = getattr(
+        request.app.state, "token_chunk_queues", {}
+    )
+    queue: asyncio.Queue[Any] = asyncio.Queue()
+    chunk_queues[session_id] = queue
+    request.app.state.token_chunk_queues = chunk_queues
+
+    task = asyncio.create_task(
+        invoke_session_agent(
+            db,
+            config,
+            personas,
+            UUID(session_id),
+            body.content,
+            saver=saver,
+            chunk_queue=queue,
+        )
+    )
+    pending: set[asyncio.Task[Any]] = getattr(request.app.state, "pending_invocations", set())
+    pending.add(task)
+    request.app.state.pending_invocations = pending
+    task.add_done_callback(pending.discard)
+    # v0.4 B4: index the task by session_id so a subsequent POST /abort can
+    # cancel mid-flight.  We deliberately overwrite an earlier task if one is
+    # still in flight — the new user message implicitly cancels the previous
+    # turn (Claude Code parity).
+    per_session: dict[str, asyncio.Task[Any]] = getattr(
+        request.app.state, "pending_per_session", {}
+    )
+    prev = per_session.get(session_id)
+    if prev is not None and not prev.done():
+        prev.cancel()
+    per_session[session_id] = task
+    request.app.state.pending_per_session = per_session
+
+    def _remove_from_session_map(_t: asyncio.Task[Any], sid: str = session_id) -> None:
+        per_session.pop(sid, None)
+
+    task.add_done_callback(_remove_from_session_map)
+
    return SessionAck(session_id=session_id, state="active", message="queued")


+# ---------------------------------------------------------------------------
+# POST /api/sessions/{id}/abort — cancel an in-flight turn (v0.4 B4)
+# ---------------------------------------------------------------------------
+
+
+@router.post("/{session_id}/abort", response_model=SessionAck)
+async def abort_turn(session_id: str, request: Request, db: DbDep) -> SessionAck:
+    """Cancel the in-flight ainvoke for this session, if any.
+
+    Idempotent — returns ok even when no task is in flight.  The cancelled
+    task's ``finally`` clauses still run, so the LangGraph checkpoint stays
+    consistent.  The next POST /messages reuses the same thread.
+    """
+    async with db.session() as s:
+        row = await s.get(InteractiveSessionRow, session_id)
+        if row is None:
+            raise HTTPException(status_code=404, detail=f"session {session_id} not found")
+
+    per_session: dict[str, asyncio.Task[Any]] = getattr(
+        request.app.state, "pending_per_session", {}
+    )
+    task = per_session.get(session_id)
+    if task is not None and not task.done():
+        task.cancel()
+        return SessionAck(session_id=session_id, state="active", message="aborted")
+    return SessionAck(session_id=session_id, state="active", message="no-in-flight-task")
+
+
 # ---------------------------------------------------------------------------
 # GET /api/sessions/{id}/stream — SSE
 # ---------------------------------------------------------------------------


-async def _session_event_stream(db: Database, session_id: str, last_seq: int = 0) -> Any:
-    """Yield ServerSentEvent per new MessageRow. Closes when session ends."""
+async def _session_event_stream(
+    db: Database, session_id: str, last_seq: int = 0, app: Any = None
+) -> Any:
+    """Yield ServerSentEvent per new MessageRow + token chunk. Closes on terminal.
+
+    Three event types emitted:
+    - ``message`` (existing): one row per new MessageRow.
+    - ``chunk`` (v0.4 B3): token delta from the in-flight ainvoke.  Drained
+      from ``app.state.token_chunk_queues[session_id]`` if present.
+    - ``done`` (existing): session terminal or deleted.
+    """
    seen = last_seq

    while True:
+        # v0.4 B3: drain queued token chunks FIRST so streaming visibly
+        # precedes the final `message` SSE.  Without this the placeholder
+        # is replaced by the persisted MessageRow before any chunk reaches
+        # the browser — Claude-Code-style typing would never appear.
+        if app is not None:
+            queues: dict[str, asyncio.Queue[Any]] = getattr(app.state, "token_chunk_queues", {})
+            queue = queues.get(session_id)
+            if queue is not None:
+                drained = 0
+                while not queue.empty() and drained < 200:
+                    chunk = queue.get_nowait()
+                    yield ServerSentEvent(
+                        data=json.dumps(chunk, ensure_ascii=False),
+                        event="chunk",
+                    )
+                    drained += 1
+                    if chunk.get("type") in ("done", "cancelled", "error"):
+                        queues.pop(session_id, None)
+                        break
+
        async with db.session() as s:
            message_rows = (
                (
@@ -333,7 +451,7 @@ async def stream_session(
        row = await s.get(InteractiveSessionRow, session_id)
    if row is None:
        raise HTTPException(status_code=404, detail=f"session {session_id} not found")
-    return EventSourceResponse(_session_event_stream(db, session_id, last_seq))
+    return EventSourceResponse(_session_event_stream(db, session_id, last_seq, app=request.app))


 # ---------------------------------------------------------------------------
--- a/my-deepagent/src/my_deepagent/api/routes/workflows.py
+++ b/my-deepagent/src/my_deepagent/api/routes/workflows.py
@@ -1,19 +1,35 @@
-"""GET /api/workflows — list seed workflow templates."""
+"""GET /api/workflows — list seed + user templates (hot-reloaded).
+
+v0.4: POST /api/workflows persists a new template YAML under
+``<config.data_dir>/workflows/<name>@<version>.yaml`` so the workflow
+generator UI can create templates without leaving the browser.
+"""

 from __future__ import annotations

 from pathlib import Path
 from typing import Annotated

-from fastapi import APIRouter, Depends
+import yaml
+from fastapi import APIRouter, Depends, HTTPException, status
+from pydantic import ValidationError

+from ...config import Config
+from ...user_dirs import user_workflows_dir
 from ...workflow import WorkflowTemplate
-from ..deps import get_workflows, seed_root
-from ..models import WorkflowPhaseSummary, WorkflowRoleSummary, WorkflowSummary
+from ..deps import get_config, get_workflows, seed_root
+from ..models import (
+    CreateWorkflowRequest,
+    CreateWorkflowResponse,
+    WorkflowPhaseSummary,
+    WorkflowRoleSummary,
+    WorkflowSummary,
+)

 router = APIRouter()

 WorkflowsDep = Annotated[list[tuple[Path, WorkflowTemplate]], Depends(get_workflows)]
+ConfigDep = Annotated[Config, Depends(get_config)]


@router.get("", response_model=list[WorkflowSummary])
@@ -50,3 +66,57 @@ async def list_workflows(workflows: WorkflowsDep) -> list[WorkflowSummary]:
            )
        )
    return out
+
+
+@router.post(
+    "",
+    response_model=CreateWorkflowResponse,
+    status_code=status.HTTP_201_CREATED,
+)
+async def create_workflow(body: CreateWorkflowRequest, config: ConfigDep) -> CreateWorkflowResponse:
+    """Persist a new WorkflowTemplate YAML under the user workflows dir.
+
+    Pipeline:
+    1. Convert request → dict (frontmatter aliases preserved: ``schema``
+       not ``schema_id`` so the file is round-tripped by
+       :func:`load_workflow_yaml`).
+    2. Hand it to :class:`WorkflowTemplate.model_validate` — same strict
+       schema the YAML loader uses.  ValidationError → 422.
+    3. Write atomically to ``<user_workflows>/<name>@<version>.yaml``.
+       Refuse to overwrite an existing user template with the same key
+       (use a new version).
+    4. The hot-reload signature on the next GET picks the file up
+       automatically — no restart needed.
+    """
+    raw = body.model_dump(by_alias=True)
+    try:
+        tpl = WorkflowTemplate.model_validate(raw)
+    except ValidationError as e:
+        # `e.errors()` may put `ValueError` objects inside `ctx` (Pydantic
+        # convention when a validator raises) — those don't JSON-serialise.
+        # Flatten to a list[str] so the 422 body is always safe to dump.
+        msgs = [
+            f"{'.'.join(str(p) for p in err.get('loc', ()))}: {err.get('msg', '')}"
+            for err in e.errors()
+        ]
+        raise HTTPException(status_code=422, detail=msgs) from e
+
+    target_dir = user_workflows_dir(config)
+    target_dir.mkdir(parents=True, exist_ok=True)
+    target = target_dir / f"{tpl.name}@{tpl.version}.yaml"
+    if target.exists():
+        raise HTTPException(
+            status_code=409,
+            detail=(
+                f"workflow {tpl.name}@{tpl.version} already exists at "
+                f"{target}.  Bump the version or delete the file first."
+            ),
+        )
+
+    serialised = yaml.safe_dump(
+        tpl.model_dump(by_alias=True, mode="json"),
+        allow_unicode=True,
+        sort_keys=False,
+    )
+    target.write_text(serialised, encoding="utf-8")
+    return CreateWorkflowResponse(path=str(target), name=tpl.name, version=tpl.version)
--- a/my-deepagent/src/my_deepagent/budget.py
+++ b/my-deepagent/src/my_deepagent/budget.py
@@ -95,9 +95,10 @@ class BudgetTracker:
        run_id: UUID | None,
        persona_name: str | None,
        estimated_cost_usd: float,
+        session_id: UUID | None = None,
    ) -> BudgetCheck:
        """Check if a call of estimated_cost can proceed. May raise BudgetExhaustedError."""
-        scopes = self._scopes_for(run_id, persona_name)
+        scopes = self._scopes_for(run_id, persona_name, session_id)
        async with self._db.session() as s:
            for scope in scopes:
                cap = self._cap_for_scope(scope)
@@ -120,11 +121,12 @@ class BudgetTracker:
        run_id: UUID | None,
        persona_name: str | None,
        actual_cost_usd: float,
+        session_id: UUID | None = None,
    ) -> None:
        """Persist the actual cost into all relevant scopes."""
        if actual_cost_usd == 0:
            return
-        scopes = self._scopes_for(run_id, persona_name)
+        scopes = self._scopes_for(run_id, persona_name, session_id)
        async with self._db.session() as s:
            for scope in scopes:
                await self._upsert_spend(s, scope, actual_cost_usd, self._cap_for_scope(scope))
@@ -145,11 +147,22 @@ class BudgetTracker:

    # ----- internals ----------------------------------------------------------

-    def _scopes_for(self, run_id: UUID | None, persona_name: str | None) -> list[str]:
+    def _scopes_for(
+        self,
+        run_id: UUID | None,
+        persona_name: str | None,
+        session_id: UUID | None = None,
+    ) -> list[str]:
        today = _today_utc()
        out = [f"day:{today}"]
        if run_id is not None:
            out.append(f"run:{run_id}")
+        if session_id is not None:
+            # v0.3 PR #6: sub-agent invocations charge their cost against this
+            # scope so the root interactive session can roll up everything that
+            # ran under it.  Cap is the same as run cap (single user, single
+            # session ≈ single run for budget purposes).
+            out.append(f"session:{session_id}")
        if persona_name:
            out.append(f"persona:{persona_name}:day:{today}")
        return out
@@ -159,6 +172,8 @@ class BudgetTracker:
            return self._daily_cap
        if scope.startswith("run:"):
            return self._run_cap
+        if scope.startswith("session:"):
+            return self._run_cap  # reuse run-cap for interactive sessions
        if scope.startswith("persona:") and ":day:" in scope:
            return self._daily_cap  # per-persona daily uses day cap unless overridden
        return None
--- a/my-deepagent/src/my_deepagent/cli/interactive.py
+++ b/my-deepagent/src/my_deepagent/cli/interactive.py
--- a/my-deepagent/src/my_deepagent/compaction.py
+++ b/my-deepagent/src/my_deepagent/compaction.py
@@ -64,11 +64,13 @@ class CompactionResult:
        compacted: bool,
        archived: int = 0,
        summary_tokens: int = 0,
+        summary_text: str = "",
        reason: str = "",
    ) -> None:
        self.compacted = compacted
        self.archived = archived
        self.summary_tokens = summary_tokens
+        self.summary_text = summary_text
        self.reason = reason

    def __repr__(self) -> str:
@@ -280,6 +282,7 @@ async def compact_session(
            compacted=True,
            archived=len(archive_ids),
            summary_tokens=summary_tokens,
+            summary_text=summary_text,
            reason="ok",
        )

--- a/my-deepagent/src/my_deepagent/governance.py
+++ b/my-deepagent/src/my_deepagent/governance.py
@@ -1,13 +1,29 @@
-"""Governance consent for sending user code to external LLM providers."""
+"""Governance consent + first-run filesystem bootstrap.
+
+v0.3 PR #7 extends this module to provision the user-wide skeleton on first
+run: ``<data_dir>/MYDEEPAGENT.md``, ``<data_dir>/global/memory/MEMORY.md``,
+``<data_dir>/skills/``, ``<data_dir>/projects/``.  All steps are idempotent
+so repeated calls do nothing destructive.
+
+The bootstrap is invoked at REPL/API startup so users always see the dirs
+even before they touch a `/remember` or `/skill` slash.
+"""

 from __future__ import annotations

 import json
+import logging
 import os
 from datetime import UTC, datetime
 from pathlib import Path

+from .config import Config
 from .errors import MyDeepAgentError
+from .instructions import ensure_global_instructions_initialized
+from .memory import ensure_memory_initialized, global_memory_dir
+from .skills import ensure_skills_initialized, user_skills_dir
+
+_LOG = logging.getLogger(__name__)


 def consent_path(data_dir: Path) -> Path:
@@ -39,3 +55,25 @@ def require_consent(data_dir: Path) -> None:
            message="governance consent not recorded",
            recovery_hint="run `mydeepagent init` and accept the data-governance prompt",
        )
+
+
+def bootstrap_user_dirs(config: Config) -> None:
+    """Provision the full user-wide skeleton.  Idempotent.
+
+    Creates (if missing):
+    - ``<data_dir>/MYDEEPAGENT.md``               (global instructions w/ template)
+    - ``<data_dir>/global/memory/MEMORY.md``      (empty index for cross-project memory)
+    - ``<data_dir>/skills/``                       (user-wide skills root)
+    - ``<data_dir>/projects/``                     (parent of per-project subtrees)
+
+    Per-project subdirs (``projects/<project_key>/memory|skills``) are still
+    created lazily by :class:`InteractiveSession` since they depend on the
+    repo path; the parent ``projects/`` is materialised here so users see the
+    expected layout even before opening their first session.
+    """
+    data_dir = Path(config.data_dir)
+    data_dir.mkdir(parents=True, exist_ok=True)
+    ensure_global_instructions_initialized(config)
+    ensure_memory_initialized(global_memory_dir(config))
+    ensure_skills_initialized(user_skills_dir(config))
+    (data_dir / "projects").mkdir(parents=True, exist_ok=True)
--- a/my-deepagent/src/my_deepagent/instructions.py
+++ b/my-deepagent/src/my_deepagent/instructions.py
@@ -0,0 +1,98 @@
+"""MYDEEPAGENT.md instruction-file hierarchy (v0.3 PR #7).
+
+Two scopes (mirrors Claude Code's CLAUDE.md global/project layering):
+
+- **Global** :  ``<config.data_dir>/MYDEEPAGENT.md``
+  User-wide preferences that apply to every project.  Bootstrapped with a
+  template on first session if missing.
+
+- **Project** : ``<repo>/MYDEEPAGENT.md``
+  Repo-specific overrides.  Picked up at session start when present; we do NOT
+  auto-create it (creating a file inside the user's repo would be invasive).
+
+Both files are passed to ``deepagents.MemoryMiddleware`` via the ``memory=``
+kwarg of ``create_deep_agent`` — same mechanism as auto-memory.  Order in the
+list:
+
+    [global MYDEEPAGENT.md, project MYDEEPAGENT.md, MEMORY.md, ...entry .md]
+
+So later files (project + auto-memory) can override earlier ones at the same
+filesystem path, matching the standard CLAUDE.md precedence.
+"""
+
+from __future__ import annotations
+
+from pathlib import Path
+
+from .config import Config
+
+#: Filename for both global and project instruction files.
+INSTRUCTION_FILENAME = "MYDEEPAGENT.md"
+
+#: Initial body written to the global file when it does not exist.
+_GLOBAL_TEMPLATE = """# MYDEEPAGENT.md (global)
+
+이 파일은 모든 프로젝트에 공통으로 적용되는 사용자 선호를 정의합니다.
+세션 시작 시 시스템 프롬프트에 자동으로 포함되어 모든 대화에 영향을 줍니다.
+
+프로젝트별 설정이 필요하면 해당 repo 루트에 같은 이름의 `MYDEEPAGENT.md` 파일을
+만들어 주세요 — 자동으로 함께 로드됩니다 (프로젝트가 글로벌을 덮어씁니다).
+
+## 협업 스타일
+- 한국어로 대화한다.  코드 안은 영어 유지.
+- 작업 시작 전 번호 목록 계획을 만든다.
+- 변경은 최소 범위로 한다 — 요청한 것만.
+
+## 코드 스타일
+- 새 파일을 만들기 전 기존 패턴을 먼저 읽는다.
+- 주석은 "왜"가 자명하지 않을 때만 짧게 단다.
+- TODO/FIXME/pass/NotImplementedError 를 최종 결과물에 남기지 않는다.
+
+## 잘 검토하기
+- 완료 선언 전에: 모든 항목 구현 / 정적 분석 통과 / 결과물 1회 이상 직접 읽음.
+"""
+
+
+def global_instructions_path(config: Config) -> Path:
+    """Return the absolute path of the global MYDEEPAGENT.md file."""
+    return Path(config.data_dir) / INSTRUCTION_FILENAME
+
+
+def project_instructions_path(repo_root: Path) -> Path:
+    """Return the absolute path of the project MYDEEPAGENT.md file (may not exist)."""
+    return Path(repo_root) / INSTRUCTION_FILENAME
+
+
+def ensure_global_instructions_initialized(config: Config) -> Path:
+    """Create the global instructions file with a template if missing.
+
+    Idempotent — repeated calls are no-ops once initialised.  Returns the
+    absolute path.  Bootstrap during REPL startup so users see the file the
+    first time they look in ``<data_dir>``.
+    """
+    p = global_instructions_path(config)
+    p.parent.mkdir(parents=True, exist_ok=True)
+    if not p.exists():
+        p.write_text(_GLOBAL_TEMPLATE, encoding="utf-8")
+    return p
+
+
+def resolve_instruction_paths(config: Config, repo_root: Path) -> list[str]:
+    """Return absolute paths to existing MYDEEPAGENT.md files, global-first.
+
+    - Global is bootstrapped (always exists after a session has started)
+    - Project is included only if the file actually exists in the repo —
+      we never write into the user's repo automatically.
+
+    The returned list is suitable for ``memory_paths_override`` passed to
+    :func:`session.build_agent` (the ``deepagents.MemoryMiddleware`` then
+    concatenates them in order — later files override earlier).
+    """
+    paths: list[str] = []
+    g = global_instructions_path(config)
+    if g.is_file():
+        paths.append(str(g.resolve()))
+    p = project_instructions_path(repo_root)
+    if p.is_file():
+        paths.append(str(p.resolve()))
+    return paths
--- a/my-deepagent/src/my_deepagent/memory.py
+++ b/my-deepagent/src/my_deepagent/memory.py
@@ -1,10 +1,13 @@
-"""Auto-memory (v0.3 PR #3) — project-scoped persistent context.
+"""Auto-memory (v0.3 PR #3) — project-scoped + global persistent context.

 Layout::

-    <config.data_dir>/projects/<project_key>/memory/
+    <config.data_dir>/projects/<project_key>/memory/   # project-scoped
        MEMORY.md           # index — one line per entry: "- [Title](file.md) — hook"
-        <slug>.md           # individual memory entries (with optional frontmatter)
+        <slug>.md           # individual memory entries with frontmatter
+    <config.data_dir>/global/memory/                    # global (every project)
+        MEMORY.md
+        <slug>.md

 The deepagents `MemoryMiddleware` reads every path we pass via the `memory=`
 kwarg of `create_deep_agent` and injects them (concatenated) into the system
@@ -13,19 +16,42 @@ every turn, so updates take effect on the next user message — no agent
 rebuild required.

 `/remember <text>` appends a new entry file and updates the index.  `/forget
-<slug>` deletes the entry file and prunes the index.  Both are project-scoped
-(via `project_key`) so different repos keep separate memory.
+<slug>` deletes the entry file and prunes the index.  Both default to the
+project scope; pass ``scope="global"`` to write into the global directory.
+
+Frontmatter follows the Claude Code auto-memory convention:
+
+    ---
+    name: <slug>
+    description: <one-line hook>
+    type: user | feedback | project | reference
+    ---
+    <body>
+
+Type inference uses simple keyword heuristics (deterministic — no LLM call)
+so `/remember` works offline.  Callers can override with ``--type=feedback``
+on the slash if the heuristic picks the wrong bucket.
+
+API keys / OpenRouter / Anthropic tokens are scrubbed at write time via
+:func:`_scrub_secrets` — the user gets a single warning + a placeholder.
 """

 from __future__ import annotations

 import re
+from dataclasses import dataclass
 from datetime import UTC, datetime
 from pathlib import Path
+from typing import Literal
+
+import yaml

 from .config import Config

-#: Filename of the index file inside each project memory dir.
+MemoryType = Literal["user", "feedback", "project", "reference"]
+_MEMORY_TYPES: tuple[MemoryType, ...] = ("user", "feedback", "project", "reference")
+
+#: Filename of the index file inside each memory dir (project or global).
 INDEX_FILENAME = "MEMORY.md"

 #: Slug character set — kept conservative for filesystem portability.
@@ -34,13 +60,69 @@ _SLUG_RE = re.compile(r"[^a-z0-9_-]+")
 #: Initial index body when bootstrapping a fresh memory directory.
 _INITIAL_INDEX = """# Auto-memory

-This file is an index of stored memories for this project.  Each entry below
-points to a sibling `*.md` file.  Entries are auto-managed by `/remember` and
-`/forget` slash commands — edit by hand if you need finer control.
+This file is an index of stored memories.  Each entry below points to a
+sibling `*.md` file.  Entries are auto-managed by `/remember` and `/forget`
+slash commands — edit by hand if you need finer control.

 ## Entries
 """

+#: Regexes used by `_scrub_secrets`.  Each redacts a recognisable secret
+#: shape: OpenRouter / Anthropic / OpenAI keys + bearer tokens + AWS keys.
+_SECRET_PATTERNS: tuple[tuple[re.Pattern[str], str], ...] = (
+    (re.compile(r"sk-or-[A-Za-z0-9_-]{16,}"), "<redacted:openrouter-key>"),
+    (re.compile(r"sk-ant-[A-Za-z0-9_-]{16,}"), "<redacted:anthropic-key>"),
+    (re.compile(r"sk-[A-Za-z0-9_-]{20,}"), "<redacted:openai-key>"),
+    (re.compile(r"Bearer\s+[A-Za-z0-9._-]{16,}"), "<redacted:bearer-token>"),
+    (re.compile(r"AKIA[0-9A-Z]{16}"), "<redacted:aws-access-key>"),
+)
+
+
+@dataclass(frozen=True)
+class MemoryEntry:
+    """One stored memory.  Parsed from a `<slug>.md` frontmatter + body."""
+
+    name: str
+    description: str
+    memory_type: MemoryType
+    content: str
+    file_path: Path
+
+
+def _scrub_secrets(text: str) -> tuple[str, bool]:
+    """Return ``(scrubbed_text, was_modified)``.
+
+    Iterates `_SECRET_PATTERNS` and replaces every match with a labelled
+    placeholder.  Conservative: any pattern hit redacts the whole match.
+    """
+    out = text
+    modified = False
+    for pat, placeholder in _SECRET_PATTERNS:
+        new = pat.sub(placeholder, out)
+        if new != out:
+            modified = True
+            out = new
+    return out, modified
+
+
+def _infer_memory_type(content: str, explicit: MemoryType | None = None) -> MemoryType:
+    """Deterministic keyword-based classifier (no LLM call).
+
+    Falls back to ``project`` when nothing matches.  Designed to be cheap +
+    predictable — `/remember "fish shell"` always lands in ``user``,
+    `/remember "don't mock the database"` in ``feedback``, etc.
+    """
+    if explicit is not None:
+        return explicit
+    text = content.lower()
+    if any(k in text for k in ("don't ", "dont ", "stop ", "never ", "no longer ", "instead of")):
+        return "feedback"
+    if any(k in text for k in ("i ", "i'm ", "i am ", "my ", "prefer", "fish shell", "user is")):
+        return "user"
+    if any(k in text for k in ("see http", "linear ", "github.com", "channel ", "dashboard")):
+        return "reference"
+    return "project"
+

 def project_memory_dir(config: Config, project_key: str) -> Path:
    """Return the absolute directory path for this project's memory."""
@@ -49,6 +131,11 @@ def project_memory_dir(config: Config, project_key: str) -> Path:
    return Path(config.data_dir) / "projects" / project_key / "memory"


+def global_memory_dir(config: Config) -> Path:
+    """Return the absolute directory path for the user's global memory."""
+    return Path(config.data_dir) / "global" / "memory"
+
+
 def ensure_memory_initialized(memory_dir: Path) -> Path:
    """Create the memory directory + initial MEMORY.md if missing.

@@ -95,27 +182,43 @@ def _now_iso() -> str:
    return datetime.now(UTC).isoformat(timespec="seconds")


+@dataclass(frozen=True)
+class WriteResult:
+    """Outcome of `add_memory_entry`.  Carries the file path + whether
+    secret-scrubbing kicked in (so the slash handler can warn the user)."""
+
+    path: Path
+    memory_type: MemoryType
+    scrubbed: bool
+
+
 def add_memory_entry(
    memory_dir: Path,
    content: str,
    *,
    name: str | None = None,
-) -> Path:
+    description: str | None = None,
+    memory_type: MemoryType | None = None,
+) -> WriteResult:
    """Write a new memory file + append pointer to the index.

-    - ``name`` (optional): explicit slug.  If omitted, derived from the first
-      line of ``content`` via :func:`_slugify`.
-    - File names collide → ``-2``, ``-3``, … suffix is appended until unique.
+    - ``name`` (optional): explicit slug.  Default = slugified first line.
+    - ``description`` (optional): one-line hook for the index pointer.
+      Default = first line of content (no leading ``#``).
+    - ``memory_type`` (optional): override the heuristic classifier.

-    Returns the absolute path to the newly written file.  Raises
-    ``ValueError`` for empty content.
+    Secret-shaped substrings (OpenRouter/Anthropic/OpenAI keys, AWS access
+    keys, bearer tokens) are redacted via :func:`_scrub_secrets` before
+    write — the ``WriteResult.scrubbed`` flag tells the caller to warn the
+    user.  Empty/whitespace content raises ``ValueError``.
    """
    if not content or not content.strip():
        raise ValueError("memory content must be non-empty")

    ensure_memory_initialized(memory_dir)
+    safe_content, scrubbed = _scrub_secrets(content.strip())

-    first_line = content.strip().splitlines()[0]
+    first_line = safe_content.splitlines()[0]
    slug_base = _slugify(name or first_line)
    candidate = memory_dir / f"{slug_base}.md"
    n = 2
@@ -123,20 +226,82 @@ def add_memory_entry(
        candidate = memory_dir / f"{slug_base}-{n}.md"
        n += 1

-    # File body: short frontmatter + content.  The frontmatter is informational
-    # for human readers; the deepagents middleware does not parse it.
-    body = f"---\nslug: {candidate.stem}\ncreated: {_now_iso()}\n---\n\n{content.strip()}\n"
+    inferred_type = _infer_memory_type(safe_content, memory_type)
+    hook = (description or first_line.strip().lstrip("# ").strip())[:120] or candidate.stem
+
+    body = (
+        f"---\n"
+        f"name: {candidate.stem}\n"
+        f"description: {hook}\n"
+        f"type: {inferred_type}\n"
+        f"created: {_now_iso()}\n"
+        f"---\n\n"
+        f"{safe_content}\n"
+    )
    candidate.write_text(body, encoding="utf-8")

-    # Append a one-line pointer to the index — first line of content is the
-    # title, truncated to keep the index scannable.
-    title = first_line.strip().lstrip("# ").strip()[:80] or candidate.stem
-    pointer = f"- [{title}]({candidate.name}) — {_now_iso()}\n"
+    pointer = f"- [{hook}]({candidate.name}) — type:{inferred_type}\n"
    index_path = memory_dir / INDEX_FILENAME
    with index_path.open("a", encoding="utf-8") as f:
        f.write(pointer)

-    return candidate
+    return WriteResult(path=candidate, memory_type=inferred_type, scrubbed=scrubbed)
+
+
+def read_entry(file_path: Path) -> MemoryEntry | None:
+    """Parse a single ``<slug>.md`` file into a :class:`MemoryEntry`.
+
+    Returns None for files with malformed/missing frontmatter — the caller
+    can decide whether to surface the issue.  Falls back to ``project`` when
+    `type` is missing or unrecognised.
+    """
+    if not file_path.is_file():
+        return None
+    try:
+        raw = file_path.read_text(encoding="utf-8")
+    except OSError:
+        return None
+    if not raw.startswith("---"):
+        return None
+    parts = raw.split("---", 2)
+    if len(parts) < 3:
+        return None
+    try:
+        meta = yaml.safe_load(parts[1]) or {}
+    except yaml.YAMLError:
+        return None
+    if not isinstance(meta, dict):
+        return None
+    name = str(meta.get("name", file_path.stem)).strip()
+    description = str(meta.get("description", "")).strip() or "(no description)"
+    raw_type = str(meta.get("type", "project")).strip().lower()
+    mt: MemoryType = "project"
+    for known in _MEMORY_TYPES:
+        if raw_type == known:
+            mt = known
+            break
+    return MemoryEntry(
+        name=name,
+        description=description,
+        memory_type=mt,
+        content=parts[2].lstrip("\n"),
+        file_path=file_path,
+    )
+
+
+def read_index_entries(memory_dir: Path) -> list[MemoryEntry]:
+    """Return parsed :class:`MemoryEntry` for every `*.md` in the dir except
+    ``MEMORY.md`` itself.  Sorted by filename.  Malformed files are skipped."""
+    if not memory_dir.is_dir():
+        return []
+    out: list[MemoryEntry] = []
+    for p in sorted(memory_dir.glob("*.md")):
+        if p.name == INDEX_FILENAME:
+            continue
+        entry = read_entry(p)
+        if entry is not None:
+            out.append(entry)
+    return out


 def remove_memory_entry(memory_dir: Path, slug_or_filename: str) -> bool:
--- a/my-deepagent/src/my_deepagent/middleware/cost.py
+++ b/my-deepagent/src/my_deepagent/middleware/cost.py
@@ -56,6 +56,7 @@ class CostMiddleware(AgentMiddleware):
                run_id=self.run_id,
                persona_name=self.persona_name,
                estimated_cost_usd=estimated,
+                session_id=self.interactive_session_id,
            )
        started = time.perf_counter()
        try:
@@ -104,6 +105,7 @@ class CostMiddleware(AgentMiddleware):
                run_id=self.run_id,
                persona_name=self.persona_name,
                actual_cost_usd=actual,
+                session_id=self.interactive_session_id,
            )
        return response

--- a/my-deepagent/src/my_deepagent/middleware/plan_mode.py
+++ b/my-deepagent/src/my_deepagent/middleware/plan_mode.py
@@ -0,0 +1,114 @@
+"""PlanModeMiddleware (v0.3 PR #5) — block write tools when plan-mode is active.
+
+Claude Code's plan mode lets the user say "design this, don't write code" — the
+agent can read, search, plan via `write_todos`, but cannot mutate the
+filesystem or run shell commands until the user `/approve`s.
+
+Implementation strategy:
+- A callable ``is_active()`` is passed in at construction time.  The REPL flips
+  a flag on/off via slash commands; the middleware re-reads on every tool call.
+  This avoids rebuilding the agent on every `/plan` / `/approve` toggle.
+- When plan-mode is on and the LLM calls a blocked tool, we return a synthetic
+  ``ToolMessage(status="error", ...)`` so the LLM sees feedback and can adjust
+  ("ok, I'll keep planning instead").  We do NOT raise — that would crash the
+  turn and the user would lose the partial response.
+
+Blocked tools (matches Claude Code's ExitPlanMode-required tool set):
+    - ``write_file``, ``edit_file`` — fs mutation
+    - ``bash`` / ``execute`` / ``run_command`` / ``shell`` — shell exec
+    - ``task`` — sub-agent spawn (a sub-agent could bypass plan mode)
+    - ``write_todos`` — todos are PART of the plan markdown.  Plan-mode
+      forbids commits to the agent's TODO list; the user reviews the plan
+      first, then /approve unlocks both writes and the TODO list.
+"""
+
+from __future__ import annotations
+
+from collections.abc import Callable
+from typing import Any
+
+from langchain.agents.middleware import AgentMiddleware
+from langchain_core.messages import ToolMessage
+
+#: Tool names that mutate the filesystem.
+_FS_WRITE_TOOLS: frozenset[str] = frozenset({"write_file", "edit_file"})
+
+#: Tool names that execute shell commands.
+_SHELL_TOOLS: frozenset[str] = frozenset({"bash", "execute", "run_command", "shell"})
+
+#: Tool names that spawn sub-agents (which would bypass plan mode in the parent).
+_SUBAGENT_TOOLS: frozenset[str] = frozenset({"task"})
+
+#: Plan-mode forbids committing to a TODO list — todos are part of the
+#: plan markdown that the user reviews before /approve.
+_PLANNING_TOOLS: frozenset[str] = frozenset({"write_todos"})
+
+#: Full blocklist applied while plan mode is on.
+BLOCKED_TOOLS_IN_PLAN_MODE: frozenset[str] = (
+    _FS_WRITE_TOOLS | _SHELL_TOOLS | _SUBAGENT_TOOLS | _PLANNING_TOOLS
+)
+
+
+def _block_message(tool_name: str) -> str:
+    return (
+        f"Plan-mode is active — `{tool_name}` is blocked. "
+        "Keep planning with read_file / glob / grep / write_todos, "
+        "or ask the user to `/approve` to leave plan mode."
+    )
+
+
+class PlanModeMiddleware(AgentMiddleware):
+    """Block mutating tool calls while plan-mode is active.
+
+    Construction takes an ``is_active`` callable that returns the current plan
+    mode state.  The REPL toggles this state via slash commands without
+    rebuilding the agent — the middleware reads it fresh per tool call.
+
+    Tools that are read-only (``read_file``, ``glob``, ``grep``, ``ls``,
+    ``write_todos``) are allowed in plan mode unconditionally.
+    """
+
+    def __init__(self, *, is_active: Callable[[], bool]) -> None:
+        self._is_active = is_active
+
+    async def awrap_tool_call(self, request: Any, handler: Any) -> Any:
+        if not self._is_active():
+            return await handler(request)
+        name = _tool_name(request)
+        if name in BLOCKED_TOOLS_IN_PLAN_MODE:
+            return ToolMessage(
+                content=_block_message(name),
+                tool_call_id=_tool_call_id(request),
+                name=name,
+                status="error",
+            )
+        return await handler(request)
+
+    def wrap_tool_call(self, request: Any, handler: Any) -> Any:
+        # Sync path mirrors the async one for parity (e.g. when the agent is
+        # invoked synchronously in unit tests).  Real REPL/Web paths are async.
+        if not self._is_active():
+            return handler(request)
+        name = _tool_name(request)
+        if name in BLOCKED_TOOLS_IN_PLAN_MODE:
+            return ToolMessage(
+                content=_block_message(name),
+                tool_call_id=_tool_call_id(request),
+                name=name,
+                status="error",
+            )
+        return handler(request)
+
+
+def _tool_name(request: Any) -> str:
+    tool_call = getattr(request, "tool_call", None)
+    if isinstance(tool_call, dict):
+        return str(tool_call.get("name") or "")
+    return str(getattr(request, "name", "") or "")
+
+
+def _tool_call_id(request: Any) -> str:
+    tool_call = getattr(request, "tool_call", None)
+    if isinstance(tool_call, dict):
+        return str(tool_call.get("id") or "")
+    return str(getattr(request, "id", "") or "")
--- a/my-deepagent/src/my_deepagent/persistence/db.py
+++ b/my-deepagent/src/my_deepagent/persistence/db.py
@@ -75,10 +75,17 @@ class Database:
    """

    def __init__(self, database_url: str) -> None:
+        # v0.3 hotfix: Postgres asyncpg pool occasionally hands out stale
+        # connections whose underlying socket was closed by the server (idle
+        # timeout, container restart, network blip, …).  `pool_pre_ping`
+        # adds a fast ping before each checkout and invalidates dead
+        # connections so the next acquire dials a fresh one — fixes the
+        # "InterfaceError: connection is closed" 500 seen under SSE load.
        self._engine: AsyncEngine = create_async_engine(
            database_url,
            poolclass=None,
            echo=False,
+            pool_pre_ping=True,
        )
        _attach_dialect_pragmas(self._engine)
        self._session_factory: async_sessionmaker[AsyncSession] = async_sessionmaker(
--- a/my-deepagent/src/my_deepagent/session.py
+++ b/my-deepagent/src/my_deepagent/session.py
@@ -153,6 +153,9 @@ def resolve_model_instance(
            max_tokens=params.get("max_tokens", 4096),
            temperature=params.get("temperature", 0.2),
            top_p=params.get("top_p", 1.0),
+            # v0.4 B3: enable token streaming so AsyncCallbackHandler.on_llm_new_token
+            # receives chunks during ainvoke.  Final response is unchanged.
+            streaming=True,
        )
    return model_spec

--- a/my-deepagent/src/my_deepagent/skills.py
+++ b/my-deepagent/src/my_deepagent/skills.py
@@ -1,12 +1,11 @@
 """Agent Skills (v0.3 PR #4) — LLM-routed progressive disclosure.

-Layout::
+Layout (two scopes, mirrors Claude Code's `~/.claude/skills/` + repo overlay):

-    <config.data_dir>/skills/<skill-name>/SKILL.md
-                                          [optional supporting files]
+    <config.data_dir>/skills/<name>/SKILL.md                # global / user
+    <config.data_dir>/projects/<project_key>/skills/<name>/SKILL.md   # project

-We mount this single directory as a source for ``deepagents.SkillsMiddleware``
-which:
+We mount both directories as sources for ``deepagents.SkillsMiddleware`` which:

 1. Parses every ``SKILL.md`` YAML frontmatter (``name``, ``description``, …)
 2. Injects an index of ``(name, description)`` pairs into the system prompt
@@ -14,15 +13,18 @@ which:
   ``read_file`` — no embeddings, no per-token vector lookup, no custom
   routing logic.  Anthropic's Agent Skills specification verbatim.

-The skill name in the YAML frontmatter must match the parent directory name
-(``deepagents`` enforces this) — e.g. a skill directory ``web-research/``
-needs ``name: web-research`` inside its ``SKILL.md``.
+Per ``deepagents.SkillsMiddleware`` semantics, later sources override earlier
+ones at the same skill name — so project-scope wins over global-scope, which
+matches the Claude Code precedence.

-PR #4 keeps the surface area small: we mount one user-scope source and expose
-``/skills`` (list) and ``/skill <name>`` (show full body for inspection)
-slashes.  Project-scope skills (``<repo>/.mydeepagent/skills/``) are NOT wired
-in this PR — call sites can later layer them by passing additional sources
-through ``build_agent(skills_sources_override=...)``.
+The skill name in the YAML frontmatter must match the parent directory name.
+
+PR #4 slashes:
+- ``/skills``: list installed skills (project + global, with scope label)
+- ``/skills show <name>``: REPL output only (inspection)
+- ``/skill <name>``: inject the SKILL.md body as a one-shot system message
+  on the next ainvoke (the LLM treats it as an explicit "use this skill"
+  directive for this turn).
 """

 from __future__ import annotations
@@ -46,22 +48,34 @@ _MAX_SKILL_READ_BYTES = 10 * 1024 * 1024
 class SkillInfo:
    """Lightweight summary of one installed skill — used by `/skills` slash.

-    Fields are derived from the YAML frontmatter inside ``SKILL.md``:
    - ``name``: directory name (also enforced inside frontmatter by deepagents)
    - ``description``: 1-line summary, truncated if very long
-    - ``path``: absolute path of the ``SKILL.md`` for `/skill <name>` body display
+    - ``path``: absolute path of the ``SKILL.md`` for body display
+    - ``scope``: ``"project"`` (repo-local) or ``"global"`` (user-wide)
    """

    name: str
    description: str
    path: Path
+    scope: str = "global"


 def user_skills_dir(config: Config) -> Path:
-    """Return the user-scope skills directory (``<data_dir>/skills``)."""
+    """Return the global / user-wide skills directory (``<data_dir>/skills``)."""
    return Path(config.data_dir) / "skills"


+def project_skills_dir(config: Config, project_key: str) -> Path:
+    """Return the project-scope skills directory.
+
+    Stored under ``<data_dir>/projects/<project_key>/skills/`` to keep all
+    project-scoped artefacts (memory, skills) under a single parent path.
+    """
+    if not project_key:
+        raise ValueError("project_key must be non-empty")
+    return Path(config.data_dir) / "projects" / project_key / "skills"
+
+
 def ensure_skills_initialized(skills_dir: Path) -> None:
    """Create the skills directory if missing.

@@ -116,7 +130,7 @@ def _parse_skill_md(path: Path) -> SkillInfo | None:
    return SkillInfo(name=name, description=description, path=path)


-def list_installed_skills(skills_dir: Path) -> list[SkillInfo]:
+def list_installed_skills(skills_dir: Path, *, scope: str = "global") -> list[SkillInfo]:
    """Scan the directory for ``<name>/SKILL.md`` entries and return summaries.

    - Sorted by name for deterministic UX
@@ -136,10 +150,33 @@ def list_installed_skills(skills_dir: Path) -> list[SkillInfo]:
            continue
        info = _parse_skill_md(skill_md)
        if info is not None:
-            found.append(info)
+            found.append(
+                SkillInfo(name=info.name, description=info.description, path=info.path, scope=scope)
+            )
    return found


+def list_all_skills(config: Config, project_key: str) -> list[SkillInfo]:
+    """Merged project + global skill list.  Project wins on name collision."""
+    global_skills = list_installed_skills(user_skills_dir(config), scope="global")
+    project_skills = list_installed_skills(project_skills_dir(config, project_key), scope="project")
+    project_names = {s.name for s in project_skills}
+    merged = [s for s in global_skills if s.name not in project_names]
+    merged.extend(project_skills)
+    merged.sort(key=lambda s: s.name)
+    return merged
+
+
+def find_skill(config: Config, project_key: str, name: str) -> SkillInfo | None:
+    """Resolve a skill by name, preferring project-scope over global."""
+    if not name:
+        return None
+    for skill in list_all_skills(config, project_key):
+        if skill.name == name:
+            return skill
+    return None
+
+
 def read_skill_body(skills_dir: Path, name: str) -> str | None:
    """Return the full SKILL.md content for the named skill, or None if missing.

@@ -160,11 +197,16 @@ def read_skill_body(skills_dir: Path, name: str) -> str | None:
        return None


-def resolve_skill_sources(config: Config) -> list[str]:
+def resolve_skill_sources(config: Config, project_key: str | None = None) -> list[str]:
    """Build the list of skill-directory sources to pass to deepagents.

-    Currently a single-entry list (user-scope).  Designed to be extended with
-    project-scope and team-scope sources in later PRs without changing the
-    caller interface.
+    Order: global first, then project.  ``deepagents.SkillsMiddleware``
+    later-wins, so project skills override global ones at the same name.
+    Returns absolute paths.  Project source is omitted when no
+    ``project_key`` is supplied (e.g. workflow-engine call sites that don't
+    have a project context).
    """
-    return [str(user_skills_dir(config).resolve())]
+    sources = [str(user_skills_dir(config).resolve())]
+    if project_key:
+        sources.append(str(project_skills_dir(config, project_key).resolve()))
+    return sources
--- a/my-deepagent/src/my_deepagent/subagents.py
+++ b/my-deepagent/src/my_deepagent/subagents.py
@@ -0,0 +1,389 @@
+"""Sub-agent session linkage + runner (v0.3 PR #6).
+
+PR #1 already added `parent_session_id` + `depth` columns to
+`InteractiveSessionRow`.  This module provides:
+
+- :func:`spawn_subagent_session` — atomically creates a child row inheriting
+  ``project_key`` from the parent, sets ``parent_session_id`` + ``depth =
+  parent.depth + 1``, rejects when depth would exceed
+  :data:`MAX_SUBAGENT_DEPTH`.
+- :func:`list_subagents` — direct children for ``/agents`` listings.
+- :func:`resolve_root_session_id` — walk the parent chain to find the root.
+- :func:`run_subagent_to_completion` — actually invoke a sub-agent's
+  ``ainvoke`` with isolation + LangGraph thread + summary push to parent.
+
+Cost rollup: each sub-agent's CostMiddleware is wired with the ROOT session
+id so all LLM calls under that session tree charge a single ``session:<uuid>``
+scope — matches the plan ("sub-agent는 root session의 한도에 합산").
+"""
+
+from __future__ import annotations
+
+import logging
+from collections.abc import Sequence
+from datetime import UTC, datetime
+from typing import Any
+from uuid import UUID, uuid4
+
+from sqlalchemy import desc, select
+
+from .audit import make_audit_recorder
+from .budget import make_budget_tracker_from_config
+from .compaction import compact_session
+from .config import Config
+from .errors import MyDeepAgentError
+from .memory import (
+    ensure_memory_initialized,
+    global_memory_dir,
+    list_memory_paths,
+    project_memory_dir,
+)
+from .middleware.audit import AuditToolMiddleware
+from .middleware.cost import CostMiddleware
+from .middleware.plan_mode import PlanModeMiddleware
+from .monitoring.pricing import ModelPrice, PricingCache
+from .monitoring.token_budget import count_tokens
+from .persistence.db import Database
+from .persistence.models import AgentPersonaRow, InteractiveSessionRow, MessageRow
+from .persona import Persona
+from .session import build_agent
+from .skills import (
+    ensure_skills_initialized,
+    project_skills_dir,
+    resolve_skill_sources,
+    user_skills_dir,
+)
+
+_LOG = logging.getLogger(__name__)
+
+#: Maximum sub-agent nesting depth.  Above this we refuse to spawn — Claude
+#: Code's `task` tool limits agent stacks to roughly 3 levels (Main → A → B)
+#: to keep budgets and audit trails legible.
+MAX_SUBAGENT_DEPTH: int = 3
+
+
+def _now_iso() -> str:
+    return datetime.now(UTC).isoformat(timespec="seconds")
+
+
+async def spawn_subagent_session(
+    db: Database,
+    *,
+    parent_session_id: UUID,
+    persona: Persona,
+    initial_title: str | None = None,
+) -> UUID:
+    """Create a child :class:`InteractiveSessionRow` linked to ``parent_session_id``.
+
+    The child inherits ``project_key`` from the parent — same memory dir,
+    same skill dir.  ``depth`` is incremented by 1; if that would exceed
+    :data:`MAX_SUBAGENT_DEPTH` we raise ``MyDeepAgentError(human_required)``
+    so the caller (REPL slash / API endpoint) can surface a clean message.
+
+    The persona may be different from the parent's (callers often want a
+    specialised role for the child), so ``persona`` is required.  We upsert
+    its ``AgentPersonaRow`` for the FK exactly like
+    :func:`cli.interactive._load_or_create_session_row` does for root rows.
+
+    Returns the new child ``session_id``.
+    """
+    async with db.session() as s:
+        parent = await s.get(InteractiveSessionRow, str(parent_session_id))
+        if parent is None:
+            raise MyDeepAgentError.human_required(
+                "parent_session_missing",
+                message=f"cannot spawn sub-agent: parent session {parent_session_id} not found",
+                recovery_hint="confirm the parent session id; sub-agents require a live parent",
+            )
+        if parent.state == "ended":
+            raise MyDeepAgentError.human_required(
+                "parent_session_ended",
+                message=f"cannot spawn sub-agent: parent {parent.id} is ended",
+                recovery_hint="resume the parent session first or pick a different parent",
+            )
+        new_depth = (parent.depth or 0) + 1
+        if new_depth > MAX_SUBAGENT_DEPTH:
+            raise MyDeepAgentError.human_required(
+                "subagent_depth_exceeded",
+                message=(
+                    f"sub-agent depth limit reached: parent depth={parent.depth}, "
+                    f"max={MAX_SUBAGENT_DEPTH}"
+                ),
+                recovery_hint=(
+                    f"flatten the agent stack (max {MAX_SUBAGENT_DEPTH} levels) "
+                    "or close intermediate sub-agents first"
+                ),
+            )
+
+        # Upsert AgentPersonaRow for the persona we're spawning with.
+        ph = persona.compute_hash()
+        persona_row = (
+            await s.execute(select(AgentPersonaRow).where(AgentPersonaRow.hash == ph))
+        ).scalar_one_or_none()
+        if persona_row is None:
+            persona_row = AgentPersonaRow(
+                id=str(uuid4()),
+                name=persona.name,
+                version=persona.version,
+                hash=ph,
+                definition=persona.model_dump(by_alias=True),
+                created_at=_now_iso(),
+            )
+            s.add(persona_row)
+            await s.flush()
+
+        child_id = uuid4()
+        child = InteractiveSessionRow(
+            id=str(child_id),
+            persona_id=persona_row.id,
+            persona_hash=ph,
+            started_at=_now_iso(),
+            last_message_at=None,
+            state="active",
+            total_input_tokens=0,
+            total_output_tokens=0,
+            model=persona.model,
+            project_key=parent.project_key,  # inherit so memory is shared
+            title=initial_title,
+            plan_mode=False,
+            parent_session_id=parent.id,
+            depth=new_depth,
+        )
+        s.add(child)
+        await s.commit()
+        return child_id
+
+
+async def list_subagents(db: Database, parent_session_id: UUID) -> list[InteractiveSessionRow]:
+    """Return all direct children of ``parent_session_id``, oldest first.
+
+    Used by the ``/agents`` slash and the Web GUI session tree.  Does NOT
+    recurse — callers that want the full tree must walk it themselves.
+    """
+    async with db.session() as s:
+        rows: Sequence[InteractiveSessionRow] = (
+            (
+                await s.execute(
+                    select(InteractiveSessionRow)
+                    .where(InteractiveSessionRow.parent_session_id == str(parent_session_id))
+                    .order_by(InteractiveSessionRow.started_at)
+                )
+            )
+            .scalars()
+            .all()
+        )
+        return list(rows)
+
+
+async def resolve_root_session_id(db: Database, session_id: UUID) -> UUID:
+    """Walk ``parent_session_id`` until we reach a session with ``parent=None``.
+
+    Guarded against cycles (would only happen if depth column lied — protective
+    cap = 1 + MAX_SUBAGENT_DEPTH iterations).  Returns the input id when the
+    session has no parent.
+    """
+    current = str(session_id)
+    for _ in range(MAX_SUBAGENT_DEPTH + 2):
+        async with db.session() as s:
+            row = await s.get(InteractiveSessionRow, current)
+        if row is None:
+            return session_id
+        if row.parent_session_id is None:
+            return UUID(row.id)
+        current = row.parent_session_id
+    # Cycle detected — return the latest hop as a graceful fallback.
+    return UUID(current)
+
+
+_SUBAGENT_SUMMARY_INSTRUCTION = (
+    "당신은 sub-agent 입니다.  사용자가 요청한 과제를 마치고 한 번의 응답 안에 "
+    "(1) 도달한 결론, (2) 변경한 파일/생성한 산출물, (3) 부모 세션에 전달할 핵심 "
+    "요약 (≤ 400 단어) 을 정리하세요.  추가 turn 은 일어나지 않습니다."
+)
+
+
+def _static_pricing_seed() -> PricingCache:
+    cache = PricingCache()
+    cache.set(
+        [
+            ModelPrice("anthropic/claude-sonnet-4-6", 0.003, 0.015, 200_000),
+            ModelPrice("anthropic/claude-haiku-4-5", 0.001, 0.005, 200_000),
+            ModelPrice("anthropic/claude-opus-4-1", 0.015, 0.075, 200_000),
+            ModelPrice("deepseek/deepseek-chat", 0.00028, 0.00112, 64_000),
+        ]
+    )
+    return cache
+
+
+def _flatten_assistant_content(msg: Any) -> str:
+    content = getattr(msg, "content", "") or ""
+    if isinstance(content, list):
+        return "\n".join(
+            (b.get("text", str(b)) if isinstance(b, dict) else str(b)) for b in content
+        )
+    return str(content)
+
+
+async def _persist_message(
+    db: Database, session_id: UUID, role: str, content: str, *, model: str
+) -> None:
+    """Insert one MessageRow + bump last_message_at + token totals.
+
+    Mirrors the REPL's ``_append_message`` but lives in subagents.py so the
+    runner stays self-contained.
+    """
+    token_count = count_tokens(content, model)
+    now = datetime.now(UTC).isoformat(timespec="seconds")
+    async with db.session() as s:
+        last_seq = (
+            await s.execute(
+                select(MessageRow.seq)
+                .where(MessageRow.session_id == str(session_id))
+                .order_by(desc(MessageRow.seq))
+                .limit(1)
+            )
+        ).scalar_one_or_none() or 0
+        s.add(
+            MessageRow(
+                session_id=str(session_id),
+                seq=last_seq + 1,
+                role=role,
+                content=content,
+                tool_calls=None,
+                token_count=token_count,
+                is_summary=False,
+                archived=False,
+                ts=now,
+            )
+        )
+        row = await s.get(InteractiveSessionRow, str(session_id))
+        if row is not None:
+            row.last_message_at = now
+            if role == "user":
+                row.total_input_tokens += token_count
+            elif role == "assistant":
+                row.total_output_tokens += token_count
+        await s.commit()
+
+
+async def run_subagent_to_completion(
+    db: Database,
+    config: Config,
+    parent_session_id: UUID,
+    sub_session_id: UUID,
+    persona: Persona,
+    prompt: str,
+    *,
+    saver: Any | None = None,
+) -> str:
+    """Invoke the sub-agent ONCE with the supplied prompt and return its summary.
+
+    - Loads the sub-session row to read ``project_key`` (inherited from parent)
+    - Resolves the root session id and wires CostMiddleware to charge that
+      single ``session:<root_uuid>`` scope so the whole agent tree shares one
+      budget envelope (per plan: "sub-agent는 root session의 한도에 합산").
+    - Persists user prompt + assistant summary to the sub-session.
+    - Pushes a "[sub-agent <id> result] …" system message to the parent so
+      the user sees the outcome in the main thread.
+    - Marks the sub-session ``ended`` on completion.
+
+    Failures are logged + propagated as a synthetic assistant message in the
+    sub-session, and an error system message in the parent.
+    """
+    async with db.session() as s:
+        sub_row = await s.get(InteractiveSessionRow, str(sub_session_id))
+    if sub_row is None:
+        raise MyDeepAgentError.fatal(
+            "subagent_session_missing",
+            message=f"sub-agent session {sub_session_id} not found",
+            recovery_hint="call spawn_subagent_session before run_subagent_to_completion",
+        )
+
+    project_key = sub_row.project_key or ""
+    root_session_id = await resolve_root_session_id(db, parent_session_id)
+
+    # Bootstrap shared memory + skills dirs (idempotent).
+    if project_key:
+        ensure_memory_initialized(project_memory_dir(config, project_key))
+        ensure_skills_initialized(project_skills_dir(config, project_key))
+    ensure_memory_initialized(global_memory_dir(config))
+    ensure_skills_initialized(user_skills_dir(config))
+
+    pricing = _static_pricing_seed()
+    budget = make_budget_tracker_from_config(db, config)
+    cost_mw = CostMiddleware(
+        pricing=pricing,
+        model_name=persona.model,
+        interactive_session_id=root_session_id,
+        persona_name=persona.name,
+        budget_tracker=budget,
+    )
+    audit_mw = AuditToolMiddleware(
+        interactive_session_id=sub_session_id,
+        file_recorder=make_audit_recorder(config.state_dir),
+    )
+    plan_mw = PlanModeMiddleware(is_active=lambda: False)
+
+    memory_paths = list_memory_paths(global_memory_dir(config))
+    if project_key:
+        memory_paths += list_memory_paths(project_memory_dir(config, project_key))
+    skill_sources = resolve_skill_sources(config, project_key or None)
+
+    agent = build_agent(
+        persona,
+        config,
+        root_dir=config.workspace_root,
+        middleware=[plan_mw, cost_mw, audit_mw],
+        checkpointer=saver,
+        memory_paths_override=memory_paths,
+        skills_sources_override=skill_sources,
+    )
+
+    full_prompt = f"{prompt.strip()}\n\n---\n\n{_SUBAGENT_SUMMARY_INSTRUCTION}"
+    await _persist_message(db, sub_session_id, "user", full_prompt, model=persona.model)
+
+    thread_id = f"{sub_session_id}:0"
+    try:
+        result = await agent.ainvoke(
+            {"messages": [{"role": "user", "content": full_prompt}]},
+            config={"configurable": {"thread_id": thread_id}},
+        )
+    except Exception as e:
+        _LOG.exception("sub-agent ainvoke failed for session %s", sub_session_id)
+        error_msg = f"sub-agent failed: {type(e).__name__}: {e}"
+        await _persist_message(db, sub_session_id, "assistant", error_msg, model=persona.model)
+        await _persist_message(
+            db,
+            parent_session_id,
+            "system",
+            f"[sub-agent {str(sub_session_id)[:8]} error] {error_msg}",
+            model=persona.model,
+        )
+        await _mark_session_ended(db, sub_session_id)
+        return error_msg
+
+    messages = result.get("messages", []) if isinstance(result, dict) else []
+    summary = _flatten_assistant_content(messages[-1]) if messages else "(empty response)"
+    await _persist_message(db, sub_session_id, "assistant", summary, model=persona.model)
+    await _persist_message(
+        db,
+        parent_session_id,
+        "system",
+        f"[sub-agent {str(sub_session_id)[:8]} result]\n{summary}",
+        model=persona.model,
+    )
+
+    # Compact the sub-session if it grew too big (rare for single-turn but
+    # the helper is idempotent + cheap to call).
+    await compact_session(db, config, str(sub_session_id))
+    await _mark_session_ended(db, sub_session_id)
+    return summary
+
+
+async def _mark_session_ended(db: Database, session_id: UUID) -> None:
+    async with db.session() as s:
+        row = await s.get(InteractiveSessionRow, str(session_id))
+        if row is not None and row.state != "ended":
+            row.state = "ended"
+            row.ended_at = datetime.now(UTC).isoformat(timespec="seconds")
+            await s.commit()
--- a/my-deepagent/src/my_deepagent/user_dirs.py
+++ b/my-deepagent/src/my_deepagent/user_dirs.py
@@ -0,0 +1,115 @@
+"""User-scope persona / workflow directories (v0.3 PR #9).
+
+Existing personas live at ``docs/schemas/personas/`` (seeded with the
+my-deepagent install).  Users can drop additional YAML files into
+``<config.data_dir>/personas/`` and ``<config.data_dir>/workflows/`` to
+register their own — these are layered ON TOP of the seed (user version
+wins on `(name, version)` collision).
+
+This module exposes:
+
+- :func:`user_personas_dir` / :func:`user_workflows_dir` — path helpers.
+- :func:`ensure_user_dirs_initialized` — `mkdir -p` for both, idempotent.
+- :func:`load_combined_personas` — seed + user, deduplicated by (name, version)
+  with user-overrides-seed semantics.
+- :func:`load_combined_workflows` — seed + user, deduplicated by (name, version).
+"""
+
+from __future__ import annotations
+
+import logging
+from pathlib import Path
+
+from .config import Config
+from .persona import Persona, load_personas_from_dir
+from .workflow import WorkflowTemplate, load_workflow_yaml
+
+_LOG = logging.getLogger(__name__)
+
+
+def user_personas_dir(config: Config) -> Path:
+    return Path(config.data_dir) / "personas"
+
+
+def user_workflows_dir(config: Config) -> Path:
+    return Path(config.data_dir) / "workflows"
+
+
+def ensure_user_dirs_initialized(config: Config) -> None:
+    """`mkdir -p` for both user directories. Idempotent."""
+    user_personas_dir(config).mkdir(parents=True, exist_ok=True)
+    user_workflows_dir(config).mkdir(parents=True, exist_ok=True)
+
+
+def load_combined_personas(config: Config, seed_dir: Path) -> list[Persona]:
+    """Combine seeded + user personas with user-overrides-seed precedence.
+
+    Returns a list whose order is "seed first, then user-only (excluding
+    overrides)" — useful for CLI listings.  Internal dedupe is keyed on
+    ``(name, version)``.  The seed dir uses strict loading (we want to know
+    if a shipped YAML is broken).  The user dir uses best-effort per-file
+    loading so a single broken file cannot break the REPL.
+    """
+    seed = load_personas_from_dir(seed_dir)
+    user_dir = user_personas_dir(config)
+    user = _safe_load_personas(user_dir) if user_dir.is_dir() else []
+    return _merge_with_user_override(seed, user)
+
+
+def _safe_load_personas(directory: Path) -> list[Persona]:
+    """Best-effort load — skip individual malformed files."""
+    from .persona import load_persona_yaml
+
+    out: list[Persona] = []
+    for p in sorted(directory.glob("*.yaml")):
+        try:
+            out.append(load_persona_yaml(p))
+        except Exception as e:
+            _LOG.warning("skipping invalid persona file %s: %s", p, e)
+    return out
+
+
+def _merge_with_user_override(seed: list[Persona], user: list[Persona]) -> list[Persona]:
+    """Last-wins on `(name, version)`.  Preserves seed order for entries not
+    overridden, then appends user-only entries in their own order."""
+    user_keys = {(p.name, p.version) for p in user}
+    merged: list[Persona] = [p for p in seed if (p.name, p.version) not in user_keys]
+    merged.extend(user)
+    return merged
+
+
+def load_combined_workflows(config: Config, seed_dir: Path) -> list[tuple[Path, WorkflowTemplate]]:
+    """Combine seeded + user workflows with user-overrides-seed precedence.
+
+    Returns `[(path, WorkflowTemplate), ...]`.  Malformed YAMLs (seed or user)
+    are logged and skipped — broken files cannot break the REPL.  Order is
+    seed first (deduped), then user-only.
+    """
+    seed = _safe_load_workflows(seed_dir)
+    user_dir = user_workflows_dir(config)
+    user = _safe_load_workflows(user_dir) if user_dir.is_dir() else []
+    return _merge_workflows_with_user_override(seed, user)
+
+
+def _safe_load_workflows(directory: Path) -> list[tuple[Path, WorkflowTemplate]]:
+    if not directory.is_dir():
+        return []
+    out: list[tuple[Path, WorkflowTemplate]] = []
+    for p in sorted(directory.glob("*.yaml")):
+        try:
+            out.append((p, load_workflow_yaml(p)))
+        except Exception as e:
+            _LOG.warning("skipping invalid workflow file %s: %s", p, e)
+    return out
+
+
+def _merge_workflows_with_user_override(
+    seed: list[tuple[Path, WorkflowTemplate]],
+    user: list[tuple[Path, WorkflowTemplate]],
+) -> list[tuple[Path, WorkflowTemplate]]:
+    user_keys = {(t.name, t.version) for (_p, t) in user}
+    merged: list[tuple[Path, WorkflowTemplate]] = [
+        (p, t) for (p, t) in seed if (t.name, t.version) not in user_keys
+    ]
+    merged.extend(user)
+    return merged
--- a/my-deepagent/static/app.js
+++ b/my-deepagent/static/app.js
@@ -417,11 +417,926 @@ async function resumeRun() {
  }
 }

+// =============== conversation page (v0.3 PR #8) ===============
+
+const CONV_STATE = {
+  sessionId: null,
+  eventSource: null,
+  lastSeq: 0,
+  awaitingReply: false,
+  streamBuffer: "",  // v0.4 B3: accumulated token deltas while streaming
+};
+
+function $conv(sel) { return document.querySelector(sel); }
+
+function setSendDisabled(disabled) {
+  $conv("#message-input").disabled = disabled;
+  $conv("#send-btn").disabled = disabled;
+}
+
+// v0.4 B4: toggle the abort button visibility based on in-flight state.
+// `disabled` is what setSendDisabled sees AFTER awaiting reply has started.
+function setAbortVisible(visible) {
+  const btn = $conv("#abort-btn");
+  if (!btn) return;
+  btn.style.display = visible ? "inline-block" : "none";
+  btn.disabled = !visible;
+}
+
+function clearMessages() {
+  const list = $conv("#messages");
+  list.replaceChildren();
+}
+
+function showConversationEmpty(show, text) {
+  let el = $conv("#conv-empty");
+  if (!el && show) {
+    el = document.createElement("div");
+    el.id = "conv-empty";
+    el.className = "conv-empty";
+    $conv("#messages").appendChild(el);
+  }
+  if (el) {
+    if (show) {
+      el.textContent = text || "대화를 시작하세요.";
+      el.style.display = "";
+    } else {
+      el.remove();
+    }
+  }
+}
+
+// v0.4 B1: minimal markdown renderer for assistant messages.
+// SECURITY: we ONLY emit DOM nodes built via createElement + textContent.
+// No innerHTML, no insertAdjacentHTML.  This is a tiny subset of Markdown
+// chosen for chat readability — anything we don't understand is rendered as
+// literal text (textContent fallback in the default case).
+function _mdRenderInto(target, raw) {
+  // Code-fence-aware splitter — we walk the input line-by-line and group
+  // lines into blocks (paragraph, code-fence, h#, list).
+  const lines = raw.split("\n");
+  let i = 0;
+  while (i < lines.length) {
+    const line = lines[i];
+
+    // Fenced code block: ```lang
+    const fence = line.match(/^```\s*([\w.-]*)\s*$/);
+    if (fence) {
+      const lang = fence[1];
+      const codeLines = [];
+      i++;
+      while (i < lines.length && !/^```\s*$/.test(lines[i])) {
+        codeLines.push(lines[i]);
+        i++;
+      }
+      if (i < lines.length) i++; // consume closing ```
+      const pre = document.createElement("pre");
+      pre.className = "md-code";
+      const code = document.createElement("code");
+      if (lang) code.className = `language-${lang}`;
+      code.textContent = codeLines.join("\n");
+      pre.appendChild(code);
+      target.appendChild(pre);
+      continue;
+    }
+
+    // ATX header: # / ## / ### (up to 6)
+    const hdr = line.match(/^(#{1,6})\s+(.*)$/);
+    if (hdr) {
+      const level = hdr[1].length;
+      const h = document.createElement(`h${level + 2 > 6 ? 6 : level + 2}`);
+      h.className = "md-h";
+      _mdInline(h, hdr[2]);
+      target.appendChild(h);
+      i++;
+      continue;
+    }
+
+    // Unordered list block — consecutive "- " or "* "
+    if (/^[-*]\s+/.test(line)) {
+      const ul = document.createElement("ul");
+      ul.className = "md-ul";
+      while (i < lines.length && /^[-*]\s+/.test(lines[i])) {
+        const li = document.createElement("li");
+        _mdInline(li, lines[i].replace(/^[-*]\s+/, ""));
+        ul.appendChild(li);
+        i++;
+      }
+      target.appendChild(ul);
+      continue;
+    }
+
+    // Ordered list: "1. ", "2. ", …
+    if (/^\d+\.\s+/.test(line)) {
+      const ol = document.createElement("ol");
+      ol.className = "md-ol";
+      while (i < lines.length && /^\d+\.\s+/.test(lines[i])) {
+        const li = document.createElement("li");
+        _mdInline(li, lines[i].replace(/^\d+\.\s+/, ""));
+        ol.appendChild(li);
+        i++;
+      }
+      target.appendChild(ol);
+      continue;
+    }
+
+    // Blank line — paragraph separator; skip.
+    if (line.trim() === "") {
+      i++;
+      continue;
+    }
+
+    // Paragraph: greedily consume until blank or block-start.
+    const paraLines = [line];
+    i++;
+    while (
+      i < lines.length
+      && lines[i].trim() !== ""
+      && !/^```/.test(lines[i])
+      && !/^#{1,6}\s+/.test(lines[i])
+      && !/^[-*]\s+/.test(lines[i])
+      && !/^\d+\.\s+/.test(lines[i])
+    ) {
+      paraLines.push(lines[i]);
+      i++;
+    }
+    const p = document.createElement("p");
+    p.className = "md-p";
+    _mdInline(p, paraLines.join("\n"));
+    target.appendChild(p);
+  }
+}
+
+// Inline parser: handles `code`, **bold**, *italic*, [link](url).
+// Emits DOM nodes; never innerHTML.
+function _mdInline(target, text) {
+  // Walk the string, matching the earliest-occurring inline pattern.
+  let remaining = text;
+  while (remaining.length > 0) {
+    const matches = [
+      { re: /`([^`]+)`/, tag: "code" },
+      { re: /\*\*([^*\n]+)\*\*/, tag: "strong" },
+      { re: /(?<!\*)\*([^*\n]+)\*(?!\*)/, tag: "em" },
+      { re: /\[([^\]]+)\]\(([^)\s]+)\)/, tag: "a" },
+    ];
+    let best = null;
+    for (const m of matches) {
+      const hit = remaining.match(m.re);
+      if (hit && (best === null || hit.index < best.hit.index)) {
+        best = { hit, tag: m.tag };
+      }
+    }
+    if (best === null) {
+      target.appendChild(document.createTextNode(remaining));
+      return;
+    }
+    if (best.hit.index > 0) {
+      target.appendChild(document.createTextNode(remaining.slice(0, best.hit.index)));
+    }
+    const el = document.createElement(best.tag);
+    if (best.tag === "a") {
+      // Link: cap protocol to http/https to avoid javascript: scheme escapes.
+      const href = best.hit[2];
+      if (/^https?:\/\//.test(href)) el.href = href;
+      el.rel = "noopener noreferrer";
+      el.target = "_blank";
+      el.textContent = best.hit[1];
+    } else {
+      el.textContent = best.hit[1];
+    }
+    target.appendChild(el);
+    remaining = remaining.slice(best.hit.index + best.hit[0].length);
+  }
+}
+
+// v0.4 B2: classify system messages into collapsible "event cards" so the
+// chat thread doesn't drown in [sub-agent ... spawned] / [workflow ... started]
+// notices.  Returns a label + an emoji-style icon + whether to default to open.
+function _classifySystemMessage(content) {
+  if (content.startsWith("[sub-agent")) {
+    if (content.includes("result]")) return { label: "Sub-agent result", icon: "🤖", open: true };
+    if (content.includes("error]")) return { label: "Sub-agent error", icon: "⚠️", open: true };
+    return { label: "Sub-agent spawned", icon: "🚀", open: false };
+  }
+  if (content.startsWith("[workflow")) {
+    if (content.includes("started]")) return { label: "Workflow started", icon: "🛠️", open: false };
+    if (content.includes("failed]")) return { label: "Workflow failed", icon: "❌", open: true };
+    return { label: "Workflow event", icon: "✅", open: true };
+  }
+  if (content.startsWith("Earlier conversation history")) {
+    return { label: "Compaction summary", icon: "📝", open: false };
+  }
+  if (content.startsWith("당신은 plan mode")) {
+    return { label: "Plan mode activated", icon: "🧭", open: false };
+  }
+  if (content.startsWith("The user APPROVED")) {
+    return { label: "Approved plan", icon: "✅", open: false };
+  }
+  if (content.startsWith("The user requested skill")) {
+    return { label: "Skill activated", icon: "🪄", open: false };
+  }
+  return null;
+}
+
+function appendMessageBubble(role, content, ts, opts) {
+  showConversationEmpty(false);
+  const list = $conv("#messages");
+  const bubble = document.createElement("div");
+  bubble.className = `msg-bubble role-${role}`;
+  const meta = document.createElement("div");
+  meta.className = "msg-meta";
+  const roleSpan = document.createElement("span");
+  roleSpan.className = "msg-role";
+  roleSpan.textContent = role;
+  const tsSpan = document.createElement("span");
+  tsSpan.className = "msg-ts";
+  tsSpan.textContent = (ts || "").slice(11, 19);
+  meta.appendChild(roleSpan);
+  if (ts) meta.appendChild(tsSpan);
+
+  const body = document.createElement("div");
+  body.className = "msg-body";
+
+  if (role === "system") {
+    // Collapsible event card if we recognise the format; otherwise plain.
+    const cls = _classifySystemMessage(content);
+    if (cls !== null) {
+      bubble.classList.add("role-system-event");
+      const det = document.createElement("details");
+      det.className = "md-system-event";
+      if (cls.open) det.open = true;
+      const sum = document.createElement("summary");
+      const icon = document.createElement("span");
+      icon.className = "event-icon";
+      icon.textContent = cls.icon;
+      const label = document.createElement("span");
+      label.className = "event-label";
+      label.textContent = cls.label;
+      sum.appendChild(icon);
+      sum.appendChild(label);
+      det.appendChild(sum);
+      const inner = document.createElement("div");
+      inner.className = "event-body";
+      _mdRenderInto(inner, content);
+      det.appendChild(inner);
+      body.appendChild(det);
+    } else {
+      _mdRenderInto(body, content);
+    }
+  } else if (role === "assistant" || (opts && opts.renderMarkdown)) {
+    _mdRenderInto(body, content);
+  } else {
+    body.textContent = content;
+  }
+
+  bubble.appendChild(meta);
+  bubble.appendChild(body);
+  list.appendChild(bubble);
+  list.scrollTop = list.scrollHeight;
+  return bubble;
+}
+
+function appendPendingPlaceholder() {
+  const list = $conv("#messages");
+  const placeholder = document.createElement("div");
+  placeholder.id = "pending-placeholder";
+  placeholder.className = "msg-bubble role-assistant pending";
+  const meta = document.createElement("div");
+  meta.className = "msg-meta";
+  const roleSpan = document.createElement("span");
+  roleSpan.className = "msg-role";
+  roleSpan.textContent = "assistant";
+  meta.appendChild(roleSpan);
+  const body = document.createElement("div");
+  body.className = "msg-body";
+  body.textContent = "…";
+  placeholder.appendChild(meta);
+  placeholder.appendChild(body);
+  list.appendChild(placeholder);
+  list.scrollTop = list.scrollHeight;
+  // v0.4 B3: keep a buffer for streamed tokens so we can re-render markdown
+  // once the full text arrives.
+  CONV_STATE.streamBuffer = "";
+}
+
+function removePendingPlaceholder() {
+  const p = $conv("#pending-placeholder");
+  if (p) p.remove();
+  CONV_STATE.streamBuffer = "";
+}
+
+// v0.4 B3: append a streamed token to the pending placeholder's body.
+function appendStreamDelta(text) {
+  const placeholder = $conv("#pending-placeholder");
+  if (!placeholder) return;
+  if (!CONV_STATE.streamBuffer || CONV_STATE.streamBuffer === "") {
+    // First chunk — replace the "…" indicator.
+    const body = placeholder.querySelector(".msg-body");
+    if (body) body.textContent = "";
+  }
+  CONV_STATE.streamBuffer = (CONV_STATE.streamBuffer || "") + text;
+  const body = placeholder.querySelector(".msg-body");
+  if (body) {
+    // Streaming view: keep plain text for speed, full markdown render only
+    // happens when the final `message` event arrives.
+    body.textContent = CONV_STATE.streamBuffer;
+  }
+  const list = $conv("#messages");
+  if (list) list.scrollTop = list.scrollHeight;
+}
+
+function updateSessionStatePill(state) {
+  const pill = $conv("#session-state-pill");
+  if (!pill) return;
+  if (!state) {
+    pill.textContent = "";
+    pill.className = "conv-session-state";
+    return;
+  }
+  pill.textContent = state;
+  pill.className = `conv-session-state state-${state}`;
+}
+
+function updateSessionModelPill(model) {
+  const pill = $conv("#session-model-pill");
+  if (!pill) return;
+  if (!model) {
+    pill.textContent = "";
+    pill.className = "conv-model-pill";
+    return;
+  }
+  // Trim the "openrouter:" prefix for display; keep full id in tooltip.
+  const display = model.replace(/^openrouter:/, "");
+  pill.textContent = display;
+  pill.title = `model: ${model}`;
+  pill.className = "conv-model-pill";
+}
+
+async function loadSessionList() {
+  try {
+    const list = await jsonFetch("/sessions?limit=50");
+    const picker = $conv("#session-picker");
+    picker.replaceChildren();
+    const placeholderOpt = document.createElement("option");
+    placeholderOpt.value = "";
+    placeholderOpt.textContent = "(세션 선택…)";
+    picker.appendChild(placeholderOpt);
+    for (const s of list) {
+      const opt = document.createElement("option");
+      opt.value = s.id;
+      const titleStr = s.title || "(제목 없음)";
+      opt.textContent = `${s.id.slice(0, 8)}… · ${titleStr}`;
+      picker.appendChild(opt);
+    }
+  } catch (e) {
+    setError(`세션 목록 로드 실패: ${e.message}`);
+  }
+}
+
+async function loadAndAttachSession(sessionId) {
+  if (CONV_STATE.eventSource) {
+    CONV_STATE.eventSource.close();
+    CONV_STATE.eventSource = null;
+  }
+  CONV_STATE.sessionId = sessionId;
+  CONV_STATE.lastSeq = 0;
+  CONV_STATE.awaitingReply = false;
+  clearMessages();
+
+  let detail;
+  try {
+    detail = await jsonFetch(`/sessions/${sessionId}`);
+  } catch (e) {
+    setError(`세션 로드 실패: ${e.message}`);
+    setSendDisabled(true);
+    return;
+  }
+  updateSessionStatePill(detail.session.state);
+  updateSessionModelPill(detail.session.model);
+
+  const messages = detail.messages || [];
+  for (const m of messages) {
+    // v0.4 B2: render system messages too — most map to recognised event
+    // cards (collapsible).  Unknown system payloads fall through to plain
+    // markdown rendering.
+    appendMessageBubble(m.role, m.content, m.ts);
+    if (m.seq > CONV_STATE.lastSeq) CONV_STATE.lastSeq = m.seq;
+  }
+  if (messages.length === 0) {
+    showConversationEmpty(true, "이 세션에 메시지가 아직 없습니다. 첫 메시지를 보내보세요.");
+  }
+
+  const ended = detail.session.state === "ended";
+  setSendDisabled(ended);
+  if (!ended) attachEventSource(sessionId);
+}
+
+function attachEventSource(sessionId) {
+  if (CONV_STATE.eventSource) {
+    CONV_STATE.eventSource.close();
+  }
+  const url = `${API}/sessions/${sessionId}/stream?last_seq=${CONV_STATE.lastSeq}`;
+  const src = new EventSource(url);
+  CONV_STATE.eventSource = src;
+
+  src.addEventListener("message", (ev) => {
+    try {
+      const data = JSON.parse(ev.data);
+      if (data.seq <= CONV_STATE.lastSeq) return;
+      if (data.role === "assistant" && CONV_STATE.awaitingReply) {
+        removePendingPlaceholder();
+        CONV_STATE.awaitingReply = false;
+        setAbortVisible(false);
+      }
+      // v0.4 B2: render every system message — most are recognised events
+      // (compaction / sub-agent / workflow / plan / skill) and rendered as
+      // collapsible cards by appendMessageBubble.
+      appendMessageBubble(data.role, data.content, data.ts);
+      CONV_STATE.lastSeq = data.seq;
+    } catch (_) { /* ignore parse errors */ }
+  });
+
+  // v0.4 B3: token streaming.  Server pushes one chunk per LLM token; we
+  // append to the pending placeholder.  When the final "message" SSE arrives
+  // it replaces the streaming text with the markdown-rendered version.
+  src.addEventListener("chunk", (ev) => {
+    try {
+      const data = JSON.parse(ev.data);
+      if (data.type === "delta" && typeof data.text === "string") {
+        appendStreamDelta(data.text);
+      } else if (data.type === "cancelled" || data.type === "error") {
+        // Drop the placeholder; setError already handled or will be by 'message'.
+        removePendingPlaceholder();
+        CONV_STATE.awaitingReply = false;
+        setAbortVisible(false);
+      }
+      // type === "done" is benign — the matching 'message' SSE arrives next.
+    } catch (_) { /* ignore parse errors */ }
+  });
+
+  src.addEventListener("done", () => {
+    src.close();
+    if (CONV_STATE.eventSource === src) CONV_STATE.eventSource = null;
+    updateSessionStatePill("ended");
+    setSendDisabled(true);
+  });
+
+  src.onerror = () => {
+    // Sessions are long-lived — let the browser reconnect on EventSource's
+    // default backoff.  We don't surface this as a hard error unless it
+    // persists.
+  };
+}
+
+async function sendMessage(text) {
+  if (!CONV_STATE.sessionId) {
+    setError("세션을 먼저 선택하거나 새로 만드세요.");
+    return;
+  }
+  if (!text.trim()) return;
+  setSendDisabled(true);
+  setAbortVisible(true);
+  CONV_STATE.awaitingReply = true;
+  appendPendingPlaceholder();
+  try {
+    await jsonFetch(`/sessions/${CONV_STATE.sessionId}/messages`, {
+      method: "POST",
+      headers: { "Content-Type": "application/json" },
+      body: JSON.stringify({ content: text }),
+    });
+    $conv("#message-input").value = "";
+    setError("");
+  } catch (e) {
+    removePendingPlaceholder();
+    CONV_STATE.awaitingReply = false;
+    setAbortVisible(false);
+    setError(`전송 실패: ${e.message}`);
+  } finally {
+    setSendDisabled(false);
+    $conv("#message-input").focus();
+  }
+}
+
+async function abortInflight() {
+  if (!CONV_STATE.sessionId) return;
+  try {
+    await jsonFetch(`/sessions/${CONV_STATE.sessionId}/abort`, { method: "POST" });
+    removePendingPlaceholder();
+    CONV_STATE.awaitingReply = false;
+    setAbortVisible(false);
+    setError("");
+  } catch (e) {
+    setError(`중단 실패: ${e.message}`);
+  }
+}
+
+async function createNewSession() {
+  let personas;
+  try {
+    personas = await jsonFetch("/personas");
+  } catch (e) {
+    setError(`persona 목록 로드 실패: ${e.message}`);
+    return;
+  }
+  const defaultPersona = personas.find((p) => p.name === "default-interactive") || personas[0];
+  if (!defaultPersona) {
+    setError("등록된 persona 가 없습니다. CLI 에서 `mydeepagent` 한 번 실행한 후 재시도하세요.");
+    return;
+  }
+  try {
+    const ack = await jsonFetch("/sessions", {
+      method: "POST",
+      headers: { "Content-Type": "application/json" },
+      // CreateSessionRequest requires repo_path min_length=1.  We default to
+      // "." (cwd of the serve process) — the backend resolves it to absolute.
+      body: JSON.stringify({ persona_name: defaultPersona.name, repo_path: "." }),
+    });
+    await loadSessionList();
+    $conv("#session-picker").value = ack.session_id;
+    await loadAndAttachSession(ack.session_id);
+  } catch (e) {
+    setError(`세션 생성 실패: ${e.message}`);
+  }
+}
+
+async function bootstrapConversationPage() {
+  await loadSessionList();
+  $conv("#new-session-btn").addEventListener("click", createNewSession);
+  $conv("#abort-btn").addEventListener("click", abortInflight);
+  $conv("#session-picker").addEventListener("change", (ev) => {
+    const sid = ev.target.value;
+    if (sid) loadAndAttachSession(sid);
+  });
+  $conv("#message-form").addEventListener("submit", (ev) => {
+    ev.preventDefault();
+    const input = $conv("#message-input");
+    sendMessage(input.value);
+  });
+  // v0.4 B5: track IME composition state — Korean/Japanese/Chinese IME emits
+  // Enter to commit the current candidate; we must NOT treat that as send.
+  // compositionend ALSO fires a synthetic Enter that we need to swallow.
+  const input = $conv("#message-input");
+  input._composing = false;
+  input.addEventListener("compositionstart", () => { input._composing = true; });
+  input.addEventListener("compositionend", () => {
+    // The keydown event that ends composition is still pending — defer the
+    // flag flip one tick so the upcoming keydown still sees _composing=true.
+    setTimeout(() => { input._composing = false; }, 0);
+  });
+  input.addEventListener("keydown", (ev) => {
+    if (ev.key !== "Enter") return;
+    // Shift+Enter → newline (let the textarea handle it natively).
+    if (ev.shiftKey) return;
+    // IME composition (Korean/Japanese/Chinese candidate commit) → never send.
+    if (input._composing) return;
+    // Plain Enter (and Cmd/Ctrl+Enter for backwards compat) → send.
+    ev.preventDefault();
+    sendMessage(ev.target.value);
+  });
+  // v0.3 PR #8: deep link `?session=<id>` auto-loads the named session.
+  const params = new URLSearchParams(window.location.search);
+  const deepSid = params.get("session");
+  if (deepSid) {
+    const picker = $conv("#session-picker");
+    if (picker) picker.value = deepSid;
+    loadAndAttachSession(deepSid);
+  }
+}
+
+// =============== sessions list (index.html as of v0.3 PR #8) ===============
+
+async function renderSessionsList() {
+  setError("");
+  let sessions;
+  try {
+    sessions = await jsonFetch("/sessions?limit=50");
+  } catch (e) {
+    setError(`세션 목록을 불러오지 못했습니다: ${e.message}`);
+    return;
+  }
+  const tbody = $("#sessions tbody");
+  if (!tbody) return;
+  tbody.replaceChildren();
+  if (sessions.length === 0) {
+    tbody.appendChild(
+      emptyCell(5, "아직 대화한 세션이 없습니다.", "/conversation.html", "새 대화 시작 →")
+    );
+    return;
+  }
+  for (const s of sessions) {
+    const tr = document.createElement("tr");
+    const idTd = document.createElement("td");
+    const idLink = document.createElement("a");
+    idLink.href = `/conversation.html?session=${s.id}`;
+    idLink.className = "mono";
+    idLink.textContent = s.id.slice(0, 8) + "…";
+    idTd.appendChild(idLink);
+    const stateTd = document.createElement("td");
+    stateTd.appendChild(badge(s.state));
+    const titleTd = document.createElement("td");
+    titleTd.textContent = s.title || "(no title yet)";
+    const personaTd = document.createElement("td");
+    personaTd.className = "mono";
+    // SessionSummary exposes `persona_id` (UUID) — show first 8 chars + tooltip.
+    if (s.persona_id) {
+      personaTd.textContent = s.persona_id.slice(0, 8) + "…";
+      personaTd.title = s.persona_id;
+    } else {
+      personaTd.textContent = "—";
+    }
+    const lastTd = document.createElement("td");
+    lastTd.className = "mono";
+    lastTd.textContent = (s.last_message_at || s.started_at || "").slice(0, 19).replace("T", " ");
+    tr.append(idTd, stateTd, titleTd, personaTd, lastTd);
+    tbody.appendChild(tr);
+  }
+}
+
+// =============== new-workflow.html (v0.4 generator) ===============
+
+const _CAPABILITIES = [
+  "spec_write", "code_review", "evidence_check", "log_analysis", "decision",
+  "command_execute", "security_audit", "code_edit", "plan", "verify",
+];
+const _BACKENDS = ["openrouter", "anthropic", "ollama_local"];
+const _RISKS = ["low", "medium", "high"];
+
+const WF_STATE = {
+  roles: /** @type {Array<{id:string,capabilities:string[],backends:string[],fallbacks:string[]}>} */ ([]),
+  phases: /** @type {Array<{key:string,title:string,risk:string,role:string,instructions:string,artifactPath:string,artifactSchema:string,gates:string,timeout:string,budget:string}>} */ ([]),
+};
+
+function _wfFreshRole() {
+  return { id: "", capabilities: [], backends: [], fallbacks: [] };
+}
+function _wfFreshPhase() {
+  return {
+    key: "", title: "", risk: "medium", role: "",
+    instructions: "", artifactPath: "", artifactSchema: "",
+    gates: "", timeout: "", budget: "",
+  };
+}
+
+function _wfChip(label, checked, onChange) {
+  const lbl = document.createElement("label");
+  lbl.className = "wf-chip";
+  const cb = document.createElement("input");
+  cb.type = "checkbox";
+  cb.checked = checked;
+  cb.addEventListener("change", () => onChange(cb.checked));
+  const span = document.createElement("span");
+  span.textContent = label;
+  lbl.appendChild(cb);
+  lbl.appendChild(span);
+  return lbl;
+}
+
+function _wfTextInput(value, placeholder, onChange, type = "text") {
+  const i = document.createElement("input");
+  i.type = type;
+  i.value = value;
+  i.placeholder = placeholder;
+  i.addEventListener("input", () => onChange(i.value));
+  return i;
+}
+
+function _wfTextArea(value, placeholder, onChange, rows = 3) {
+  const t = document.createElement("textarea");
+  t.value = value;
+  t.placeholder = placeholder;
+  t.rows = rows;
+  t.addEventListener("input", () => onChange(t.value));
+  return t;
+}
+
+function _wfSelect(value, options, onChange) {
+  const s = document.createElement("select");
+  for (const o of options) {
+    const opt = document.createElement("option");
+    opt.value = o;
+    opt.textContent = o;
+    if (o === value) opt.selected = true;
+    s.appendChild(opt);
+  }
+  s.addEventListener("change", () => onChange(s.value));
+  return s;
+}
+
+function renderRolesList() {
+  const container = $("#roles-list");
+  if (!container) return;
+  container.replaceChildren();
+  WF_STATE.roles.forEach((role, idx) => {
+    const card = document.createElement("div");
+    card.className = "wf-row-card";
+    const header = document.createElement("div");
+    header.className = "wf-row-header";
+    const title = document.createElement("strong");
+    title.textContent = `Role #${idx + 1}`;
+    const del = document.createElement("button");
+    del.type = "button";
+    del.className = "button-link";
+    del.textContent = "삭제";
+    del.addEventListener("click", () => { WF_STATE.roles.splice(idx, 1); renderRolesList(); renderPreview(); });
+    header.append(title, del);
+    card.appendChild(header);
+
+    const idRow = document.createElement("div");
+    idRow.className = "form-row";
+    const idLbl = document.createElement("label");
+    idLbl.innerHTML = "id <span class='hint'>— phase 가 참조할 키. <code>writer</code> 같은 소문자/숫자/언더스코어</span>";
+    idRow.append(idLbl, _wfTextInput(role.id, "writer", (v) => { role.id = v; renderPreview(); }));
+    card.appendChild(idRow);
+
+    const capRow = document.createElement("div");
+    capRow.className = "form-row";
+    const capLbl = document.createElement("label");
+    capLbl.innerHTML = "required_capabilities <span class='hint'>— persona 가 가져야 할 능력 (최소 1)</span>";
+    const chips = document.createElement("div");
+    chips.className = "chips";
+    for (const c of _CAPABILITIES) {
+      chips.appendChild(_wfChip(c, role.capabilities.includes(c), (on) => {
+        if (on && !role.capabilities.includes(c)) role.capabilities.push(c);
+        else if (!on) role.capabilities = role.capabilities.filter((x) => x !== c);
+        renderPreview();
+      }));
+    }
+    capRow.append(capLbl, chips);
+    card.appendChild(capRow);
+
+    container.appendChild(card);
+  });
+  if (WF_STATE.roles.length === 0) {
+    const empty = document.createElement("div");
+    empty.className = "hint";
+    empty.textContent = "Role 이 1개 이상 필요합니다.";
+    container.appendChild(empty);
+  }
+}
+
+function renderPhasesList() {
+  const container = $("#phases-list");
+  if (!container) return;
+  container.replaceChildren();
+  const roleIds = WF_STATE.roles.map((r) => r.id).filter(Boolean);
+  WF_STATE.phases.forEach((phase, idx) => {
+    const card = document.createElement("div");
+    card.className = "wf-row-card";
+    const header = document.createElement("div");
+    header.className = "wf-row-header";
+    const title = document.createElement("strong");
+    title.textContent = `Phase #${idx + 1}`;
+    const del = document.createElement("button");
+    del.type = "button";
+    del.className = "button-link";
+    del.textContent = "삭제";
+    del.addEventListener("click", () => { WF_STATE.phases.splice(idx, 1); renderPhasesList(); renderPreview(); });
+    header.append(title, del);
+    card.appendChild(header);
+
+    const grid = document.createElement("div");
+    grid.className = "form-grid";
+    for (const [label, key, ph] of [
+      ["key — 영문 소문자/숫자/언더스코어", "key", "spec"],
+      ["title — 표시용 한 줄", "title", "명세 작성"],
+    ]) {
+      const r = document.createElement("div");
+      r.className = "form-row";
+      const l = document.createElement("label");
+      l.textContent = label;
+      r.append(l, _wfTextInput(phase[key], ph, (v) => { phase[key] = v; renderPreview(); }));
+      grid.appendChild(r);
+    }
+    card.appendChild(grid);
+
+    const grid2 = document.createElement("div");
+    grid2.className = "form-grid";
+    const riskRow = document.createElement("div");
+    riskRow.className = "form-row";
+    const riskLbl = document.createElement("label");
+    riskLbl.innerHTML = "risk <span class='hint'>— 단계 위험 등급</span>";
+    riskRow.append(riskLbl, _wfSelect(phase.risk, _RISKS, (v) => { phase.risk = v; renderPreview(); }));
+    grid2.appendChild(riskRow);
+    const roleRow = document.createElement("div");
+    roleRow.className = "form-row";
+    const roleLbl = document.createElement("label");
+    roleLbl.innerHTML = "role <span class='hint'>— 위에서 정의한 role id 중 하나</span>";
+    const opts = roleIds.length > 0 ? roleIds : ["(role 을 먼저 정의)"];
+    roleRow.append(roleLbl, _wfSelect(phase.role, opts, (v) => { phase.role = v; renderPreview(); }));
+    grid2.appendChild(roleRow);
+    card.appendChild(grid2);
+
+    const insRow = document.createElement("div");
+    insRow.className = "form-row";
+    const insLbl = document.createElement("label");
+    insLbl.innerHTML = "instructions <span class='hint'>— 최소 10자. 이 phase 가 무엇을 해야 하는지</span>";
+    insRow.append(insLbl, _wfTextArea(phase.instructions,
+      "예: requirements.md 를 읽고 spec.md 를 작성하세요. 한국어 권장.",
+      (v) => { phase.instructions = v; renderPreview(); }, 4));
+    card.appendChild(insRow);
+
+    const grid3 = document.createElement("div");
+    grid3.className = "form-grid";
+    for (const [label, key, ph] of [
+      ["expected_artifact.path (선택)", "artifactPath", "artifacts/spec.md"],
+      ["expected_artifact.schema (선택)", "artifactSchema", "spec-v1"],
+    ]) {
+      const r = document.createElement("div");
+      r.className = "form-row";
+      const l = document.createElement("label");
+      l.textContent = label;
+      r.append(l, _wfTextInput(phase[key], ph, (v) => { phase[key] = v; renderPreview(); }));
+      grid3.appendChild(r);
+    }
+    card.appendChild(grid3);
+    container.appendChild(card);
+  });
+  if (WF_STATE.phases.length === 0) {
+    const empty = document.createElement("div");
+    empty.className = "hint";
+    empty.textContent = "Phase 가 1개 이상 필요합니다.";
+    container.appendChild(empty);
+  }
+}
+
+function _wfBuildRequest() {
+  const name = $("#wf-name").value.trim();
+  const version = parseInt($("#wf-version").value, 10);
+  const description = $("#wf-description").value.trim();
+  const roles = WF_STATE.roles.filter((r) => r.id).map((r) => ({
+    id: r.id,
+    required_capabilities: r.capabilities,
+    preferred_backends: r.backends,
+    fallback_personas: r.fallbacks,
+  }));
+  const phases = WF_STATE.phases.filter((p) => p.key).map((p) => {
+    const out = {
+      key: p.key,
+      title: p.title || p.key,
+      risk: p.risk,
+      role: p.role,
+      instructions: p.instructions || "(no instructions)",
+      gates: [],
+    };
+    if (p.artifactPath || p.artifactSchema) {
+      out.expected_artifact = {
+        path: p.artifactPath || "artifacts/output.md",
+        schema: p.artifactSchema || "text",
+      };
+    }
+    return out;
+  });
+  const req = { name, version: isNaN(version) ? 1 : version, roles, phases, default_gates: [] };
+  if (description) req.description = description;
+  return req;
+}
+
+function renderPreview() {
+  const pre = $("#wf-preview");
+  if (!pre) return;
+  pre.textContent = JSON.stringify(_wfBuildRequest(), null, 2);
+}
+
+async function submitWorkflow(ev) {
+  ev.preventDefault();
+  setError("");
+  $("#success").style.display = "none";
+  const req = _wfBuildRequest();
+  try {
+    const ack = await jsonFetch("/workflows", {
+      method: "POST",
+      headers: { "Content-Type": "application/json" },
+      body: JSON.stringify(req),
+    });
+    const okBox = $("#success");
+    okBox.textContent = `✅ 저장 완료 → ${ack.path}. 워크플로우 실행 페이지에서 바로 보입니다.`;
+    okBox.style.display = "block";
+  } catch (e) {
+    setError(`저장 실패: ${e.message}`);
+  }
+}
+
+function bootstrapWorkflowGenerator() {
+  WF_STATE.roles = [_wfFreshRole()];
+  WF_STATE.phases = [_wfFreshPhase()];
+  renderRolesList();
+  renderPhasesList();
+  renderPreview();
+  $("#add-role").addEventListener("click", () => { WF_STATE.roles.push(_wfFreshRole()); renderRolesList(); renderPreview(); });
+  $("#add-phase").addEventListener("click", () => { WF_STATE.phases.push(_wfFreshPhase()); renderPhasesList(); renderPreview(); });
+  $("#wf-name").addEventListener("input", renderPreview);
+  $("#wf-version").addEventListener("input", renderPreview);
+  $("#wf-description").addEventListener("input", renderPreview);
+  $("#wf-form").addEventListener("submit", submitWorkflow);
+}
+
 // =============== bootstrap ===============

 document.addEventListener("DOMContentLoaded", () => {
  const page = document.body.dataset.page;
  if (page === "index") {
+    renderSessionsList();
+  } else if (page === "runs") {
    renderRunsList();
    renderBudgetSummary();
  } else if (page === "new") {
@@ -430,5 +1345,9 @@ document.addEventListener("DOMContentLoaded", () => {
    renderRunDetail();
    $("#abort-btn").addEventListener("click", abortRun);
    $("#resume-btn").addEventListener("click", resumeRun);
+  } else if (page === "conversation") {
+    bootstrapConversationPage();
+  } else if (page === "new-workflow") {
+    bootstrapWorkflowGenerator();
  }
 });
--- a/my-deepagent/static/conversation.html
+++ b/my-deepagent/static/conversation.html
@@ -0,0 +1,53 @@
+<!doctype html>
+<html lang="ko">
+<head>
+  <meta charset="utf-8" />
+  <meta name="viewport" content="width=device-width,initial-scale=1" />
+  <title>my-deepagent · 대화</title>
+  <link rel="stylesheet" href="/static/style.css" />
+</head>
+<body data-page="conversation">
+  <header>
+    <h1><a href="/">my-deepagent</a></h1>
+    <nav>
+      <a href="/" class="nav-primary">세션 목록</a>
+      <a href="/conversation.html" class="active nav-primary">대화</a>
+      <a href="/runs.html" class="nav-secondary">Runs</a>
+      <a href="/new.html" class="nav-secondary">워크플로우 실행</a>
+    </nav>
+  </header>
+  <main class="conversation-main">
+    <div id="error" class="error-banner" style="display:none"></div>
+
+    <!-- Top bar: session picker + new conversation button -->
+    <div class="conv-topbar">
+      <label for="session-picker" class="conv-label">세션</label>
+      <select id="session-picker" class="conv-picker">
+        <option value="">(세션 선택…)</option>
+      </select>
+      <button id="new-session-btn" type="button" class="conv-action-btn">새 대화</button>
+      <span class="conv-model-pill" id="session-model-pill" title="이 세션의 활성 모델"></span>
+      <span class="conv-session-state" id="session-state-pill"></span>
+    </div>
+
+    <!-- Message thread -->
+    <div id="messages" class="messages-thread">
+      <div class="conv-empty" id="conv-empty">대화를 시작하려면 위에서 세션을 선택하거나 "새 대화"를 누르세요.</div>
+    </div>
+
+    <!-- Input bar -->
+    <form id="message-form" class="conv-input-bar">
+      <textarea
+        id="message-input"
+        rows="2"
+        placeholder="메시지를 입력하세요…  (Enter 전송, Shift+Enter 줄바꿈)"
+        autocomplete="off"
+        disabled
+      ></textarea>
+      <button id="send-btn" type="submit" disabled>전송</button>
+      <button id="abort-btn" type="button" disabled style="display:none">⏹ 중단</button>
+    </form>
+  </main>
+  <script src="/static/app.js"></script>
+</body>
+</html>
--- a/my-deepagent/static/index.html
+++ b/my-deepagent/static/index.html
@@ -3,43 +3,51 @@
 <head>
  <meta charset="utf-8" />
  <meta name="viewport" content="width=device-width,initial-scale=1" />
-  <title>my-deepagent · runs</title>
+  <title>my-deepagent · 대화</title>
  <link rel="stylesheet" href="/static/style.css" />
 </head>
 <body data-page="index">
  <header>
    <h1><a href="/">my-deepagent</a></h1>
    <nav>
-      <a href="/" class="active">Runs</a>
-      <a href="/new.html">새 Run</a>
+      <a href="/" class="active nav-primary">대화</a>
+      <a href="/runs.html" class="nav-secondary">Runs</a>
+      <a href="/new.html" class="nav-secondary">워크플로우 실행</a>
+      <a href="/new-workflow.html" class="nav-secondary">+ 템플릿 만들기</a>
    </nav>
  </header>
  <main>
    <div id="error" class="error-banner" style="display:none"></div>

    <div class="page-title">
-      <h2>최근 Runs</h2>
-      <span class="page-subtitle">최신 50개</span>
+      <h2>최근 대화 세션</h2>
+      <span class="page-subtitle">최근 50개 · 빈 화면이면 아래 "새 대화"를 누르세요</span>
+    </div>
+
+    <div class="info-box">
+      <strong>👋 my-deepagent</strong> — OpenRouter 가성비 모델로 돌아가는 Claude Code 스타일 멀티턴 에이전트.
+      대부분의 경우 아래 <strong>"새 대화 시작"</strong>만 누르면 됩니다.
+      <a href="/new.html">여러 단계 자동화</a>가 필요하면 워크플로우, <a href="/new-workflow.html">템플릿 직접 만들기</a>도 가능.
+    </div>
+
+    <div class="action-bar" style="margin-bottom: 12px;">
+      <a class="button primary" href="/conversation.html">▶︎ 새 대화 시작</a>
    </div>

    <div class="card">
-      <table id="runs">
+      <table id="sessions">
        <thead>
          <tr>
-            <th style="width: 22%">Run</th>
-            <th style="width: 13%">State</th>
-            <th>Repo</th>
-            <th style="width: 12%">Branch</th>
-            <th style="width: 16%">Created</th>
-            <th style="width: 16%">Ended</th>
+            <th style="width: 16%">Session</th>
+            <th style="width: 12%">State</th>
+            <th>Title / preview</th>
+            <th style="width: 12%">Persona</th>
+            <th style="width: 18%">Last activity</th>
          </tr>
        </thead>
        <tbody></tbody>
      </table>
    </div>
-
-    <h2 class="section-title">예산 (현재)</h2>
-    <div id="budget-summary" class="budget-grid"></div>
  </main>
  <script src="/static/app.js"></script>
 </body>
--- a/my-deepagent/static/new-workflow.html
+++ b/my-deepagent/static/new-workflow.html
@@ -0,0 +1,99 @@
+<!doctype html>
+<html lang="ko">
+<head>
+  <meta charset="utf-8" />
+  <meta name="viewport" content="width=device-width,initial-scale=1" />
+  <title>my-deepagent · 워크플로우 템플릿 만들기</title>
+  <link rel="stylesheet" href="/static/style.css" />
+</head>
+<body data-page="new-workflow">
+  <header>
+    <h1><a href="/">my-deepagent</a></h1>
+    <nav>
+      <a href="/" class="nav-primary">대화</a>
+      <a href="/runs.html" class="nav-secondary">Runs</a>
+      <a href="/new.html" class="nav-secondary">워크플로우 실행</a>
+      <a href="/new-workflow.html" class="active nav-secondary">+ 템플릿 만들기</a>
+    </nav>
+  </header>
+  <main>
+    <div id="error" class="error-banner" style="display:none"></div>
+    <div id="success" class="info-box" style="display:none"></div>
+
+    <div class="page-title">
+      <h2>워크플로우 템플릿 만들기</h2>
+      <span class="page-subtitle">phase 시퀀스 + role 정의 → YAML 저장</span>
+    </div>
+
+    <div class="info-box">
+      <strong>📘 워크플로우 = phase 시퀀스</strong><br />
+      예: <code>"명세 작성" → "리뷰" → "검증"</code> 처럼 단계별로 어떤 role(역할)이 어떤
+      산출물을 만들지 정의하는 파일입니다. 저장 후엔 <a href="/new.html">워크플로우 실행</a>
+      페이지의 드롭다운에 자동으로 등장합니다 (서버 재시작 불필요).
+    </div>
+
+    <form id="wf-form" autocomplete="off">
+
+      <!-- 기본 메타 -->
+      <div class="card" style="padding: 20px;">
+        <h3 class="section-title" style="margin-top:0">기본 정보</h3>
+        <div class="form-grid">
+          <div class="form-row">
+            <label for="wf-name">
+              name
+              <span class="hint">— 영문 소문자/숫자/하이픈만. 예: <code>spec-and-review</code></span>
+            </label>
+            <input id="wf-name" type="text" required placeholder="my-workflow" />
+          </div>
+          <div class="form-row">
+            <label for="wf-version">
+              version
+              <span class="hint">— 정수, 1부터</span>
+            </label>
+            <input id="wf-version" type="number" required value="1" min="1" />
+          </div>
+        </div>
+        <div class="form-row">
+          <label for="wf-description">
+            description
+            <span class="hint">— 한 줄 설명 (선택)</span>
+          </label>
+          <input id="wf-description" type="text" placeholder="이 워크플로우가 무엇을 하는지" />
+        </div>
+      </div>
+
+      <!-- Roles -->
+      <div class="card" style="padding: 20px; margin-top: 16px;">
+        <h3 class="section-title" style="margin-top:0">
+          Roles <span class="hint" style="font-weight:400">— phase 가 참조할 역할 정의</span>
+        </h3>
+        <div id="roles-list"></div>
+        <button type="button" id="add-role" class="button">+ Role 추가</button>
+      </div>
+
+      <!-- Phases -->
+      <div class="card" style="padding: 20px; margin-top: 16px;">
+        <h3 class="section-title" style="margin-top:0">
+          Phases <span class="hint" style="font-weight:400">— 실제 실행되는 단계 순서</span>
+        </h3>
+        <div id="phases-list"></div>
+        <button type="button" id="add-phase" class="button">+ Phase 추가</button>
+      </div>
+
+      <!-- Preview -->
+      <details class="card" style="padding: 16px; margin-top: 16px;">
+        <summary style="cursor:pointer; font-weight:600;">
+          YAML 미리보기 <span class="hint" style="font-weight:400">— 저장될 파일 내용</span>
+        </summary>
+        <pre id="wf-preview" class="mono" style="margin-top:12px; white-space:pre-wrap; font-size:12.5px;"></pre>
+      </details>
+
+      <div class="action-bar">
+        <button type="submit" class="primary">💾 저장 + 등록</button>
+        <a class="button" href="/">취소</a>
+      </div>
+    </form>
+  </main>
+  <script src="/static/app.js"></script>
+</body>
+</html>
--- a/my-deepagent/static/new.html
+++ b/my-deepagent/static/new.html
@@ -3,55 +3,85 @@
 <head>
  <meta charset="utf-8" />
  <meta name="viewport" content="width=device-width,initial-scale=1" />
-  <title>my-deepagent · 새 Run</title>
+  <title>my-deepagent · 워크플로우 실행</title>
  <link rel="stylesheet" href="/static/style.css" />
 </head>
 <body data-page="new">
  <header>
    <h1><a href="/">my-deepagent</a></h1>
    <nav>
-      <a href="/">Runs</a>
-      <a href="/new.html" class="active">새 Run</a>
+      <a href="/" class="nav-primary">대화</a>
+      <a href="/runs.html" class="nav-secondary">Runs</a>
+      <a href="/new.html" class="active nav-secondary">워크플로우 실행</a>
+      <a href="/new-workflow.html" class="nav-secondary">+ 템플릿 만들기</a>
    </nav>
  </header>
  <main>
    <div id="error" class="error-banner" style="display:none"></div>

    <div class="page-title">
-      <h2>새 Run 시작</h2>
-      <span class="page-subtitle">워크플로우 + repo + 요구사항</span>
+      <h2>워크플로우 실행 <span class="hint" style="font-size: 12px; vertical-align: middle;">(고급 기능)</span></h2>
+      <span class="page-subtitle">사전 정의된 phase 시퀀스로 자동화된 작업 실행</span>
+    </div>
+
+    <div class="info-box">
+      <strong>💡 자유 대화는 여기가 아닙니다.</strong>
+      그냥 챗봇처럼 쓰고 싶다면 <a href="/">메인 페이지의 "새 대화 시작"</a>을 눌러주세요.
+      이 페이지는 <strong>여러 단계 (예: 명세 → 리뷰 → 검증)</strong> 가 정해진 순서로 자동 실행되는 워크플로우를 시작할 때 씁니다.
+      <br /><br />
+      <strong>새 템플릿을 직접 만들고 싶다면</strong> 우상단 <a href="/new-workflow.html">+ 템플릿 만들기</a>로 가세요.
    </div>

    <form id="start-form" autocomplete="off">
      <div class="card" style="padding: 20px;">
        <div class="form-row">
-          <label for="template">워크플로우 템플릿</label>
+          <label for="template">
+            워크플로우 템플릿
+            <span class="hint">— 무슨 단계를 어떤 순서로 돌릴지 정의한 YAML. 모르면 첫 번째 선택.</span>
+          </label>
          <select id="template" required></select>
        </div>

        <div class="form-grid">
          <div class="form-row">
-            <label for="repo-path">repo 절대경로</label>
+            <label for="repo-path">
+              repo 절대경로
+              <span class="hint">— 작업할 git 저장소 위치 (예: /Users/me/projects/my-thing)</span>
+            </label>
            <input id="repo-path" type="text" placeholder="/Users/me/projects/my-thing" required />
          </div>

          <div class="form-row">
-            <label for="base-branch">base branch</label>
+            <label for="base-branch">
+              base branch
+              <span class="hint">— 작업의 시작점 (보통 main)</span>
+            </label>
            <input id="base-branch" type="text" value="main" />
          </div>
        </div>

        <div class="form-row">
-          <label for="requirements">requirements <span class="hint">— 자유 텍스트, 마크다운 OK</span></label>
-          <textarea id="requirements" rows="6" placeholder="이 workflow가 다룰 요구사항을 적어주세요."></textarea>
+          <label for="requirements">
+            requirements
+            <span class="hint">— 이 워크플로우가 다룰 요구사항. 자유 텍스트, 마크다운 OK</span>
+          </label>
+          <textarea id="requirements" rows="6" placeholder="예: wordcount CLI를 만들어줘. python으로, pytest 테스트 포함."></textarea>
        </div>
      </div>

-      <h2 class="section-title">Persona 오버라이드 <span class="hint" style="text-transform: none; letter-spacing: 0; font-weight: 400;">(선택, 비우면 자동 선택)</span></h2>
-      <div id="override-fields" class="card"></div>
+      <details class="card" style="margin-top: 16px; padding: 16px;">
+        <summary style="cursor: pointer; font-weight: 600;">
+          Persona 오버라이드 <span class="hint" style="font-weight: 400;">— 비우면 자동 선택 (고급)</span>
+        </summary>
+        <p class="hint" style="margin-top: 12px; font-weight: 400;">
+          각 단계(role)에 어떤 persona(AI 모델 + 시스템 프롬프트)를 쓸지 직접 고르고 싶을 때만 채우세요.
+          비워두면 capability 매칭으로 자동 선택됩니다.
+        </p>
+        <div id="override-fields"></div>
+      </details>

      <div class="action-bar">
-        <button type="submit" class="primary">▶︎ 시작</button>
+        <button type="submit" class="primary">▶︎ 워크플로우 실행</button>
        <a class="button" href="/">취소</a>
      </div>
    </form>
--- a/my-deepagent/static/run.html
+++ b/my-deepagent/static/run.html
@@ -10,8 +10,10 @@
  <header>
    <h1><a href="/">my-deepagent</a></h1>
    <nav>
-      <a href="/">Runs</a>
-      <a href="/new.html">새 Run</a>
+      <a href="/" class="nav-primary">대화</a>
+      <a href="/runs.html" class="nav-secondary">Runs</a>
+      <a href="/new.html" class="nav-secondary">워크플로우 실행</a>
+      <a href="/new-workflow.html" class="nav-secondary">+ 템플릿 만들기</a>
    </nav>
  </header>
  <main>
--- a/my-deepagent/static/runs.html
+++ b/my-deepagent/static/runs.html
@@ -0,0 +1,48 @@
+<!doctype html>
+<html lang="ko">
+<head>
+  <meta charset="utf-8" />
+  <meta name="viewport" content="width=device-width,initial-scale=1" />
+  <title>my-deepagent · workflow runs (archive)</title>
+  <link rel="stylesheet" href="/static/style.css" />
+</head>
+<body data-page="runs">
+  <header>
+    <h1><a href="/">my-deepagent</a></h1>
+    <nav>
+      <a href="/" class="nav-primary">대화</a>
+      <a href="/runs.html" class="active nav-secondary">Runs</a>
+      <a href="/new.html" class="nav-secondary">워크플로우 실행</a>
+      <a href="/new-workflow.html" class="nav-secondary">+ 템플릿 만들기</a>
+    </nav>
+  </header>
+  <main>
+    <div id="error" class="error-banner" style="display:none"></div>
+
+    <div class="page-title">
+      <h2>Workflow Runs · archive</h2>
+      <span class="page-subtitle">차별화 워크플로우 엔진 결과 (최신 50개)</span>
+    </div>
+
+    <div class="card">
+      <table id="runs">
+        <thead>
+          <tr>
+            <th style="width: 22%">Run</th>
+            <th style="width: 13%">State</th>
+            <th>Repo</th>
+            <th style="width: 12%">Branch</th>
+            <th style="width: 16%">Created</th>
+            <th style="width: 16%">Ended</th>
+          </tr>
+        </thead>
+        <tbody></tbody>
+      </table>
+    </div>
+
+    <h2 class="section-title">예산 (현재)</h2>
+    <div id="budget-summary" class="budget-grid"></div>
+  </main>
+  <script src="/static/app.js"></script>
+</body>
+</html>
--- a/my-deepagent/static/style.css
+++ b/my-deepagent/static/style.css
@@ -778,3 +778,446 @@ select {
  .event-line { grid-template-columns: 1fr; gap: 2px; }
  .chips { grid-template-columns: 1fr; gap: 6px; }
 }
+
+/* =================================================================
+   v0.3 PR #8 — Conversation page
+   ================================================================= */
+
+.conversation-main {
+  display: flex;
+  flex-direction: column;
+  min-height: calc(100vh - 80px);
+  padding-bottom: 0;
+}
+
+.conv-topbar {
+  display: flex;
+  align-items: center;
+  gap: 12px;
+  padding: 12px 16px;
+  background: var(--bg-card);
+  border: 1px solid var(--border);
+  border-radius: 8px;
+  margin-bottom: 12px;
+  flex-wrap: wrap;
+}
+
+.conv-label {
+  font-size: 13px;
+  color: var(--text-muted);
+  font-weight: 600;
+}
+
+.conv-picker {
+  flex: 1;
+  min-width: 240px;
+  padding: 6px 10px;
+  font-family: var(--font-mono);
+  font-size: 13px;
+  border: 1px solid var(--border);
+  border-radius: 6px;
+  background: var(--bg);
+}
+
+.conv-action-btn {
+  padding: 6px 14px;
+  font-size: 13px;
+  background: var(--accent);
+  color: white;
+  border: none;
+  border-radius: 6px;
+  cursor: pointer;
+}
+
+.conv-action-btn:hover { filter: brightness(1.08); }
+
+.conv-session-state {
+  font-size: 11px;
+  padding: 2px 8px;
+  border-radius: 999px;
+  text-transform: lowercase;
+  letter-spacing: 0.04em;
+}
+
+.conv-session-state.state-active {
+  background: rgba(34,197,94,0.12);
+  color: rgb(22,163,74);
+}
+
+.conv-session-state.state-ended {
+  background: rgba(100,116,139,0.12);
+  color: rgb(71,85,105);
+}
+
+.messages-thread {
+  flex: 1;
+  overflow-y: auto;
+  padding: 16px;
+  border: 1px solid var(--border);
+  border-radius: 8px;
+  background: var(--bg);
+  margin-bottom: 12px;
+  display: flex;
+  flex-direction: column;
+  gap: 12px;
+}
+
+.conv-empty {
+  color: var(--text-muted);
+  text-align: center;
+  padding: 40px 16px;
+  font-size: 13px;
+}
+
+.msg-bubble {
+  max-width: 80%;
+  padding: 10px 14px;
+  border-radius: 12px;
+  font-size: 14px;
+  line-height: 1.5;
+  white-space: pre-wrap;
+  word-break: break-word;
+}
+
+.msg-bubble.role-user {
+  align-self: flex-end;
+  background: var(--accent);
+  color: white;
+}
+
+.msg-bubble.role-assistant {
+  align-self: flex-start;
+  background: var(--bg-card);
+  border: 1px solid var(--border);
+}
+
+.msg-bubble.role-system {
+  align-self: center;
+  max-width: 90%;
+  font-style: italic;
+  font-size: 12.5px;
+  background: rgba(245,158,11,0.08);
+  border: 1px dashed rgba(245,158,11,0.4);
+  color: rgb(120,53,15);
+}
+
+.msg-bubble.pending {
+  opacity: 0.6;
+  font-size: 20px;
+  padding: 6px 14px;
+}
+
+.msg-meta {
+  display: flex;
+  align-items: center;
+  gap: 8px;
+  font-size: 11px;
+  opacity: 0.6;
+  margin-bottom: 4px;
+}
+
+.msg-role {
+  font-weight: 700;
+  text-transform: uppercase;
+  letter-spacing: 0.05em;
+}
+
+.conv-input-bar {
+  display: flex;
+  gap: 8px;
+  padding: 12px;
+  background: var(--bg-card);
+  border: 1px solid var(--border);
+  border-radius: 8px;
+}
+
+.conv-input-bar textarea {
+  flex: 1;
+  font-family: var(--font-body);
+  font-size: 14px;
+  padding: 8px 10px;
+  border: 1px solid var(--border);
+  border-radius: 6px;
+  resize: vertical;
+  min-height: 44px;
+}
+
+.conv-input-bar textarea:disabled {
+  background: var(--bg);
+  opacity: 0.5;
+}
+
+.conv-input-bar button {
+  padding: 0 18px;
+  font-size: 13px;
+  background: var(--accent);
+  color: white;
+  border: none;
+  border-radius: 6px;
+  cursor: pointer;
+}
+
+.conv-input-bar button:disabled {
+  opacity: 0.4;
+  cursor: not-allowed;
+}
+
+
+/* =================================================================
+   v0.4 — nav tiers + info-box + empty-state polish
+   ================================================================= */
+
+nav .nav-primary {
+  font-weight: 600;
+}
+
+nav .nav-secondary {
+  font-size: 12.5px;
+  opacity: 0.65;
+}
+
+nav .nav-secondary:hover {
+  opacity: 1;
+}
+
+nav a.active.nav-primary,
+nav a.active.nav-secondary {
+  opacity: 1;
+}
+
+.info-box {
+  background: rgba(245, 158, 11, 0.08);
+  border: 1px solid rgba(245, 158, 11, 0.3);
+  border-left: 4px solid rgb(245, 158, 11);
+  padding: 14px 18px;
+  border-radius: 8px;
+  margin-bottom: 20px;
+  font-size: 14px;
+  line-height: 1.65;
+  color: rgb(95, 50, 5);
+}
+
+.info-box strong {
+  color: rgb(75, 35, 0);
+}
+
+.info-box a {
+  color: rgb(180, 70, 30);
+  text-decoration: underline;
+  text-underline-offset: 2px;
+}
+
+/* details/summary polish */
+details summary {
+  padding: 4px 0;
+}
+
+details[open] summary {
+  margin-bottom: 12px;
+}
+
+/* index empty state — prominent CTA */
+.empty-cta {
+  text-align: center;
+  padding: 64px 20px;
+}
+
+.empty-cta-title {
+  font-size: 18px;
+  font-weight: 600;
+  margin-bottom: 8px;
+  color: var(--text);
+}
+
+.empty-cta-subtitle {
+  color: var(--text-muted);
+  font-size: 14px;
+  margin-bottom: 24px;
+}
+
+.empty-cta .button {
+  font-size: 15px;
+  padding: 12px 24px;
+}
+
+/* =================================================================
+   v0.4 — workflow generator UI
+   ================================================================= */
+
+.wf-row-card {
+  background: var(--bg);
+  border: 1px solid var(--border);
+  border-radius: 8px;
+  padding: 14px 16px;
+  margin-bottom: 12px;
+}
+
+.wf-row-header {
+  display: flex;
+  justify-content: space-between;
+  align-items: center;
+  margin-bottom: 12px;
+  padding-bottom: 8px;
+  border-bottom: 1px dashed var(--border);
+}
+
+.button-link {
+  background: none;
+  border: none;
+  color: rgb(180, 70, 30);
+  cursor: pointer;
+  font-size: 12px;
+  text-decoration: underline;
+  padding: 2px 6px;
+}
+
+.wf-chip {
+  display: inline-flex;
+  align-items: center;
+  gap: 4px;
+  background: rgba(180, 70, 30, 0.06);
+  border: 1px solid rgba(180, 70, 30, 0.2);
+  border-radius: 999px;
+  padding: 3px 10px;
+  font-size: 12.5px;
+  cursor: pointer;
+  margin: 2px 4px 2px 0;
+}
+
+.wf-chip input {
+  margin: 0;
+}
+
+.wf-chip:has(input:checked) {
+  background: rgba(180, 70, 30, 0.18);
+  border-color: rgba(180, 70, 30, 0.5);
+  font-weight: 600;
+}
+
+/* =================================================================
+   v0.4 — Markdown + system event cards in conversation
+   ================================================================= */
+
+.msg-body .md-p {
+  margin: 0 0 8px 0;
+  line-height: 1.6;
+}
+
+.msg-body .md-p:last-child { margin-bottom: 0; }
+
+.msg-body .md-h {
+  margin: 12px 0 6px 0;
+  font-weight: 700;
+  line-height: 1.3;
+}
+
+.msg-body .md-ul,
+.msg-body .md-ol {
+  margin: 4px 0 8px 0;
+  padding-left: 22px;
+  line-height: 1.6;
+}
+
+.msg-body .md-ul li,
+.msg-body .md-ol li {
+  margin: 2px 0;
+}
+
+.msg-body .md-code {
+  background: rgba(0, 0, 0, 0.04);
+  border: 1px solid var(--border);
+  border-radius: 6px;
+  padding: 10px 12px;
+  margin: 8px 0;
+  overflow-x: auto;
+  font-family: var(--font-mono);
+  font-size: 12.5px;
+  line-height: 1.45;
+}
+
+.msg-body .md-code code {
+  background: transparent;
+  padding: 0;
+}
+
+.msg-body code {
+  background: rgba(0, 0, 0, 0.06);
+  border-radius: 4px;
+  padding: 1px 5px;
+  font-family: var(--font-mono);
+  font-size: 0.9em;
+}
+
+.msg-bubble.role-user .msg-body .md-code,
+.msg-bubble.role-user .msg-body code {
+  background: rgba(255, 255, 255, 0.18);
+  border-color: rgba(255, 255, 255, 0.3);
+  color: white;
+}
+
+.msg-body a {
+  color: rgb(180, 70, 30);
+  text-decoration: underline;
+  text-underline-offset: 2px;
+}
+
+.msg-bubble.role-user .msg-body a {
+  color: white;
+}
+
+.msg-body strong { font-weight: 700; }
+.msg-body em { font-style: italic; }
+
+/* System event card */
+.msg-bubble.role-system-event {
+  align-self: stretch;
+  max-width: 100%;
+  background: rgba(245, 158, 11, 0.06);
+  border: 1px solid rgba(245, 158, 11, 0.25);
+  border-style: dashed;
+  font-style: normal;
+  color: var(--text);
+}
+
+.md-system-event summary {
+  cursor: pointer;
+  font-size: 12.5px;
+  display: flex;
+  align-items: center;
+  gap: 6px;
+  list-style: none;
+}
+
+.md-system-event summary::-webkit-details-marker { display: none; }
+
+.md-system-event summary .event-icon {
+  font-size: 14px;
+}
+
+.md-system-event summary .event-label {
+  font-weight: 600;
+  letter-spacing: 0.02em;
+  color: rgb(120, 53, 15);
+}
+
+.md-system-event[open] summary {
+  margin-bottom: 8px;
+  border-bottom: 1px dashed rgba(245, 158, 11, 0.3);
+  padding-bottom: 6px;
+}
+
+.md-system-event .event-body {
+  font-size: 12.5px;
+  line-height: 1.55;
+  color: var(--text-muted);
+}
+
+.conv-model-pill {
+  font-family: var(--font-mono);
+  font-size: 11.5px;
+  padding: 2px 8px;
+  border-radius: 999px;
+  background: rgba(0, 0, 0, 0.06);
+  color: var(--text-muted);
+  letter-spacing: 0.01em;
+}
--- a/my-deepagent/tests/integration/test_api_static.py
+++ b/my-deepagent/tests/integration/test_api_static.py
@@ -33,12 +33,25 @@ async def app_client(tmp_path: Path) -> AsyncIterator[AsyncClient]:

@pytest.mark.asyncio
 async def test_root_serves_index_html(app_client: AsyncClient) -> None:
+    """`/` now renders the conversation-centric index (v0.3 PR #8 rewrite)."""
    r = await app_client.get("/")
    assert r.status_code == 200
    assert r.headers["content-type"].startswith("text/html")
    body = r.text
-    assert "<title>my-deepagent · runs</title>" in body
+    # Title became "대화"; data-page kept as "index" for back-compat.
    assert 'data-page="index"' in body
+    assert "대화" in body
+    # Must NOT advertise itself as the Runs page anymore.
+    assert "my-deepagent · runs" not in body
+
+
+@pytest.mark.asyncio
+async def test_runs_html_served(app_client: AsyncClient) -> None:
+    """`/runs.html` is the new home of the workflow runs archive."""
+    r = await app_client.get("/runs.html")
+    assert r.status_code == 200
+    assert 'data-page="runs"' in r.text
+    assert "Workflow Runs" in r.text


@pytest.mark.asyncio
--- a/my-deepagent/tests/integration/test_budget.py
+++ b/my-deepagent/tests/integration/test_budget.py
@@ -256,6 +256,42 @@ async def test_get_remaining_unknown_scope_returns_none(db: Database) -> None:
    assert remaining is None


+# ---------------------------------------------------------------------------
+# session: scope (v0.3 PR #6) — sub-agent rollup to root session
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_session_scope_accumulates_cost(db: Database) -> None:
+    import uuid as _uuid
+
+    tracker = _make_tracker(db, run_cap=2.0)
+    session_id = _uuid.uuid4()
+    await tracker.record(
+        run_id=None, persona_name=None, actual_cost_usd=0.30, session_id=session_id
+    )
+    await tracker.record(
+        run_id=None, persona_name=None, actual_cost_usd=0.20, session_id=session_id
+    )
+    spent = await tracker.get_spent(f"session:{session_id}")
+    assert spent == pytest.approx(0.50)
+    remaining = await tracker.get_remaining(f"session:{session_id}")
+    assert remaining == pytest.approx(1.50)
+
+
+@pytest.mark.asyncio
+async def test_session_scope_omitted_when_no_session_id(db: Database) -> None:
+    """Calls without ``session_id`` must NOT create a session: ledger row."""
+    import uuid as _uuid
+
+    tracker = _make_tracker(db)
+    # Drive a record without session_id.
+    await tracker.record(run_id=None, persona_name=None, actual_cost_usd=0.10)
+    # Querying any session scope should yield 0 spent.
+    sid = _uuid.uuid4()
+    assert (await tracker.get_spent(f"session:{sid}")) == pytest.approx(0.0)
+
+
 # ---------------------------------------------------------------------------
 # helpers
 # ---------------------------------------------------------------------------
--- a/my-deepagent/tests/integration/test_conversation_gui.py
+++ b/my-deepagent/tests/integration/test_conversation_gui.py
@@ -0,0 +1,242 @@
+"""v0.3 PR #8 — Conversation Web GUI tests.
+
+Covers:
+1. GET /conversation.html serves the static file (200).
+2. POST /api/sessions/{id}/messages still returns 200 + queues a background
+   task (the agent_runner is stubbed so we never hit OpenRouter).
+3. The background task persists an assistant MessageRow that the SSE stream
+   then surfaces.
+4. The background task is awaited correctly (asyncio.Task ref held on
+   app.state so RUF006 doesn't drop it mid-flight).
+"""
+
+from __future__ import annotations
+
+import asyncio
+from collections.abc import AsyncIterator
+from pathlib import Path
+from typing import Any
+
+import pytest
+from fastapi import FastAPI
+from httpx import ASGITransport, AsyncClient
+from sqlalchemy import select
+
+from my_deepagent.api.app import create_app
+from my_deepagent.config import load_config
+from my_deepagent.persistence.db import Database
+from my_deepagent.persistence.models import InteractiveSessionRow, MessageRow
+
+
+@pytest.fixture
+async def app_client(
+    tmp_path: Path,
+) -> AsyncIterator[tuple[AsyncClient, Database, FastAPI]]:
+    db_url = f"sqlite+aiosqlite:///{tmp_path / 'conv.sqlite3'}"
+    cfg = load_config(
+        workspace_root=tmp_path,
+        data_dir=tmp_path / "data",
+        database_url=db_url,
+    )
+    db = Database(db_url)
+    await db.init_schema()
+    await db.dispose()
+    app = create_app(cfg)
+    transport = ASGITransport(app=app)
+    async with app.router.lifespan_context(app):
+        # Tests get their own Database instance for direct row inspection.
+        external_db = Database(db_url)
+        async with AsyncClient(transport=transport, base_url="http://test", timeout=10.0) as client:
+            yield (client, external_db, app)
+        await external_db.dispose()
+
+
+# ---------------------------------------------------------------------------
+# Static file serving
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_conversation_page_served(
+    app_client: tuple[AsyncClient, Database, FastAPI],
+) -> None:
+    client, _db, _app = app_client
+    r = await client.get("/conversation.html")
+    assert r.status_code == 200
+    assert 'data-page="conversation"' in r.text
+    assert "message-input" in r.text
+
+
+# ---------------------------------------------------------------------------
+# POST /messages still 200 + background task fires
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_post_message_returns_ack_and_persists_user_row(
+    app_client: tuple[AsyncClient, Database, FastAPI], monkeypatch: pytest.MonkeyPatch
+) -> None:
+    client, db, _app = app_client
+
+    invocations: list[tuple[str, str]] = []
+
+    async def fake_invoke(
+        _db: Any,
+        _config: Any,
+        _personas: Any,
+        session_id: Any,
+        user_message: str,
+        *,
+        saver: Any = None,
+        chunk_queue: Any = None,
+    ) -> None:
+        invocations.append((str(session_id), user_message))
+
+    monkeypatch.setattr("my_deepagent.api.routes.sessions.invoke_session_agent", fake_invoke)
+
+    # Create a session.
+    r = await client.post(
+        "/api/sessions",
+        json={"persona_name": "default-interactive", "repo_path": str(Path.cwd())},
+    )
+    assert r.status_code == 200
+    sid = r.json()["session_id"]
+
+    # POST a message.
+    r2 = await client.post(f"/api/sessions/{sid}/messages", json={"content": "hello agent"})
+    assert r2.status_code == 200
+    assert r2.json()["state"] == "active"
+
+    # User row persisted synchronously.
+    async with db.session() as s:
+        rows = (
+            (
+                await s.execute(
+                    select(MessageRow).where(MessageRow.session_id == sid).order_by(MessageRow.seq)
+                )
+            )
+            .scalars()
+            .all()
+        )
+    assert len(rows) == 1
+    assert rows[0].role == "user"
+    assert rows[0].content == "hello agent"
+
+    # Give the event loop one cycle so the background task can fire.
+    await asyncio.sleep(0.05)
+    assert invocations == [(sid, "hello agent")]
+
+
+@pytest.mark.asyncio
+async def test_post_message_holds_task_ref_on_app_state(
+    app_client: tuple[AsyncClient, Database, FastAPI], monkeypatch: pytest.MonkeyPatch
+) -> None:
+    """Background task must be held on app.state.pending_invocations so the
+    GC + RUF006 don't drop it before completion."""
+    client, _db, app = app_client
+
+    started = asyncio.Event()
+    can_finish = asyncio.Event()
+
+    async def slow_invoke(*_a: Any, **_k: Any) -> None:
+        started.set()
+        await can_finish.wait()
+
+    monkeypatch.setattr("my_deepagent.api.routes.sessions.invoke_session_agent", slow_invoke)
+
+    r = await client.post(
+        "/api/sessions",
+        json={"persona_name": "default-interactive", "repo_path": str(Path.cwd())},
+    )
+    sid = r.json()["session_id"]
+    await client.post(f"/api/sessions/{sid}/messages", json={"content": "x"})
+
+    # Wait for the task to start.
+    await asyncio.wait_for(started.wait(), timeout=2.0)
+    # The pending_invocations set on the app should hold a reference.
+    pending = app.state.pending_invocations
+    assert len(pending) == 1
+    # Release the task and let the discard callback fire.
+    can_finish.set()
+    await asyncio.sleep(0.05)
+    assert len(app.state.pending_invocations) == 0
+
+
+# ---------------------------------------------------------------------------
+# End-to-end: assistant message materializes for SSE
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_background_invocation_persists_assistant_row(
+    app_client: tuple[AsyncClient, Database, FastAPI], monkeypatch: pytest.MonkeyPatch
+) -> None:
+    """When the runner finishes, an assistant MessageRow should be visible."""
+    client, db, _app = app_client
+
+    async def fake_invoke(
+        passed_db: Any,
+        _config: Any,
+        _personas: Any,
+        session_id: Any,
+        _user_message: str,
+        *,
+        saver: Any = None,
+        chunk_queue: Any = None,
+    ) -> None:
+        # Simulate what the real runner does: write an assistant MessageRow.
+        from datetime import UTC, datetime
+
+        from sqlalchemy import desc
+
+        async with passed_db.session() as s:
+            last = (
+                await s.execute(
+                    select(MessageRow.seq)
+                    .where(MessageRow.session_id == str(session_id))
+                    .order_by(desc(MessageRow.seq))
+                    .limit(1)
+                )
+            ).scalar_one_or_none() or 0
+            s.add(
+                MessageRow(
+                    session_id=str(session_id),
+                    seq=last + 1,
+                    role="assistant",
+                    content="(stubbed assistant reply)",
+                    tool_calls=None,
+                    token_count=5,
+                    is_summary=False,
+                    archived=False,
+                    ts=datetime.now(UTC).isoformat(timespec="seconds"),
+                )
+            )
+            await s.commit()
+
+    monkeypatch.setattr("my_deepagent.api.routes.sessions.invoke_session_agent", fake_invoke)
+
+    r = await client.post(
+        "/api/sessions",
+        json={"persona_name": "default-interactive", "repo_path": str(Path.cwd())},
+    )
+    sid = r.json()["session_id"]
+    await client.post(f"/api/sessions/{sid}/messages", json={"content": "ping"})
+    # Let the background task complete.
+    await asyncio.sleep(0.1)
+
+    # Verify the conversation now has both user + assistant rows.
+    async with db.session() as s:
+        rows = (
+            (
+                await s.execute(
+                    select(MessageRow).where(MessageRow.session_id == sid).order_by(MessageRow.seq)
+                )
+            )
+            .scalars()
+            .all()
+        )
+        sess_row = await s.get(InteractiveSessionRow, sid)
+    assert [r.role for r in rows] == ["user", "assistant"]
+    assert rows[1].content == "(stubbed assistant reply)"
+    assert sess_row is not None
+    assert sess_row.title is not None  # set from first user message
--- a/my-deepagent/tests/integration/test_instructions.py
+++ b/my-deepagent/tests/integration/test_instructions.py
@@ -0,0 +1,192 @@
+"""v0.3 PR #7 — MYDEEPAGENT.md instruction-file hierarchy tests.
+
+Covers:
+1. Global file is bootstrapped with template on first call (idempotent).
+2. Project file is NEVER auto-created — present iff user wrote it.
+3. `resolve_instruction_paths` orders global → project.
+4. Resolution is empty if global hasn't been bootstrapped yet.
+5. `build_agent` passes the combined list through to `deepagents.create_deep_agent(memory=...)`.
+"""
+
+from __future__ import annotations
+
+from pathlib import Path
+from typing import Any
+
+import pytest
+
+from my_deepagent.config import load_config
+from my_deepagent.instructions import (
+    INSTRUCTION_FILENAME,
+    ensure_global_instructions_initialized,
+    global_instructions_path,
+    project_instructions_path,
+    resolve_instruction_paths,
+)
+
+# ---------------------------------------------------------------------------
+# Bootstrap (global only)
+# ---------------------------------------------------------------------------
+
+
+def test_ensure_global_instructions_creates_template(tmp_path: Path) -> None:
+    cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
+    p = ensure_global_instructions_initialized(cfg)
+    assert p.is_file()
+    assert p.name == INSTRUCTION_FILENAME
+    body = p.read_text(encoding="utf-8")
+    assert "MYDEEPAGENT.md (global)" in body
+    assert "한국어" in body  # template is Korean by default
+
+
+def test_ensure_global_instructions_idempotent(tmp_path: Path) -> None:
+    cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
+    p = ensure_global_instructions_initialized(cfg)
+    p.write_text("custom content", encoding="utf-8")
+    # Second call must not overwrite user-edited content.
+    p2 = ensure_global_instructions_initialized(cfg)
+    assert p2 == p
+    assert p.read_text(encoding="utf-8") == "custom content"
+
+
+# ---------------------------------------------------------------------------
+# Project file behaviour
+# ---------------------------------------------------------------------------
+
+
+def test_project_instructions_never_auto_created(tmp_path: Path) -> None:
+    cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
+    repo = tmp_path / "repo"
+    repo.mkdir()
+    # Bootstrap global — must not touch project file.
+    ensure_global_instructions_initialized(cfg)
+    assert not project_instructions_path(repo).exists()
+
+
+# ---------------------------------------------------------------------------
+# resolve_instruction_paths
+# ---------------------------------------------------------------------------
+
+
+def test_resolve_paths_includes_only_existing_files(tmp_path: Path) -> None:
+    cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
+    repo = tmp_path / "repo"
+    repo.mkdir()
+
+    # No files exist → empty.
+    assert resolve_instruction_paths(cfg, repo) == []
+
+    # Only global.
+    g = ensure_global_instructions_initialized(cfg)
+    paths = resolve_instruction_paths(cfg, repo)
+    assert paths == [str(g.resolve())]
+
+    # Add project — order becomes global, project.
+    proj_file = project_instructions_path(repo)
+    proj_file.write_text("# project-specific", encoding="utf-8")
+    paths = resolve_instruction_paths(cfg, repo)
+    assert paths == [str(g.resolve()), str(proj_file.resolve())]
+
+
+def test_global_instructions_path_under_data_dir(tmp_path: Path) -> None:
+    cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
+    p = global_instructions_path(cfg)
+    assert p.parent == cfg.data_dir
+    assert p.name == INSTRUCTION_FILENAME
+
+
+def test_governance_bootstrap_creates_full_skeleton(tmp_path: Path) -> None:
+    """`bootstrap_user_dirs` materialises the user-wide layout (PR #7)."""
+    from my_deepagent.governance import bootstrap_user_dirs
+    from my_deepagent.memory import INDEX_FILENAME as MEMORY_INDEX_FILENAME
+
+    cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
+    bootstrap_user_dirs(cfg)
+
+    # Global MYDEEPAGENT.md created with template.
+    assert global_instructions_path(cfg).is_file()
+    # Global memory dir + MEMORY.md created.
+    global_mem = Path(cfg.data_dir) / "global" / "memory"
+    assert global_mem.is_dir()
+    assert (global_mem / MEMORY_INDEX_FILENAME).is_file()
+    # User skills dir created.
+    assert (Path(cfg.data_dir) / "skills").is_dir()
+    # Projects parent dir created.
+    assert (Path(cfg.data_dir) / "projects").is_dir()
+
+
+def test_governance_bootstrap_is_idempotent(tmp_path: Path) -> None:
+    from my_deepagent.governance import bootstrap_user_dirs
+
+    cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
+    bootstrap_user_dirs(cfg)
+    gpath = global_instructions_path(cfg)
+    gpath.write_text("custom edited content", encoding="utf-8")
+    # Second call must not overwrite user edits.
+    bootstrap_user_dirs(cfg)
+    assert gpath.read_text(encoding="utf-8") == "custom edited content"
+
+
+# ---------------------------------------------------------------------------
+# Integration: instruction paths reach deepagents memory= kwarg
+# ---------------------------------------------------------------------------
+
+
+def test_build_agent_receives_combined_instruction_and_memory_paths(
+    tmp_path: Path, monkeypatch: pytest.MonkeyPatch
+) -> None:
+    """`build_agent(memory_paths_override=[instructions..., memory...])` passes
+    the union through to `create_deep_agent(memory=...)`.  Mirrors what
+    InteractiveSession does at REPL bootstrap.
+    """
+    from my_deepagent import session as session_mod
+    from my_deepagent.persona import Persona
+
+    captured: dict[str, Any] = {}
+
+    def fake_create_deep_agent(**kwargs: Any) -> Any:
+        captured.update(kwargs)
+        return object()
+
+    monkeypatch.setattr(session_mod, "create_deep_agent", fake_create_deep_agent)
+
+    cfg = load_config(
+        workspace_root=tmp_path,
+        data_dir=tmp_path / "data",
+        openrouter_api_key="test-key",
+    )
+    repo = tmp_path / "repo"
+    repo.mkdir()
+
+    g = ensure_global_instructions_initialized(cfg)
+    proj_file = project_instructions_path(repo)
+    proj_file.write_text("# project rule", encoding="utf-8")
+
+    # Simulate a project memory entry.
+    mem_entry = tmp_path / "MEM.md"
+    mem_entry.write_text("# memory entry", encoding="utf-8")
+
+    persona = Persona(
+        name="test-persona",
+        version=1,
+        backend="openrouter",
+        model="openrouter:deepseek/deepseek-chat",
+        provider_origin="CN/DeepSeek",
+        capabilities=("code_edit",),
+        max_risk_level="high",
+        system_prompt="System prompt for test persona (must be ≥10 chars)",
+        deepagents_backend="state",
+    )
+    instruction_paths = resolve_instruction_paths(cfg, repo)
+    combined = [*instruction_paths, str(mem_entry.resolve())]
+
+    _agent = session_mod.build_agent(
+        persona,
+        cfg,
+        root_dir=repo,
+        memory_paths_override=combined,
+    )
+    assert "memory" in captured
+    # Global must come before project, project before mem entry — exact list match.
+    expected = [str(g.resolve()), str(proj_file.resolve()), str(mem_entry.resolve())]
+    assert captured["memory"] == expected
--- a/my-deepagent/tests/integration/test_memory.py
+++ b/my-deepagent/tests/integration/test_memory.py
@@ -71,23 +71,25 @@ def test_ensure_memory_initialized_is_idempotent(memory_dir: Path) -> None:


 def test_add_memory_entry_writes_file_and_updates_index(memory_dir: Path) -> None:
-    path = add_memory_entry(memory_dir, "프로젝트 핵심: 위크닥 CLI MVP")
-    assert path.is_file()
-    body = path.read_text(encoding="utf-8")
+    result = add_memory_entry(memory_dir, "프로젝트 핵심: 위크닥 CLI MVP")
+    assert result.path.is_file()
+    body = result.path.read_text(encoding="utf-8")
    assert "프로젝트 핵심" in body
-    assert body.startswith("---\nslug: ")
+    assert body.startswith("---\nname: ")
+    assert "type:" in body
+    assert result.scrubbed is False

    index = (memory_dir / INDEX_FILENAME).read_text(encoding="utf-8")
-    assert path.name in index
+    assert result.path.name in index
    assert "프로젝트 핵심" in index


 def test_add_memory_entry_handles_slug_collision(memory_dir: Path) -> None:
-    p1 = add_memory_entry(memory_dir, "Same first line")
-    p2 = add_memory_entry(memory_dir, "Same first line\nsecond entry body")
-    p3 = add_memory_entry(memory_dir, "Same first line\nthird entry body")
+    r1 = add_memory_entry(memory_dir, "Same first line")
+    r2 = add_memory_entry(memory_dir, "Same first line\nsecond entry body")
+    r3 = add_memory_entry(memory_dir, "Same first line\nthird entry body")
+    p1, p2, p3 = r1.path, r2.path, r3.path
    assert p1.name != p2.name != p3.name
-    # Auto-slugging should land on <slug>-2.md and <slug>-3.md.
    stems = sorted([p1.stem, p2.stem, p3.stem])
    assert stems[0] == "same-first-line"
    assert stems[1] == "same-first-line-2"
@@ -100,8 +102,34 @@ def test_add_memory_entry_rejects_empty_content(memory_dir: Path) -> None:


 def test_add_memory_entry_explicit_name_override(memory_dir: Path) -> None:
-    p = add_memory_entry(memory_dir, "Random body text", name="My Custom Slug!!")
-    assert p.stem == "my-custom-slug"
+    r = add_memory_entry(memory_dir, "Random body text", name="My Custom Slug!!")
+    assert r.path.stem == "my-custom-slug"
+
+
+def test_add_memory_entry_scrubs_openrouter_key(memory_dir: Path) -> None:
+    r = add_memory_entry(
+        memory_dir,
+        "save this for me: sk-or-v1-abcdefghijklmnop1234567890",
+    )
+    body = r.path.read_text(encoding="utf-8")
+    assert "sk-or-v1-abcdefghijklmnop" not in body
+    assert "<redacted:openrouter-key>" in body
+    assert r.scrubbed is True
+
+
+def test_add_memory_entry_infers_user_type(memory_dir: Path) -> None:
+    r = add_memory_entry(memory_dir, "I prefer fish shell over bash")
+    assert r.memory_type == "user"
+
+
+def test_add_memory_entry_infers_feedback_type(memory_dir: Path) -> None:
+    r = add_memory_entry(memory_dir, "don't mock the database in integration tests")
+    assert r.memory_type == "feedback"
+
+
+def test_add_memory_entry_explicit_type_overrides_heuristic(memory_dir: Path) -> None:
+    r = add_memory_entry(memory_dir, "I prefer fish shell", memory_type="reference")
+    assert r.memory_type == "reference"


 # ---------------------------------------------------------------------------
@@ -110,17 +138,17 @@ def test_add_memory_entry_explicit_name_override(memory_dir: Path) -> None:


 def test_remove_memory_entry_by_slug(memory_dir: Path) -> None:
-    p = add_memory_entry(memory_dir, "to be forgotten")
-    assert remove_memory_entry(memory_dir, p.stem) is True
-    assert not p.exists()
+    r = add_memory_entry(memory_dir, "to be forgotten")
+    assert remove_memory_entry(memory_dir, r.path.stem) is True
+    assert not r.path.exists()
    index_body = (memory_dir / INDEX_FILENAME).read_text(encoding="utf-8")
-    assert p.name not in index_body
+    assert r.path.name not in index_body


 def test_remove_memory_entry_by_filename(memory_dir: Path) -> None:
-    p = add_memory_entry(memory_dir, "to be forgotten by full filename")
-    assert remove_memory_entry(memory_dir, p.name) is True
-    assert not p.exists()
+    r = add_memory_entry(memory_dir, "to be forgotten by full filename")
+    assert remove_memory_entry(memory_dir, r.path.name) is True
+    assert not r.path.exists()


 def test_remove_memory_entry_missing_returns_false(memory_dir: Path) -> None:
--- a/my-deepagent/tests/integration/test_plan_mode.py
+++ b/my-deepagent/tests/integration/test_plan_mode.py
@@ -0,0 +1,181 @@
+"""v0.3 PR #5 — Plan mode tests.
+
+Covers:
+1. PlanModeMiddleware passes tool calls through when inactive.
+2. PlanModeMiddleware blocks write_file / edit_file / execute / task when active.
+3. read_file / glob / grep / write_todos are allowed regardless.
+4. Toggling the closure flag changes behavior without rebuilding the middleware.
+5. The synthetic ToolMessage carries status="error" and a clear hint.
+"""
+
+from __future__ import annotations
+
+from dataclasses import dataclass
+from typing import Any
+
+import pytest
+from langchain_core.messages import ToolMessage
+
+from my_deepagent.middleware.plan_mode import (
+    BLOCKED_TOOLS_IN_PLAN_MODE,
+    PlanModeMiddleware,
+)
+
+
+@dataclass
+class _FakeToolRequest:
+    """Minimal stand-in for langchain ToolCallRequest in unit tests."""
+
+    tool_call: dict[str, Any]
+
+
+async def _passthrough_handler(_: _FakeToolRequest) -> ToolMessage:
+    """Stub handler — returns a benign 'tool executed' message."""
+    return ToolMessage(content="EXECUTED", tool_call_id="t1", name="stub")
+
+
+# ---------------------------------------------------------------------------
+# Inactive plan-mode → all tools pass through
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_plan_mode_inactive_passes_through() -> None:
+    mw = PlanModeMiddleware(is_active=lambda: False)
+    for name in ["write_file", "edit_file", "execute", "task", "read_file", "glob"]:
+        req = _FakeToolRequest(tool_call={"name": name, "id": "t1", "args": {}})
+        result = await mw.awrap_tool_call(req, _passthrough_handler)
+        assert isinstance(result, ToolMessage)
+        assert result.content == "EXECUTED"
+        assert result.status != "error"
+
+
+# ---------------------------------------------------------------------------
+# Active plan-mode → write tools blocked with status=error
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_plan_mode_active_blocks_write_file() -> None:
+    mw = PlanModeMiddleware(is_active=lambda: True)
+    req = _FakeToolRequest(
+        tool_call={"name": "write_file", "id": "abc123", "args": {"file_path": "/tmp/x"}}
+    )
+    result = await mw.awrap_tool_call(req, _passthrough_handler)
+    assert isinstance(result, ToolMessage)
+    assert result.status == "error"
+    assert result.tool_call_id == "abc123"
+    assert "Plan-mode" in result.content
+    assert "write_file" in result.content
+
+
+@pytest.mark.asyncio
+async def test_plan_mode_active_blocks_execute() -> None:
+    mw = PlanModeMiddleware(is_active=lambda: True)
+    req = _FakeToolRequest(tool_call={"name": "execute", "id": "exec1", "args": {"command": "ls"}})
+    result = await mw.awrap_tool_call(req, _passthrough_handler)
+    assert isinstance(result, ToolMessage)
+    assert result.status == "error"
+    assert "execute" in result.content
+
+
+@pytest.mark.asyncio
+async def test_plan_mode_active_blocks_task_subagent_spawn() -> None:
+    mw = PlanModeMiddleware(is_active=lambda: True)
+    req = _FakeToolRequest(tool_call={"name": "task", "id": "task1", "args": {"description": "x"}})
+    result = await mw.awrap_tool_call(req, _passthrough_handler)
+    assert isinstance(result, ToolMessage)
+    assert result.status == "error"
+    assert "task" in result.content
+
+
+# ---------------------------------------------------------------------------
+# Active plan-mode → read-only tools still pass through
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_plan_mode_active_allows_read_only_tools() -> None:
+    mw = PlanModeMiddleware(is_active=lambda: True)
+    for name in ["read_file", "glob", "grep", "ls"]:
+        req = _FakeToolRequest(tool_call={"name": name, "id": "t1", "args": {}})
+        result = await mw.awrap_tool_call(req, _passthrough_handler)
+        assert result.content == "EXECUTED", f"{name} should not be blocked"
+        assert result.status != "error"
+
+
+@pytest.mark.asyncio
+async def test_plan_mode_blocks_write_todos() -> None:
+    """`write_todos` is part of the plan markdown — must be blocked."""
+    mw = PlanModeMiddleware(is_active=lambda: True)
+    req = _FakeToolRequest(tool_call={"name": "write_todos", "id": "wt1", "args": {"todos": []}})
+    result = await mw.awrap_tool_call(req, _passthrough_handler)
+    assert isinstance(result, ToolMessage)
+    assert result.status == "error"
+    assert "write_todos" in result.content
+
+
+# ---------------------------------------------------------------------------
+# Closure-toggle behavior — flip without rebuild
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_plan_mode_closure_toggle_changes_behavior() -> None:
+    state = {"on": False}
+    mw = PlanModeMiddleware(is_active=lambda: state["on"])
+
+    req = _FakeToolRequest(tool_call={"name": "write_file", "id": "w", "args": {}})
+
+    # Off → passes.
+    r1 = await mw.awrap_tool_call(req, _passthrough_handler)
+    assert r1.status != "error"
+
+    # Flip on → blocks.
+    state["on"] = True
+    r2 = await mw.awrap_tool_call(req, _passthrough_handler)
+    assert r2.status == "error"
+
+    # Flip back off → passes again.
+    state["on"] = False
+    r3 = await mw.awrap_tool_call(req, _passthrough_handler)
+    assert r3.status != "error"
+
+
+# ---------------------------------------------------------------------------
+# Sync path mirrors async path
+# ---------------------------------------------------------------------------
+
+
+def test_plan_mode_sync_wrap_tool_call() -> None:
+    mw = PlanModeMiddleware(is_active=lambda: True)
+
+    def sync_handler(_: _FakeToolRequest) -> ToolMessage:
+        return ToolMessage(content="EXECUTED", tool_call_id="t1", name="stub")
+
+    req = _FakeToolRequest(tool_call={"name": "write_file", "id": "s1", "args": {}})
+    result = mw.wrap_tool_call(req, sync_handler)
+    assert isinstance(result, ToolMessage)
+    assert result.status == "error"
+
+
+# ---------------------------------------------------------------------------
+# Blocklist constant sanity
+# ---------------------------------------------------------------------------
+
+
+def test_blocklist_includes_all_known_write_tools() -> None:
+    assert "write_file" in BLOCKED_TOOLS_IN_PLAN_MODE
+    assert "edit_file" in BLOCKED_TOOLS_IN_PLAN_MODE
+    assert "execute" in BLOCKED_TOOLS_IN_PLAN_MODE
+    assert "bash" in BLOCKED_TOOLS_IN_PLAN_MODE
+    assert "task" in BLOCKED_TOOLS_IN_PLAN_MODE
+
+
+def test_blocklist_excludes_read_only_tools() -> None:
+    for name in ("read_file", "glob", "grep", "ls"):
+        assert name not in BLOCKED_TOOLS_IN_PLAN_MODE
+
+
+def test_blocklist_includes_write_todos() -> None:
+    assert "write_todos" in BLOCKED_TOOLS_IN_PLAN_MODE
--- a/my-deepagent/tests/integration/test_skills.py
+++ b/my-deepagent/tests/integration/test_skills.py
@@ -201,6 +201,66 @@ def test_resolve_skill_sources_returns_user_dir(tmp_path: Path) -> None:
    assert sources[0] == str(user_skills_dir(cfg).resolve())


+def test_resolve_skill_sources_with_project_key_returns_both(tmp_path: Path) -> None:
+    from my_deepagent.skills import project_skills_dir
+
+    cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
+    sources = resolve_skill_sources(cfg, project_key="proj1234abcdef00")
+    assert sources == [
+        str(user_skills_dir(cfg).resolve()),
+        str(project_skills_dir(cfg, "proj1234abcdef00").resolve()),
+    ]
+
+
+def test_list_all_skills_project_overrides_global(tmp_path: Path) -> None:
+    from my_deepagent.skills import list_all_skills, project_skills_dir
+
+    cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
+    pk = "abc123def456ffff"
+    global_dir = user_skills_dir(cfg)
+    proj_dir = project_skills_dir(cfg, pk)
+    global_dir.mkdir(parents=True)
+    proj_dir.mkdir(parents=True)
+    _make_skill(global_dir, "shared", description="global-version")
+    _make_skill(proj_dir, "shared", description="project-version")
+    _make_skill(global_dir, "global-only", description="g")
+    _make_skill(proj_dir, "project-only", description="p")
+
+    skills = list_all_skills(cfg, pk)
+    by_name = {s.name: s for s in skills}
+    assert set(by_name.keys()) == {"shared", "global-only", "project-only"}
+    # Project overrides global on the shared name.
+    assert by_name["shared"].scope == "project"
+    assert by_name["shared"].description == "project-version"
+    assert by_name["global-only"].scope == "global"
+    assert by_name["project-only"].scope == "project"
+
+
+def test_find_skill_prefers_project_over_global(tmp_path: Path) -> None:
+    from my_deepagent.skills import find_skill, project_skills_dir
+
+    cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
+    pk = "f0f0f0f0f0f0f0f0"
+    global_dir = user_skills_dir(cfg)
+    proj_dir = project_skills_dir(cfg, pk)
+    global_dir.mkdir(parents=True)
+    proj_dir.mkdir(parents=True)
+    _make_skill(global_dir, "dup", description="g")
+    _make_skill(proj_dir, "dup", description="p")
+
+    skill = find_skill(cfg, pk, "dup")
+    assert skill is not None
+    assert skill.scope == "project"
+    assert skill.description == "p"
+
+
+def test_find_skill_missing_returns_none(tmp_path: Path) -> None:
+    from my_deepagent.skills import find_skill
+
+    cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
+    assert find_skill(cfg, "any-project-key", "nonexistent") is None
+
+
 # ---------------------------------------------------------------------------
 # Integration: build_agent threads skills sources to deepagents
 # ---------------------------------------------------------------------------
--- a/my-deepagent/tests/integration/test_subagents.py
+++ b/my-deepagent/tests/integration/test_subagents.py
@@ -0,0 +1,306 @@
+"""v0.3 PR #6 — Sub-agent session linkage tests.
+
+Covers:
+1. `spawn_subagent_session` creates a child row with correct parent_session_id,
+   depth = parent.depth + 1, inherited project_key.
+2. Depth limit `MAX_SUBAGENT_DEPTH` rejects further spawns.
+3. Spawn against ended/missing parent raises human_required errors.
+4. `list_subagents` returns direct children in start-order, excludes grandchildren.
+5. Persona upsert behaves correctly — same persona hash → same persona_id.
+"""
+
+from __future__ import annotations
+
+import uuid
+from collections.abc import AsyncIterator
+from datetime import UTC, datetime
+from pathlib import Path
+
+import pytest
+
+from my_deepagent.config import load_config
+from my_deepagent.errors import MyDeepAgentError
+from my_deepagent.persistence.db import Database
+from my_deepagent.persistence.models import (
+    AgentPersonaRow,
+    InteractiveSessionRow,
+)
+from my_deepagent.persona import Persona
+from my_deepagent.subagents import (
+    MAX_SUBAGENT_DEPTH,
+    list_subagents,
+    resolve_root_session_id,
+    spawn_subagent_session,
+)
+
+
+def _now() -> str:
+    return datetime.now(UTC).isoformat(timespec="seconds")
+
+
+def _make_persona(name: str = "spec-writer") -> Persona:
+    return Persona(
+        name=name,
+        version=1,
+        backend="openrouter",
+        model="openrouter:deepseek/deepseek-chat",
+        provider_origin="CN/DeepSeek",
+        capabilities=("spec_write",),
+        max_risk_level="medium",
+        system_prompt="System prompt — at least ten chars",
+        deepagents_backend="state",
+    )
+
+
+@pytest.fixture
+async def db_with_root(tmp_path: Path) -> AsyncIterator[tuple[Database, str]]:
+    """Database + one root InteractiveSessionRow with depth=0 + project_key='proj1234abcdef00'."""
+    db_url = f"sqlite+aiosqlite:///{tmp_path / 'subagent.sqlite3'}"
+    db = Database(db_url)
+    await db.init_schema()
+
+    persona_id = str(uuid.uuid4())
+    root_id = str(uuid.uuid4())
+
+    async with db.session() as s:
+        s.add(
+            AgentPersonaRow(
+                id=persona_id,
+                name="default-interactive",
+                version=1,
+                hash="parent-hash",
+                definition={"name": "default-interactive", "version": 1},
+                created_at=_now(),
+            )
+        )
+        s.add(
+            InteractiveSessionRow(
+                id=root_id,
+                persona_id=persona_id,
+                persona_hash="parent-hash",
+                started_at=_now(),
+                last_message_at=_now(),
+                state="active",
+                total_input_tokens=0,
+                total_output_tokens=0,
+                model="openrouter:deepseek/deepseek-chat",
+                project_key="proj1234abcdef00",
+                title="root",
+                plan_mode=False,
+                parent_session_id=None,
+                depth=0,
+            )
+        )
+        await s.commit()
+    try:
+        yield (db, root_id)
+    finally:
+        await db.dispose()
+
+
+# ---------------------------------------------------------------------------
+# Happy path
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_spawn_creates_child_with_inherited_project_key(
+    db_with_root: tuple[Database, str],
+) -> None:
+    db, root_id = db_with_root
+    persona = _make_persona()
+    child_id = await spawn_subagent_session(
+        db,
+        parent_session_id=uuid.UUID(root_id),
+        persona=persona,
+        initial_title="planner-1",
+    )
+    async with db.session() as s:
+        child = await s.get(InteractiveSessionRow, str(child_id))
+    assert child is not None
+    assert child.parent_session_id == root_id
+    assert child.depth == 1
+    assert child.project_key == "proj1234abcdef00"  # inherited
+    assert child.title == "planner-1"
+    assert child.state == "active"
+    assert child.plan_mode is False
+    assert child.persona_hash == persona.compute_hash()
+
+
+@pytest.mark.asyncio
+async def test_spawn_two_children_depth_one_each(
+    db_with_root: tuple[Database, str],
+) -> None:
+    db, root_id = db_with_root
+    persona = _make_persona()
+    child_a = await spawn_subagent_session(
+        db, parent_session_id=uuid.UUID(root_id), persona=persona
+    )
+    child_b = await spawn_subagent_session(
+        db, parent_session_id=uuid.UUID(root_id), persona=persona
+    )
+    async with db.session() as s:
+        a = await s.get(InteractiveSessionRow, str(child_a))
+        b = await s.get(InteractiveSessionRow, str(child_b))
+    assert a is not None and b is not None
+    assert a.depth == b.depth == 1
+    assert a.parent_session_id == b.parent_session_id == root_id
+
+
+# ---------------------------------------------------------------------------
+# Depth limit
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_spawn_rejects_beyond_max_depth(db_with_root: tuple[Database, str]) -> None:
+    db, root_id = db_with_root
+    persona = _make_persona()
+    current = uuid.UUID(root_id)
+    # Chain spawns down to MAX_SUBAGENT_DEPTH (root depth=0; spawn produces 1, 2, 3).
+    for expected_depth in range(1, MAX_SUBAGENT_DEPTH + 1):
+        new_child = await spawn_subagent_session(db, parent_session_id=current, persona=persona)
+        async with db.session() as s:
+            row = await s.get(InteractiveSessionRow, str(new_child))
+        assert row is not None
+        assert row.depth == expected_depth
+        current = new_child
+
+    # Now `current` has depth=MAX_SUBAGENT_DEPTH (3) → spawn must reject.
+    with pytest.raises(MyDeepAgentError) as exc_info:
+        await spawn_subagent_session(db, parent_session_id=current, persona=persona)
+    assert exc_info.value.code == "subagent_depth_exceeded"
+
+
+# ---------------------------------------------------------------------------
+# Invalid parent
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_spawn_missing_parent_raises(db_with_root: tuple[Database, str]) -> None:
+    db, _root_id = db_with_root
+    persona = _make_persona()
+    bogus = uuid.uuid4()
+    with pytest.raises(MyDeepAgentError) as exc_info:
+        await spawn_subagent_session(db, parent_session_id=bogus, persona=persona)
+    assert exc_info.value.code == "parent_session_missing"
+
+
+@pytest.mark.asyncio
+async def test_spawn_ended_parent_raises(db_with_root: tuple[Database, str]) -> None:
+    db, root_id = db_with_root
+    async with db.session() as s:
+        row = await s.get(InteractiveSessionRow, root_id)
+        assert row is not None
+        row.state = "ended"
+        await s.commit()
+    persona = _make_persona()
+    with pytest.raises(MyDeepAgentError) as exc_info:
+        await spawn_subagent_session(db, parent_session_id=uuid.UUID(root_id), persona=persona)
+    assert exc_info.value.code == "parent_session_ended"
+
+
+# ---------------------------------------------------------------------------
+# list_subagents
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_list_subagents_returns_direct_children_only(
+    db_with_root: tuple[Database, str],
+) -> None:
+    db, root_id = db_with_root
+    persona = _make_persona()
+
+    # root → child_a → grandchild
+    child_a = await spawn_subagent_session(
+        db, parent_session_id=uuid.UUID(root_id), persona=persona
+    )
+    child_b = await spawn_subagent_session(
+        db, parent_session_id=uuid.UUID(root_id), persona=persona
+    )
+    grandchild = await spawn_subagent_session(db, parent_session_id=child_a, persona=persona)
+
+    direct = await list_subagents(db, uuid.UUID(root_id))
+    ids = [r.id for r in direct]
+    assert str(child_a) in ids
+    assert str(child_b) in ids
+    assert str(grandchild) not in ids  # depth-2 not in direct children
+    assert len(direct) == 2
+
+
+@pytest.mark.asyncio
+async def test_list_subagents_no_children_returns_empty(
+    db_with_root: tuple[Database, str],
+) -> None:
+    db, root_id = db_with_root
+    direct = await list_subagents(db, uuid.UUID(root_id))
+    assert direct == []
+
+
+# ---------------------------------------------------------------------------
+# Persona upsert
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_resolve_root_session_id_walks_to_root(
+    db_with_root: tuple[Database, str],
+) -> None:
+    db, root_id = db_with_root
+    persona = _make_persona()
+    child = await spawn_subagent_session(db, parent_session_id=uuid.UUID(root_id), persona=persona)
+    grand = await spawn_subagent_session(db, parent_session_id=child, persona=persona)
+    great = await spawn_subagent_session(db, parent_session_id=grand, persona=persona)
+
+    assert (await resolve_root_session_id(db, uuid.UUID(root_id))) == uuid.UUID(root_id)
+    assert (await resolve_root_session_id(db, child)) == uuid.UUID(root_id)
+    assert (await resolve_root_session_id(db, grand)) == uuid.UUID(root_id)
+    assert (await resolve_root_session_id(db, great)) == uuid.UUID(root_id)
+
+
+@pytest.mark.asyncio
+async def test_resolve_root_session_id_missing_returns_input(
+    db_with_root: tuple[Database, str],
+) -> None:
+    db, _root_id = db_with_root
+    bogus = uuid.uuid4()
+    assert (await resolve_root_session_id(db, bogus)) == bogus
+
+
+@pytest.mark.asyncio
+async def test_spawn_reuses_persona_row_for_same_hash(
+    db_with_root: tuple[Database, str],
+) -> None:
+    db, root_id = db_with_root
+    persona = _make_persona("shared-persona")
+
+    child_a = await spawn_subagent_session(
+        db, parent_session_id=uuid.UUID(root_id), persona=persona
+    )
+    child_b = await spawn_subagent_session(
+        db, parent_session_id=uuid.UUID(root_id), persona=persona
+    )
+
+    async with db.session() as s:
+        a = await s.get(InteractiveSessionRow, str(child_a))
+        b = await s.get(InteractiveSessionRow, str(child_b))
+    assert a is not None and b is not None
+    assert a.persona_id == b.persona_id
+    assert a.persona_hash == b.persona_hash
+    # No duplicate AgentPersonaRow.
+    async with db.session() as s:
+        cfg = load_config(workspace_root=Path.cwd(), data_dir=Path.cwd() / "data")  # noqa: F841
+        from sqlalchemy import select
+
+        rows = (
+            (
+                await s.execute(
+                    select(AgentPersonaRow).where(AgentPersonaRow.hash == persona.compute_hash())
+                )
+            )
+            .scalars()
+            .all()
+        )
+        assert len(rows) == 1
--- a/my-deepagent/tests/integration/test_user_dirs.py
+++ b/my-deepagent/tests/integration/test_user_dirs.py
@@ -0,0 +1,204 @@
+"""v0.3 PR #9 — User-scope persona/workflow directory tests.
+
+Covers:
+1. `ensure_user_dirs_initialized` creates both directories (idempotent).
+2. `load_combined_personas` returns seed + user, deduplicated by (name, version).
+3. User entries override seed entries with the same key (last-wins).
+4. Malformed user persona files are logged + skipped (don't kill the REPL).
+5. `load_combined_workflows` mirrors the persona behaviour for workflow YAMLs.
+6. Empty user dirs → seed-only.
+"""
+
+from __future__ import annotations
+
+from pathlib import Path
+from textwrap import dedent
+
+from my_deepagent.config import load_config
+from my_deepagent.user_dirs import (
+    ensure_user_dirs_initialized,
+    load_combined_personas,
+    load_combined_workflows,
+    user_personas_dir,
+    user_workflows_dir,
+)
+
+
+def _write_persona_yaml(
+    target: Path,
+    *,
+    name: str,
+    version: int = 1,
+    model: str = "openrouter:deepseek/deepseek-chat",
+    backend: str = "openrouter",
+    capabilities: list[str] | None = None,
+) -> None:
+    target.parent.mkdir(parents=True, exist_ok=True)
+    caps = capabilities or ["code_edit"]
+    cap_lines = "\n".join(f"  - {c}" for c in caps)
+    target.write_text(
+        dedent(
+            f"""\
+            name: {name}
+            version: {version}
+            backend: {backend}
+            model: "{model}"
+            provider_origin: "CN/DeepSeek"
+            capabilities:
+            {cap_lines}
+            max_risk_level: medium
+            system_prompt: |
+              Test persona system prompt (must be ≥10 chars).
+            allowed_tools:
+              - read_file
+              - write_file
+            deepagents_backend: state
+            """
+        ),
+        encoding="utf-8",
+    )
+
+
+def _write_workflow_yaml(target: Path, *, name: str, version: int = 1) -> None:
+    target.parent.mkdir(parents=True, exist_ok=True)
+    target.write_text(
+        dedent(
+            f"""\
+            name: {name}
+            version: {version}
+            description: "test workflow {name}"
+            roles:
+              - id: writer
+                required_capabilities: [code_edit]
+            phases:
+              - key: p1
+                title: "first phase"
+                risk: medium
+                role: writer
+                gates: []
+                expected_artifact:
+                  path: artifacts/foo.md
+                  schema: text
+                instructions: "do something useful in this phase"
+            """
+        ),
+        encoding="utf-8",
+    )
+
+
+# ---------------------------------------------------------------------------
+# Bootstrap
+# ---------------------------------------------------------------------------
+
+
+def test_ensure_user_dirs_creates_both(tmp_path: Path) -> None:
+    cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
+    ensure_user_dirs_initialized(cfg)
+    assert user_personas_dir(cfg).is_dir()
+    assert user_workflows_dir(cfg).is_dir()
+
+
+def test_ensure_user_dirs_is_idempotent(tmp_path: Path) -> None:
+    cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
+    ensure_user_dirs_initialized(cfg)
+    # Drop a file to make sure repeat doesn't wipe it.
+    _write_persona_yaml(user_personas_dir(cfg) / "p.yaml", name="custom-test")
+    ensure_user_dirs_initialized(cfg)
+    assert (user_personas_dir(cfg) / "p.yaml").is_file()
+
+
+# ---------------------------------------------------------------------------
+# load_combined_personas
+# ---------------------------------------------------------------------------
+
+
+def test_load_combined_personas_returns_seed_only_when_no_user(tmp_path: Path) -> None:
+    cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
+    seed = tmp_path / "seed"
+    _write_persona_yaml(seed / "a.yaml", name="alpha")
+    _write_persona_yaml(seed / "b.yaml", name="bravo")
+    personas = load_combined_personas(cfg, seed)
+    names = sorted(p.name for p in personas)
+    assert names == ["alpha", "bravo"]
+
+
+def test_load_combined_personas_adds_user(tmp_path: Path) -> None:
+    cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
+    seed = tmp_path / "seed"
+    _write_persona_yaml(seed / "a.yaml", name="alpha")
+    _write_persona_yaml(user_personas_dir(cfg) / "user.yaml", name="my-custom")
+    personas = load_combined_personas(cfg, seed)
+    names = sorted(p.name for p in personas)
+    assert names == ["alpha", "my-custom"]
+
+
+def test_load_combined_personas_user_overrides_seed(tmp_path: Path) -> None:
+    cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
+    seed = tmp_path / "seed"
+    _write_persona_yaml(seed / "alpha.yaml", name="alpha", model="seed-model")
+    _write_persona_yaml(user_personas_dir(cfg) / "alpha.yaml", name="alpha", model="user-model")
+    personas = load_combined_personas(cfg, seed)
+    assert len(personas) == 1
+    assert personas[0].name == "alpha"
+    assert personas[0].model == "user-model"  # user wins
+
+
+def test_load_combined_personas_skips_malformed_user_file(tmp_path: Path) -> None:
+    cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
+    seed = tmp_path / "seed"
+    _write_persona_yaml(seed / "a.yaml", name="alpha")
+    bad = user_personas_dir(cfg) / "broken.yaml"
+    bad.parent.mkdir(parents=True, exist_ok=True)
+    bad.write_text("not: a valid: persona:::", encoding="utf-8")
+    # Should not raise — broken file is logged + skipped.
+    personas = load_combined_personas(cfg, seed)
+    # Seed alpha is still present.
+    assert any(p.name == "alpha" for p in personas)
+
+
+# ---------------------------------------------------------------------------
+# load_combined_workflows
+# ---------------------------------------------------------------------------
+
+
+def test_load_combined_workflows_seed_only(tmp_path: Path) -> None:
+    cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
+    seed = tmp_path / "wf-seed"
+    _write_workflow_yaml(seed / "a.yaml", name="wfa")
+    workflows = load_combined_workflows(cfg, seed)
+    names = sorted(t.name for (_p, t) in workflows)
+    assert names == ["wfa"]
+
+
+def test_load_combined_workflows_user_overrides_seed(tmp_path: Path) -> None:
+    cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
+    seed = tmp_path / "wf-seed"
+    _write_workflow_yaml(seed / "wfa.yaml", name="wfa", version=1)
+    _write_workflow_yaml(user_workflows_dir(cfg) / "wfa.yaml", name="wfa", version=1)
+    workflows = load_combined_workflows(cfg, seed)
+    # Dedupe by (name, version) — only the user version remains.
+    assert len(workflows) == 1
+    path, tpl = workflows[0]
+    assert tpl.name == "wfa"
+    assert path.parent == user_workflows_dir(cfg)
+
+
+def test_load_combined_workflows_user_adds_distinct(tmp_path: Path) -> None:
+    cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
+    seed = tmp_path / "wf-seed"
+    _write_workflow_yaml(seed / "a.yaml", name="wfa")
+    _write_workflow_yaml(user_workflows_dir(cfg) / "user.yaml", name="userwf")
+    workflows = load_combined_workflows(cfg, seed)
+    names = sorted(t.name for (_p, t) in workflows)
+    assert names == ["userwf", "wfa"]
+
+
+def test_load_combined_workflows_skips_malformed(tmp_path: Path) -> None:
+    cfg = load_config(workspace_root=tmp_path, data_dir=tmp_path / "data")
+    seed = tmp_path / "wf-seed"
+    _write_workflow_yaml(seed / "a.yaml", name="wfa")
+    bad = user_workflows_dir(cfg) / "broken.yaml"
+    bad.parent.mkdir(parents=True, exist_ok=True)
+    bad.write_text("not: a workflow:::", encoding="utf-8")
+    workflows = load_combined_workflows(cfg, seed)
+    assert any(t.name == "wfa" for (_p, t) in workflows)
--- a/my-deepagent/tests/integration/test_workflow_generator.py
+++ b/my-deepagent/tests/integration/test_workflow_generator.py
@@ -0,0 +1,202 @@
+"""v0.4 — Workflow generator UI + hot-reload tests.
+
+Covers:
+1. POST /api/workflows persists a YAML under <data_dir>/workflows/
+2. POST rejects malformed body with 422
+3. POST rejects duplicate (name, version) with 409
+4. GET /api/workflows hot-reloads when a new file appears
+5. GET /api/workflows hot-reloads when an existing file is edited
+6. /new-workflow.html serves with the page marker
+"""
+
+from __future__ import annotations
+
+from collections.abc import AsyncIterator
+from pathlib import Path
+
+import pytest
+import yaml
+from fastapi import FastAPI
+from httpx import ASGITransport, AsyncClient
+
+from my_deepagent.api.app import create_app
+from my_deepagent.config import load_config
+from my_deepagent.persistence.db import Database
+
+
+@pytest.fixture
+async def app_client(tmp_path: Path) -> AsyncIterator[tuple[AsyncClient, FastAPI, Path]]:
+    db_url = f"sqlite+aiosqlite:///{tmp_path / 'gen.sqlite3'}"
+    cfg = load_config(
+        workspace_root=tmp_path,
+        data_dir=tmp_path / "data",
+        database_url=db_url,
+    )
+    db = Database(db_url)
+    await db.init_schema()
+    await db.dispose()
+    app = create_app(cfg)
+    transport = ASGITransport(app=app)
+    async with app.router.lifespan_context(app):
+        async with AsyncClient(transport=transport, base_url="http://test", timeout=10.0) as client:
+            yield (client, app, cfg.data_dir)
+
+
+def _valid_body(name: str = "my-flow", version: int = 1) -> dict[str, object]:
+    return {
+        "name": name,
+        "version": version,
+        "description": "test workflow generator",
+        "roles": [{"id": "writer", "required_capabilities": ["code_edit"]}],
+        "phases": [
+            {
+                "key": "p1",
+                "title": "first phase",
+                "risk": "medium",
+                "role": "writer",
+                "instructions": "do something useful in this phase",
+            }
+        ],
+    }
+
+
+# ---------------------------------------------------------------------------
+# Static page
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_new_workflow_page_served(
+    app_client: tuple[AsyncClient, FastAPI, Path],
+) -> None:
+    client, _app, _dir = app_client
+    r = await client.get("/new-workflow.html")
+    assert r.status_code == 200
+    assert 'data-page="new-workflow"' in r.text
+    assert "워크플로우 템플릿 만들기" in r.text
+
+
+# ---------------------------------------------------------------------------
+# POST /api/workflows happy path
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_post_workflow_creates_yaml_under_data_dir(
+    app_client: tuple[AsyncClient, FastAPI, Path],
+) -> None:
+    client, _app, data_dir = app_client
+    r = await client.post("/api/workflows", json=_valid_body())
+    assert r.status_code == 201, r.text
+    body = r.json()
+    target = Path(body["path"])
+    assert target.is_file()
+    assert target.parent == data_dir / "workflows"
+    assert target.name == "my-flow@1.yaml"
+    parsed = yaml.safe_load(target.read_text(encoding="utf-8"))
+    assert parsed["name"] == "my-flow"
+    assert parsed["version"] == 1
+    assert parsed["phases"][0]["key"] == "p1"
+
+
+# ---------------------------------------------------------------------------
+# Validation rejection
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_post_workflow_rejects_missing_roles(
+    app_client: tuple[AsyncClient, FastAPI, Path],
+) -> None:
+    client, _app, _dir = app_client
+    bad = _valid_body()
+    bad["roles"] = []  # min_length=1 violation
+    r = await client.post("/api/workflows", json=bad)
+    assert r.status_code == 422
+
+
+@pytest.mark.asyncio
+async def test_post_workflow_rejects_phase_referencing_unknown_role(
+    app_client: tuple[AsyncClient, FastAPI, Path],
+) -> None:
+    client, _app, _dir = app_client
+    bad = _valid_body()
+    bad["phases"][0]["role"] = "ghost-role"  # type: ignore[index]
+    r = await client.post("/api/workflows", json=bad)
+    assert r.status_code == 422
+    assert "ghost-role" in r.text
+
+
+# ---------------------------------------------------------------------------
+# Duplicate refusal
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_post_workflow_rejects_duplicate_name_version(
+    app_client: tuple[AsyncClient, FastAPI, Path],
+) -> None:
+    client, _app, _dir = app_client
+    body = _valid_body("dup-flow", 1)
+    r1 = await client.post("/api/workflows", json=body)
+    assert r1.status_code == 201
+    r2 = await client.post("/api/workflows", json=body)
+    assert r2.status_code == 409
+    assert "already exists" in r2.text
+
+
+# ---------------------------------------------------------------------------
+# Hot-reload — new file appears in GET
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_get_workflows_hot_reloads_after_post(
+    app_client: tuple[AsyncClient, FastAPI, Path],
+) -> None:
+    client, _app, _dir = app_client
+    before = await client.get("/api/workflows")
+    before_names = {w["name"] for w in before.json()}
+    assert "fresh-flow" not in before_names
+
+    r = await client.post("/api/workflows", json=_valid_body("fresh-flow", 1))
+    assert r.status_code == 201
+
+    after = await client.get("/api/workflows")
+    after_names = {w["name"] for w in after.json()}
+    assert "fresh-flow" in after_names
+
+
+@pytest.mark.asyncio
+async def test_get_workflows_hot_reloads_after_external_file_drop(
+    app_client: tuple[AsyncClient, FastAPI, Path],
+) -> None:
+    """Even when the file is dropped directly into the dir (not via POST),
+    the next GET picks it up via the mtime fingerprint."""
+    from textwrap import dedent
+
+    client, _app, data_dir = app_client
+    wf_dir = data_dir / "workflows"
+    wf_dir.mkdir(parents=True, exist_ok=True)
+    (wf_dir / "external@1.yaml").write_text(
+        dedent(
+            """\
+            name: external
+            version: 1
+            description: dropped by hand
+            roles:
+              - id: writer
+                required_capabilities: [code_edit]
+            phases:
+              - key: p1
+                title: only phase
+                risk: low
+                role: writer
+                instructions: just write something to disk
+            """
+        ),
+        encoding="utf-8",
+    )
+    r = await client.get("/api/workflows")
+    names = {w["name"] for w in r.json()}
+    assert "external" in names