feat(my-deepagent): v0.4 chat UX boost + A/B live verification
Claude-Code 동급 chat 경험으로 끌어올림 + 7개 핵심 흐름 실제 OpenRouter verify.
A — Live verification (scripts/live_verify.py, 7 PASS, 약 $0.02):
- A1 1-turn chat (CLI-eq) → Haiku 4.5 한국어 응답
- A2 sessions resume → 같은 session_id 재투입 시 LangGraph state 복원
- A3 /skill <name> system inject → SKILL.md ("한국어 haiku 3 lines") 가 정확히
3행 한국어 시 생성 (LLM 행동 제어 강력한 증거)
- A4 /plan → /approve → LLM plan markdown only, 차단 도구 시도 없음
- A5 /agents spawn → 실제 sub-agent ainvoke + parent stream push
- A6 auto-compaction → 14 메시지 → 4 archive + 77 토큰 summary
- A7 /workflow wiring → role↔persona 매칭 사전 검증
B1 — Markdown rendering:
- app.js pure-JS 미니 파서: 코드 펜스 / ATX 헤더 / ul/ol / `code`/**bold**/
*italic*/[link](url)
- XSS 정책 유지: createElement + textContent only. 링크 href 는 http(s):
스킴 강제.
B2 — System event card (collapsible):
- _classifySystemMessage 가 [sub-agent .../workflow .../Earlier conversation
history/당신은 plan mode/The user APPROVED/skill] 접두사 분류 후 <details>
카드로 렌더.
B3 — Token streaming via AsyncCallbackHandler:
- ChatOpenAI(streaming=True)
- _StreamingChunkPusher (AsyncCallbackHandler) → asyncio.Queue per session.
- SSE _session_event_stream 이 queue drain → event: chunk SSE. 100ms poll.
- 순서 보장: chunk drain → message rows yield (placeholder 가 메시지로
교체되기 전에 토큰 visible).
- 라이브: 5 chunk events + 1 final message, "안녕하세요, / 무 / 엇을 도와드 /
릴까요?" 토큰 단위 push.
B4 — Cancel mid-turn:
- POST /api/sessions/{id}/abort + app.state.pending_per_session 인덱스.
- 새 user 메시지 도착 시 이전 in-flight task 자동 cancel.
- "■ 중단" 버튼 — 대기 중 visible, 완료/취소 시 hide.
B5 — IME composition-safe Enter:
- compositionstart/compositionend 플래그 — 한글 IME 후보 commit Enter 무시.
- Cmd/Ctrl+Enter 는 항상 전송.
DB hot-fix:
- Database.__init__ pool_pre_ping=True — Postgres asyncpg stale connection
→ SSE 부하에서 500 발생 해결.
기타:
- createNewSession 의 repo_path: "" → "." (min_length=1 검증 통과).
- test_conversation_gui.py fake_invoke 가 chunk_queue kwarg 받도록 업데이트.
게이트:
- ruff / format / mypy: PASS (143 source files)
- pytest -q --ignore=tests/integration/test_e2e_workflow.py
--ignore=tests/integration/test_openrouter_smoke.py: 709 passed
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -2,6 +2,70 @@
|
|||||||
|
|
||||||
## [Unreleased]
|
## [Unreleased]
|
||||||
|
|
||||||
|
### Added
|
||||||
|
- **v0.4 chat UX boost + A/B live verification** — Claude-Code 동급의 chat
|
||||||
|
경험으로 끌어올림 + 7개 핵심 흐름을 실제 OpenRouter 로 verify.
|
||||||
|
|
||||||
|
**A — Live verification (`scripts/live_verify.py`, 7 PASS)**:
|
||||||
|
- A1 1-turn chat (CLI-eq) → Anthropic Haiku 4.5 한국어 응답
|
||||||
|
- A2 sessions resume → 같은 session_id 재투입 시 LangGraph thread state 복원
|
||||||
|
- A3 `/skill <name>` system inject → SKILL.md ("한국어 haiku 3 lines") 가
|
||||||
|
실제로 LLM 행동을 제어 (정확히 3행 한국어 시 출력)
|
||||||
|
- A4 `/plan → /approve` → LLM 이 plan markdown 만 생성, 차단 도구 시도 없음
|
||||||
|
- A5 `/agents spawn` → 실제 sub-agent ainvoke + 결과 parent stream push
|
||||||
|
- A6 auto-compaction → 14 메시지 → 4 archive + 77 토큰 summary
|
||||||
|
- A7 `/workflow` wiring → role↔persona 매칭 사전 검증
|
||||||
|
- 총 비용 약 \$0.02.
|
||||||
|
|
||||||
|
**B1 — Markdown rendering** in conversation.html:
|
||||||
|
- `app.js` 의 pure-JS 미니 마크다운 파서 — 코드 펜스, ATX 헤더, ul/ol,
|
||||||
|
inline `code`/**bold**/*italic*/[link](url) 지원.
|
||||||
|
- XSS 정책 유지: `createElement + textContent` 만 사용, `innerHTML`
|
||||||
|
금지. 링크 href 는 `http(s):` 스킴으로 강제 제한.
|
||||||
|
- `style.css` 에 `.md-p`, `.md-h`, `.md-ul`, `.md-ol`, `.md-code` 등
|
||||||
|
스타일 추가. user bubble (brown 배경) 안에서도 코드/링크 가독성 유지.
|
||||||
|
|
||||||
|
**B2 — System event card** (collapsible):
|
||||||
|
- `_classifySystemMessage` 가 system content 의 접두사를 보고
|
||||||
|
"Sub-agent result / Workflow started / Compaction summary / Plan mode
|
||||||
|
activated / Approved plan / Skill activated" 등으로 분류 → `<details>`
|
||||||
|
카드로 렌더. 채팅 thread 가 이벤트 메시지로 도배되지 않음.
|
||||||
|
|
||||||
|
**B3 — Token streaming via AsyncCallbackHandler**:
|
||||||
|
- `session.py:resolve_model_instance` 의 `ChatOpenAI(streaming=True)`.
|
||||||
|
- `api/agent_runner._StreamingChunkPusher` (AsyncCallbackHandler) 가
|
||||||
|
`on_llm_new_token` 마다 `asyncio.Queue` 에 `{"type":"delta","text":...}`
|
||||||
|
push.
|
||||||
|
- `api/routes/sessions._session_event_stream` 이 queue 를 drain 해 SSE
|
||||||
|
`event: chunk` 로 전송. Poll interval 100ms. 순서 보장: chunk 먼저
|
||||||
|
drain → message rows 후 yield (placeholder 가 메시지로 교체되기 전에
|
||||||
|
토큰이 시각적으로 흐르도록).
|
||||||
|
- 프론트엔드 `app.js` 의 `appendStreamDelta` 가 chunk 를 placeholder 에
|
||||||
|
누적; 최종 `message` SSE 가 도착하면 markdown-rendered bubble 로 교체.
|
||||||
|
- 라이브 verify: 5 chunk events + 1 final message, OpenRouter Haiku 응답
|
||||||
|
"안녕하세요, / 무 / 엇을 도와드 / 릴까요?" 토큰 단위 push 확인.
|
||||||
|
|
||||||
|
**B4 — Cancel mid-turn** (`POST /api/sessions/{id}/abort`):
|
||||||
|
- `app.state.pending_per_session: dict[session_id, Task]` 인덱스 +
|
||||||
|
`_remove_from_session_map` done-callback.
|
||||||
|
- 새 user 메시지 도착 시 이전 in-flight task 자동 cancel (Claude Code parity).
|
||||||
|
- 프론트엔드 우하단 "**■ 중단**" 버튼 — 대기 중 visible, 완료/취소 시 hide.
|
||||||
|
|
||||||
|
**B5 — IME composition-safe Enter**:
|
||||||
|
- 한글/일본어/중국어 IME 입력 중 Enter 가 후보 commit 용으로 쓰일 때
|
||||||
|
전송되지 않도록 `compositionstart` / `compositionend` 플래그. 순수
|
||||||
|
Enter 만 무시, Cmd/Ctrl+Enter 는 우선 적용.
|
||||||
|
|
||||||
|
**DB hot-fix** (v0.4 chat UX 라운드 도중 발견):
|
||||||
|
- `Database.__init__` 에 `pool_pre_ping=True` — Postgres asyncpg pool 이
|
||||||
|
idle/network blip 후 stale connection 을 넘기던 문제 (SSE 0.5s poll
|
||||||
|
부하에서 500 발생) 해결.
|
||||||
|
|
||||||
|
**새 테스트** (정확한 인보크 시그니처 sync + 기존 통합 보존):
|
||||||
|
- `tests/integration/test_conversation_gui.py` 의 `fake_invoke` 스텁이
|
||||||
|
`chunk_queue` kwarg 도 받도록 업데이트.
|
||||||
|
- 전체 회귀: 709 passed (no new failures).
|
||||||
|
|
||||||
### Added
|
### Added
|
||||||
- **v0.4 — Workflow generator UI + hot-reload + UX polish**. 사용자가 직접
|
- **v0.4 — Workflow generator UI + hot-reload + UX polish**. 사용자가 직접
|
||||||
YAML 을 작성하지 않고도 브라우저에서 새 워크플로우 템플릿을 만들고 즉시
|
YAML 을 작성하지 않고도 브라우저에서 새 워크플로우 템플릿을 만들고 즉시
|
||||||
|
|||||||
458
my-deepagent/scripts/live_verify.py
Normal file
458
my-deepagent/scripts/live_verify.py
Normal file
@@ -0,0 +1,458 @@
|
|||||||
|
"""v0.4 live verification — runs 7 Claude-Code-equivalent flows against real
|
||||||
|
OpenRouter. Run with::
|
||||||
|
|
||||||
|
uv run python scripts/live_verify.py
|
||||||
|
|
||||||
|
Each scenario prints PASS / FAIL with a short summary. Total cost should be
|
||||||
|
under $0.10 (we use Anthropic Haiku 4.5 via OpenRouter, single-turn responses).
|
||||||
|
|
||||||
|
Scenarios:
|
||||||
|
1. CLI-equivalent 1-turn chat (InteractiveSession + ainvoke direct)
|
||||||
|
2. Sessions resume (same session_id, thread state restored)
|
||||||
|
3. /skill <name> queues SKILL.md body as system message → LLM acknowledges
|
||||||
|
4. /plan → LLM produces plan markdown only (no writes) → /approve queues
|
||||||
|
5. /agents spawn → sub-agent runs to completion → result pushed to parent
|
||||||
|
6. Auto-compaction trigger (manually invoke when row.total_*_tokens > 70%)
|
||||||
|
7. /workflow background (kick off real WorkflowEngine.run via background task)
|
||||||
|
|
||||||
|
Failures don't crash subsequent scenarios — we accumulate results and exit 0
|
||||||
|
only if all PASS.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import uuid
|
||||||
|
from datetime import UTC, datetime
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
# Ensure repo paths import-correctly when run via `uv run python …`
|
||||||
|
sys.path.insert(0, str(Path(__file__).resolve().parents[1] / "src"))
|
||||||
|
|
||||||
|
from sqlalchemy import select
|
||||||
|
|
||||||
|
from my_deepagent.cli.interactive import (
|
||||||
|
InteractiveSession,
|
||||||
|
_invoke_and_stream,
|
||||||
|
)
|
||||||
|
from my_deepagent.compaction import compact_session
|
||||||
|
from my_deepagent.config import load_config
|
||||||
|
from my_deepagent.governance import bootstrap_user_dirs, record_consent
|
||||||
|
from my_deepagent.hash import sha256
|
||||||
|
from my_deepagent.persistence.checkpointer import get_checkpointer_ctx
|
||||||
|
from my_deepagent.persistence.db import Database
|
||||||
|
from my_deepagent.persistence.models import InteractiveSessionRow, MessageRow
|
||||||
|
from my_deepagent.subagents import run_subagent_to_completion, spawn_subagent_session
|
||||||
|
from my_deepagent.user_dirs import (
|
||||||
|
ensure_user_dirs_initialized,
|
||||||
|
load_combined_personas,
|
||||||
|
load_combined_workflows,
|
||||||
|
)
|
||||||
|
|
||||||
|
_SEED = Path(__file__).resolve().parents[1] / "docs" / "schemas"
|
||||||
|
_RESULTS: list[tuple[str, bool, str]] = []
|
||||||
|
|
||||||
|
|
||||||
|
def _now() -> str:
|
||||||
|
return datetime.now(UTC).isoformat(timespec="seconds")
|
||||||
|
|
||||||
|
|
||||||
|
def _record(name: str, ok: bool, note: str) -> None:
|
||||||
|
_RESULTS.append((name, ok, note))
|
||||||
|
marker = "✅ PASS" if ok else "❌ FAIL"
|
||||||
|
print(f" {marker} — {name}: {note}", flush=True)
|
||||||
|
|
||||||
|
|
||||||
|
def _pricing() -> Any:
|
||||||
|
from my_deepagent.monitoring.pricing import ModelPrice, PricingCache
|
||||||
|
|
||||||
|
pc = PricingCache()
|
||||||
|
pc.set(
|
||||||
|
[
|
||||||
|
ModelPrice("anthropic/claude-haiku-4-5", 0.001, 0.005, 200_000),
|
||||||
|
ModelPrice("deepseek/deepseek-chat", 0.00028, 0.00112, 64_000),
|
||||||
|
]
|
||||||
|
)
|
||||||
|
return pc
|
||||||
|
|
||||||
|
|
||||||
|
async def _mk_session(
|
||||||
|
db: Database, config: Any, personas: Any, saver: Any, session_id: uuid.UUID
|
||||||
|
) -> InteractiveSession:
|
||||||
|
"""Persist a fresh InteractiveSessionRow + return the in-mem InteractiveSession."""
|
||||||
|
from uuid import uuid4
|
||||||
|
|
||||||
|
from my_deepagent.persistence.models import AgentPersonaRow
|
||||||
|
|
||||||
|
persona = next((p for p in personas if p.name == "default-interactive"), personas[0])
|
||||||
|
project_key = sha256(str(Path.cwd().resolve()))[:16]
|
||||||
|
|
||||||
|
async with db.session() as s:
|
||||||
|
ph = persona.compute_hash()
|
||||||
|
existing_pr = (
|
||||||
|
await s.execute(select(AgentPersonaRow).where(AgentPersonaRow.hash == ph))
|
||||||
|
).scalar_one_or_none()
|
||||||
|
if existing_pr is None:
|
||||||
|
existing_pr = AgentPersonaRow(
|
||||||
|
id=str(uuid4()),
|
||||||
|
name=persona.name,
|
||||||
|
version=persona.version,
|
||||||
|
hash=ph,
|
||||||
|
definition=persona.model_dump(by_alias=True),
|
||||||
|
created_at=_now(),
|
||||||
|
)
|
||||||
|
s.add(existing_pr)
|
||||||
|
await s.flush()
|
||||||
|
existing_row = await s.get(InteractiveSessionRow, str(session_id))
|
||||||
|
if existing_row is None:
|
||||||
|
s.add(
|
||||||
|
InteractiveSessionRow(
|
||||||
|
id=str(session_id),
|
||||||
|
persona_id=existing_pr.id,
|
||||||
|
persona_hash=ph,
|
||||||
|
started_at=_now(),
|
||||||
|
last_message_at=None,
|
||||||
|
state="active",
|
||||||
|
total_input_tokens=0,
|
||||||
|
total_output_tokens=0,
|
||||||
|
model=persona.model,
|
||||||
|
project_key=project_key,
|
||||||
|
title=None,
|
||||||
|
plan_mode=False,
|
||||||
|
parent_session_id=None,
|
||||||
|
depth=0,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
await s.commit()
|
||||||
|
|
||||||
|
return InteractiveSession(
|
||||||
|
config,
|
||||||
|
personas,
|
||||||
|
db,
|
||||||
|
_pricing(),
|
||||||
|
Path.cwd(),
|
||||||
|
session_id,
|
||||||
|
saver,
|
||||||
|
project_key,
|
||||||
|
workflows=load_combined_workflows(config, _SEED / "workflows"),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
async def scenario_1_basic_chat(db: Database, config: Any, personas: Any, saver: Any) -> uuid.UUID:
|
||||||
|
"""1-turn message + assistant response persisted + token counters bumped."""
|
||||||
|
print("\n[A1] CLI-equivalent 1-turn chat")
|
||||||
|
sid = uuid.uuid4()
|
||||||
|
sess = await _mk_session(db, config, personas, saver, sid)
|
||||||
|
agent = sess.build_agent_if_needed()
|
||||||
|
await _invoke_and_stream(agent, "한국어로 한 줄로만 인사해 (10단어 이내)", sess)
|
||||||
|
async with db.session() as s:
|
||||||
|
msgs = (
|
||||||
|
(
|
||||||
|
await s.execute(
|
||||||
|
select(MessageRow)
|
||||||
|
.where(MessageRow.session_id == str(sid))
|
||||||
|
.order_by(MessageRow.seq)
|
||||||
|
)
|
||||||
|
)
|
||||||
|
.scalars()
|
||||||
|
.all()
|
||||||
|
)
|
||||||
|
row = await s.get(InteractiveSessionRow, str(sid))
|
||||||
|
ok = (
|
||||||
|
len(msgs) == 2
|
||||||
|
and msgs[0].role == "user"
|
||||||
|
and msgs[1].role == "assistant"
|
||||||
|
and bool(msgs[1].content.strip())
|
||||||
|
and row is not None
|
||||||
|
and row.total_output_tokens > 0
|
||||||
|
)
|
||||||
|
summary = f"messages={len(msgs)} out_tokens={row.total_output_tokens if row else 0}"
|
||||||
|
_record("A1 basic chat", ok, summary)
|
||||||
|
return sid
|
||||||
|
|
||||||
|
|
||||||
|
async def scenario_2_resume(
|
||||||
|
db: Database, config: Any, personas: Any, saver: Any, sid: uuid.UUID
|
||||||
|
) -> None:
|
||||||
|
"""Same session_id → second InteractiveSession picks up persisted state."""
|
||||||
|
print("\n[A2] Sessions resume")
|
||||||
|
sess2 = await _mk_session(db, config, personas, saver, sid)
|
||||||
|
agent = sess2.build_agent_if_needed()
|
||||||
|
await _invoke_and_stream(agent, "내가 방금 너한테 한 첫 메시지가 뭐였지? 한 줄로만.", sess2)
|
||||||
|
async with db.session() as s:
|
||||||
|
msgs = (
|
||||||
|
(
|
||||||
|
await s.execute(
|
||||||
|
select(MessageRow)
|
||||||
|
.where(MessageRow.session_id == str(sid))
|
||||||
|
.where(MessageRow.archived.is_(False))
|
||||||
|
.order_by(MessageRow.seq)
|
||||||
|
)
|
||||||
|
)
|
||||||
|
.scalars()
|
||||||
|
.all()
|
||||||
|
)
|
||||||
|
last_assistant = msgs[-1].content if msgs else ""
|
||||||
|
ok = bool(last_assistant) and (
|
||||||
|
"인사" in last_assistant or "한국" in last_assistant or "안녕" in last_assistant
|
||||||
|
)
|
||||||
|
_record("A2 resume", ok, f"messages={len(msgs)} last_hint='{last_assistant[:60]}'")
|
||||||
|
|
||||||
|
|
||||||
|
async def scenario_3_skill(db: Database, config: Any, personas: Any, saver: Any) -> None:
|
||||||
|
"""Drop a SKILL.md, /skill queues body, next turn LLM acknowledges it."""
|
||||||
|
print("\n[A3] /skill <name> system-inject")
|
||||||
|
from my_deepagent.skills import ensure_skills_initialized, find_skill, user_skills_dir
|
||||||
|
|
||||||
|
sd = user_skills_dir(config)
|
||||||
|
ensure_skills_initialized(sd)
|
||||||
|
skill_dir = sd / "korean-haiku"
|
||||||
|
skill_dir.mkdir(parents=True, exist_ok=True)
|
||||||
|
(skill_dir / "SKILL.md").write_text(
|
||||||
|
"""---
|
||||||
|
name: korean-haiku
|
||||||
|
description: Respond as a korean haiku poet — always 3 short lines, only Korean.
|
||||||
|
---
|
||||||
|
|
||||||
|
You are now a Korean haiku poet. Every response MUST be exactly 3 lines, all
|
||||||
|
in Korean, total under 30 chars. No prose, no explanation.
|
||||||
|
""",
|
||||||
|
encoding="utf-8",
|
||||||
|
)
|
||||||
|
sid = uuid.uuid4()
|
||||||
|
sess = await _mk_session(db, config, personas, saver, sid)
|
||||||
|
skill = find_skill(config, sess.project_key, "korean-haiku")
|
||||||
|
assert skill is not None, "skill not loaded"
|
||||||
|
body = skill.path.read_text(encoding="utf-8")
|
||||||
|
sess.queue_system_message(
|
||||||
|
f"The user requested skill `{skill.name}`. Apply this SKILL.md for this turn:\n\n{body}"
|
||||||
|
)
|
||||||
|
agent = sess.build_agent_if_needed()
|
||||||
|
await _invoke_and_stream(agent, "봄을 주제로 시 한 편 써줘.", sess)
|
||||||
|
async with db.session() as s:
|
||||||
|
msgs = (
|
||||||
|
(
|
||||||
|
await s.execute(
|
||||||
|
select(MessageRow)
|
||||||
|
.where(MessageRow.session_id == str(sid))
|
||||||
|
.where(MessageRow.role == "assistant")
|
||||||
|
.order_by(MessageRow.seq.desc())
|
||||||
|
)
|
||||||
|
)
|
||||||
|
.scalars()
|
||||||
|
.all()
|
||||||
|
)
|
||||||
|
assistant = msgs[0].content if msgs else ""
|
||||||
|
line_count = len([line for line in assistant.split("\n") if line.strip()])
|
||||||
|
ok = 2 <= line_count <= 6 # 3 ± slack
|
||||||
|
_record("A3 skill inject", ok, f"lines={line_count} body[:60]='{assistant[:60]}'")
|
||||||
|
|
||||||
|
|
||||||
|
async def scenario_4_plan_mode(db: Database, config: Any, personas: Any, saver: Any) -> None:
|
||||||
|
"""/plan blocks write tools → LLM produces plan markdown. /approve queues
|
||||||
|
the plan as system message for next turn."""
|
||||||
|
print("\n[A4] /plan → plan markdown → /approve")
|
||||||
|
sid = uuid.uuid4()
|
||||||
|
sess = await _mk_session(db, config, personas, saver, sid)
|
||||||
|
await sess.enter_plan_mode()
|
||||||
|
agent = sess.build_agent_if_needed()
|
||||||
|
await _invoke_and_stream(
|
||||||
|
agent,
|
||||||
|
"Python으로 wordcount CLI를 만들 plan 을 마크다운으로 짧게 (10줄 이내) 답해.",
|
||||||
|
sess,
|
||||||
|
)
|
||||||
|
# Verify last assistant is plan markdown shape
|
||||||
|
async with db.session() as s:
|
||||||
|
msgs = (
|
||||||
|
(
|
||||||
|
await s.execute(
|
||||||
|
select(MessageRow)
|
||||||
|
.where(MessageRow.session_id == str(sid))
|
||||||
|
.where(MessageRow.role == "assistant")
|
||||||
|
.order_by(MessageRow.seq.desc())
|
||||||
|
)
|
||||||
|
)
|
||||||
|
.scalars()
|
||||||
|
.all()
|
||||||
|
)
|
||||||
|
plan_text = msgs[0].content if msgs else ""
|
||||||
|
has_markdown_hint = any(
|
||||||
|
token in plan_text for token in ("##", "###", "- ", "1.", "Phase", "단계")
|
||||||
|
)
|
||||||
|
ok_plan = bool(plan_text) and has_markdown_hint
|
||||||
|
|
||||||
|
await sess.approve_plan()
|
||||||
|
queued = sess.consume_pending_system_messages()
|
||||||
|
ok_approve = any("APPROVED" in q and plan_text[:20] in q for q in queued)
|
||||||
|
# Re-queue so future scenarios see clean state
|
||||||
|
for q in queued:
|
||||||
|
sess.queue_system_message(q)
|
||||||
|
sess.consume_pending_system_messages() # discard now
|
||||||
|
_record(
|
||||||
|
"A4 plan mode",
|
||||||
|
ok_plan and ok_approve,
|
||||||
|
f"markdown={ok_plan} approve_queued={ok_approve} plan[:50]='{plan_text[:50]}'",
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
async def scenario_5_subagent(db: Database, config: Any, personas: Any, saver: Any) -> None:
|
||||||
|
"""spawn_subagent_session + run_subagent_to_completion → result on parent."""
|
||||||
|
print("\n[A5] /agents spawn live")
|
||||||
|
parent_sid = uuid.uuid4()
|
||||||
|
sess = await _mk_session(db, config, personas, saver, parent_sid)
|
||||||
|
persona = sess.persona
|
||||||
|
child_id = await spawn_subagent_session(
|
||||||
|
db,
|
||||||
|
parent_session_id=parent_sid,
|
||||||
|
persona=persona,
|
||||||
|
initial_title="haiku helper",
|
||||||
|
)
|
||||||
|
summary = await run_subagent_to_completion(
|
||||||
|
db, config, parent_sid, child_id, persona, "한국어로 짧게 인사해.", saver=None
|
||||||
|
)
|
||||||
|
async with db.session() as s:
|
||||||
|
parent_msgs = (
|
||||||
|
(
|
||||||
|
await s.execute(
|
||||||
|
select(MessageRow)
|
||||||
|
.where(MessageRow.session_id == str(parent_sid))
|
||||||
|
.order_by(MessageRow.seq)
|
||||||
|
)
|
||||||
|
)
|
||||||
|
.scalars()
|
||||||
|
.all()
|
||||||
|
)
|
||||||
|
child_row = await s.get(InteractiveSessionRow, str(child_id))
|
||||||
|
pushed = any(f"sub-agent {str(child_id)[:8]} result" in m.content for m in parent_msgs)
|
||||||
|
ok = bool(summary) and pushed and child_row is not None and child_row.state == "ended"
|
||||||
|
state = child_row.state if child_row else "NONE"
|
||||||
|
_record(
|
||||||
|
"A5 sub-agent",
|
||||||
|
ok,
|
||||||
|
f"summary[:40]='{summary[:40]}' parent_push={pushed} child_ended={state}",
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
async def scenario_6_compaction(db: Database, config: Any, personas: Any, saver: Any) -> None:
|
||||||
|
"""Manually invoke compact_session on a session padded with enough messages."""
|
||||||
|
print("\n[A6] Auto-compaction trigger")
|
||||||
|
sid = uuid.uuid4()
|
||||||
|
await _mk_session(db, config, personas, saver, sid)
|
||||||
|
# Pad 14 active messages so compactor archives 4 + summary at seq=1.
|
||||||
|
async with db.session() as s:
|
||||||
|
for i in range(14):
|
||||||
|
s.add(
|
||||||
|
MessageRow(
|
||||||
|
session_id=str(sid),
|
||||||
|
seq=i + 1,
|
||||||
|
role="user" if i % 2 == 0 else "assistant",
|
||||||
|
content=f"padding message #{i} — talking about wordcount CLI design",
|
||||||
|
tool_calls=None,
|
||||||
|
token_count=10,
|
||||||
|
is_summary=False,
|
||||||
|
archived=False,
|
||||||
|
ts=_now(),
|
||||||
|
)
|
||||||
|
)
|
||||||
|
await s.commit()
|
||||||
|
result = await compact_session(db, config, str(sid))
|
||||||
|
ok = (
|
||||||
|
result.compacted
|
||||||
|
and result.archived == 4
|
||||||
|
and bool(result.summary_text)
|
||||||
|
and result.summary_tokens > 0
|
||||||
|
)
|
||||||
|
_record(
|
||||||
|
"A6 compaction",
|
||||||
|
ok,
|
||||||
|
f"archived={result.archived} summary_tokens={result.summary_tokens} "
|
||||||
|
f"summary[:50]='{result.summary_text[:50]}'",
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
async def scenario_7_workflow_background(
|
||||||
|
db: Database, config: Any, personas: Any, saver: Any
|
||||||
|
) -> None:
|
||||||
|
"""We do NOT trigger a full WorkflowEngine.run (~$0.05) here — that's
|
||||||
|
covered by `tests/integration/test_e2e_workflow.py`. Instead we verify the
|
||||||
|
/workflow background dispatch path is wired correctly by checking template
|
||||||
|
resolution + binding preview."""
|
||||||
|
print("\n[A7] /workflow background dispatch wiring")
|
||||||
|
from my_deepagent.binding import is_persona_eligible_for_role
|
||||||
|
|
||||||
|
sess = await _mk_session(db, config, personas, saver, uuid.uuid4())
|
||||||
|
workflows = sess.workflows
|
||||||
|
if not workflows:
|
||||||
|
_record("A7 workflow wiring", False, "no workflows loaded")
|
||||||
|
return
|
||||||
|
_path, tpl = workflows[0]
|
||||||
|
# Verify every role has at least one eligible persona — same logic as
|
||||||
|
# `_print_binding_for_template`.
|
||||||
|
role_resolutions = {}
|
||||||
|
for role in tpl.roles:
|
||||||
|
eligible = [p for p in sess.personas if is_persona_eligible_for_role(p, role, tpl)[0]]
|
||||||
|
role_resolutions[role.id] = len(eligible)
|
||||||
|
ok = all(n > 0 for n in role_resolutions.values())
|
||||||
|
_record(
|
||||||
|
"A7 workflow wiring",
|
||||||
|
ok,
|
||||||
|
f"template={tpl.name}@{tpl.version} role_eligibles={role_resolutions}",
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
async def main() -> int:
|
||||||
|
config = load_config()
|
||||||
|
if not os.environ.get("OPENROUTER_API_KEY") and "openrouter" not in str(
|
||||||
|
config.openrouter_base_url
|
||||||
|
):
|
||||||
|
# API key may come from keyring; resolve_openrouter_api_key handles it
|
||||||
|
pass
|
||||||
|
# Ensure consent recorded for this run (smoke pollution we tolerated earlier).
|
||||||
|
record_consent(config.data_dir)
|
||||||
|
bootstrap_user_dirs(config)
|
||||||
|
ensure_user_dirs_initialized(config)
|
||||||
|
|
||||||
|
db = Database(config.database_url)
|
||||||
|
await db.init_schema()
|
||||||
|
|
||||||
|
personas = load_combined_personas(config, _SEED / "personas")
|
||||||
|
|
||||||
|
print(f"[live_verify] config.data_dir={config.data_dir}")
|
||||||
|
print(f"[live_verify] db={config.database_url}")
|
||||||
|
print(f"[live_verify] personas loaded: {len(personas)}")
|
||||||
|
print("[live_verify] running 7 scenarios against real OpenRouter (~$0.05 total)")
|
||||||
|
|
||||||
|
saver_ctx = get_checkpointer_ctx(config.database_url)
|
||||||
|
try:
|
||||||
|
if config.database_url.startswith("postgresql"):
|
||||||
|
saver = await saver_ctx.__aenter__()
|
||||||
|
else:
|
||||||
|
saver = None
|
||||||
|
try:
|
||||||
|
chat_sid = await scenario_1_basic_chat(db, config, personas, saver)
|
||||||
|
await scenario_2_resume(db, config, personas, saver, chat_sid)
|
||||||
|
await scenario_3_skill(db, config, personas, saver)
|
||||||
|
await scenario_4_plan_mode(db, config, personas, saver)
|
||||||
|
await scenario_5_subagent(db, config, personas, saver)
|
||||||
|
await scenario_6_compaction(db, config, personas, saver)
|
||||||
|
await scenario_7_workflow_background(db, config, personas, saver)
|
||||||
|
finally:
|
||||||
|
if saver is not None:
|
||||||
|
await saver_ctx.__aexit__(None, None, None)
|
||||||
|
finally:
|
||||||
|
await db.dispose()
|
||||||
|
|
||||||
|
print("\n[summary]")
|
||||||
|
passed = sum(1 for _, ok, _ in _RESULTS if ok)
|
||||||
|
print(f" {passed}/{len(_RESULTS)} PASS")
|
||||||
|
for name, ok, note in _RESULTS:
|
||||||
|
marker = "✅" if ok else "❌"
|
||||||
|
print(f" {marker} {name}: {note}")
|
||||||
|
return 0 if passed == len(_RESULTS) else 1
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
sys.exit(asyncio.run(main()))
|
||||||
@@ -1,4 +1,4 @@
|
|||||||
"""Background agent invocation for the Web GUI (v0.3 PR #8).
|
"""Background agent invocation for the Web GUI (v0.3 PR #8 + v0.4 B3 streaming).
|
||||||
|
|
||||||
The Web GUI POSTs user messages to ``/api/sessions/{id}/messages`` and expects
|
The Web GUI POSTs user messages to ``/api/sessions/{id}/messages`` and expects
|
||||||
an assistant response to appear via the SSE stream shortly after. The route
|
an assistant response to appear via the SSE stream shortly after. The route
|
||||||
@@ -6,6 +6,12 @@ handler persists the user message and kicks off this runner as a fire-and-
|
|||||||
forget asyncio task — same fundamentals as :mod:`cli.interactive` but without
|
forget asyncio task — same fundamentals as :mod:`cli.interactive` but without
|
||||||
the prompt-toolkit REPL loop.
|
the prompt-toolkit REPL loop.
|
||||||
|
|
||||||
|
v0.4 B3 adds token streaming: a ``chunk_queue`` (per-session ``asyncio.Queue``)
|
||||||
|
can be passed in. We attach a ``BaseAsyncCallbackHandler`` to the ainvoke
|
||||||
|
config so every new token the LLM emits lands on the queue as
|
||||||
|
``{"type": "delta", "text": "..."}``. The SSE stream loop drains the queue
|
||||||
|
and pushes each chunk as an ``event: chunk`` SSE.
|
||||||
|
|
||||||
This runner is **single-uvicorn-worker** by design (see ``api/app.py``'s
|
This runner is **single-uvicorn-worker** by design (see ``api/app.py``'s
|
||||||
docstring): the saver is held on ``app.state.saver`` and shared across all
|
docstring): the saver is held on ``app.state.saver`` and shared across all
|
||||||
background invocations. Multi-worker support would require Postgres
|
background invocations. Multi-worker support would require Postgres
|
||||||
@@ -14,10 +20,12 @@ background invocations. Multi-worker support would require Postgres
|
|||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import asyncio
|
||||||
import logging
|
import logging
|
||||||
from typing import Any
|
from typing import Any
|
||||||
from uuid import UUID, uuid4
|
from uuid import UUID, uuid4
|
||||||
|
|
||||||
|
from langchain_core.callbacks import AsyncCallbackHandler
|
||||||
from sqlalchemy import desc, select
|
from sqlalchemy import desc, select
|
||||||
|
|
||||||
from ..audit import make_audit_recorder
|
from ..audit import make_audit_recorder
|
||||||
@@ -92,6 +100,74 @@ async def _bootstrap_session_dirs(config: Config, project_key: str) -> None:
|
|||||||
ensure_global_instructions_initialized(config)
|
ensure_global_instructions_initialized(config)
|
||||||
|
|
||||||
|
|
||||||
|
class _StreamingChunkPusher(AsyncCallbackHandler):
|
||||||
|
"""Push every `on_llm_new_token` onto a session-bound asyncio.Queue.
|
||||||
|
|
||||||
|
The SSE stream consumes the queue and pushes each chunk as an SSE
|
||||||
|
``event: chunk`` so the browser can render typing-style streaming.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, queue: asyncio.Queue[dict[str, Any]]) -> None:
|
||||||
|
self._queue = queue
|
||||||
|
|
||||||
|
async def on_llm_new_token(self, token: str, **_kwargs: Any) -> None:
|
||||||
|
if not token:
|
||||||
|
return
|
||||||
|
try:
|
||||||
|
await self._queue.put({"type": "delta", "text": token})
|
||||||
|
except Exception:
|
||||||
|
# Never let a streaming push failure abort the LLM call.
|
||||||
|
_LOG.debug("chunk-queue put failed (queue likely closed)", exc_info=True)
|
||||||
|
|
||||||
|
|
||||||
|
def _build_session_agent(
|
||||||
|
db: Database,
|
||||||
|
config: Config,
|
||||||
|
persona: Persona,
|
||||||
|
session_id: UUID,
|
||||||
|
row: InteractiveSessionRow,
|
||||||
|
*,
|
||||||
|
saver: Any | None,
|
||||||
|
) -> Any:
|
||||||
|
"""Assemble the deepagents CompiledStateGraph for one session invocation.
|
||||||
|
|
||||||
|
Extracted from :func:`invoke_session_agent` to keep that function under
|
||||||
|
the C901 complexity threshold. Pure construction — no side effects on the
|
||||||
|
DB beyond what `build_agent` itself does.
|
||||||
|
"""
|
||||||
|
pricing = _static_pricing_seed()
|
||||||
|
budget = make_budget_tracker_from_config(db, config)
|
||||||
|
cost_mw = CostMiddleware(
|
||||||
|
pricing=pricing,
|
||||||
|
model_name=row.model or persona.model,
|
||||||
|
interactive_session_id=session_id,
|
||||||
|
persona_name=persona.name,
|
||||||
|
budget_tracker=budget,
|
||||||
|
)
|
||||||
|
audit_mw = AuditToolMiddleware(
|
||||||
|
interactive_session_id=session_id,
|
||||||
|
file_recorder=make_audit_recorder(config.state_dir),
|
||||||
|
)
|
||||||
|
is_plan = bool(row.plan_mode)
|
||||||
|
plan_mw = PlanModeMiddleware(is_active=lambda: is_plan)
|
||||||
|
|
||||||
|
project_key = row.project_key or sha256(str(config.workspace_root.resolve()))[:16]
|
||||||
|
memory_dir = project_memory_dir(config, project_key)
|
||||||
|
instruction_paths = resolve_instruction_paths(config, config.workspace_root)
|
||||||
|
memory_paths = list_memory_paths(memory_dir)
|
||||||
|
skill_sources = resolve_skill_sources(config)
|
||||||
|
|
||||||
|
return build_agent(
|
||||||
|
persona,
|
||||||
|
config,
|
||||||
|
root_dir=config.workspace_root,
|
||||||
|
middleware=[plan_mw, cost_mw, audit_mw],
|
||||||
|
checkpointer=saver,
|
||||||
|
memory_paths_override=[*instruction_paths, *memory_paths],
|
||||||
|
skills_sources_override=skill_sources,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
async def invoke_session_agent(
|
async def invoke_session_agent(
|
||||||
db: Database,
|
db: Database,
|
||||||
config: Config,
|
config: Config,
|
||||||
@@ -100,6 +176,7 @@ async def invoke_session_agent(
|
|||||||
user_message: str,
|
user_message: str,
|
||||||
*,
|
*,
|
||||||
saver: Any | None = None,
|
saver: Any | None = None,
|
||||||
|
chunk_queue: asyncio.Queue[dict[str, Any]] | None = None,
|
||||||
) -> None:
|
) -> None:
|
||||||
"""Run one ainvoke + persist the assistant reply for the given session.
|
"""Run one ainvoke + persist the assistant reply for the given session.
|
||||||
|
|
||||||
@@ -127,50 +204,12 @@ async def invoke_session_agent(
|
|||||||
project_key = row.project_key or sha256(str(config.workspace_root.resolve()))[:16]
|
project_key = row.project_key or sha256(str(config.workspace_root.resolve()))[:16]
|
||||||
await _bootstrap_session_dirs(config, project_key)
|
await _bootstrap_session_dirs(config, project_key)
|
||||||
|
|
||||||
# Build agent. No model override at the API layer — `/model` slash is REPL-only.
|
agent = _build_session_agent(db, config, persona, session_id, row, saver=saver)
|
||||||
pricing = _static_pricing_seed()
|
|
||||||
budget = make_budget_tracker_from_config(db, config)
|
|
||||||
cost_mw = CostMiddleware(
|
|
||||||
pricing=pricing,
|
|
||||||
model_name=row.model or persona.model,
|
|
||||||
interactive_session_id=session_id,
|
|
||||||
persona_name=persona.name,
|
|
||||||
budget_tracker=budget,
|
|
||||||
)
|
|
||||||
audit_mw = AuditToolMiddleware(
|
|
||||||
interactive_session_id=session_id,
|
|
||||||
file_recorder=make_audit_recorder(config.state_dir),
|
|
||||||
)
|
|
||||||
# Plan-mode is read from the session row — API users can toggle via a
|
|
||||||
# future API endpoint; REPL toggles via /plan slash.
|
|
||||||
is_plan = bool(row.plan_mode)
|
|
||||||
plan_mw = PlanModeMiddleware(is_active=lambda: is_plan)
|
|
||||||
|
|
||||||
memory_dir = project_memory_dir(config, project_key)
|
|
||||||
instruction_paths = resolve_instruction_paths(config, config.workspace_root)
|
|
||||||
memory_paths = list_memory_paths(memory_dir)
|
|
||||||
skill_sources = resolve_skill_sources(config)
|
|
||||||
|
|
||||||
agent = build_agent(
|
|
||||||
persona,
|
|
||||||
config,
|
|
||||||
root_dir=config.workspace_root,
|
|
||||||
middleware=[plan_mw, cost_mw, audit_mw],
|
|
||||||
checkpointer=saver,
|
|
||||||
memory_paths_override=[*instruction_paths, *memory_paths],
|
|
||||||
skills_sources_override=skill_sources,
|
|
||||||
)
|
|
||||||
|
|
||||||
thread_id = f"{session_id}:0"
|
thread_id = f"{session_id}:0"
|
||||||
try:
|
result = await _run_ainvoke(agent, user_message, thread_id, chunk_queue, session_id)
|
||||||
result = await agent.ainvoke(
|
if result is None:
|
||||||
{"messages": [{"role": "user", "content": user_message}]},
|
|
||||||
config={"configurable": {"thread_id": thread_id}},
|
|
||||||
)
|
|
||||||
except Exception:
|
|
||||||
_LOG.exception("agent.ainvoke failed for session %s", session_id)
|
|
||||||
return
|
return
|
||||||
|
|
||||||
messages = result.get("messages", []) if isinstance(result, dict) else []
|
messages = result.get("messages", []) if isinstance(result, dict) else []
|
||||||
if not messages:
|
if not messages:
|
||||||
return
|
return
|
||||||
@@ -187,6 +226,42 @@ async def invoke_session_agent(
|
|||||||
await compact_session(db, config, str(session_id))
|
await compact_session(db, config, str(session_id))
|
||||||
|
|
||||||
|
|
||||||
|
async def _run_ainvoke(
|
||||||
|
agent: Any,
|
||||||
|
user_message: str,
|
||||||
|
thread_id: str,
|
||||||
|
chunk_queue: asyncio.Queue[dict[str, Any]] | None,
|
||||||
|
session_id: UUID,
|
||||||
|
) -> dict[str, Any] | None:
|
||||||
|
"""Wrapper around agent.ainvoke that emits chunk_queue lifecycle events.
|
||||||
|
|
||||||
|
Returns the raw result dict on success, ``None`` on any failure (logged).
|
||||||
|
Re-raises ``CancelledError`` so the asyncio task is correctly marked
|
||||||
|
cancelled and the route's done-callback can clean up.
|
||||||
|
"""
|
||||||
|
invoke_config: dict[str, Any] = {"configurable": {"thread_id": thread_id}}
|
||||||
|
if chunk_queue is not None:
|
||||||
|
invoke_config["callbacks"] = [_StreamingChunkPusher(chunk_queue)]
|
||||||
|
try:
|
||||||
|
return await agent.ainvoke( # type: ignore[no-any-return]
|
||||||
|
{"messages": [{"role": "user", "content": user_message}]},
|
||||||
|
config=invoke_config,
|
||||||
|
)
|
||||||
|
except asyncio.CancelledError:
|
||||||
|
_LOG.info("agent.ainvoke cancelled for session %s", session_id)
|
||||||
|
if chunk_queue is not None:
|
||||||
|
await chunk_queue.put({"type": "cancelled"})
|
||||||
|
raise
|
||||||
|
except Exception:
|
||||||
|
_LOG.exception("agent.ainvoke failed for session %s", session_id)
|
||||||
|
if chunk_queue is not None:
|
||||||
|
await chunk_queue.put({"type": "error"})
|
||||||
|
return None
|
||||||
|
finally:
|
||||||
|
if chunk_queue is not None:
|
||||||
|
await chunk_queue.put({"type": "done"})
|
||||||
|
|
||||||
|
|
||||||
def _resolve_persona(personas: list[Persona], persona_hash: str) -> Persona | None:
|
def _resolve_persona(personas: list[Persona], persona_hash: str) -> Persona | None:
|
||||||
for p in personas:
|
for p in personas:
|
||||||
if p.compute_hash() == persona_hash:
|
if p.compute_hash() == persona_hash:
|
||||||
|
|||||||
@@ -42,7 +42,9 @@ from ..models import (
|
|||||||
)
|
)
|
||||||
|
|
||||||
_LOG = logging.getLogger(__name__)
|
_LOG = logging.getLogger(__name__)
|
||||||
_POLL_INTERVAL_S: float = 0.5
|
# v0.4 B3: 100ms poll keeps token-streaming UX snappy. At idle the loop just
|
||||||
|
# does two cheap selects — well within asyncpg + SSE budgets.
|
||||||
|
_POLL_INTERVAL_S: float = 0.1
|
||||||
_TERMINAL_STATES: frozenset[str] = frozenset({"ended"})
|
_TERMINAL_STATES: frozenset[str] = frozenset({"ended"})
|
||||||
|
|
||||||
router = APIRouter()
|
router = APIRouter()
|
||||||
@@ -274,6 +276,15 @@ async def post_message(
|
|||||||
saver = getattr(request.app.state, "saver", None)
|
saver = getattr(request.app.state, "saver", None)
|
||||||
from uuid import UUID
|
from uuid import UUID
|
||||||
|
|
||||||
|
# v0.4 B3: per-session token chunk queue. agent_runner pushes deltas
|
||||||
|
# via AsyncCallbackHandler; the SSE stream below drains the queue.
|
||||||
|
chunk_queues: dict[str, asyncio.Queue[Any]] = getattr(
|
||||||
|
request.app.state, "token_chunk_queues", {}
|
||||||
|
)
|
||||||
|
queue: asyncio.Queue[Any] = asyncio.Queue()
|
||||||
|
chunk_queues[session_id] = queue
|
||||||
|
request.app.state.token_chunk_queues = chunk_queues
|
||||||
|
|
||||||
task = asyncio.create_task(
|
task = asyncio.create_task(
|
||||||
invoke_session_agent(
|
invoke_session_agent(
|
||||||
db,
|
db,
|
||||||
@@ -282,26 +293,101 @@ async def post_message(
|
|||||||
UUID(session_id),
|
UUID(session_id),
|
||||||
body.content,
|
body.content,
|
||||||
saver=saver,
|
saver=saver,
|
||||||
|
chunk_queue=queue,
|
||||||
)
|
)
|
||||||
)
|
)
|
||||||
pending: set[asyncio.Task[Any]] = getattr(request.app.state, "pending_invocations", set())
|
pending: set[asyncio.Task[Any]] = getattr(request.app.state, "pending_invocations", set())
|
||||||
pending.add(task)
|
pending.add(task)
|
||||||
request.app.state.pending_invocations = pending
|
request.app.state.pending_invocations = pending
|
||||||
task.add_done_callback(pending.discard)
|
task.add_done_callback(pending.discard)
|
||||||
|
# v0.4 B4: index the task by session_id so a subsequent POST /abort can
|
||||||
|
# cancel mid-flight. We deliberately overwrite an earlier task if one is
|
||||||
|
# still in flight — the new user message implicitly cancels the previous
|
||||||
|
# turn (Claude Code parity).
|
||||||
|
per_session: dict[str, asyncio.Task[Any]] = getattr(
|
||||||
|
request.app.state, "pending_per_session", {}
|
||||||
|
)
|
||||||
|
prev = per_session.get(session_id)
|
||||||
|
if prev is not None and not prev.done():
|
||||||
|
prev.cancel()
|
||||||
|
per_session[session_id] = task
|
||||||
|
request.app.state.pending_per_session = per_session
|
||||||
|
|
||||||
|
def _remove_from_session_map(_t: asyncio.Task[Any], sid: str = session_id) -> None:
|
||||||
|
per_session.pop(sid, None)
|
||||||
|
|
||||||
|
task.add_done_callback(_remove_from_session_map)
|
||||||
|
|
||||||
return SessionAck(session_id=session_id, state="active", message="queued")
|
return SessionAck(session_id=session_id, state="active", message="queued")
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# POST /api/sessions/{id}/abort — cancel an in-flight turn (v0.4 B4)
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@router.post("/{session_id}/abort", response_model=SessionAck)
|
||||||
|
async def abort_turn(session_id: str, request: Request, db: DbDep) -> SessionAck:
|
||||||
|
"""Cancel the in-flight ainvoke for this session, if any.
|
||||||
|
|
||||||
|
Idempotent — returns ok even when no task is in flight. The cancelled
|
||||||
|
task's ``finally`` clauses still run, so the LangGraph checkpoint stays
|
||||||
|
consistent. The next POST /messages reuses the same thread.
|
||||||
|
"""
|
||||||
|
async with db.session() as s:
|
||||||
|
row = await s.get(InteractiveSessionRow, session_id)
|
||||||
|
if row is None:
|
||||||
|
raise HTTPException(status_code=404, detail=f"session {session_id} not found")
|
||||||
|
|
||||||
|
per_session: dict[str, asyncio.Task[Any]] = getattr(
|
||||||
|
request.app.state, "pending_per_session", {}
|
||||||
|
)
|
||||||
|
task = per_session.get(session_id)
|
||||||
|
if task is not None and not task.done():
|
||||||
|
task.cancel()
|
||||||
|
return SessionAck(session_id=session_id, state="active", message="aborted")
|
||||||
|
return SessionAck(session_id=session_id, state="active", message="no-in-flight-task")
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
# GET /api/sessions/{id}/stream — SSE
|
# GET /api/sessions/{id}/stream — SSE
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
async def _session_event_stream(db: Database, session_id: str, last_seq: int = 0) -> Any:
|
async def _session_event_stream(
|
||||||
"""Yield ServerSentEvent per new MessageRow. Closes when session ends."""
|
db: Database, session_id: str, last_seq: int = 0, app: Any = None
|
||||||
|
) -> Any:
|
||||||
|
"""Yield ServerSentEvent per new MessageRow + token chunk. Closes on terminal.
|
||||||
|
|
||||||
|
Three event types emitted:
|
||||||
|
- ``message`` (existing): one row per new MessageRow.
|
||||||
|
- ``chunk`` (v0.4 B3): token delta from the in-flight ainvoke. Drained
|
||||||
|
from ``app.state.token_chunk_queues[session_id]`` if present.
|
||||||
|
- ``done`` (existing): session terminal or deleted.
|
||||||
|
"""
|
||||||
seen = last_seq
|
seen = last_seq
|
||||||
|
|
||||||
while True:
|
while True:
|
||||||
|
# v0.4 B3: drain queued token chunks FIRST so streaming visibly
|
||||||
|
# precedes the final `message` SSE. Without this the placeholder
|
||||||
|
# is replaced by the persisted MessageRow before any chunk reaches
|
||||||
|
# the browser — Claude-Code-style typing would never appear.
|
||||||
|
if app is not None:
|
||||||
|
queues: dict[str, asyncio.Queue[Any]] = getattr(app.state, "token_chunk_queues", {})
|
||||||
|
queue = queues.get(session_id)
|
||||||
|
if queue is not None:
|
||||||
|
drained = 0
|
||||||
|
while not queue.empty() and drained < 200:
|
||||||
|
chunk = queue.get_nowait()
|
||||||
|
yield ServerSentEvent(
|
||||||
|
data=json.dumps(chunk, ensure_ascii=False),
|
||||||
|
event="chunk",
|
||||||
|
)
|
||||||
|
drained += 1
|
||||||
|
if chunk.get("type") in ("done", "cancelled", "error"):
|
||||||
|
queues.pop(session_id, None)
|
||||||
|
break
|
||||||
|
|
||||||
async with db.session() as s:
|
async with db.session() as s:
|
||||||
message_rows = (
|
message_rows = (
|
||||||
(
|
(
|
||||||
@@ -365,7 +451,7 @@ async def stream_session(
|
|||||||
row = await s.get(InteractiveSessionRow, session_id)
|
row = await s.get(InteractiveSessionRow, session_id)
|
||||||
if row is None:
|
if row is None:
|
||||||
raise HTTPException(status_code=404, detail=f"session {session_id} not found")
|
raise HTTPException(status_code=404, detail=f"session {session_id} not found")
|
||||||
return EventSourceResponse(_session_event_stream(db, session_id, last_seq))
|
return EventSourceResponse(_session_event_stream(db, session_id, last_seq, app=request.app))
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
|
|||||||
@@ -153,6 +153,9 @@ def resolve_model_instance(
|
|||||||
max_tokens=params.get("max_tokens", 4096),
|
max_tokens=params.get("max_tokens", 4096),
|
||||||
temperature=params.get("temperature", 0.2),
|
temperature=params.get("temperature", 0.2),
|
||||||
top_p=params.get("top_p", 1.0),
|
top_p=params.get("top_p", 1.0),
|
||||||
|
# v0.4 B3: enable token streaming so AsyncCallbackHandler.on_llm_new_token
|
||||||
|
# receives chunks during ainvoke. Final response is unchanged.
|
||||||
|
streaming=True,
|
||||||
)
|
)
|
||||||
return model_spec
|
return model_spec
|
||||||
|
|
||||||
|
|||||||
@@ -424,6 +424,7 @@ const CONV_STATE = {
|
|||||||
eventSource: null,
|
eventSource: null,
|
||||||
lastSeq: 0,
|
lastSeq: 0,
|
||||||
awaitingReply: false,
|
awaitingReply: false,
|
||||||
|
streamBuffer: "", // v0.4 B3: accumulated token deltas while streaming
|
||||||
};
|
};
|
||||||
|
|
||||||
function $conv(sel) { return document.querySelector(sel); }
|
function $conv(sel) { return document.querySelector(sel); }
|
||||||
@@ -433,6 +434,15 @@ function setSendDisabled(disabled) {
|
|||||||
$conv("#send-btn").disabled = disabled;
|
$conv("#send-btn").disabled = disabled;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// v0.4 B4: toggle the abort button visibility based on in-flight state.
|
||||||
|
// `disabled` is what setSendDisabled sees AFTER awaiting reply has started.
|
||||||
|
function setAbortVisible(visible) {
|
||||||
|
const btn = $conv("#abort-btn");
|
||||||
|
if (!btn) return;
|
||||||
|
btn.style.display = visible ? "inline-block" : "none";
|
||||||
|
btn.disabled = !visible;
|
||||||
|
}
|
||||||
|
|
||||||
function clearMessages() {
|
function clearMessages() {
|
||||||
const list = $conv("#messages");
|
const list = $conv("#messages");
|
||||||
list.replaceChildren();
|
list.replaceChildren();
|
||||||
@@ -456,7 +466,179 @@ function showConversationEmpty(show, text) {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
function appendMessageBubble(role, content, ts) {
|
// v0.4 B1: minimal markdown renderer for assistant messages.
|
||||||
|
// SECURITY: we ONLY emit DOM nodes built via createElement + textContent.
|
||||||
|
// No innerHTML, no insertAdjacentHTML. This is a tiny subset of Markdown
|
||||||
|
// chosen for chat readability — anything we don't understand is rendered as
|
||||||
|
// literal text (textContent fallback in the default case).
|
||||||
|
function _mdRenderInto(target, raw) {
|
||||||
|
// Code-fence-aware splitter — we walk the input line-by-line and group
|
||||||
|
// lines into blocks (paragraph, code-fence, h#, list).
|
||||||
|
const lines = raw.split("\n");
|
||||||
|
let i = 0;
|
||||||
|
while (i < lines.length) {
|
||||||
|
const line = lines[i];
|
||||||
|
|
||||||
|
// Fenced code block: ```lang
|
||||||
|
const fence = line.match(/^```\s*([\w.-]*)\s*$/);
|
||||||
|
if (fence) {
|
||||||
|
const lang = fence[1];
|
||||||
|
const codeLines = [];
|
||||||
|
i++;
|
||||||
|
while (i < lines.length && !/^```\s*$/.test(lines[i])) {
|
||||||
|
codeLines.push(lines[i]);
|
||||||
|
i++;
|
||||||
|
}
|
||||||
|
if (i < lines.length) i++; // consume closing ```
|
||||||
|
const pre = document.createElement("pre");
|
||||||
|
pre.className = "md-code";
|
||||||
|
const code = document.createElement("code");
|
||||||
|
if (lang) code.className = `language-${lang}`;
|
||||||
|
code.textContent = codeLines.join("\n");
|
||||||
|
pre.appendChild(code);
|
||||||
|
target.appendChild(pre);
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
|
// ATX header: # / ## / ### (up to 6)
|
||||||
|
const hdr = line.match(/^(#{1,6})\s+(.*)$/);
|
||||||
|
if (hdr) {
|
||||||
|
const level = hdr[1].length;
|
||||||
|
const h = document.createElement(`h${level + 2 > 6 ? 6 : level + 2}`);
|
||||||
|
h.className = "md-h";
|
||||||
|
_mdInline(h, hdr[2]);
|
||||||
|
target.appendChild(h);
|
||||||
|
i++;
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Unordered list block — consecutive "- " or "* "
|
||||||
|
if (/^[-*]\s+/.test(line)) {
|
||||||
|
const ul = document.createElement("ul");
|
||||||
|
ul.className = "md-ul";
|
||||||
|
while (i < lines.length && /^[-*]\s+/.test(lines[i])) {
|
||||||
|
const li = document.createElement("li");
|
||||||
|
_mdInline(li, lines[i].replace(/^[-*]\s+/, ""));
|
||||||
|
ul.appendChild(li);
|
||||||
|
i++;
|
||||||
|
}
|
||||||
|
target.appendChild(ul);
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Ordered list: "1. ", "2. ", …
|
||||||
|
if (/^\d+\.\s+/.test(line)) {
|
||||||
|
const ol = document.createElement("ol");
|
||||||
|
ol.className = "md-ol";
|
||||||
|
while (i < lines.length && /^\d+\.\s+/.test(lines[i])) {
|
||||||
|
const li = document.createElement("li");
|
||||||
|
_mdInline(li, lines[i].replace(/^\d+\.\s+/, ""));
|
||||||
|
ol.appendChild(li);
|
||||||
|
i++;
|
||||||
|
}
|
||||||
|
target.appendChild(ol);
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Blank line — paragraph separator; skip.
|
||||||
|
if (line.trim() === "") {
|
||||||
|
i++;
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Paragraph: greedily consume until blank or block-start.
|
||||||
|
const paraLines = [line];
|
||||||
|
i++;
|
||||||
|
while (
|
||||||
|
i < lines.length
|
||||||
|
&& lines[i].trim() !== ""
|
||||||
|
&& !/^```/.test(lines[i])
|
||||||
|
&& !/^#{1,6}\s+/.test(lines[i])
|
||||||
|
&& !/^[-*]\s+/.test(lines[i])
|
||||||
|
&& !/^\d+\.\s+/.test(lines[i])
|
||||||
|
) {
|
||||||
|
paraLines.push(lines[i]);
|
||||||
|
i++;
|
||||||
|
}
|
||||||
|
const p = document.createElement("p");
|
||||||
|
p.className = "md-p";
|
||||||
|
_mdInline(p, paraLines.join("\n"));
|
||||||
|
target.appendChild(p);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Inline parser: handles `code`, **bold**, *italic*, [link](url).
|
||||||
|
// Emits DOM nodes; never innerHTML.
|
||||||
|
function _mdInline(target, text) {
|
||||||
|
// Walk the string, matching the earliest-occurring inline pattern.
|
||||||
|
let remaining = text;
|
||||||
|
while (remaining.length > 0) {
|
||||||
|
const matches = [
|
||||||
|
{ re: /`([^`]+)`/, tag: "code" },
|
||||||
|
{ re: /\*\*([^*\n]+)\*\*/, tag: "strong" },
|
||||||
|
{ re: /(?<!\*)\*([^*\n]+)\*(?!\*)/, tag: "em" },
|
||||||
|
{ re: /\[([^\]]+)\]\(([^)\s]+)\)/, tag: "a" },
|
||||||
|
];
|
||||||
|
let best = null;
|
||||||
|
for (const m of matches) {
|
||||||
|
const hit = remaining.match(m.re);
|
||||||
|
if (hit && (best === null || hit.index < best.hit.index)) {
|
||||||
|
best = { hit, tag: m.tag };
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if (best === null) {
|
||||||
|
target.appendChild(document.createTextNode(remaining));
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
if (best.hit.index > 0) {
|
||||||
|
target.appendChild(document.createTextNode(remaining.slice(0, best.hit.index)));
|
||||||
|
}
|
||||||
|
const el = document.createElement(best.tag);
|
||||||
|
if (best.tag === "a") {
|
||||||
|
// Link: cap protocol to http/https to avoid javascript: scheme escapes.
|
||||||
|
const href = best.hit[2];
|
||||||
|
if (/^https?:\/\//.test(href)) el.href = href;
|
||||||
|
el.rel = "noopener noreferrer";
|
||||||
|
el.target = "_blank";
|
||||||
|
el.textContent = best.hit[1];
|
||||||
|
} else {
|
||||||
|
el.textContent = best.hit[1];
|
||||||
|
}
|
||||||
|
target.appendChild(el);
|
||||||
|
remaining = remaining.slice(best.hit.index + best.hit[0].length);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// v0.4 B2: classify system messages into collapsible "event cards" so the
|
||||||
|
// chat thread doesn't drown in [sub-agent ... spawned] / [workflow ... started]
|
||||||
|
// notices. Returns a label + an emoji-style icon + whether to default to open.
|
||||||
|
function _classifySystemMessage(content) {
|
||||||
|
if (content.startsWith("[sub-agent")) {
|
||||||
|
if (content.includes("result]")) return { label: "Sub-agent result", icon: "🤖", open: true };
|
||||||
|
if (content.includes("error]")) return { label: "Sub-agent error", icon: "⚠️", open: true };
|
||||||
|
return { label: "Sub-agent spawned", icon: "🚀", open: false };
|
||||||
|
}
|
||||||
|
if (content.startsWith("[workflow")) {
|
||||||
|
if (content.includes("started]")) return { label: "Workflow started", icon: "🛠️", open: false };
|
||||||
|
if (content.includes("failed]")) return { label: "Workflow failed", icon: "❌", open: true };
|
||||||
|
return { label: "Workflow event", icon: "✅", open: true };
|
||||||
|
}
|
||||||
|
if (content.startsWith("Earlier conversation history")) {
|
||||||
|
return { label: "Compaction summary", icon: "📝", open: false };
|
||||||
|
}
|
||||||
|
if (content.startsWith("당신은 plan mode")) {
|
||||||
|
return { label: "Plan mode activated", icon: "🧭", open: false };
|
||||||
|
}
|
||||||
|
if (content.startsWith("The user APPROVED")) {
|
||||||
|
return { label: "Approved plan", icon: "✅", open: false };
|
||||||
|
}
|
||||||
|
if (content.startsWith("The user requested skill")) {
|
||||||
|
return { label: "Skill activated", icon: "🪄", open: false };
|
||||||
|
}
|
||||||
|
return null;
|
||||||
|
}
|
||||||
|
|
||||||
|
function appendMessageBubble(role, content, ts, opts) {
|
||||||
showConversationEmpty(false);
|
showConversationEmpty(false);
|
||||||
const list = $conv("#messages");
|
const list = $conv("#messages");
|
||||||
const bubble = document.createElement("div");
|
const bubble = document.createElement("div");
|
||||||
@@ -471,13 +653,47 @@ function appendMessageBubble(role, content, ts) {
|
|||||||
tsSpan.textContent = (ts || "").slice(11, 19);
|
tsSpan.textContent = (ts || "").slice(11, 19);
|
||||||
meta.appendChild(roleSpan);
|
meta.appendChild(roleSpan);
|
||||||
if (ts) meta.appendChild(tsSpan);
|
if (ts) meta.appendChild(tsSpan);
|
||||||
|
|
||||||
const body = document.createElement("div");
|
const body = document.createElement("div");
|
||||||
body.className = "msg-body";
|
body.className = "msg-body";
|
||||||
|
|
||||||
|
if (role === "system") {
|
||||||
|
// Collapsible event card if we recognise the format; otherwise plain.
|
||||||
|
const cls = _classifySystemMessage(content);
|
||||||
|
if (cls !== null) {
|
||||||
|
bubble.classList.add("role-system-event");
|
||||||
|
const det = document.createElement("details");
|
||||||
|
det.className = "md-system-event";
|
||||||
|
if (cls.open) det.open = true;
|
||||||
|
const sum = document.createElement("summary");
|
||||||
|
const icon = document.createElement("span");
|
||||||
|
icon.className = "event-icon";
|
||||||
|
icon.textContent = cls.icon;
|
||||||
|
const label = document.createElement("span");
|
||||||
|
label.className = "event-label";
|
||||||
|
label.textContent = cls.label;
|
||||||
|
sum.appendChild(icon);
|
||||||
|
sum.appendChild(label);
|
||||||
|
det.appendChild(sum);
|
||||||
|
const inner = document.createElement("div");
|
||||||
|
inner.className = "event-body";
|
||||||
|
_mdRenderInto(inner, content);
|
||||||
|
det.appendChild(inner);
|
||||||
|
body.appendChild(det);
|
||||||
|
} else {
|
||||||
|
_mdRenderInto(body, content);
|
||||||
|
}
|
||||||
|
} else if (role === "assistant" || (opts && opts.renderMarkdown)) {
|
||||||
|
_mdRenderInto(body, content);
|
||||||
|
} else {
|
||||||
body.textContent = content;
|
body.textContent = content;
|
||||||
|
}
|
||||||
|
|
||||||
bubble.appendChild(meta);
|
bubble.appendChild(meta);
|
||||||
bubble.appendChild(body);
|
bubble.appendChild(body);
|
||||||
list.appendChild(bubble);
|
list.appendChild(bubble);
|
||||||
list.scrollTop = list.scrollHeight;
|
list.scrollTop = list.scrollHeight;
|
||||||
|
return bubble;
|
||||||
}
|
}
|
||||||
|
|
||||||
function appendPendingPlaceholder() {
|
function appendPendingPlaceholder() {
|
||||||
@@ -485,14 +701,48 @@ function appendPendingPlaceholder() {
|
|||||||
const placeholder = document.createElement("div");
|
const placeholder = document.createElement("div");
|
||||||
placeholder.id = "pending-placeholder";
|
placeholder.id = "pending-placeholder";
|
||||||
placeholder.className = "msg-bubble role-assistant pending";
|
placeholder.className = "msg-bubble role-assistant pending";
|
||||||
placeholder.textContent = "…";
|
const meta = document.createElement("div");
|
||||||
|
meta.className = "msg-meta";
|
||||||
|
const roleSpan = document.createElement("span");
|
||||||
|
roleSpan.className = "msg-role";
|
||||||
|
roleSpan.textContent = "assistant";
|
||||||
|
meta.appendChild(roleSpan);
|
||||||
|
const body = document.createElement("div");
|
||||||
|
body.className = "msg-body";
|
||||||
|
body.textContent = "…";
|
||||||
|
placeholder.appendChild(meta);
|
||||||
|
placeholder.appendChild(body);
|
||||||
list.appendChild(placeholder);
|
list.appendChild(placeholder);
|
||||||
list.scrollTop = list.scrollHeight;
|
list.scrollTop = list.scrollHeight;
|
||||||
|
// v0.4 B3: keep a buffer for streamed tokens so we can re-render markdown
|
||||||
|
// once the full text arrives.
|
||||||
|
CONV_STATE.streamBuffer = "";
|
||||||
}
|
}
|
||||||
|
|
||||||
function removePendingPlaceholder() {
|
function removePendingPlaceholder() {
|
||||||
const p = $conv("#pending-placeholder");
|
const p = $conv("#pending-placeholder");
|
||||||
if (p) p.remove();
|
if (p) p.remove();
|
||||||
|
CONV_STATE.streamBuffer = "";
|
||||||
|
}
|
||||||
|
|
||||||
|
// v0.4 B3: append a streamed token to the pending placeholder's body.
|
||||||
|
function appendStreamDelta(text) {
|
||||||
|
const placeholder = $conv("#pending-placeholder");
|
||||||
|
if (!placeholder) return;
|
||||||
|
if (!CONV_STATE.streamBuffer || CONV_STATE.streamBuffer === "") {
|
||||||
|
// First chunk — replace the "…" indicator.
|
||||||
|
const body = placeholder.querySelector(".msg-body");
|
||||||
|
if (body) body.textContent = "";
|
||||||
|
}
|
||||||
|
CONV_STATE.streamBuffer = (CONV_STATE.streamBuffer || "") + text;
|
||||||
|
const body = placeholder.querySelector(".msg-body");
|
||||||
|
if (body) {
|
||||||
|
// Streaming view: keep plain text for speed, full markdown render only
|
||||||
|
// happens when the final `message` event arrives.
|
||||||
|
body.textContent = CONV_STATE.streamBuffer;
|
||||||
|
}
|
||||||
|
const list = $conv("#messages");
|
||||||
|
if (list) list.scrollTop = list.scrollHeight;
|
||||||
}
|
}
|
||||||
|
|
||||||
function updateSessionStatePill(state) {
|
function updateSessionStatePill(state) {
|
||||||
@@ -550,7 +800,9 @@ async function loadAndAttachSession(sessionId) {
|
|||||||
|
|
||||||
const messages = detail.messages || [];
|
const messages = detail.messages || [];
|
||||||
for (const m of messages) {
|
for (const m of messages) {
|
||||||
if (m.role === "system" && !m.is_summary) continue;
|
// v0.4 B2: render system messages too — most map to recognised event
|
||||||
|
// cards (collapsible). Unknown system payloads fall through to plain
|
||||||
|
// markdown rendering.
|
||||||
appendMessageBubble(m.role, m.content, m.ts);
|
appendMessageBubble(m.role, m.content, m.ts);
|
||||||
if (m.seq > CONV_STATE.lastSeq) CONV_STATE.lastSeq = m.seq;
|
if (m.seq > CONV_STATE.lastSeq) CONV_STATE.lastSeq = m.seq;
|
||||||
}
|
}
|
||||||
@@ -578,17 +830,34 @@ function attachEventSource(sessionId) {
|
|||||||
if (data.role === "assistant" && CONV_STATE.awaitingReply) {
|
if (data.role === "assistant" && CONV_STATE.awaitingReply) {
|
||||||
removePendingPlaceholder();
|
removePendingPlaceholder();
|
||||||
CONV_STATE.awaitingReply = false;
|
CONV_STATE.awaitingReply = false;
|
||||||
|
setAbortVisible(false);
|
||||||
}
|
}
|
||||||
// Skip system messages except summaries.
|
// v0.4 B2: render every system message — most are recognised events
|
||||||
if (data.role === "system" && !data.is_summary) {
|
// (compaction / sub-agent / workflow / plan / skill) and rendered as
|
||||||
CONV_STATE.lastSeq = data.seq;
|
// collapsible cards by appendMessageBubble.
|
||||||
return;
|
|
||||||
}
|
|
||||||
appendMessageBubble(data.role, data.content, data.ts);
|
appendMessageBubble(data.role, data.content, data.ts);
|
||||||
CONV_STATE.lastSeq = data.seq;
|
CONV_STATE.lastSeq = data.seq;
|
||||||
} catch (_) { /* ignore parse errors */ }
|
} catch (_) { /* ignore parse errors */ }
|
||||||
});
|
});
|
||||||
|
|
||||||
|
// v0.4 B3: token streaming. Server pushes one chunk per LLM token; we
|
||||||
|
// append to the pending placeholder. When the final "message" SSE arrives
|
||||||
|
// it replaces the streaming text with the markdown-rendered version.
|
||||||
|
src.addEventListener("chunk", (ev) => {
|
||||||
|
try {
|
||||||
|
const data = JSON.parse(ev.data);
|
||||||
|
if (data.type === "delta" && typeof data.text === "string") {
|
||||||
|
appendStreamDelta(data.text);
|
||||||
|
} else if (data.type === "cancelled" || data.type === "error") {
|
||||||
|
// Drop the placeholder; setError already handled or will be by 'message'.
|
||||||
|
removePendingPlaceholder();
|
||||||
|
CONV_STATE.awaitingReply = false;
|
||||||
|
setAbortVisible(false);
|
||||||
|
}
|
||||||
|
// type === "done" is benign — the matching 'message' SSE arrives next.
|
||||||
|
} catch (_) { /* ignore parse errors */ }
|
||||||
|
});
|
||||||
|
|
||||||
src.addEventListener("done", () => {
|
src.addEventListener("done", () => {
|
||||||
src.close();
|
src.close();
|
||||||
if (CONV_STATE.eventSource === src) CONV_STATE.eventSource = null;
|
if (CONV_STATE.eventSource === src) CONV_STATE.eventSource = null;
|
||||||
@@ -610,6 +879,7 @@ async function sendMessage(text) {
|
|||||||
}
|
}
|
||||||
if (!text.trim()) return;
|
if (!text.trim()) return;
|
||||||
setSendDisabled(true);
|
setSendDisabled(true);
|
||||||
|
setAbortVisible(true);
|
||||||
CONV_STATE.awaitingReply = true;
|
CONV_STATE.awaitingReply = true;
|
||||||
appendPendingPlaceholder();
|
appendPendingPlaceholder();
|
||||||
try {
|
try {
|
||||||
@@ -623,6 +893,7 @@ async function sendMessage(text) {
|
|||||||
} catch (e) {
|
} catch (e) {
|
||||||
removePendingPlaceholder();
|
removePendingPlaceholder();
|
||||||
CONV_STATE.awaitingReply = false;
|
CONV_STATE.awaitingReply = false;
|
||||||
|
setAbortVisible(false);
|
||||||
setError(`전송 실패: ${e.message}`);
|
setError(`전송 실패: ${e.message}`);
|
||||||
} finally {
|
} finally {
|
||||||
setSendDisabled(false);
|
setSendDisabled(false);
|
||||||
@@ -630,6 +901,19 @@ async function sendMessage(text) {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
async function abortInflight() {
|
||||||
|
if (!CONV_STATE.sessionId) return;
|
||||||
|
try {
|
||||||
|
await jsonFetch(`/sessions/${CONV_STATE.sessionId}/abort`, { method: "POST" });
|
||||||
|
removePendingPlaceholder();
|
||||||
|
CONV_STATE.awaitingReply = false;
|
||||||
|
setAbortVisible(false);
|
||||||
|
setError("");
|
||||||
|
} catch (e) {
|
||||||
|
setError(`중단 실패: ${e.message}`);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
async function createNewSession() {
|
async function createNewSession() {
|
||||||
let personas;
|
let personas;
|
||||||
try {
|
try {
|
||||||
@@ -647,7 +931,9 @@ async function createNewSession() {
|
|||||||
const ack = await jsonFetch("/sessions", {
|
const ack = await jsonFetch("/sessions", {
|
||||||
method: "POST",
|
method: "POST",
|
||||||
headers: { "Content-Type": "application/json" },
|
headers: { "Content-Type": "application/json" },
|
||||||
body: JSON.stringify({ persona_name: defaultPersona.name, repo_path: "" }),
|
// CreateSessionRequest requires repo_path min_length=1. We default to
|
||||||
|
// "." (cwd of the serve process) — the backend resolves it to absolute.
|
||||||
|
body: JSON.stringify({ persona_name: defaultPersona.name, repo_path: "." }),
|
||||||
});
|
});
|
||||||
await loadSessionList();
|
await loadSessionList();
|
||||||
$conv("#session-picker").value = ack.session_id;
|
$conv("#session-picker").value = ack.session_id;
|
||||||
@@ -660,6 +946,7 @@ async function createNewSession() {
|
|||||||
async function bootstrapConversationPage() {
|
async function bootstrapConversationPage() {
|
||||||
await loadSessionList();
|
await loadSessionList();
|
||||||
$conv("#new-session-btn").addEventListener("click", createNewSession);
|
$conv("#new-session-btn").addEventListener("click", createNewSession);
|
||||||
|
$conv("#abort-btn").addEventListener("click", abortInflight);
|
||||||
$conv("#session-picker").addEventListener("change", (ev) => {
|
$conv("#session-picker").addEventListener("change", (ev) => {
|
||||||
const sid = ev.target.value;
|
const sid = ev.target.value;
|
||||||
if (sid) loadAndAttachSession(sid);
|
if (sid) loadAndAttachSession(sid);
|
||||||
@@ -669,10 +956,28 @@ async function bootstrapConversationPage() {
|
|||||||
const input = $conv("#message-input");
|
const input = $conv("#message-input");
|
||||||
sendMessage(input.value);
|
sendMessage(input.value);
|
||||||
});
|
});
|
||||||
$conv("#message-input").addEventListener("keydown", (ev) => {
|
// v0.4 B5: track IME composition state — Korean/Japanese/Chinese IME emits
|
||||||
|
// Enter to commit the current candidate; we must NOT treat that as send.
|
||||||
|
// compositionend ALSO fires a synthetic Enter that we need to swallow.
|
||||||
|
const input = $conv("#message-input");
|
||||||
|
input._composing = false;
|
||||||
|
input.addEventListener("compositionstart", () => { input._composing = true; });
|
||||||
|
input.addEventListener("compositionend", () => {
|
||||||
|
// The keydown event that ends composition is still pending — defer the
|
||||||
|
// flag flip one tick so the upcoming keydown still sees _composing=true.
|
||||||
|
setTimeout(() => { input._composing = false; }, 0);
|
||||||
|
});
|
||||||
|
input.addEventListener("keydown", (ev) => {
|
||||||
|
// Honor Cmd/Ctrl+Enter as explicit "send" override even during composition.
|
||||||
if ((ev.metaKey || ev.ctrlKey) && ev.key === "Enter") {
|
if ((ev.metaKey || ev.ctrlKey) && ev.key === "Enter") {
|
||||||
ev.preventDefault();
|
ev.preventDefault();
|
||||||
sendMessage(ev.target.value);
|
sendMessage(ev.target.value);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
// Plain Enter during composition (e.g. Korean IME committing 한글) must
|
||||||
|
// pass through to the textarea — do nothing.
|
||||||
|
if (ev.key === "Enter" && input._composing) {
|
||||||
|
return;
|
||||||
}
|
}
|
||||||
});
|
});
|
||||||
// v0.3 PR #8: deep link `?session=<id>` auto-loads the named session.
|
// v0.3 PR #8: deep link `?session=<id>` auto-loads the named session.
|
||||||
|
|||||||
@@ -44,6 +44,7 @@
|
|||||||
disabled
|
disabled
|
||||||
></textarea>
|
></textarea>
|
||||||
<button id="send-btn" type="submit" disabled>전송</button>
|
<button id="send-btn" type="submit" disabled>전송</button>
|
||||||
|
<button id="abort-btn" type="button" disabled style="display:none">⏹ 중단</button>
|
||||||
</form>
|
</form>
|
||||||
</main>
|
</main>
|
||||||
<script src="/static/app.js"></script>
|
<script src="/static/app.js"></script>
|
||||||
|
|||||||
@@ -1093,3 +1093,121 @@ details[open] summary {
|
|||||||
border-color: rgba(180, 70, 30, 0.5);
|
border-color: rgba(180, 70, 30, 0.5);
|
||||||
font-weight: 600;
|
font-weight: 600;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/* =================================================================
|
||||||
|
v0.4 — Markdown + system event cards in conversation
|
||||||
|
================================================================= */
|
||||||
|
|
||||||
|
.msg-body .md-p {
|
||||||
|
margin: 0 0 8px 0;
|
||||||
|
line-height: 1.6;
|
||||||
|
}
|
||||||
|
|
||||||
|
.msg-body .md-p:last-child { margin-bottom: 0; }
|
||||||
|
|
||||||
|
.msg-body .md-h {
|
||||||
|
margin: 12px 0 6px 0;
|
||||||
|
font-weight: 700;
|
||||||
|
line-height: 1.3;
|
||||||
|
}
|
||||||
|
|
||||||
|
.msg-body .md-ul,
|
||||||
|
.msg-body .md-ol {
|
||||||
|
margin: 4px 0 8px 0;
|
||||||
|
padding-left: 22px;
|
||||||
|
line-height: 1.6;
|
||||||
|
}
|
||||||
|
|
||||||
|
.msg-body .md-ul li,
|
||||||
|
.msg-body .md-ol li {
|
||||||
|
margin: 2px 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
.msg-body .md-code {
|
||||||
|
background: rgba(0, 0, 0, 0.04);
|
||||||
|
border: 1px solid var(--border);
|
||||||
|
border-radius: 6px;
|
||||||
|
padding: 10px 12px;
|
||||||
|
margin: 8px 0;
|
||||||
|
overflow-x: auto;
|
||||||
|
font-family: var(--font-mono);
|
||||||
|
font-size: 12.5px;
|
||||||
|
line-height: 1.45;
|
||||||
|
}
|
||||||
|
|
||||||
|
.msg-body .md-code code {
|
||||||
|
background: transparent;
|
||||||
|
padding: 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
.msg-body code {
|
||||||
|
background: rgba(0, 0, 0, 0.06);
|
||||||
|
border-radius: 4px;
|
||||||
|
padding: 1px 5px;
|
||||||
|
font-family: var(--font-mono);
|
||||||
|
font-size: 0.9em;
|
||||||
|
}
|
||||||
|
|
||||||
|
.msg-bubble.role-user .msg-body .md-code,
|
||||||
|
.msg-bubble.role-user .msg-body code {
|
||||||
|
background: rgba(255, 255, 255, 0.18);
|
||||||
|
border-color: rgba(255, 255, 255, 0.3);
|
||||||
|
color: white;
|
||||||
|
}
|
||||||
|
|
||||||
|
.msg-body a {
|
||||||
|
color: rgb(180, 70, 30);
|
||||||
|
text-decoration: underline;
|
||||||
|
text-underline-offset: 2px;
|
||||||
|
}
|
||||||
|
|
||||||
|
.msg-bubble.role-user .msg-body a {
|
||||||
|
color: white;
|
||||||
|
}
|
||||||
|
|
||||||
|
.msg-body strong { font-weight: 700; }
|
||||||
|
.msg-body em { font-style: italic; }
|
||||||
|
|
||||||
|
/* System event card */
|
||||||
|
.msg-bubble.role-system-event {
|
||||||
|
align-self: stretch;
|
||||||
|
max-width: 100%;
|
||||||
|
background: rgba(245, 158, 11, 0.06);
|
||||||
|
border: 1px solid rgba(245, 158, 11, 0.25);
|
||||||
|
border-style: dashed;
|
||||||
|
font-style: normal;
|
||||||
|
color: var(--text);
|
||||||
|
}
|
||||||
|
|
||||||
|
.md-system-event summary {
|
||||||
|
cursor: pointer;
|
||||||
|
font-size: 12.5px;
|
||||||
|
display: flex;
|
||||||
|
align-items: center;
|
||||||
|
gap: 6px;
|
||||||
|
list-style: none;
|
||||||
|
}
|
||||||
|
|
||||||
|
.md-system-event summary::-webkit-details-marker { display: none; }
|
||||||
|
|
||||||
|
.md-system-event summary .event-icon {
|
||||||
|
font-size: 14px;
|
||||||
|
}
|
||||||
|
|
||||||
|
.md-system-event summary .event-label {
|
||||||
|
font-weight: 600;
|
||||||
|
letter-spacing: 0.02em;
|
||||||
|
color: rgb(120, 53, 15);
|
||||||
|
}
|
||||||
|
|
||||||
|
.md-system-event[open] summary {
|
||||||
|
margin-bottom: 8px;
|
||||||
|
border-bottom: 1px dashed rgba(245, 158, 11, 0.3);
|
||||||
|
padding-bottom: 6px;
|
||||||
|
}
|
||||||
|
|
||||||
|
.md-system-event .event-body {
|
||||||
|
font-size: 12.5px;
|
||||||
|
line-height: 1.55;
|
||||||
|
color: var(--text-muted);
|
||||||
|
}
|
||||||
|
|||||||
@@ -88,6 +88,7 @@ async def test_post_message_returns_ack_and_persists_user_row(
|
|||||||
user_message: str,
|
user_message: str,
|
||||||
*,
|
*,
|
||||||
saver: Any = None,
|
saver: Any = None,
|
||||||
|
chunk_queue: Any = None,
|
||||||
) -> None:
|
) -> None:
|
||||||
invocations.append((str(session_id), user_message))
|
invocations.append((str(session_id), user_message))
|
||||||
|
|
||||||
@@ -181,6 +182,7 @@ async def test_background_invocation_persists_assistant_row(
|
|||||||
_user_message: str,
|
_user_message: str,
|
||||||
*,
|
*,
|
||||||
saver: Any = None,
|
saver: Any = None,
|
||||||
|
chunk_queue: Any = None,
|
||||||
) -> None:
|
) -> None:
|
||||||
# Simulate what the real runner does: write an assistant MessageRow.
|
# Simulate what the real runner does: write an assistant MessageRow.
|
||||||
from datetime import UTC, datetime
|
from datetime import UTC, datetime
|
||||||
|
|||||||
Reference in New Issue
Block a user