2 Commits

Author SHA1 Message Date
chungyeong
0630142c34 feat(my-deepagent): v0.2 PR #3 — FastAPI + SSE + minimal Web GUI (mydeepagent serve)
Closes the "GUI 미존재" gap from the user's first-session requirements
(REPL + workflow + GUI). v0.2 PR #1's Postgres migration made a second
concurrent writer safe; v0.2 PR #2a/#2b wired durable resume; this commit
ships the HTTP + browser surface that uses them.

No auth, no multi-tenant, single uvicorn worker — per DR-3 boundaries.
v0.3+ will add auth, multi-worker fanout, LISTEN/NOTIFY SSE upgrade.

Backend
- `src/my_deepagent/api/`:
  - `app.py` create_app() factory. lifespan stores db/config/personas/
    workflows on app.state. CORS allow_origin_regex http://localhost(:port)?.
    /static mount + /, /{page}.html for the HTML frontend.
  - `models.py` — pydantic v2 DTOs (extra="forbid") for every route. Auto
    OpenAPI/Swagger via FastAPI's response_model.
  - `deps.py` — get_db / get_config / get_personas / get_workflows.
  - `runner.py` — start_new_run / start_resume. Pre-allocates run_id via
    new `WorkflowEngine.run(pre_allocated_run_id=...)` so the route returns
    the id immediately while the engine runs in asyncio.create_task.
  - `sse.py` — 0.5 s poll over run_events.seq. Emits ServerSentEvent rows;
    sends `event: done` and HTTP-200-closes when run hits terminal.
  - `routes/{runs,personas,workflows,budget}.py`:
      GET  /api/runs              (list, ?limit + ?state)
      GET  /api/runs/{id}         (detail + phases + artifacts + events)
      POST /api/runs              (start; mock-able via runner.start_new_run)
      POST /api/runs/{id}/resume
      POST /api/runs/{id}/abort
      GET  /api/runs/{id}/events  (SSE; Last-Event-ID header + ?last_event_id)
      GET  /api/personas
      GET  /api/workflows
      GET  /api/budget

CLI
- `cli/serve.py` mydeepagent serve [--host 127.0.0.1] [--port 8000].
  Loud stderr warning if --host is not loopback (no auth = footgun).
  uvicorn.run(factory=True, workers=1).
- `cli/main.py` serve command registered.

Static frontend (vanilla HTML/JS/CSS, no build system)
- index.html — runs list + budget summary
- new.html — start-run form (workflow select, repo path, requirements,
  per-role persona override)
- run.html — run detail + live SSE event log + Resume/Abort buttons
- app.js — fetch + EventSource. XSS policy HARDCODED at file top:
  textContent only, innerHTML/insertAdjacentHTML/outerHTML forbidden.
- style.css — dark theme, single file.

Engine
- WorkflowEngine.run(... pre_allocated_run_id: UUID|None = None). None →
  uuid4() (existing behavior). Set → use that UUID. Backward compatible.

Tests
- tests/integration/test_api_read.py (5): list empty, get 404, personas
  seed count (12), workflows seed (>=3), budget empty.
- tests/integration/test_api_write.py (5): missing template 400, extra
  field 422, resume 404, abort 404, mock-runner happy path.
- tests/integration/test_api_sse.py (1): seed terminal run + 3 events,
  drain stream, assert types present + stream closes within 3 s.
- tests/integration/test_api_static.py (5): index/new/run HTML 200,
  app.js content-type + XSS-policy substring assertion, style.css
  content-type.
- All fixtures use httpx ASGITransport + app.router.lifespan_context
  (httpx does NOT auto-trigger FastAPI lifespan) + sqlite tmp_path.

Gates
- ruff check + ruff format --check + mypy --strict: PASS (120 source files)
- pytest non-E2E: 603 PASS (12.15 s) — +16 from new API tests
- pytest E2E real OpenRouter on Postgres: PASS 60.44 s (baseline 71–122 s
  range; well within DR-3 acceptance threshold ≤+20%)

Manual browser verification deferred to a follow-up (docker compose up,
mydeepagent serve, open http://localhost:8000).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 22:25:15 +09:00
chungyeong
501292a5cd feat(my-deepagent): v0.2 PR #2b — mydeepagent runs resume <id> real implementation
Closes the v0.1.0 KNOWN LIMIT where resume was an exit-2 stub. Builds on
v0.2 PR #2a's LangGraph wiring + the existing DB phase-state machine +
sweep_orphan_runs — no Temporal (per DR-3).

Highlights
- `WorkflowEngine.resume(run_id)` (new async method):
  - Loads RunRow, rejects terminal states with
    MyDeepAgentError("run_already_terminal").
  - Reloads worktree_root from `RunRow.worktree_root`, template via
    `_reload_template` (WorkflowTemplateRow JOIN + model_validate), and
    bindings via `_reload_bindings` (run_bindings ⨝ agent_personas).
  - **Does NOT call `bind_personas` again** — locks in the original
    binding so consent / persona-pool changes since the original run
    don't silently shift role assignment.
- `_execute_run` (extracted shared phase loop): `run()` and `resume()`
  both dispatch through it. Skips already-completed phases (emits
  `phase.skipped` event) and re-executes the rest.
- 4 new private helpers on WorkflowEngine: `_get_run_or_raise`,
  `_reload_template`, `_reload_bindings`, `_get_completed_phase_keys`.
- `RunEventType.RUN_RESUMED` and `PHASE_SKIPPED` are now actually
  emitted (the enum members existed already).
- `cli/runs.py _runs_resume_async`: stub → real impl. Validates the run
  exists + non-terminal, loads seed personas + artifact schemas from
  `docs/schemas/`, constructs WorkflowEngine with an
  "abort-on-new-approval" callback (resume should not silently re-prompt
  the user — original gates already passed; a new gate means the
  workflow has changed). Calls engine.resume(UUID(id)), prints final
  state + report. Catches MyDeepAgentError and exits 1 with red error.

Tests
- `tests/integration/test_resume.py` (new, 5 scenarios):
  1. 2-phase mock workflow: phase 1 succeeds, phase 2 fails first time,
     row flipped back to executing → resume → phase 2 completes.
     Asserts `phase.skipped` event for phase 1, `run.resumed` event,
     and exactly 1 mock invocation for phase 2 on resume.
  2. Terminal run → `MyDeepAgentError(code="run_already_terminal")`.
  3. Unknown run id → `MyDeepAgentError(code="run_not_found")`.
  4. RunBindingRow rows missing → `MyDeepAgentError(code="run_metadata_missing")`.
  5. Corrupt `workflow_templates.definition` →
     `MyDeepAgentError(code="template_load_failed")`.
  Mock pattern matches existing test_engine.py: patch
  `my_deepagent.engine.build_agent` to return a fake agent that writes
  the expected artifact and drives the watcher middleware.

Gates
- ruff check + ruff format --check + mypy --strict: PASS (103 source files)
- pytest non-E2E: 587 PASS (12.69 s) — +5 from new resume tests
- pytest E2E real OpenRouter on Postgres: PASS 78.52 s (baseline 71–122 s;
  within DR-3 acceptance threshold ≤+20%)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 22:07:24 +09:00