feat(my-deepagent): v0.2 PR #2a — wire LangGraph AsyncPostgresSaver into engine

Foundation for `runs resume` (v0.2 PR #2b). v0.2 PR #1 added
langgraph-checkpoint-postgres as a dependency, but engine.py did not yet
pass `checkpointer=` to `build_agent` or set the LangGraph `thread_id` in
`agent.ainvoke` — meaning resume had no state to restore. This commit
actually wires the dependency.

Highlights
- `WorkflowEngine.__init__` accepts `checkpointer_url: str | None`
  (default = `config.database_url`).
- `_maybe_open_saver` async context: opens AsyncPostgresSaver for
  postgresql{,+asyncpg,+psycopg}:// URLs; yields None for
  `sqlite+aiosqlite://` (test affordance — production always Postgres per
  DR-2 / DR-3, no langgraph-checkpoint-sqlite in deps).
- `WorkflowEngine.run()` opens the saver **once per run** and shares it
  across all phases. Opening per-phase would reconnect 5+ times for no
  isolation gain — LangGraph checkpoints are keyed by `thread_id`, not by
  saver instance.
- `_invoke_agent_until_artifact` forwards `checkpointer=self._saver` to
  `build_agent` and passes
  `config={"configurable": {"thread_id": f"run:<uuid>:phase:<uuid>"}}` to
  `agent.ainvoke`. The thread_id format is already used by
  `LlmCallRow.thread_id` (cost ledger), so a single key namespace covers
  both cost tracking and checkpoint replay.

Tests
- `tests/integration/test_engine_checkpointer_wiring.py` (new, 2 tests):
  1. Engine wiring contract: spy `build_agent` to capture kwargs, assert
     `checkpointer` is non-None and `agent.ainvoke` receives the expected
     `config.configurable.thread_id` in run:<uuid>:phase:<uuid> format.
  2. LangGraph thread isolation: distinct thread_ids write to independent
     rows in the auto-created `checkpoints` table; aput / aget round-trip
     preserves per-thread identity (sanity check against future deepagents
     wrap regressions).
- `tests/integration/test_engine.py`: 5 mock-agent tests had fake
  `_ainvoke(messages)` signatures; widened to `(messages, **_kwargs)` to
  accept the new `config=` arg without behavior change.

Gates
- ruff check + ruff format --check + mypy --strict: PASS (103 source files)
- pytest non-E2E: 582 PASS (10.55 s) — was 576 before, +7 from new wiring
  tests, +/-1 from engine.py reshape, +/-... settled at 582 net.
- pytest E2E real OpenRouter on Postgres: PASS 75.99 s (baseline 71–122 s;
  within DR-3 acceptance threshold ≤+20%).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
chungyeong
2026-05-16 21:56:34 +09:00
parent 711d61d245
commit 50aacd3382
4 changed files with 316 additions and 43 deletions

View File

@@ -3,6 +3,36 @@
## [Unreleased]
### Added
- **v0.2 PR #2a — LangGraph `AsyncPostgresSaver` engine wiring** (foundation
for `runs resume`). v0.2 PR #1 added the dependency; this commit actually
uses it.
- `src/my_deepagent/engine.py`:
- `WorkflowEngine.__init__` accepts `checkpointer_url: str | None` (defaults
to `config.database_url`).
- New `_maybe_open_saver` async context: opens `get_checkpointer_ctx` for
`postgresql{,+asyncpg,+psycopg}://` URLs, yields `None` for `sqlite+aiosqlite://`
(test affordance — production always Postgres per DR-2 / DR-3).
- `WorkflowEngine.run()` opens the saver **once per run** and shares it
across all phases via `self._saver` — opening per-phase would re-connect
5+ times for no isolation gain (checkpoints are keyed by `thread_id`, not
saver instance).
- `_invoke_agent_until_artifact` forwards `checkpointer=self._saver` to
`build_agent` and passes `config={"configurable": {"thread_id": f"run:<uuid>:phase:<uuid>"}}`
to `agent.ainvoke`. Same `thread_id` format already used by
`LlmCallRow.thread_id` (cost ledger), so one key namespace covers both.
- `tests/integration/test_engine_checkpointer_wiring.py` (new):
1. **Contract 1 — engine wiring**: `build_agent` receives a non-None saver;
`agent.ainvoke` receives `config.configurable.thread_id` in the
expected `run:<uuid>:phase:<uuid>` format.
2. **Contract 2 — LangGraph thread isolation**: two distinct `thread_id`s
write independent rows in the auto-created `checkpoints` table; aput /
aget round-trip preserves per-thread identity (sanity check against
future deepagents wrap regressions).
- `tests/integration/test_engine.py` — 5 mock-agent tests: fake `_ainvoke`
signature widened with `**_kwargs` to accept the new `config=` arg.
- E2E real OpenRouter regression PASS 75.99 s (baseline 71122 s); within
DR-3 acceptance threshold (+20%).
- **v0.2 PR #1 — Postgres migration**: production backing store switched from
SQLite to PostgreSQL 16 ahead of M8-Py (FastAPI) per DR-2.
- `pyproject.toml`: `asyncpg>=0.30` + `psycopg[binary]>=3.2` +