Foundation for `runs resume` (v0.2 PR #2b). v0.2 PR #1 added langgraph-checkpoint-postgres as a dependency, but engine.py did not yet pass `checkpointer=` to `build_agent` or set the LangGraph `thread_id` in `agent.ainvoke` — meaning resume had no state to restore. This commit actually wires the dependency. Highlights - `WorkflowEngine.__init__` accepts `checkpointer_url: str | None` (default = `config.database_url`). - `_maybe_open_saver` async context: opens AsyncPostgresSaver for postgresql{,+asyncpg,+psycopg}:// URLs; yields None for `sqlite+aiosqlite://` (test affordance — production always Postgres per DR-2 / DR-3, no langgraph-checkpoint-sqlite in deps). - `WorkflowEngine.run()` opens the saver **once per run** and shares it across all phases. Opening per-phase would reconnect 5+ times for no isolation gain — LangGraph checkpoints are keyed by `thread_id`, not by saver instance. - `_invoke_agent_until_artifact` forwards `checkpointer=self._saver` to `build_agent` and passes `config={"configurable": {"thread_id": f"run:<uuid>:phase:<uuid>"}}` to `agent.ainvoke`. The thread_id format is already used by `LlmCallRow.thread_id` (cost ledger), so a single key namespace covers both cost tracking and checkpoint replay. Tests - `tests/integration/test_engine_checkpointer_wiring.py` (new, 2 tests): 1. Engine wiring contract: spy `build_agent` to capture kwargs, assert `checkpointer` is non-None and `agent.ainvoke` receives the expected `config.configurable.thread_id` in run:<uuid>:phase:<uuid> format. 2. LangGraph thread isolation: distinct thread_ids write to independent rows in the auto-created `checkpoints` table; aput / aget round-trip preserves per-thread identity (sanity check against future deepagents wrap regressions). - `tests/integration/test_engine.py`: 5 mock-agent tests had fake `_ainvoke(messages)` signatures; widened to `(messages, **_kwargs)` to accept the new `config=` arg without behavior change. Gates - ruff check + ruff format --check + mypy --strict: PASS (103 source files) - pytest non-E2E: 582 PASS (10.55 s) — was 576 before, +7 from new wiring tests, +/-1 from engine.py reshape, +/-... settled at 582 net. - pytest E2E real OpenRouter on Postgres: PASS 75.99 s (baseline 71–122 s; within DR-3 acceptance threshold ≤+20%). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
8.1 KiB
8.1 KiB