feat(my-deepagent): v0.2 PR #1 — Postgres migration (ahead of M8-Py FastAPI)

Switches the production backing store from SQLite to PostgreSQL 16, per DR-2.
The migration trigger is two concurrent writers on the my-deepagent ORM
tables — which first appears with FastAPI (M8-Py). Doing the cut now keeps
the surface area small while M8-Py is still planning.

Production deps: `asyncpg`, `psycopg[binary]`, `langgraph-checkpoint-postgres`.
Test deps: `aiosqlite` (the bulk of unit + integration tests stay on sqlite
tmp_path for speed; the E2E suite and the new checkpointer tests exercise
the live Postgres path).

Highlights
- `persistence/db.py`: dialect-aware connect listener. SQLite still gets
  WAL + busy_timeout=5000 + foreign_keys=ON; Postgres gets `SET TIME ZONE 'UTC'`.
  Added `Database.dialect_name` + `drop_schema` (test-only).
- `persistence/checkpointer.py`: SqliteSaver → AsyncPostgresSaver. API is
  now async (`async with`) and takes a connection string. SQLAlchemy URL
  prefixes (`+asyncpg`, `+psycopg`) are auto-stripped to a plain libpq DSN
  (`_to_psycopg_dsn` helper, 4 unit tests).
- `persistence/upsert.py` (new): `insert_for(session)` — dialect-aware UPSERT
  helper. Picks `postgresql.insert` or `sqlite.insert` based on the bound
  engine. Replaces 5 hardcoded `sqlite_insert` call sites in `budget.py`,
  `recovery.py`, `cli/doctor.py`.
- `persistence/models.py`: `RunRow` partial unique index declares both
  `postgresql_where=` and `sqlite_where=` for cross-dialect correctness.
- `config.py`: default `database_url` now
  `postgresql+asyncpg://devflow:devflow@localhost:55432/mydeepagent`. v3
  `devflow` DB preserved untouched; v4 lives in a fresh `mydeepagent` DB.
- `cli/doctor.py` check 8: dialect-aware DB liveness probe. Postgres path
  runs `SELECT 1` (pg_isready equivalent); SQLite keeps `PRAGMA integrity_check`.
- `alembic/env.py`: env-aware URL resolution (`MYDEEPAGENT_DATABASE_URL` >
  `DATABASE_URL` > default). Async driver prefixes are mapped to the sync
  equivalents alembic needs.
- `alembic/versions/9f2a6c79667e_v0_2_baseline_schema_postgres.py` (new):
  fresh baseline autogenerated against live Postgres. Old SQLite migrations
  (`79945fdc2649`, `839f2233e346`) deleted — v0.2 starts a clean history.
- `tests/conftest.py` (new): `pg_db_url` async fixture creates a fresh DB
  per test against docker-compose `devflow-postgres` and drops it on
  teardown after terminating lingering backends.
- `tests/integration/test_checkpointer.py`: rewritten for AsyncPostgresSaver
  (4 pure DSN-converter unit tests + 3 async context-manager integration tests).
- `tests/integration/test_e2e_workflow.py`: switched to `pg_db_url`. Real
  OpenRouter E2E now exercises the production Postgres path end-to-end.

Recovery
- Previous SQLite database at the platformdirs data_dir is NOT auto-migrated;
  v0.1.0 was the only release that wrote to it. Set
  `MYDEEPAGENT_DATABASE_URL=sqlite+aiosqlite:///<path>` to read it.
- The v3 `devflow` Postgres DB is preserved untouched (separate database
  name); to inspect: `psql -h localhost -p 55432 -U devflow -d devflow`.

Gates
- ruff check + ruff format --check + mypy --strict: PASS (102 source files)
- pytest non-E2E: 576 PASS (5.46 s)
- pytest E2E real OpenRouter on Postgres: 1 PASS (122.93 s, ~$0.05/run)

--no-verify: lefthook still TS-only (deleted in 0e61b2d but still queryable
in git history).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
chungyeong
2026-05-16 18:11:19 +09:00
parent 55be4f3aa0
commit e21a5241bf
17 changed files with 730 additions and 936 deletions

View File

@@ -0,0 +1,80 @@
"""Test fixtures shared across unit + integration tests.
v0.2 PR #1: tests run against the live Postgres container managed by
docker-compose. Each test that needs DB isolation requests the
``pg_db_url`` fixture, which creates a fresh database per test and drops it
on teardown.
Prerequisites:
docker compose up -d postgres # devflow-postgres on 55432
"""
from __future__ import annotations
import os
import uuid
from collections.abc import AsyncIterator
from typing import Final
import psycopg
import pytest_asyncio
# Maintenance connection — used only to CREATE DATABASE / DROP DATABASE.
# `postgres` is the bootstrap DB present on every Postgres install.
_MAINTENANCE_DSN: Final[str] = os.environ.get(
"MYDEEPAGENT_TEST_MAINTENANCE_DSN",
"postgresql://devflow:devflow@localhost:55432/postgres",
)
def _async_url(db_name: str) -> str:
"""Return the SQLAlchemy + asyncpg URL for *db_name*."""
return f"postgresql+asyncpg://devflow:devflow@localhost:55432/{db_name}"
def _create_test_database() -> str:
"""Create a fresh test database with a random suffix and return its name."""
db_name = f"test_{uuid.uuid4().hex[:16]}"
# autocommit=True is required for CREATE DATABASE (cannot run in a tx block).
with psycopg.connect(_MAINTENANCE_DSN, autocommit=True) as conn:
with conn.cursor() as cur:
cur.execute(f'CREATE DATABASE "{db_name}"')
return db_name
def _drop_test_database(db_name: str) -> None:
"""Forcefully terminate connections and drop *db_name*. Idempotent."""
with psycopg.connect(_MAINTENANCE_DSN, autocommit=True) as conn:
with conn.cursor() as cur:
# Kick any lingering connections held by aiosqlite-style pools that
# didn't dispose cleanly. WITH (FORCE) is Postgres 13+.
cur.execute(
"""
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE datname = %s AND pid <> pg_backend_pid()
""",
(db_name,),
)
cur.execute(f'DROP DATABASE IF EXISTS "{db_name}"')
@pytest_asyncio.fixture
async def pg_db_url() -> AsyncIterator[str]:
"""Yield an isolated Postgres database URL; drop the DB on teardown.
Usage::
async def test_something(pg_db_url: str) -> None:
db = Database(pg_db_url)
await db.init_schema()
...
The returned URL uses the asyncpg driver. To get a sync URL for tools
like alembic, replace ``postgresql+asyncpg://`` with ``postgresql+psycopg://``.
"""
db_name = _create_test_database()
try:
yield _async_url(db_name)
finally:
_drop_test_database(db_name)

View File

@@ -1,78 +1,61 @@
"""Integration tests for src/my_deepagent/persistence/checkpointer.py."""
"""Integration tests for src/my_deepagent/persistence/checkpointer.py.
v0.2 PR #1: rewritten for AsyncPostgresSaver (LangGraph Postgres checkpointer).
The legacy SqliteSaver / Path-based API is removed.
Requires the docker-compose `devflow-postgres` container; the ``pg_db_url``
fixture from ``tests/conftest.py`` creates a fresh DB per test.
"""
from __future__ import annotations
import sqlite3
from pathlib import Path
import pytest
from my_deepagent.persistence.checkpointer import get_checkpointer_ctx
from my_deepagent.persistence.checkpointer import _to_psycopg_dsn, get_checkpointer_ctx
class TestToPsycopgDsn:
"""Pure-function tests for the SQLAlchemy → libpq DSN converter."""
def test_strips_asyncpg_prefix(self) -> None:
url = "postgresql+asyncpg://u:p@h:1/d"
assert _to_psycopg_dsn(url) == "postgresql://u:p@h:1/d"
def test_strips_psycopg_prefix(self) -> None:
url = "postgresql+psycopg://u:p@h:1/d"
assert _to_psycopg_dsn(url) == "postgresql://u:p@h:1/d"
def test_bare_postgres_url_passes_through(self) -> None:
url = "postgresql://u:p@h:1/d"
assert _to_psycopg_dsn(url) == url
def test_non_postgres_url_passes_through(self) -> None:
url = "sqlite:///x"
assert _to_psycopg_dsn(url) == url
@pytest.mark.integration
class TestGetCheckpointerCtx:
"""Tests for the get_checkpointer_ctx context manager."""
def test_ctx_yields_saver_and_cleans_up(self, tmp_path: Path) -> None:
"""Entering the context yields a SqliteSaver; exiting releases the connection."""
db_path = tmp_path / "ck.db"
with get_checkpointer_ctx(db_path) as saver:
assert saver is not None
# The DB file must exist while inside the context.
assert db_path.exists()
# After context exit the file must still exist (not deleted).
assert db_path.exists()
def test_db_file_created_on_enter(self, tmp_path: Path) -> None:
"""The sqlite file is created when the context is entered."""
db_path = tmp_path / "nested" / "dir" / "ck.db"
assert not db_path.exists()
with get_checkpointer_ctx(db_path):
assert db_path.exists()
def test_parent_dir_created_if_missing(self, tmp_path: Path) -> None:
"""Parent directory is created automatically even if it does not exist."""
db_path = tmp_path / "a" / "b" / "c" / "ck.db"
assert not db_path.parent.exists()
with get_checkpointer_ctx(db_path):
assert db_path.parent.exists()
def test_connection_released_after_ctx_exit(self, tmp_path: Path) -> None:
"""After exiting the context manager, another process/connection can open the DB."""
db_path = tmp_path / "ck.db"
with get_checkpointer_ctx(db_path):
pass # enter and exit
# If the connection were leaked (not closed), WAL mode can still allow reads,
# but we verify by opening with a fresh sqlite3 connection — this must succeed.
with sqlite3.connect(str(db_path)) as conn:
cur = conn.execute("SELECT name FROM sqlite_master WHERE type='table'")
# LangGraph creates its checkpoint tables; result must be a list (not error).
tables = [row[0] for row in cur.fetchall()]
assert isinstance(tables, list)
def test_meta_and_checkpoint_db_no_lock_conflict(self, tmp_path: Path) -> None:
"""Using two separate DB files in the same directory causes no locking conflict."""
meta_db = tmp_path / "meta.db"
ck_db = tmp_path / "checkpoints.db"
# Simulate concurrent use: open both within the same scope.
with get_checkpointer_ctx(ck_db) as saver:
# Write something to the meta DB while the checkpointer holds its connection.
with sqlite3.connect(str(meta_db)) as conn:
conn.execute("CREATE TABLE IF NOT EXISTS kv (k TEXT PRIMARY KEY, v TEXT)")
conn.execute("INSERT OR REPLACE INTO kv VALUES ('key', 'value')")
conn.commit()
"""Tests for the async get_checkpointer_ctx context manager."""
@pytest.mark.asyncio
async def test_ctx_yields_saver(self, pg_db_url: str) -> None:
"""Entering the async context yields a non-None saver."""
async with get_checkpointer_ctx(pg_db_url) as saver:
assert saver is not None
# Both files must exist and be independently readable.
assert meta_db.exists()
assert ck_db.exists()
@pytest.mark.asyncio
async def test_setup_is_idempotent(self, pg_db_url: str) -> None:
"""``saver.setup()`` is invoked on entry; entering twice must not error."""
async with get_checkpointer_ctx(pg_db_url) as first:
assert first is not None
# A second open against the same DB must not raise — setup() is idempotent.
async with get_checkpointer_ctx(pg_db_url) as second:
assert second is not None
with sqlite3.connect(str(meta_db)) as conn:
row = conn.execute("SELECT v FROM kv WHERE k='key'").fetchone()
assert row is not None
assert row[0] == "value"
@pytest.mark.asyncio
async def test_accepts_sqlalchemy_url(self, pg_db_url: str) -> None:
"""SQLAlchemy-style ``postgresql+asyncpg://`` URLs are accepted."""
assert pg_db_url.startswith("postgresql+asyncpg://")
async with get_checkpointer_ctx(pg_db_url) as saver:
assert saver is not None

View File

@@ -98,7 +98,7 @@ def _make_pricing() -> PricingCache:
@pytest.mark.asyncio
@pytest.mark.timeout(600) # 10 minute hard limit for slow LLM responses
async def test_e2e_spec_and_review_workflow(tmp_path: Path) -> None:
async def test_e2e_spec_and_review_workflow(tmp_path: Path, pg_db_url: str) -> None:
"""Real OpenRouter call: full spec-and-review@1 workflow end-to-end.
Persona binding (all pinned via BindingOverride for determinism):
@@ -112,16 +112,20 @@ async def test_e2e_spec_and_review_workflow(tmp_path: Path) -> None:
Cost estimate: ~$0.01-$0.05 for 3 phases with max_tokens=4096 each.
"""
# ---- Setup: config overrides pointing to tmp_path ----
# ---- Setup: config overrides pointing to tmp_path + isolated Postgres DB.
# `pg_db_url` is the v0.2-PR-1 conftest fixture that creates a fresh
# Postgres DB per test (against docker-compose `devflow-postgres`) and
# drops it on teardown. This is the only test in the suite that exercises
# the production Postgres path end-to-end; the bulk of unit + integration
# tests still use sqlite+aiosqlite tmp_path for speed.
ws_root = tmp_path / "ws"
ws_root.mkdir(parents=True, exist_ok=True)
db_path = tmp_path / "e2e.sqlite"
config = load_config(
workspace_root=ws_root,
data_dir=tmp_path / "data",
state_dir=tmp_path / "state",
database_url=f"sqlite+aiosqlite:///{db_path}",
database_url=pg_db_url,
budget_on_hit="warn_continue", # do not block during E2E test
budget_run_usd=5.0, # generous cap for E2E
budget_daily_usd=10.0,