docs: patch plan.md to v4 r1 (Python rewrite spec) + .gitignore node_modules
plan.md v4 r1 patches (per plan-v4-draft.md §0/§1/§2/§3/§8.5/§18/§22/§23):
- §0 header: v3 r13 → v4 r1 + note explaining the language migration. v3 CC
counter frozen at CC-39; v4 begins its own series (DR-1 below).
- §1 Stack Decisions: full rewrite for Python (uv / pydantic v2 /
pydantic-settings / SQLAlchemy 2 async + aiosqlite / typer + prompt_toolkit
/ structlog / FastAPI + sse-starlette).
- §2 Directory Layout: collapse v3 multi-package monorepo → single
`my-deepagent/` project. TS `apps/`, `packages/`, `tests/`, `scripts/` are
gone after `0e61b2d`.
- §3 doctor: 13-check (Node/pnpm/Docker/Drizzle) → 8-check (python/uv/git/
workspace_root/config+governance/openrouter_api_key/openrouter_ping+pricing
upsert/disk+sqlite integrity).
- §8.5 OpenRouter Adapter: full rewrite. v3 marker-extraction HTTP adapter
(CC-39) is superseded by the deepagents 0.6.1 multi-turn tool-using agent
driven by `my_deepagent.session.build_agent`. Native write_file/read_file/
bash via LocalShellBackend; SafetyShellMiddleware enforces destructive
command + deny-path policy; ArtifactWatcherMiddleware observes artifact
writes; CostMiddleware records usage. Known v0.1.0 limits documented:
usage_metadata empty on OpenRouter-forwarded responses, Anthropic-via-
OpenRouter tool_calls.args ValidationError requires DeepSeek workaround.
- §18 Errors: add `token_budget_exceeded` and `tool_quota_exceeded` under
human_required.
- §22 Decision Log: add DR-1 "v3 → v4 major bump" with rationale, scope,
recovery path (pre-python-rewrite tag at c9fed71).
- §23 Kickoff Order: v3 historical order preserved + v4 Python step matrix
showing Step 0~12 + Step 15 DONE, Step 13/14 (tmux/TUI recovery) DEFERRED.
§4~§17 (DB schema, enums, hashing, template/persona/binding, session
runtime, prompt envelope, artifact schema registry, run events, fake
adapter, state machines, approval state, run engine + Temporal contract,
WriteSet/worktree, SSE contract) are language-neutral domain spec and remain
unchanged for the Python implementation.
.gitignore: re-add `node_modules/` (legacy Node tree kept ignored until
`rm -rf` cleanup outside git).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
572
docs/plan.md
572
docs/plan.md
@@ -1,12 +1,23 @@
|
||||
# Devflow Implementation Plan v3 r13
|
||||
# Devflow Implementation Plan v4 r1
|
||||
|
||||
## 0. Document Status
|
||||
|
||||
- **v4 r1: language migration TS → Python.** Major version bump; the TypeScript
|
||||
monorepo (apps/, packages/, tests/, scripts/, pnpm/tsconfig metadata) was
|
||||
deleted in `0e61b2d` after being re-implemented under `my-deepagent/`.
|
||||
v3 CC counters are preserved as historical context; v4 begins its own CC
|
||||
series (DR-1 below; CC-Py-1 onward as new change clarifications land).
|
||||
- This document supersedes v2 and all earlier v3 drafts where conflicting.
|
||||
- Single-user, single-machine assumption. No auth, no retention policy, no observability dashboards, no multi-tenancy.
|
||||
- Target OS: macOS 13+ / Linux. No Windows.
|
||||
- All paths are Unix-style. All times are stored UTC.
|
||||
- Decisions in this document are locked unless explicitly marked `(provisional)`. Override requires updating this document, not only code.
|
||||
- §1 Stack Decisions, §2 Directory Layout, §3 doctor checklist, §22 Decision Log
|
||||
have been rewritten for v4 r1. §4~§17 (DB schema, enums, hashing, template/
|
||||
persona/binding, session runtime, prompt envelope, artifact registry, run
|
||||
events, fake adapter, state machines, errors, write set, SSE contract) are
|
||||
language-neutral domain spec and remain valid for the Python implementation.
|
||||
- v3 CC history (informational):
|
||||
- r1 applied CC-1 through CC-5.
|
||||
- r2 applied CC-6 through CC-10.
|
||||
- r3 applied CC-11 through CC-15.
|
||||
@@ -19,218 +30,188 @@
|
||||
- r10 applies CC-29 through CC-31.
|
||||
- r11 applies CC-32.
|
||||
- r12 applies CC-33 through CC-35.
|
||||
- r13 applies CC-39.
|
||||
- r13 applied CC-39 (final v3 revision; superseded by v4 r1).
|
||||
|
||||
## 1. Stack Decisions
|
||||
|
||||
### 1.1 Workspace
|
||||
|
||||
- `pnpm 9` with workspaces. No Turbo.
|
||||
- Node 22 LTS, pinned by `.nvmrc` and `package.json#engines`.
|
||||
- TypeScript 5.6 with project references via `tsc -b`.
|
||||
- `strict: true`.
|
||||
- No `any` unless accompanied by an explicit annotation comment explaining why.
|
||||
- **Python 3.12+**, managed by **uv** workspaces (`uv sync`, `uv add`, `uv run`).
|
||||
- Pinned via `.python-version`. No Node, no pnpm, no tsc.
|
||||
- `pyproject.toml` at repo root + per-package `pyproject.toml` under
|
||||
`packages/<name>/` (uv workspace members).
|
||||
- Imports are absolute. No `from . import *`.
|
||||
|
||||
### 1.2 Tooling
|
||||
|
||||
- Build:
|
||||
- `tsup` for libraries, CJS + ESM dual output.
|
||||
- `vite` for `apps/web`.
|
||||
- `tsx` for `apps/cli`, `apps/api`, and `apps/worker` in dev.
|
||||
- `node` for prod-ish local runs.
|
||||
- Test:
|
||||
- `vitest` with workspace config.
|
||||
- Coverage via `@vitest/coverage-v8`.
|
||||
- No coverage gate at M1.
|
||||
- M9 adds coverage gate: >=70% lines on `packages/core`, `packages/session`, `packages/run-engine`.
|
||||
- Lint/format:
|
||||
- `biome`.
|
||||
- One root config.
|
||||
- Pre-commit:
|
||||
- `lefthook`.
|
||||
- Runs `biome check --write` on staged files.
|
||||
- Runs `tsc -p tsconfig.typecheck.json --noEmit`.
|
||||
- Runs related Vitest tests on changed packages.
|
||||
| Concern | Choice | Notes |
|
||||
|---------|--------|-------|
|
||||
| Lint / format | **ruff** | One root `ruff.toml`. `ruff check .` + `ruff format --check .`. |
|
||||
| Type check | **mypy --strict** | `mypy.ini` enables strict mode; tests relax `disallow_untyped_defs`. |
|
||||
| Test | **pytest** + **pytest-asyncio** + **pytest-httpx** + **respx** | `pytest -q`. |
|
||||
| Pre-commit | **pre-commit** (`.pre-commit-config.yaml`) | Runs ruff + mypy + pytest --collect-only. |
|
||||
| Schema validation | **pydantic v2** + **pydantic-settings** | Replaces zod. |
|
||||
| YAML | **PyYAML** | Persona/template YAML loaders. |
|
||||
| JSON Schema | **jsonschema** (2020-12) | Artifact registry. |
|
||||
| HTTP client | **httpx** (async) | OpenRouter / pricing fetch. |
|
||||
| Logging | **structlog** + **rich** | Replaces pino. `_scrub_processor` redacts secrets before stderr / JSON sinks. |
|
||||
| CLI | **typer** + **prompt_toolkit** | Replaces commander; prompt_toolkit drives the interactive REPL. |
|
||||
| OS dirs | **platformdirs** | XDG data / state / config dirs. |
|
||||
| Secrets | **keyring** | macOS Keychain / Linux Secret Service / Windows Credential Store. |
|
||||
|
||||
### 1.3 Database
|
||||
|
||||
- Postgres 16 via Docker Compose.
|
||||
- Drizzle ORM + `drizzle-kit generate`.
|
||||
- Generated SQL migrations are committed.
|
||||
- Migrations are never auto-applied at runtime except through the explicit migration runner invoked by `devflow up`.
|
||||
- Migration runner:
|
||||
- `scripts/migrate.ts`.
|
||||
- Takes `DATABASE_URL`.
|
||||
- `devflow up` waits for Postgres health and then runs pending migrations.
|
||||
- **SQLite 3 (WAL mode)** via **aiosqlite**, ORM: **SQLAlchemy 2.0 async**.
|
||||
- Migrations: **Alembic** (baseline + per-feature revisions).
|
||||
- WAL + `busy_timeout=5000` + `PRAGMA foreign_keys=ON` enforced at connect.
|
||||
- Postgres (the v3 default) is parked: single-machine + single-user removes the
|
||||
multi-process concurrency justification, and aiosqlite + the
|
||||
`ux_active_run_repo_base` partial unique index covers the active-run
|
||||
uniqueness invariant. Postgres can be reinstated for multi-tenant later.
|
||||
|
||||
### 1.4 Logging
|
||||
|
||||
- `pino`.
|
||||
- `pino-pretty` in dev, JSON otherwise.
|
||||
- Standard fields:
|
||||
- `time`
|
||||
- `level`
|
||||
- `module`
|
||||
- `runId?`
|
||||
- `phaseId?`
|
||||
- `role?`
|
||||
- `eventId?`
|
||||
- Levels:
|
||||
- `trace`: transcript chunks only.
|
||||
- `debug`: internal state transitions.
|
||||
- `info`: run events.
|
||||
- `warn`: recoverable errors.
|
||||
- `error`: human-required or fatal errors.
|
||||
- **structlog**, JSON sink to stderr by default, rich pretty sink when stdout is
|
||||
a TTY.
|
||||
- Standard fields: `time`, `level`, `module`, `run_id?`, `phase_id?`, `role?`,
|
||||
`event_id?`, `interactive_session_id?`.
|
||||
- `_scrub_processor` redacts OpenRouter / Anthropic / OpenAI / LangSmith /
|
||||
GitHub / GitLab API keys and generic `Bearer …` tokens before emission.
|
||||
- Levels: same semantics as v3 (`trace`/`debug`/`info`/`warn`/`error`).
|
||||
|
||||
### 1.5 Config
|
||||
|
||||
- Single Zod schema in `packages/core/src/config.ts`.
|
||||
- Source precedence, high to low:
|
||||
- `process.env`
|
||||
- `.env.local`
|
||||
- `.env`
|
||||
- schema defaults
|
||||
- Config is loaded once at process start, validated, frozen, and exported as typed `Config`.
|
||||
- Config validation failure is fatal.
|
||||
- Required keys at M1:
|
||||
- `DATABASE_URL`
|
||||
- `WORKSPACE_ROOT`
|
||||
- `LOG_LEVEL`
|
||||
- Single `pydantic-settings` BaseSettings in
|
||||
`my_deepagent.config.Config` with `MYDEEPAGENT_` env prefix and optional TOML
|
||||
source.
|
||||
- Source precedence (high → low): explicit overrides → `os.environ` (with
|
||||
`MYDEEPAGENT_` prefix) → `.env` → `config.toml` → schema defaults.
|
||||
- Config is loaded once at process start, validated, frozen, and re-exported as
|
||||
an immutable typed `Config`.
|
||||
- Validation failure is fatal (exit code 2).
|
||||
- Required keys at v0.1.0:
|
||||
- `MYDEEPAGENT_DATABASE_URL` (default `sqlite+aiosqlite:///<state_dir>/db.sqlite3`)
|
||||
- `MYDEEPAGENT_WORKSPACE_ROOT`
|
||||
- `MYDEEPAGENT_LOG_LEVEL`
|
||||
- `MYDEEPAGENT_OPENROUTER_API_KEY` when the OpenRouter backend is enabled
|
||||
(resolution order: config → env → OS keyring → error).
|
||||
- Path canonicalization: `workspace_root` is resolved via `Path.resolve()` at
|
||||
config load. Any path entering the system is canonicalized before storage or
|
||||
hashing.
|
||||
|
||||
Additional required keys when `openrouter` backend is enabled:
|
||||
Backend registration (deepagents-flavored):
|
||||
|
||||
- `OPENROUTER_API_KEY`
|
||||
|
||||
- M5 adds:
|
||||
- `TEMPORAL_ADDRESS`
|
||||
- Path canonicalization:
|
||||
- `WORKSPACE_ROOT` is resolved through `fs.realpathSync` and stored as an absolute path at config load.
|
||||
- Any path entering the system must be canonicalized before storage or hashing.
|
||||
- `repo_path` and `worktree_root` rules are defined in section 4.
|
||||
|
||||
Backend registration:
|
||||
|
||||
```ts
|
||||
const BackendConfig = z.object({
|
||||
id: Backend, // codex | claude | fake | openrouter
|
||||
enabled: z.boolean(),
|
||||
binaryPath: z.string().optional(), // resolved from PATH if absent; required for codex/claude when enabled
|
||||
apiBaseUrl: z.string().optional(), // openrouter only; default https://openrouter.ai/api/v1
|
||||
apiKeyEnv: z.string().optional(), // openrouter only; default OPENROUTER_API_KEY
|
||||
});
|
||||
```python
|
||||
class BackendConfig(BaseModel, frozen=True):
|
||||
id: Backend # openrouter | anthropic | openai | google | fake
|
||||
enabled: bool
|
||||
api_base_url: str | None = None # openrouter default https://openrouter.ai/api/v1
|
||||
api_key_env: str | None = None # default MYDEEPAGENT_OPENROUTER_API_KEY
|
||||
```
|
||||
|
||||
- `fake` is always available.
|
||||
- `codex` and `claude` are available only when:
|
||||
- `enabled=true`
|
||||
- binary resolves at process start.
|
||||
- `openrouter` is available only when:
|
||||
- `enabled=true`
|
||||
- the env var named by `apiKeyEnv` (default `OPENROUTER_API_KEY`) is present and non-empty.
|
||||
- `binaryPath` is ignored for `openrouter`.
|
||||
- Resolution failure:
|
||||
- `doctor` warns.
|
||||
- binding fails fast at run start with `human_required:backend_unavailable`.
|
||||
- Binding reads from `config.backends`, never directly from `PATH`.
|
||||
- `openrouter` is available only when enabled and the resolved key is present.
|
||||
- Doctor warns on misconfig; binding fails fast at run start with
|
||||
`human_required:backend_unavailable`.
|
||||
|
||||
### 1.6 HTTP
|
||||
### 1.6 HTTP / SSE
|
||||
|
||||
- `fastify` 5.
|
||||
- `@fastify/sensible`.
|
||||
- SSE primary strategy:
|
||||
- Try `fastify-sse-v2`.
|
||||
- Fastify 5 compatibility is not assumed.
|
||||
- M1 includes a smoke test.
|
||||
- SSE fallback:
|
||||
- Native `reply.raw`.
|
||||
- Headers:
|
||||
- `content-type: text/event-stream`
|
||||
- `cache-control: no-cache`
|
||||
- `connection: keep-alive`
|
||||
- Write `data: <json>\n\n`.
|
||||
- Manage heartbeats and reconnect manually.
|
||||
- WebSocket is deferred unless SSE fails under transcript volume.
|
||||
- **FastAPI** + **uvicorn** + **sse-starlette** for the M8-Py REST + SSE
|
||||
surface (v3 r13 §17 contract unchanged: same event types, same headers,
|
||||
same `data: <json>\n\n` framing).
|
||||
- Body validation via the same pydantic v2 models used elsewhere.
|
||||
- WebSocket remains deferred unless SSE fails under transcript volume.
|
||||
|
||||
## 2. Directory Layout
|
||||
|
||||
v4 r1 collapses the v3 multi-package monorepo into a single `my-deepagent/`
|
||||
project. The TS `apps/`, `packages/`, `tests/`, `scripts/` trees were deleted
|
||||
in `0e61b2d`; v3 §4~§17 module-by-module spec still applies but each module
|
||||
now lives under `my_deepagent/<name>.py` instead of
|
||||
`packages/<name>/src/<name>.ts`.
|
||||
|
||||
```text
|
||||
devflow/
|
||||
├── package.json
|
||||
├── pnpm-workspace.yaml
|
||||
├── tsconfig.base.json
|
||||
├── biome.json
|
||||
├── lefthook.yml
|
||||
├── vitest.workspace.ts
|
||||
├── docker-compose.yml
|
||||
├── .nvmrc
|
||||
├── .env.example
|
||||
<repo-root>/
|
||||
├── docs/
|
||||
│ ├── plan.md
|
||||
│ ├── adr/
|
||||
│ ├── plan.md # this document
|
||||
│ ├── plan-v4-draft.md # v4 r1 design memo (informational)
|
||||
│ └── schemas/
|
||||
│ ├── artifacts/
|
||||
│ ├── personas/
|
||||
│ └── templates/
|
||||
├── scripts/
|
||||
│ ├── migrate.ts
|
||||
│ └── seed.ts
|
||||
├── packages/
|
||||
│ ├── core/
|
||||
│ │ └── src/
|
||||
│ │ ├── config.ts
|
||||
│ │ ├── enums.ts
|
||||
│ │ ├── hash.ts
|
||||
│ │ ├── errors.ts
|
||||
│ │ ├── template.ts
|
||||
│ │ ├── persona.ts
|
||||
│ │ ├── binding.ts
|
||||
│ │ ├── prompt-envelope.ts
|
||||
│ │ ├── artifact-schema.ts
|
||||
│ │ ├── run-event.ts
|
||||
│ │ └── index.ts
|
||||
│ ├── db/
|
||||
│ │ └── src/
|
||||
│ │ ├── schema/
|
||||
│ │ ├── migrations/
|
||||
│ │ ├── repositories/
|
||||
│ │ └── client.ts
|
||||
│ ├── session/
|
||||
│ │ └── src/
|
||||
│ │ ├── adapter.ts
|
||||
│ │ ├── fake.ts
|
||||
│ │ ├── tmux.ts
|
||||
│ │ ├── profiles/
|
||||
│ │ │ ├── codex.ts
|
||||
│ │ │ └── claude.ts
|
||||
│ │ ├── recovery.ts
|
||||
│ │ └── transcript.ts
|
||||
│ ├── harness/
|
||||
│ │ └── src/
|
||||
│ │ ├── git.ts
|
||||
│ │ ├── worktree.ts
|
||||
│ │ ├── runner.ts
|
||||
│ │ ├── review.ts
|
||||
│ │ └── backtest.ts
|
||||
│ ├── run-engine/
|
||||
│ │ └── src/
|
||||
│ │ ├── engine.ts
|
||||
│ │ ├── phase-executor.ts
|
||||
│ │ └── approval.ts
|
||||
│ └── workflows/
|
||||
│ └── src/
|
||||
│ ├── workflow.ts
|
||||
│ └── activities.ts
|
||||
├── apps/
|
||||
│ ├── api/
|
||||
│ ├── web/
|
||||
│ ├── cli/
|
||||
│ └── worker/
|
||||
└── tests/
|
||||
├── e2e/
|
||||
└── fixtures/
|
||||
│ ├── artifacts/ # JSON Schema 2020-12 (language-neutral)
|
||||
│ ├── personas/ # YAML persona seed (language-neutral)
|
||||
│ └── templates/ # YAML workflow templates
|
||||
├── docker-compose.yml # Postgres + Temporal (still relevant for M5-Py)
|
||||
├── .env.example
|
||||
├── .gitignore
|
||||
├── my-deepagent-seed/ # v0.1.0 bootstrap kit (historical, may be pruned)
|
||||
└── my-deepagent/
|
||||
├── pyproject.toml # uv workspace root
|
||||
├── uv.lock
|
||||
├── ruff.toml
|
||||
├── mypy.ini
|
||||
├── alembic.ini
|
||||
├── .pre-commit-config.yaml
|
||||
├── CHANGELOG.md
|
||||
├── alembic/
|
||||
│ ├── env.py
|
||||
│ └── versions/ # baseline + per-feature migrations
|
||||
├── docs/schemas/ # mirror of repo-root docs/schemas for loader convenience
|
||||
├── src/my_deepagent/
|
||||
│ ├── config.py # pydantic-settings Config (replaces §1.5 zod schema)
|
||||
│ ├── enums.py # closed-set enums (§5)
|
||||
│ ├── errors.py # error taxonomy (§18)
|
||||
│ ├── hash.py # content-addressed hashing (§6)
|
||||
│ ├── persona.py # Persona + loader (§7.2)
|
||||
│ ├── workflow.py # WorkflowTemplate + loader (§7.1)
|
||||
│ ├── binding.py # autoSelect / override / consent store (§7.4)
|
||||
│ ├── artifact_schema.py # JSON Schema 2020-12 registry (§10)
|
||||
│ ├── run_event.py # event types + idempotency keys (§11, §13.1)
|
||||
│ ├── prompt_envelope.py # envelope builder (§9)
|
||||
│ ├── budget.py # BudgetTracker (v4-new)
|
||||
│ ├── secrets.py # config → env → keyring resolution chain
|
||||
│ ├── keys.py # OS keyring wrapper
|
||||
│ ├── audit.py # append-only JSONL audit log (v4-new)
|
||||
│ ├── logging.py # structlog + secret scrubber (§1.4)
|
||||
│ ├── governance.py # first-run consent (v4-new)
|
||||
│ ├── i18n/ # ko / en catalog
|
||||
│ ├── recovery.py # sweep_orphan_runs (§19)
|
||||
│ ├── session.py # deepagents adapter (§8.5, v4-new)
|
||||
│ ├── engine.py # WorkflowEngine — phase loop (§15)
|
||||
│ ├── persistence/
|
||||
│ │ ├── db.py # SQLAlchemy 2 async engine
|
||||
│ │ ├── models.py # ORM models (§4)
|
||||
│ │ └── checkpointer.py # LangGraph SqliteSaver context
|
||||
│ ├── middleware/
|
||||
│ │ ├── cost.py # CostMiddleware (v4-new)
|
||||
│ │ ├── budget.py # BudgetMiddleware (v4-new)
|
||||
│ │ ├── audit.py # AuditToolMiddleware
|
||||
│ │ ├── safety.py # SafetyShellMiddleware (deny-path / destructive command)
|
||||
│ │ └── artifact_watcher.py # ArtifactWatcherMiddleware
|
||||
│ ├── monitoring/
|
||||
│ │ ├── pricing.py # OpenRouter pricing cache
|
||||
│ │ └── cost_estimator.py # pre-run preview
|
||||
│ ├── cli/ # typer-driven CLI
|
||||
│ │ ├── main.py # entry (interactive REPL when no subcommand)
|
||||
│ │ ├── doctor.py # §3 doctor checks (Python/uv version)
|
||||
│ │ ├── init.py
|
||||
│ │ ├── keys_cmd.py
|
||||
│ │ ├── run.py
|
||||
│ │ ├── runs.py
|
||||
│ │ ├── stats.py
|
||||
│ │ └── interactive.py # prompt_toolkit REPL
|
||||
│ ├── tui/
|
||||
│ │ └── approval.py # tri-state approval prompt
|
||||
│ └── slash.py # REPL slash commands
|
||||
└── tests/
|
||||
├── unit/ # pure-Python unit tests
|
||||
└── integration/ # async + persistence + real OpenRouter (gated)
|
||||
```
|
||||
|
||||
## 3. `devflow doctor`
|
||||
Future trees deferred:
|
||||
- `apps/api/`, `apps/worker/` (M5-Py / M8-Py): FastAPI app and temporalio
|
||||
worker. v4 r1 keeps them out until M5 lands.
|
||||
- `apps/web/`: Web GUI port is out of scope for v0.1.0 (separate milestone).
|
||||
|
||||
## 3. `mydeepagent doctor`
|
||||
|
||||
Exit codes:
|
||||
|
||||
@@ -245,34 +226,42 @@ Each check emits:
|
||||
- `detail`
|
||||
- `remediation`
|
||||
|
||||
Closed check list:
|
||||
Closed check list (v4 r1, 8 checks — Node/pnpm/Docker/Drizzle dropped):
|
||||
|
||||
1. Node version satisfies `>=22.0.0 <23`.
|
||||
2. pnpm version `>=9.0.0`.
|
||||
3. `tmux` exists, version `>=3.3`.
|
||||
4. `git` version `>=2.40`.
|
||||
5. Docker daemon reachable.
|
||||
6. Postgres container running, `pg_isready` ok, `DATABASE_URL` connects.
|
||||
7. No pending Drizzle migrations.
|
||||
8. `WORKSPACE_ROOT` exists and is writable.
|
||||
9. `.env` resolves to valid `Config`.
|
||||
10. `codex` in `PATH`, warn-only.
|
||||
11. `claude` in `PATH`, warn-only.
|
||||
12. Free disk on `WORKSPACE_ROOT` partition:
|
||||
- warn under 10GB.
|
||||
- fail under 2GB.
|
||||
- target green threshold: >=5GB.
|
||||
13. OpenRouter API reachable: when `openrouter` backend is enabled, `GET ${apiBaseUrl}/models` with the bearer key.
|
||||
- pass on `200`.
|
||||
- fail on `401`.
|
||||
- warn on any other non-200 or network error.
|
||||
1. **python**: `python --version` satisfies `>=3.12,<3.14`.
|
||||
2. **uv**: `uv --version` resolves (any).
|
||||
3. **git**: `git --version` `>=2.40`.
|
||||
4. **workspace_root**: `MYDEEPAGENT_WORKSPACE_ROOT` exists, is a directory,
|
||||
and is writable.
|
||||
5. **config+governance**: `Config` loads from env + `.env` + `config.toml`
|
||||
without ValidationError; first-run governance consent file exists (or is
|
||||
created interactively on first run only).
|
||||
6. **openrouter_api_key**: resolution chain (config → env → OS keyring)
|
||||
yields a non-empty value. Warn-only when the OpenRouter backend is not
|
||||
enabled.
|
||||
7. **openrouter_ping + pricing upsert**: `GET https://openrouter.ai/api/v1/models`
|
||||
with the bearer key.
|
||||
- `200` → pass; pricing rows are upserted into `model_pricing` for use by
|
||||
the `mydeepagent run` cost preview.
|
||||
- `401` → fail.
|
||||
- any other non-200 / network error → warn.
|
||||
8. **disk+sqlite integrity**:
|
||||
- Free disk on the `workspace_root` partition: warn under 10 GB, fail under
|
||||
2 GB, green target ≥ 5 GB.
|
||||
- SQLite DB file (if present) opens and `PRAGMA integrity_check` returns
|
||||
`ok`.
|
||||
|
||||
Output:
|
||||
|
||||
- Human table by default.
|
||||
- Rich human table by default.
|
||||
- `--json` for machine-readable output.
|
||||
- `--quiet` prints only nonzero results.
|
||||
- `--list-orphans` lists orphaned worktrees only; it never removes them.
|
||||
|
||||
Notes:
|
||||
- `tmux` / `Docker` / `Postgres` / `pg_isready` / drizzle migration checks from
|
||||
v3 §3 are dropped in v4 r1 — the v0.1.0 runtime is SQLite-only and tmux is
|
||||
out of scope for the deepagents-driven session model.
|
||||
- `--list-orphans` and friends are owned by `mydeepagent runs list/show` (§19).
|
||||
|
||||
## 4. Database Schema
|
||||
|
||||
@@ -882,53 +871,91 @@ Exhaustion creates a human gate with `recoveryHint`.
|
||||
- persist `last_capture_seq`.
|
||||
- release advisory lock.
|
||||
|
||||
### 8.5 OpenRouter Adapter
|
||||
### 8.5 OpenRouter Adapter — v4 r1 deepagents rewrite
|
||||
|
||||
HTTP-based `SessionAdapter` for the `openrouter` backend. No PTY, no tmux.
|
||||
**Supersedes the v3 marker-extraction HTTP adapter (CC-39).** In v4 the
|
||||
OpenRouter integration is a multi-turn, tool-using agent driven by LangChain
|
||||
`deepagents` 0.6.1 — no single-shot completions, no `<<<DEVFLOW_ARTIFACT_*>>>`
|
||||
markers, no transcript replay reconstruction.
|
||||
|
||||
Method mapping:
|
||||
Construction — `my_deepagent.session.build_agent(persona, run_id, …)`:
|
||||
|
||||
- `start`:
|
||||
- allocate in-memory session state `{ messages: [], lastResponseAt }`.
|
||||
- push the backend prelude (§9.4) as a `system` message.
|
||||
- `sendPrompt`:
|
||||
- append the envelope `instructions` (full §9.1 envelope text) as a `user` message.
|
||||
- POST `${apiBaseUrl}/chat/completions` with `Authorization: Bearer ${apiKey}` and body `{ model: persona.modelConfig.model, messages, max_tokens?, temperature?, top_p? }`.
|
||||
- append the assistant response as an `assistant` message.
|
||||
- `probe`:
|
||||
- alive iff session state is held in the SessionManager map.
|
||||
- `paneActive` is always `true`.
|
||||
- `resume`:
|
||||
- in-memory messages are lost on process restart.
|
||||
- attempt restoration by replaying `tui_transcript_chunks` for the session into the messages array.
|
||||
- on irrecoverable failure, fall through to `rebootstrap`.
|
||||
- `rebootstrap`:
|
||||
- clear messages and re-push the prelude.
|
||||
- `capture`:
|
||||
- split assistant responses into line-sized `TranscriptChunk`s and persist via the standard chunk pipeline.
|
||||
- `dispose`:
|
||||
- drop the in-memory entry.
|
||||
```python
|
||||
llm = ChatOpenAI(
|
||||
model=persona.model, # e.g. "openrouter:deepseek/deepseek-chat"
|
||||
base_url=config.openrouter_api_base, # https://openrouter.ai/api/v1
|
||||
api_key=resolve_openrouter_api_key(),
|
||||
timeout=persona.model_params.timeout,
|
||||
)
|
||||
agent = deepagents.create_deep_agent(
|
||||
model=llm,
|
||||
tools=[], # base tools come from LocalShellBackend
|
||||
instructions=persona.system_prompt,
|
||||
subagents=[_subagent_to_dict(s) for s in persona.subagents],
|
||||
middleware=[
|
||||
SafetyShellMiddleware(...), # destructive command + deny-path guard
|
||||
AuditToolMiddleware(...), # append-only JSONL audit log
|
||||
ArtifactWatcherMiddleware(...), # write_file/edit_file detection
|
||||
CostMiddleware(...), # usage_metadata + budget ledger
|
||||
],
|
||||
backend=LocalShellBackend( # bash + read_file + write_file + edit_file + ls
|
||||
cwd=worktree_root,
|
||||
# `permissions` kwarg is intentionally omitted for local_shell backend
|
||||
# (deepagents 0.6.1 NotImplementedError workaround — enforcement moves
|
||||
# to SafetyShellMiddleware).
|
||||
),
|
||||
)
|
||||
```
|
||||
|
||||
Method mapping (driven by `WorkflowEngine` rather than a v3-style adapter
|
||||
interface):
|
||||
|
||||
- **Start**: `create_deep_agent` returns a `CompiledStateGraph` per phase.
|
||||
No persistent session object is shared across phases — each phase is a
|
||||
fresh agent invocation parameterized by persona + envelope.
|
||||
- **Send prompt**: `await agent.ainvoke({"messages": [HumanMessage(envelope)]})`
|
||||
where `envelope` is built by `WorkflowEngine._build_envelope` (§9 with the
|
||||
artifact JSON Schema inlined so the model sees the exact required fields).
|
||||
- **Tool use**: native `read_file` / `write_file` / `edit_file` / `ls` /
|
||||
`bash` calls are emitted by the model and dispatched through
|
||||
LocalShellBackend, recorded by AuditToolMiddleware, gated by
|
||||
SafetyShellMiddleware.
|
||||
- **Probe / resume / rebootstrap / dispose**: not applicable — the agent is
|
||||
ephemeral per phase. Crash recovery operates at the run/phase level via
|
||||
`sweep_orphan_runs` (§19), not at a session-adapter level.
|
||||
|
||||
Artifact production:
|
||||
|
||||
- HTTP agents cannot write to the workspace filesystem. The backend prelude (§9.4) instructs the model to emit the artifact body inside a single fenced block at the tail of the response:
|
||||
- The model writes the artifact directly to `expected_artifact_path` via the
|
||||
`write_file` tool. ArtifactWatcherMiddleware observes the tool call and
|
||||
notifies the engine.
|
||||
- The envelope inlines the artifact's JSON Schema definition so the LLM has
|
||||
the exact required fields.
|
||||
- Schema validation is performed by `ArtifactSchemaRegistry.validate` on the
|
||||
written file (§10). On failure, the engine retries once with a repair
|
||||
prompt; second failure raises `human_required:artifact_invalid_after_repair`.
|
||||
|
||||
```text
|
||||
<<<DEVFLOW_ARTIFACT_BEGIN>>>
|
||||
{ "...": "..." }
|
||||
<<<DEVFLOW_ARTIFACT_END>>>
|
||||
```
|
||||
|
||||
- The adapter extracts the JSON between the markers and writes it atomically (temp file + rename) to `expectedArtifactPath`.
|
||||
- Missing markers, multiple blocks, or JSON parse failure are treated as `artifact.invalid` and follow the standard repair/timeout flow in §10.3.
|
||||
|
||||
Error mapping:
|
||||
Error mapping (preserved from CC-39, applied per-call by the LangChain
|
||||
exception path):
|
||||
|
||||
- HTTP `401` → `human_required:backend_auth_failed`.
|
||||
- HTTP `429` → `recoverable:rate_limited` (exponential backoff: 1s, 2s, 4s, max 30s).
|
||||
- HTTP `429` → `recoverable:rate_limited` (exponential backoff: 1 s, 2 s, 4 s,
|
||||
max 30 s, owned by langchain-openai retries).
|
||||
- HTTP `5xx` → `recoverable:network_blip`.
|
||||
- HTTP `400` with body code `model_not_found` → `human_required:model_unavailable`.
|
||||
- Network error before any response → `recoverable:network_blip`.
|
||||
- HTTP `400` with `model_not_found` → `human_required:model_unavailable`.
|
||||
- BudgetTracker pre-call rejection → `human_required:token_budget_exceeded`.
|
||||
- SafetyShellMiddleware blocked tool call → `human_required:tool_quota_exceeded`.
|
||||
|
||||
Known v0.1.0 limitations:
|
||||
|
||||
- `usage_metadata` is sometimes empty on responses forwarded by OpenRouter
|
||||
(deepagents wraps the underlying ChatOpenAI response so token counts may
|
||||
not surface). The recorder still fires and `LlmCallRow` is persisted, but
|
||||
`input_tokens` / `output_tokens` may read 0. v0.2 will probe additional
|
||||
response shapes (raw chunks / callbacks).
|
||||
- Anthropic models via OpenRouter currently fail with a `tool_calls.args`
|
||||
JSON-string vs dict ValidationError inside `langchain-openai` 1.2.1.
|
||||
Workaround: pin DeepSeek personas via `BindingOverride`. Tracking for v0.2.
|
||||
|
||||
## 9. Prompt Envelope
|
||||
|
||||
@@ -1549,19 +1576,22 @@ Reconnect:
|
||||
|
||||
## 18. Errors
|
||||
|
||||
`packages/core/src/errors.ts`:
|
||||
v4: `my_deepagent.errors.MyDeepAgentError` (replaces v3 `DevflowError` 1:1):
|
||||
|
||||
```ts
|
||||
type ErrorClass = 'recoverable' | 'human_required' | 'fatal';
|
||||
```python
|
||||
class ErrorClass(StrEnum):
|
||||
RECOVERABLE = "recoverable"
|
||||
HUMAN_REQUIRED = "human_required"
|
||||
FATAL = "fatal"
|
||||
|
||||
class DevflowError extends Error {
|
||||
readonly class: ErrorClass;
|
||||
readonly code: string;
|
||||
readonly runId?: string;
|
||||
readonly phaseId?: string;
|
||||
readonly recoveryHint?: string;
|
||||
readonly cause?: unknown;
|
||||
}
|
||||
|
||||
class MyDeepAgentError(Exception):
|
||||
error_class: ErrorClass
|
||||
code: str
|
||||
run_id: UUID | None
|
||||
phase_id: UUID | None
|
||||
recovery_hint: str | None
|
||||
cause: BaseException | None
|
||||
```
|
||||
|
||||
Recoverable:
|
||||
@@ -1587,6 +1617,12 @@ Human required:
|
||||
- `review_dispute_unresolved`
|
||||
- `backend_auth_failed`
|
||||
- `model_unavailable`
|
||||
- `token_budget_exceeded` *(v4 r1: BudgetTracker rejects a call whose
|
||||
estimated cost would breach the per-run, per-day, or per-persona-daily cap
|
||||
with `on_hit=block`.)*
|
||||
- `tool_quota_exceeded` *(v4 r1: SafetyShellMiddleware blocked a tool call
|
||||
due to deny-path / destructive-command policy, or a per-phase tool-call
|
||||
cap was hit.)*
|
||||
|
||||
Fatal:
|
||||
|
||||
@@ -1857,7 +1893,13 @@ M5+:
|
||||
| CC-36 | SSE reconnect wording used per-run `seq` for global stream even though `seq` is not globally monotonic | `/sse/runs/:runId` uses per-run `seq`; `/sse/global` uses global `run_events.id` and emits only scope=`both` summary events |
|
||||
| CC-37 | Run SSE replay could emit historical derived events after the first page | run SSE drains historical rows up to a high-water `seq` with only `run.event_appended`, then switches to live derived events |
|
||||
| CC-38 | Normal phase start changed run state to `planning` / `executing` without a summary event source | `phase.started` payload includes `runState`; SSE derives `run.state_changed` from that live event |
|
||||
| CC-39 | No OpenRouter HTTP backend; users cannot pick cost-tuned per-persona models | add `openrouter` to Backend enum; HTTP `OpenRouterAdapter` in §8.5; persona `modelConfig.model` requirement; doctor check 13; new error codes `rate_limited`, `backend_auth_failed`, `model_unavailable` |
|
||||
| CC-39 | No OpenRouter HTTP backend; users cannot pick cost-tuned per-persona models | add `openrouter` to Backend enum; HTTP `OpenRouterAdapter` in §8.5; persona `modelConfig.model` requirement; doctor check 13; new error codes `rate_limited`, `backend_auth_failed`, `model_unavailable` (final v3 entry — v4 reinterprets the OpenRouter integration as the deepagents-driven session adapter; the standalone HTTP `OpenRouterAdapter` from CC-39 is **superseded by DR-1**) |
|
||||
|
||||
### Decision Records (v4)
|
||||
|
||||
| ID | Decision | Rationale | Impact |
|
||||
|----|----------|-----------|--------|
|
||||
| DR-1 | **v3 → v4 major bump: delete TS monorepo, rewrite in Python on LangChain `deepagents`.** | (1) Claude/Anthropic direct API cost is prohibitive for a single-user toolchain. (2) OpenRouter cost-tuned models (DeepSeek, etc.) require a multi-turn, tool-using agent harness; `deepagents` is Python-only with no 1:1 TS port. (3) Switching languages is shorter than reimplementing the harness. | Step 0 (commit `0e61b2d`) deleted `apps/`, `packages/`, `tests/`, `scripts/`, pnpm/tsconfig metadata. The Python rewrite lives at `my-deepagent/` and reached Step 15 (real OpenRouter E2E PASS, ~$0.05/run) before the v3 codebase was removed. CC-39's separate `OpenRouterAdapter` is replaced by `my_deepagent.session.build_agent` (deepagents 0.6.1 with LocalShellBackend + SafetyShellMiddleware). v3 CC counters frozen; v4 begins its own series. Recovery: `git checkout pre-python-rewrite -- <path>`. |
|
||||
|
||||
### Future Open Questions
|
||||
|
||||
@@ -1868,6 +1910,8 @@ M5+:
|
||||
|
||||
## 23. Kickoff Order
|
||||
|
||||
v3 historical order (TS, completed up to M8 before the v4 pivot):
|
||||
|
||||
1. M1.1: repo + pnpm + tsconfig + biome + lefthook + vitest workspace.
|
||||
2. M1.2: docker-compose + Postgres healthcheck + drizzle-kit + first migration.
|
||||
3. M1.3: `apps/cli` skeleton + `devflow doctor`.
|
||||
@@ -1880,3 +1924,27 @@ M5+:
|
||||
10. M3.3: engine-shaped harness running a single fake phase end-to-end.
|
||||
11. M4: assemble run engine; lock contract; full fake `development@1` minus reviewers.
|
||||
12. M5 in parallel with M6 once M4 is green.
|
||||
|
||||
v4 r1 order (Python, status as of v0.1.0):
|
||||
|
||||
| Step | Scope | Status |
|
||||
|------|-------|--------|
|
||||
| Step 0 | Scaffold `my-deepagent/` (uv workspace, ruff, mypy, alembic, .pre-commit) | DONE (`17ba5d7`) |
|
||||
| Step 1 | `devflow_core` → `my_deepagent.{config,enums,errors,hash,persona,prompt_envelope,run_event}` | DONE |
|
||||
| Step 2 | `devflow_db` → `my_deepagent.persistence.{db,models,checkpointer}` + Alembic baseline | DONE |
|
||||
| Step 3 | `mydeepagent doctor` (typer) | DONE |
|
||||
| Step 4 | Persona / workflow seeding + binding (`my_deepagent.{persona,workflow,binding}`) | DONE |
|
||||
| Step 5 | Artifact schema registry (`my_deepagent.artifact_schema`) | DONE |
|
||||
| Step 6 | Distribution: init/login/logout/keys, governance consent, i18n (ko/en) | DONE |
|
||||
| Step 7 | WorkflowEngine + ArtifactWatcherMiddleware (replaces v3 §15 in-process engine) | DONE |
|
||||
| Step 8 | Budget guardrails (`my_deepagent.budget` + cost preview + CostMiddleware) | DONE |
|
||||
| Step 9 | Crash recovery + concurrency (`my_deepagent.recovery` + `mydeepagent runs …`) | DONE |
|
||||
| Step 10 | Interactive REPL (`mydeepagent` no-subcommand + slash commands) | DONE |
|
||||
| Step 11 | Audit log + structlog secret scrubbing | DONE |
|
||||
| Step 12 | Doctor 8-check + OpenRouter pricing fetch + `mydeepagent pricing` | DONE |
|
||||
| Step 13 | Tmux adapter (M6-Py) | DEFERRED — not in v0.1.0 |
|
||||
| Step 14 | TUI recovery (M7-Py) | DEFERRED — not in v0.1.0 |
|
||||
| Step 15 | End-to-end real OpenRouter integration test | DONE (`733c9be`) |
|
||||
| Step 0-purge | Delete v3 TS monorepo per DR-1 | DONE (`0e61b2d`) |
|
||||
| M5-Py | Temporal worker (`apps/worker`) | NEXT |
|
||||
| M8-Py | FastAPI + SSE (`apps/api`) | NEXT |
|
||||
|
||||
Reference in New Issue
Block a user