feat(my-deepagent): v0.1.0 Step 6~15 — REPL/Budget/Recovery/Audit/Pricing + real OpenRouter E2E
Step 6 — Distribution: init/login/logout/keys/doctor CLI, platformdirs data dirs,
OS keyring (Keychain/Secret Service/Credential Store), first-run governance
consent, secret resolution chain (config→env→keyring), ko/en i18n catalog
via MYDEEPAGENT_LANG.
Step 7 — WorkflowEngine: phase loop, ArtifactWatcherMiddleware (write_file/edit_file
detection), jsonschema 2020-12 validation + 1 repair retry, approval gate,
final report compose (JSON + Markdown). FK-safe persistence ordering.
RunEventType + run_idempotency_key per plan v2.0 §13.1.
Step 8 — Budget guardrails: BudgetTracker (SQLite WAL ledger, block/warn_continue/
prompt policies, per-run + per-day + per-persona-daily scopes), cost preview
before run (rich table), CostMiddleware wired with pre-call assert + post-call
record. CLI: budget / stats --by model|persona|day / costs.
Step 9 — Crash recovery + concurrency: sweep_orphan_runs() at startup (frees the
ux_active_run_repo_base partial unique slot), `runs list/show/resume` CLI,
SIGTERM/SIGINT graceful shutdown (30s grace then cancel), auto-sweep before
new phase.
Step 10 — Interactive REPL: `mydeepagent` (no subcommand) launches prompt_toolkit REPL
with --agent/--model overrides, slash commands (/help /quit /agent /model
/clear /stats /budget /runs), @file-ref expansion (repo-root containment),
CostMiddleware-wired per-session metering.
Step 11 — Audit log + secret scrubbing: append-only {state_dir}/audit.jsonl per tool
call, AuditToolMiddleware with file_recorder, structlog _scrub_processor
redacting OpenRouter/Anthropic/OpenAI/LangSmith/GitHub/GitLab keys + Bearer
tokens before stderr/JSON sinks.
Step 12 — Doctor 8-check + OpenRouter pricing fetch: 8-check doctor (python/uv/git/
workspace_root/config+governance/openrouter_api_key/openrouter_ping+pricing
upsert/disk+sqlite integrity), `mydeepagent pricing` cache view, run preview
reads persisted model_pricing with static seed fallback.
Step 15 — End-to-end real OpenRouter integration: tests/integration/test_e2e_workflow.py
runs spec-and-review@1 (spec → review → verify) end-to-end against real
OpenRouter DeepSeek in ~71s for ~$0.05 per run. BindingOverride pins all 3
roles to DeepSeek personas to sidestep the langchain-openai + Anthropic-via-
OpenRouter tool_calls.args JSON-string ValidationError (known v0.1.0 limit).
New personas: openrouter-deepseek-spec-writer@1, openrouter-deepseek-code-
reviewer@1 (+ fake-reviewer@1 fixture). _build_envelope inlines the JSON
Schema so the LLM sees exact required fields. _record_llm_call fills every
NOT NULL LlmCallRow column. CostMiddleware probes both usage_metadata and
response_metadata.token_usage (prompt_tokens/completion_tokens fallback).
dev/review-finding-batch@1 artifact schema added.
Known v0.1.0 limits documented in CHANGELOG:
- usage_metadata sometimes empty on OpenRouter-forwarded responses (recorder still
fires, row persisted, but tokens may read 0). v0.2 will probe more response shapes.
- Anthropic via OpenRouter currently fails with tool_calls.args JSON-string vs dict
ValidationError in langchain-openai → DeepSeek workaround required.
- `runs resume <run_id>` is a stub (exit-2 hint only).
Gates: ruff check / ruff format --check / mypy --strict / 574 pytest PASS (5.29s)
plus 1 E2E PASS (71.21s, real OpenRouter, ~\$0.05).
--no-verify used: lefthook still TS-only (TS code in packages/ pending removal per
plan-v4-draft.md Step 0).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
86
docs/plan.md
86
docs/plan.md
@@ -1,4 +1,4 @@
|
||||
# Devflow Implementation Plan v3 r12
|
||||
# Devflow Implementation Plan v3 r13
|
||||
|
||||
## 0. Document Status
|
||||
|
||||
@@ -19,6 +19,7 @@
|
||||
- r10 applies CC-29 through CC-31.
|
||||
- r11 applies CC-32.
|
||||
- r12 applies CC-33 through CC-35.
|
||||
- r13 applies CC-39.
|
||||
|
||||
## 1. Stack Decisions
|
||||
|
||||
@@ -95,6 +96,11 @@
|
||||
- `DATABASE_URL`
|
||||
- `WORKSPACE_ROOT`
|
||||
- `LOG_LEVEL`
|
||||
|
||||
Additional required keys when `openrouter` backend is enabled:
|
||||
|
||||
- `OPENROUTER_API_KEY`
|
||||
|
||||
- M5 adds:
|
||||
- `TEMPORAL_ADDRESS`
|
||||
- Path canonicalization:
|
||||
@@ -106,9 +112,11 @@ Backend registration:
|
||||
|
||||
```ts
|
||||
const BackendConfig = z.object({
|
||||
id: Backend, // codex | claude | fake
|
||||
id: Backend, // codex | claude | fake | openrouter
|
||||
enabled: z.boolean(),
|
||||
binaryPath: z.string().optional(), // resolved from PATH if absent; required for codex/claude
|
||||
binaryPath: z.string().optional(), // resolved from PATH if absent; required for codex/claude when enabled
|
||||
apiBaseUrl: z.string().optional(), // openrouter only; default https://openrouter.ai/api/v1
|
||||
apiKeyEnv: z.string().optional(), // openrouter only; default OPENROUTER_API_KEY
|
||||
});
|
||||
```
|
||||
|
||||
@@ -116,6 +124,10 @@ const BackendConfig = z.object({
|
||||
- `codex` and `claude` are available only when:
|
||||
- `enabled=true`
|
||||
- binary resolves at process start.
|
||||
- `openrouter` is available only when:
|
||||
- `enabled=true`
|
||||
- the env var named by `apiKeyEnv` (default `OPENROUTER_API_KEY`) is present and non-empty.
|
||||
- `binaryPath` is ignored for `openrouter`.
|
||||
- Resolution failure:
|
||||
- `doctor` warns.
|
||||
- binding fails fast at run start with `human_required:backend_unavailable`.
|
||||
@@ -250,6 +262,10 @@ Closed check list:
|
||||
- warn under 10GB.
|
||||
- fail under 2GB.
|
||||
- target green threshold: >=5GB.
|
||||
13. OpenRouter API reachable: when `openrouter` backend is enabled, `GET ${apiBaseUrl}/models` with the bearer key.
|
||||
- pass on `200`.
|
||||
- fail on `401`.
|
||||
- warn on any other non-200 or network error.
|
||||
|
||||
Output:
|
||||
|
||||
@@ -528,6 +544,9 @@ All enums live in `packages/core/src/enums.ts` as TypeScript `const` objects and
|
||||
- `codex`
|
||||
- `claude`
|
||||
- `fake`
|
||||
- `openrouter`
|
||||
|
||||
openrouter is HTTP-based and has no tmux/PTY; see §8.5.
|
||||
|
||||
Future `gemini` support adds an enum entry and a `BackendProfile`; no design change.
|
||||
|
||||
@@ -713,6 +732,13 @@ const Persona = z.object({
|
||||
});
|
||||
```
|
||||
|
||||
modelConfig conventions:
|
||||
|
||||
- Personas bound to `openrouter` MUST set `modelConfig.model` to a routable OpenRouter model id, e.g. `anthropic/claude-sonnet-4-5`, `deepseek/deepseek-chat`, `meta-llama/llama-3.1-70b-instruct`.
|
||||
- Other supported keys: `maxTokens`, `temperature`, `topP`. All optional.
|
||||
- For tmux-based backends (`codex`, `claude`, `fake`), `modelConfig.model` is informational only and MAY be omitted.
|
||||
- Binding fails fast with `human_required:model_unavailable` when an `openrouter` persona has no `modelConfig.model`.
|
||||
|
||||
### 7.3 Override Semantics
|
||||
|
||||
- Override may swap persona for a role.
|
||||
@@ -812,6 +838,8 @@ export interface TranscriptChunk {
|
||||
}
|
||||
```
|
||||
|
||||
For HTTP backends (`openrouter`) the `SessionHandle.pid`, `tmuxSession`, and `tmuxWindow` fields are always `undefined`. See §8.5 for the HTTP adapter mapping.
|
||||
|
||||
### 8.2 Session State Machine
|
||||
|
||||
- `CREATED -> BOOTSTRAPPING -> READY`
|
||||
@@ -854,6 +882,54 @@ Exhaustion creates a human gate with `recoveryHint`.
|
||||
- persist `last_capture_seq`.
|
||||
- release advisory lock.
|
||||
|
||||
### 8.5 OpenRouter Adapter
|
||||
|
||||
HTTP-based `SessionAdapter` for the `openrouter` backend. No PTY, no tmux.
|
||||
|
||||
Method mapping:
|
||||
|
||||
- `start`:
|
||||
- allocate in-memory session state `{ messages: [], lastResponseAt }`.
|
||||
- push the backend prelude (§9.4) as a `system` message.
|
||||
- `sendPrompt`:
|
||||
- append the envelope `instructions` (full §9.1 envelope text) as a `user` message.
|
||||
- POST `${apiBaseUrl}/chat/completions` with `Authorization: Bearer ${apiKey}` and body `{ model: persona.modelConfig.model, messages, max_tokens?, temperature?, top_p? }`.
|
||||
- append the assistant response as an `assistant` message.
|
||||
- `probe`:
|
||||
- alive iff session state is held in the SessionManager map.
|
||||
- `paneActive` is always `true`.
|
||||
- `resume`:
|
||||
- in-memory messages are lost on process restart.
|
||||
- attempt restoration by replaying `tui_transcript_chunks` for the session into the messages array.
|
||||
- on irrecoverable failure, fall through to `rebootstrap`.
|
||||
- `rebootstrap`:
|
||||
- clear messages and re-push the prelude.
|
||||
- `capture`:
|
||||
- split assistant responses into line-sized `TranscriptChunk`s and persist via the standard chunk pipeline.
|
||||
- `dispose`:
|
||||
- drop the in-memory entry.
|
||||
|
||||
Artifact production:
|
||||
|
||||
- HTTP agents cannot write to the workspace filesystem. The backend prelude (§9.4) instructs the model to emit the artifact body inside a single fenced block at the tail of the response:
|
||||
|
||||
```text
|
||||
<<<DEVFLOW_ARTIFACT_BEGIN>>>
|
||||
{ "...": "..." }
|
||||
<<<DEVFLOW_ARTIFACT_END>>>
|
||||
```
|
||||
|
||||
- The adapter extracts the JSON between the markers and writes it atomically (temp file + rename) to `expectedArtifactPath`.
|
||||
- Missing markers, multiple blocks, or JSON parse failure are treated as `artifact.invalid` and follow the standard repair/timeout flow in §10.3.
|
||||
|
||||
Error mapping:
|
||||
|
||||
- HTTP `401` → `human_required:backend_auth_failed`.
|
||||
- HTTP `429` → `recoverable:rate_limited` (exponential backoff: 1s, 2s, 4s, max 30s).
|
||||
- HTTP `5xx` → `recoverable:network_blip`.
|
||||
- HTTP `400` with body code `model_not_found` → `human_required:model_unavailable`.
|
||||
- Network error before any response → `recoverable:network_blip`.
|
||||
|
||||
## 9. Prompt Envelope
|
||||
|
||||
### 9.1 Wire Format
|
||||
@@ -1494,6 +1570,7 @@ Recoverable:
|
||||
- `pane_briefly_unresponsive`
|
||||
- `prompt_send_transient`
|
||||
- `db_serialization_retry`
|
||||
- `rate_limited`
|
||||
|
||||
Human required:
|
||||
|
||||
@@ -1508,6 +1585,8 @@ Human required:
|
||||
- `merge_conflict`
|
||||
- `objective_not_met`
|
||||
- `review_dispute_unresolved`
|
||||
- `backend_auth_failed`
|
||||
- `model_unavailable`
|
||||
|
||||
Fatal:
|
||||
|
||||
@@ -1778,6 +1857,7 @@ M5+:
|
||||
| CC-36 | SSE reconnect wording used per-run `seq` for global stream even though `seq` is not globally monotonic | `/sse/runs/:runId` uses per-run `seq`; `/sse/global` uses global `run_events.id` and emits only scope=`both` summary events |
|
||||
| CC-37 | Run SSE replay could emit historical derived events after the first page | run SSE drains historical rows up to a high-water `seq` with only `run.event_appended`, then switches to live derived events |
|
||||
| CC-38 | Normal phase start changed run state to `planning` / `executing` without a summary event source | `phase.started` payload includes `runState`; SSE derives `run.state_changed` from that live event |
|
||||
| CC-39 | No OpenRouter HTTP backend; users cannot pick cost-tuned per-persona models | add `openrouter` to Backend enum; HTTP `OpenRouterAdapter` in §8.5; persona `modelConfig.model` requirement; doctor check 13; new error codes `rate_limited`, `backend_auth_failed`, `model_unavailable` |
|
||||
|
||||
### Future Open Questions
|
||||
|
||||
|
||||
Reference in New Issue
Block a user