feat(my-deepagent): v0.1.0 Step 6~15 — REPL/Budget/Recovery/Audit/Pricing + real OpenRouter E2E

Step 6  — Distribution: init/login/logout/keys/doctor CLI, platformdirs data dirs,
          OS keyring (Keychain/Secret Service/Credential Store), first-run governance
          consent, secret resolution chain (config→env→keyring), ko/en i18n catalog
          via MYDEEPAGENT_LANG.
Step 7  — WorkflowEngine: phase loop, ArtifactWatcherMiddleware (write_file/edit_file
          detection), jsonschema 2020-12 validation + 1 repair retry, approval gate,
          final report compose (JSON + Markdown). FK-safe persistence ordering.
          RunEventType + run_idempotency_key per plan v2.0 §13.1.
Step 8  — Budget guardrails: BudgetTracker (SQLite WAL ledger, block/warn_continue/
          prompt policies, per-run + per-day + per-persona-daily scopes), cost preview
          before run (rich table), CostMiddleware wired with pre-call assert + post-call
          record. CLI: budget / stats --by model|persona|day / costs.
Step 9  — Crash recovery + concurrency: sweep_orphan_runs() at startup (frees the
          ux_active_run_repo_base partial unique slot), `runs list/show/resume` CLI,
          SIGTERM/SIGINT graceful shutdown (30s grace then cancel), auto-sweep before
          new phase.
Step 10 — Interactive REPL: `mydeepagent` (no subcommand) launches prompt_toolkit REPL
          with --agent/--model overrides, slash commands (/help /quit /agent /model
          /clear /stats /budget /runs), @file-ref expansion (repo-root containment),
          CostMiddleware-wired per-session metering.
Step 11 — Audit log + secret scrubbing: append-only {state_dir}/audit.jsonl per tool
          call, AuditToolMiddleware with file_recorder, structlog _scrub_processor
          redacting OpenRouter/Anthropic/OpenAI/LangSmith/GitHub/GitLab keys + Bearer
          tokens before stderr/JSON sinks.
Step 12 — Doctor 8-check + OpenRouter pricing fetch: 8-check doctor (python/uv/git/
          workspace_root/config+governance/openrouter_api_key/openrouter_ping+pricing
          upsert/disk+sqlite integrity), `mydeepagent pricing` cache view, run preview
          reads persisted model_pricing with static seed fallback.
Step 15 — End-to-end real OpenRouter integration: tests/integration/test_e2e_workflow.py
          runs spec-and-review@1 (spec → review → verify) end-to-end against real
          OpenRouter DeepSeek in ~71s for ~$0.05 per run. BindingOverride pins all 3
          roles to DeepSeek personas to sidestep the langchain-openai + Anthropic-via-
          OpenRouter tool_calls.args JSON-string ValidationError (known v0.1.0 limit).
          New personas: openrouter-deepseek-spec-writer@1, openrouter-deepseek-code-
          reviewer@1 (+ fake-reviewer@1 fixture). _build_envelope inlines the JSON
          Schema so the LLM sees exact required fields. _record_llm_call fills every
          NOT NULL LlmCallRow column. CostMiddleware probes both usage_metadata and
          response_metadata.token_usage (prompt_tokens/completion_tokens fallback).
          dev/review-finding-batch@1 artifact schema added.

Known v0.1.0 limits documented in CHANGELOG:
- usage_metadata sometimes empty on OpenRouter-forwarded responses (recorder still
  fires, row persisted, but tokens may read 0). v0.2 will probe more response shapes.
- Anthropic via OpenRouter currently fails with tool_calls.args JSON-string vs dict
  ValidationError in langchain-openai → DeepSeek workaround required.
- `runs resume <run_id>` is a stub (exit-2 hint only).

Gates: ruff check / ruff format --check / mypy --strict / 574 pytest PASS (5.29s)
plus 1 E2E PASS (71.21s, real OpenRouter, ~\$0.05).

--no-verify used: lefthook still TS-only (TS code in packages/ pending removal per
plan-v4-draft.md Step 0).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
chungyeong
2026-05-16 16:32:46 +09:00
parent 17ba5d723b
commit 733c9be0bd
66 changed files with 8286 additions and 100 deletions

View File

@@ -0,0 +1,63 @@
"""Append-only audit log at {state_dir}/audit.jsonl. One JSON object per line.
Tracks every tool call (execute, write_file, edit_file, read_file, ...) plus
every destructive-attempt block. Used for post-hoc forensics and compliance.
The file is opened with O_APPEND so concurrent processes can safely append.
"""
from __future__ import annotations
import json
import os
from collections.abc import Awaitable, Callable
from datetime import UTC, datetime
from pathlib import Path
from typing import Any
def audit_path(state_dir: Path) -> Path:
return state_dir / "audit.jsonl"
def append_audit_record(state_dir: Path, record: dict[str, Any]) -> None:
"""Append a record to audit.jsonl atomically (O_APPEND + single write call)."""
state_dir.mkdir(parents=True, exist_ok=True)
target = audit_path(state_dir)
record_with_ts = {"ts": datetime.now(UTC).isoformat(timespec="seconds"), **record}
line = json.dumps(record_with_ts, ensure_ascii=False, sort_keys=True) + "\n"
fd = os.open(target, os.O_WRONLY | os.O_CREAT | os.O_APPEND, 0o600)
try:
os.write(fd, line.encode("utf-8"))
finally:
os.close(fd)
def read_audit_records(state_dir: Path, limit: int | None = None) -> list[dict[str, Any]]:
"""Read all records (or last ``limit``) from audit.jsonl."""
target = audit_path(state_dir)
if not target.is_file():
return []
records: list[dict[str, Any]] = []
with target.open("r", encoding="utf-8") as f:
for line in f:
stripped = line.strip()
if not stripped:
continue
try:
records.append(json.loads(stripped))
except json.JSONDecodeError:
continue
if limit is not None and limit > 0:
return records[-limit:]
return records
def make_audit_recorder(
state_dir: Path,
) -> Callable[[dict[str, Any]], Awaitable[None]]:
"""Return an async callable suitable as a file_recorder for AuditToolMiddleware."""
async def _recorder(record: dict[str, Any]) -> None:
append_audit_record(state_dir, record)
return _recorder

View File

@@ -0,0 +1,249 @@
"""Budget tracking: SQLite-backed ledger + assert/record API + on_hit policy.
Mirrors the PoC in my-deepagent-seed/poc/src/poc/budget.py but uses the project's
async Database (SQLAlchemy 2.0) and the BudgetLedgerRow ORM model.
"""
from __future__ import annotations
import logging
from collections.abc import Awaitable, Callable
from dataclasses import dataclass
from datetime import UTC, datetime
from enum import StrEnum
from uuid import UUID
from sqlalchemy.dialects.sqlite import insert as sqlite_insert
from .config import Config
from .errors import BudgetExhaustedError
from .persistence.db import Database
from .persistence.models import BudgetLedgerRow
_logger = logging.getLogger(__name__)
# Async callback signature for on_hit="prompt": (scope, projected, cap) -> Awaitable[bool]
# Return True to extend the cap and proceed; False to block.
PromptCallback = Callable[[str, float, float], Awaitable[bool]]
class BudgetOnHit(StrEnum):
BLOCK = "block"
WARN_CONTINUE = "warn_continue"
PROMPT = "prompt"
@dataclass(frozen=True)
class BudgetCheck:
"""Result of assert_can_call. ok=True means proceed."""
ok: bool
blocked_scope: str | None = None
projected_usd: float | None = None
cap_usd: float | None = None
def _today_utc() -> str:
return datetime.now(UTC).strftime("%Y-%m-%d")
def _now_iso() -> str:
return datetime.now(UTC).isoformat(timespec="seconds")
class BudgetTracker:
"""Per-scope spend ledger + cap enforcement.
Scopes (string keys):
- ``day:YYYY-MM-DD`` (UTC date) — daily cap shared across all runs.
- ``run:<uuid>`` — per-run cap.
- ``persona:<name>:day:YYYY-MM-DD`` — per-persona daily quota (optional).
on_hit policy:
- "block": raise BudgetExhaustedError immediately.
- "warn_continue": log a warning, allow the call, do not raise.
- "prompt": invoke the prompt_callback; if it returns True, extend cap; else raise.
"""
def __init__(
self,
db: Database,
daily_cap_usd: float,
run_cap_usd: float,
daily_warn_usd: float,
run_warn_usd: float,
on_hit: BudgetOnHit,
prompt_callback: PromptCallback | None = None,
) -> None:
self._db = db
self._daily_cap = daily_cap_usd
self._run_cap = run_cap_usd
self._daily_warn = daily_warn_usd
self._run_warn = run_warn_usd
self._on_hit = on_hit
self._prompt = prompt_callback
# ----- public API ---------------------------------------------------------
async def init(self) -> None:
"""Ensure ledger rows exist for today's day-scope. No-op if already present."""
async with self._db.session() as s:
await self._ensure_scope(s, f"day:{_today_utc()}", self._daily_cap)
async def assert_can_call(
self,
*,
run_id: UUID | None,
persona_name: str | None,
estimated_cost_usd: float,
) -> BudgetCheck:
"""Check if a call of estimated_cost can proceed. May raise BudgetExhaustedError."""
scopes = self._scopes_for(run_id, persona_name)
async with self._db.session() as s:
for scope in scopes:
cap = self._cap_for_scope(scope)
spent = await self._get_spent(s, scope, cap)
projected = spent + estimated_cost_usd
if cap is not None and projected > cap:
blocked = await self._apply_on_hit(scope, projected, cap)
if blocked:
return BudgetCheck(
ok=False,
blocked_scope=scope,
projected_usd=projected,
cap_usd=cap,
)
return BudgetCheck(ok=True)
async def record(
self,
*,
run_id: UUID | None,
persona_name: str | None,
actual_cost_usd: float,
) -> None:
"""Persist the actual cost into all relevant scopes."""
if actual_cost_usd == 0:
return
scopes = self._scopes_for(run_id, persona_name)
async with self._db.session() as s:
for scope in scopes:
await self._upsert_spend(s, scope, actual_cost_usd, self._cap_for_scope(scope))
async def get_spent(self, scope: str) -> float:
"""Return the total spent USD for a given scope (0.0 if scope does not exist)."""
async with self._db.session() as s:
cap = self._cap_for_scope(scope)
return await self._get_spent(s, scope, cap)
async def get_remaining(self, scope: str) -> float | None:
"""Return remaining cap in USD, or None if this scope has no cap."""
cap = self._cap_for_scope(scope)
if cap is None:
return None
spent = await self.get_spent(scope)
return max(0.0, cap - spent)
# ----- internals ----------------------------------------------------------
def _scopes_for(self, run_id: UUID | None, persona_name: str | None) -> list[str]:
today = _today_utc()
out = [f"day:{today}"]
if run_id is not None:
out.append(f"run:{run_id}")
if persona_name:
out.append(f"persona:{persona_name}:day:{today}")
return out
def _cap_for_scope(self, scope: str) -> float | None:
if scope.startswith("day:"):
return self._daily_cap
if scope.startswith("run:"):
return self._run_cap
if scope.startswith("persona:") and ":day:" in scope:
return self._daily_cap # per-persona daily uses day cap unless overridden
return None
async def _ensure_scope(
self,
s: object,
scope: str,
cap: float | None,
) -> None:
from sqlalchemy.ext.asyncio import AsyncSession
session: AsyncSession = s # type: ignore[assignment]
stmt = (
sqlite_insert(BudgetLedgerRow)
.values(scope=scope, spent_usd=0.0, cap_usd=cap, last_updated=_now_iso())
.on_conflict_do_nothing(index_elements=["scope"])
)
await session.execute(stmt)
async def _get_spent(self, s: object, scope: str, cap: float | None) -> float:
from sqlalchemy.ext.asyncio import AsyncSession
session: AsyncSession = s # type: ignore[assignment]
await self._ensure_scope(session, scope, cap)
row = await session.get(BudgetLedgerRow, scope)
return float(row.spent_usd) if row else 0.0
async def _upsert_spend(
self,
s: object,
scope: str,
delta_usd: float,
cap: float | None,
) -> None:
from sqlalchemy.ext.asyncio import AsyncSession
session: AsyncSession = s # type: ignore[assignment]
stmt = (
sqlite_insert(BudgetLedgerRow)
.values(scope=scope, spent_usd=delta_usd, cap_usd=cap, last_updated=_now_iso())
.on_conflict_do_update(
index_elements=["scope"],
set_={
"spent_usd": BudgetLedgerRow.spent_usd + delta_usd,
"last_updated": _now_iso(),
},
)
)
await session.execute(stmt)
async def _apply_on_hit(self, scope: str, projected_usd: float, cap_usd: float) -> bool:
"""Return True if the call should be blocked (i.e. raise or return False)."""
if self._on_hit == BudgetOnHit.BLOCK:
raise BudgetExhaustedError(scope=scope, projected_usd=projected_usd, cap_usd=cap_usd)
if self._on_hit == BudgetOnHit.WARN_CONTINUE:
_logger.warning(
"budget cap reached but continuing: scope=%s projected=%.4f cap=%.4f",
scope,
projected_usd,
cap_usd,
)
return False
# PROMPT
if self._prompt is None:
raise BudgetExhaustedError(scope=scope, projected_usd=projected_usd, cap_usd=cap_usd)
allow = await self._prompt(scope, projected_usd, cap_usd)
if not allow:
raise BudgetExhaustedError(scope=scope, projected_usd=projected_usd, cap_usd=cap_usd)
return False
def make_budget_tracker_from_config(
db: Database,
config: Config,
prompt_callback: PromptCallback | None = None,
) -> BudgetTracker:
"""Construct a BudgetTracker from application Config."""
return BudgetTracker(
db=db,
daily_cap_usd=config.budget_daily_usd,
run_cap_usd=config.budget_run_usd,
daily_warn_usd=config.budget_daily_warn_usd,
run_warn_usd=config.budget_run_warn_usd,
on_hit=BudgetOnHit(config.budget_on_hit),
prompt_callback=prompt_callback,
)

View File

@@ -1 +1,244 @@
"""CLI doctor command for environment diagnostics. Implemented in Step 12."""
"""mydeepagent doctor — full 8-check environment diagnostic.
Checks:
1. Python 3.12+ <3.14
2. uv >= 0.5
3. git >= 2.40
4. WORKSPACE_ROOT writable
5. config + governance consent
6. OpenRouter API key reachable
7. OpenRouter /models ping + pricing matrix upsert
8. Disk free + SQLite integrity_check
"""
from __future__ import annotations
import asyncio
import shutil
import subprocess
import sys
from dataclasses import dataclass
from datetime import UTC, datetime
from typing import Literal
import httpx
import typer
from rich.console import Console
from rich.table import Table
from sqlalchemy import text as sa_text
from sqlalchemy.dialects.sqlite import insert as sqlite_insert
from ..config import Config, load_config
from ..errors import MyDeepAgentError
from ..governance import has_consent
from ..i18n import t
from ..monitoring.pricing import (
ModelPrice,
fetch_openrouter_pricing,
)
from ..persistence.db import Database
from ..persistence.models import ModelPricingRow
from ..secrets import resolve_openrouter_api_key
_CONSOLE = Console()
@dataclass(frozen=True)
class CheckResult:
name: str
status: Literal["ok", "warn", "fail"]
detail: str = ""
def _check_python() -> CheckResult:
if (3, 12) <= sys.version_info[:2] < (3, 14):
return CheckResult("python", "ok", f"v{sys.version.split()[0]}")
return CheckResult(
"python",
"fail",
f"need 3.12<=x<3.14, got {sys.version.split()[0]}",
)
def _check_uv() -> CheckResult:
path = shutil.which("uv")
if not path:
return CheckResult("uv", "warn", "not on PATH (only needed for dev workflows)")
try:
result = subprocess.run( # noqa: S603
[path, "--version"], capture_output=True, text=True, timeout=5
)
except (OSError, subprocess.TimeoutExpired) as e:
return CheckResult("uv", "warn", f"version probe failed: {e}")
version = result.stdout.strip()
return CheckResult("uv", "ok", version or path)
def _check_git() -> CheckResult:
path = shutil.which("git")
if not path:
return CheckResult("git", "warn", "not on PATH (workflows may use git tools)")
try:
result = subprocess.run( # noqa: S603
[path, "--version"], capture_output=True, text=True, timeout=5
)
except (OSError, subprocess.TimeoutExpired) as e:
return CheckResult("git", "warn", f"version probe failed: {e}")
return CheckResult("git", "ok", result.stdout.strip())
def _check_workspace(config: Config) -> CheckResult:
root = config.workspace_root
if not root.exists():
try:
root.mkdir(parents=True, exist_ok=True)
except OSError as e:
return CheckResult("workspace_root", "fail", f"cannot create: {e}")
try:
probe = root / ".doctor_probe"
probe.write_text("ok", encoding="utf-8")
probe.unlink()
except OSError as e:
return CheckResult("workspace_root", "fail", f"not writable: {e}")
return CheckResult("workspace_root", "ok", str(root))
def _check_config_and_governance(config: Config) -> CheckResult:
if not has_consent(config.data_dir):
return CheckResult(
"config+governance",
"fail",
"governance not accepted — run `mydeepagent init`",
)
return CheckResult("config+governance", "ok", f"data_dir={config.data_dir}")
def _check_openrouter_api_key(config: Config) -> CheckResult:
try:
key = resolve_openrouter_api_key(config)
except MyDeepAgentError as e:
hint = e.recovery_hint or str(e)
return CheckResult("openrouter_api_key", "fail", f"missing: {hint}")
return CheckResult("openrouter_api_key", "ok", f"resolved ({len(key)} chars)")
async def _check_openrouter_ping_and_upsert(config: Config) -> CheckResult:
try:
key = resolve_openrouter_api_key(config)
except MyDeepAgentError:
return CheckResult("openrouter_ping", "warn", "skipped — no API key (see previous check)")
try:
prices = await fetch_openrouter_pricing(key, config.openrouter_base_url)
except MyDeepAgentError as e:
return CheckResult("openrouter_ping", "warn", f"fetch failed: {e}")
except httpx.HTTPStatusError as e:
if e.response.status_code == 401:
return CheckResult("openrouter_ping", "fail", "401 — API key invalid")
return CheckResult("openrouter_ping", "warn", f"http {e.response.status_code}")
if not prices:
return CheckResult("openrouter_ping", "warn", "no models in response payload")
await _upsert_pricing(config, prices)
return CheckResult("openrouter_ping", "ok", f"{len(prices)} models cached")
async def _upsert_pricing(config: Config, prices: list[ModelPrice]) -> None:
db = Database(config.database_url)
await db.init_schema()
now = datetime.now(UTC).isoformat(timespec="seconds")
try:
async with db.session() as s:
for p in prices:
stmt = (
sqlite_insert(ModelPricingRow)
.values(
model=p.model,
input_per_1k_usd=p.input_per_1k_usd,
output_per_1k_usd=p.output_per_1k_usd,
context_length=p.context_length,
fetched_at=now,
raw_payload="",
)
.on_conflict_do_update(
index_elements=["model"],
set_={
"input_per_1k_usd": p.input_per_1k_usd,
"output_per_1k_usd": p.output_per_1k_usd,
"context_length": p.context_length,
"fetched_at": now,
},
)
)
await s.execute(stmt)
await s.commit()
finally:
await db.dispose()
async def _check_disk_and_db(config: Config) -> CheckResult:
usage = shutil.disk_usage(str(config.workspace_root))
free_gb = usage.free / (1024**3)
if free_gb < 2.0:
disk_status: Literal["ok", "warn", "fail"] = "fail"
elif free_gb < 10.0:
disk_status = "warn"
else:
disk_status = "ok"
db = Database(config.database_url)
await db.init_schema()
try:
async with db.session() as s:
row = (await s.execute(sa_text("PRAGMA integrity_check"))).scalar_one()
finally:
await db.dispose()
db_ok = row == "ok"
detail = f"free={free_gb:.1f}GB, sqlite_integrity={'ok' if db_ok else str(row)}"
if disk_status == "fail" or not db_ok:
final: Literal["ok", "warn", "fail"] = "fail"
elif disk_status == "warn":
final = "warn"
else:
final = "ok"
return CheckResult("disk+db", final, detail)
def doctor_command() -> None:
asyncio.run(_doctor_async())
async def _doctor_async() -> None:
try:
config = load_config()
except MyDeepAgentError as e:
_CONSOLE.print(f"[red]config load failed: {e}[/]")
raise typer.Exit(code=1) from None
checks: list[CheckResult] = []
checks.append(_check_python())
checks.append(_check_uv())
checks.append(_check_git())
checks.append(_check_workspace(config))
checks.append(_check_config_and_governance(config))
checks.append(_check_openrouter_api_key(config))
checks.append(await _check_openrouter_ping_and_upsert(config))
checks.append(await _check_disk_and_db(config))
_render(checks)
has_fail = any(c.status == "fail" for c in checks)
if has_fail:
raise typer.Exit(code=1)
def _render(checks: list[CheckResult]) -> None:
title = t("doctor.header") or "Environment diagnostics:"
table = Table(title=title)
table.add_column("Check")
table.add_column("Status")
table.add_column("Detail")
color_map: dict[str, str] = {"ok": "green", "warn": "yellow", "fail": "red"}
for c in checks:
color = color_map[c.status]
table.add_row(c.name, f"[{color}]{c.status}[/]", c.detail)
_CONSOLE.print(table)

View File

@@ -0,0 +1,39 @@
"""mydeepagent init: first-run wizard."""
from __future__ import annotations
import typer
from rich.console import Console
from ..config import load_config
from ..governance import has_consent, record_consent
from ..i18n import t
from ..keys import set_api_key
from .doctor import doctor_command
_CONSOLE = Console()
def init_command() -> None:
config = load_config()
_CONSOLE.print(f"[bold cyan]{t('init.welcome')}[/]")
_CONSOLE.print()
if not has_consent(config.data_dir):
_CONSOLE.print(f"[yellow]{t('init.governance_title')}[/]")
_CONSOLE.print(t("init.governance_body"))
answer = typer.prompt(t("init.governance_prompt"))
if answer.strip().lower() != "yes":
_CONSOLE.print(f"[red]{t('init.governance_declined')}[/]")
raise typer.Exit(code=1)
record_consent(config.data_dir)
api_key = typer.prompt(t("init.api_key_prompt"), hide_input=True, default="")
if api_key.strip():
set_api_key("openrouter", api_key.strip())
_CONSOLE.print(f"[green]{t('init.api_key_saved')}[/]")
else:
_CONSOLE.print(f"[yellow]{t('init.api_key_empty')}[/]")
_CONSOLE.print()
_CONSOLE.print(t("init.doctor_running"))
doctor_command()
_CONSOLE.print()
_CONSOLE.print(f"[bold green]{t('init.done')}[/]")

View File

@@ -1 +1,367 @@
"""CLI interactive subcommand. Implemented in Step 10."""
"""mydeepagent (no subcommand) — interactive REPL.
prompt_toolkit-based REPL. Slash commands for navigation; everything else
goes to the bound agent. File refs ``@path/to/file.py`` are expanded into
markdown code blocks inline before the message is sent.
"""
from __future__ import annotations
import asyncio
import re
from datetime import UTC, datetime
from pathlib import Path
from typing import Any
from uuid import UUID, uuid4
from prompt_toolkit import PromptSession
from prompt_toolkit.completion import WordCompleter
from prompt_toolkit.history import FileHistory
from rich.console import Console
from ..audit import make_audit_recorder
from ..budget import make_budget_tracker_from_config
from ..config import Config, load_config
from ..governance import require_consent
from ..middleware.audit import AuditToolMiddleware
from ..middleware.cost import CostMiddleware
from ..monitoring.pricing import ModelPrice, PricingCache
from ..persistence.db import Database
from ..persona import Persona, load_personas_from_dir
from ..session import build_agent
from ..slash import SlashParsed, SlashRegistry, parse_slash
_CONSOLE = Console()
_FILE_REF_PATTERN = re.compile(r"(?<![\w./])@([\w./\-]+)")
def _seed_root() -> Path:
return Path(__file__).resolve().parents[3] / "docs" / "schemas"
def _history_path(config: Config) -> Path:
p = config.state_dir
p.mkdir(parents=True, exist_ok=True)
return p / "history.txt"
def _expand_file_refs(text: str, repo_root: Path) -> str:
"""Replace ``@path`` tokens with the file contents in fenced markdown blocks.
Silently skips paths that escape the repo root or don't exist.
"""
def _replace(match: re.Match[str]) -> str:
rel = match.group(1)
target = (repo_root / rel).resolve()
try:
target.relative_to(repo_root.resolve())
except ValueError:
return match.group(0)
if not target.is_file():
return match.group(0)
try:
content = target.read_text(encoding="utf-8", errors="replace")
except OSError:
return match.group(0)
suffix = target.suffix.lstrip(".") or ""
return f"\n```{suffix}\n# {rel}\n{content}\n```\n"
return _FILE_REF_PATTERN.sub(_replace, text)
def _static_pricing_seed() -> PricingCache:
"""Minimal pricing matrix for v0.1.0 (full fetch is Step 12).
Unit: USD per 1,000 tokens.
"""
cache = PricingCache()
cache.set(
[
ModelPrice("anthropic/claude-sonnet-4-6", 0.003, 0.015, 200_000),
ModelPrice("anthropic/claude-haiku-4-5", 0.001, 0.005, 200_000),
ModelPrice("anthropic/claude-opus-4-1", 0.015, 0.075, 200_000),
ModelPrice("deepseek/deepseek-chat", 0.00028, 0.00112, 64_000),
]
)
return cache
def _now_iso() -> str:
return datetime.now(UTC).isoformat(timespec="seconds")
class InteractiveSession:
"""Holds REPL state: current persona, current model override, history, agent."""
def __init__(
self,
config: Config,
personas: list[Persona],
db: Database,
pricing: PricingCache,
repo_root: Path,
session_id: UUID,
) -> None:
self.config = config
self.personas = personas
self.db = db
self.pricing = pricing
self.repo_root = repo_root
self.session_id = session_id
self._model_override: str | None = None
self._persona = self._default_persona()
self._agent: Any | None = None
def _default_persona(self) -> Persona:
name = self.config.default_persona
for p in self.personas:
if p.name == name:
return p
if not self.personas:
raise RuntimeError(
"no personas seeded; run `mydeepagent init` or seed docs/schemas/personas/"
)
return self.personas[0]
@property
def persona(self) -> Persona:
return self._persona
@property
def model_override(self) -> str | None:
return self._model_override
def set_persona(self, name: str) -> Persona:
for p in self.personas:
if p.name == name or f"{p.name}@{p.version}" == name:
self._persona = p
self._agent = None # rebuild on next turn
return p
raise ValueError(f"persona not found: {name!r}")
def set_model(self, model: str | None) -> None:
self._model_override = model
self._agent = None
def clear_agent_cache(self) -> None:
"""Flush the cached agent so the next call rebuilds with a fresh thread."""
self._agent = None
def build_agent_if_needed(self) -> Any:
if self._agent is not None:
return self._agent
budget = make_budget_tracker_from_config(self.db, self.config)
cost_mw = CostMiddleware(
pricing=self.pricing,
model_name=self._model_override or self._persona.model,
interactive_session_id=self.session_id,
persona_name=self._persona.name,
budget_tracker=budget,
)
audit_mw = AuditToolMiddleware(
interactive_session_id=self.session_id,
file_recorder=make_audit_recorder(self.config.state_dir),
)
self._agent = build_agent(
self._persona,
self.config,
root_dir=self.repo_root,
middleware=[cost_mw, audit_mw],
model_override=self._model_override,
)
return self._agent
def _register_navigation_slash(reg: SlashRegistry, sess: InteractiveSession) -> None:
"""Register /quit, /exit, /help, /clear slash handlers."""
async def _quit(_: SlashParsed) -> bool:
return True
async def _help(_: SlashParsed) -> bool:
_CONSOLE.print("[bold]Slash commands:[/]")
for name, desc in reg.all_help():
_CONSOLE.print(f" /{name:14s} {desc}")
return False
async def _clear(_: SlashParsed) -> bool:
sess.clear_agent_cache()
_CONSOLE.print("[dim]context cleared (new session thread)[/]")
return False
reg.register("quit", _quit, help="exit the REPL")
reg.register("exit", _quit, help="alias for /quit")
reg.register("help", _help, help="show slash commands")
reg.register("clear", _clear, help="clear conversation context")
def _register_persona_slash(reg: SlashRegistry, sess: InteractiveSession) -> None:
"""Register /agent and /model slash handlers."""
async def _agent_cmd(cmd: SlashParsed) -> bool:
if not cmd.args:
_CONSOLE.print(f"current: [cyan]{sess.persona.name}@{sess.persona.version}[/]")
for p in sess.personas:
_CONSOLE.print(f" - {p.name}@{p.version} ({p.backend.value})")
return False
try:
new = sess.set_persona(cmd.args[0])
_CONSOLE.print(f"[green]switched persona → {new.name}@{new.version}[/]")
except ValueError as e:
_CONSOLE.print(f"[red]{e}[/]")
return False
async def _model_cmd(cmd: SlashParsed) -> bool:
if not cmd.args:
cur = sess.model_override or sess.persona.model
_CONSOLE.print(f"current model: [cyan]{cur}[/]")
return False
if cmd.args[0] in ("-", "reset"):
sess.set_model(None)
_CONSOLE.print("[green]model override cleared[/]")
else:
sess.set_model(cmd.args[0])
_CONSOLE.print(f"[green]model → {cmd.args[0]}[/]")
return False
reg.register("agent", _agent_cmd, help="list or switch persona: /agent [name]")
reg.register("model", _model_cmd, help="override model: /model <id> | reset")
def _register_telemetry_slash(reg: SlashRegistry) -> None:
"""Register /stats, /budget, /runs slash handlers."""
async def _stats(_: SlashParsed) -> bool:
from .stats import stats_command
stats_command(by="model", since_days=1)
return False
async def _budget(_: SlashParsed) -> bool:
from .stats import budget_command
budget_command()
return False
async def _runs(_: SlashParsed) -> bool:
from .runs import runs_list_command
runs_list_command(limit=10, state_filter=None)
return False
reg.register("stats", _stats, help="LLM-call stats (last 24h)")
reg.register("budget", _budget, help="budget ledger")
reg.register("runs", _runs, help="list recent workflow runs")
def _register_slash(reg: SlashRegistry, sess: InteractiveSession) -> None:
_register_navigation_slash(reg, sess)
_register_persona_slash(reg, sess)
_register_telemetry_slash(reg)
def _completer(personas: list[Persona], slash_names: list[str]) -> WordCompleter:
words = [f"/{n}" for n in slash_names]
words += [p.name for p in personas]
return WordCompleter(words, ignore_case=True, sentence=True)
async def _invoke_and_stream(agent: Any, user_text: str, session_id: UUID) -> None:
"""Invoke the agent and pretty-print the response.
v0.1 keeps it simple — full ainvoke, then print the final message.
Token-level streaming via astream is a Step 16 polish.
"""
result = await agent.ainvoke(
{"messages": [{"role": "user", "content": user_text}]},
config={"configurable": {"thread_id": str(session_id)}},
)
messages = result.get("messages", []) if isinstance(result, dict) else []
if not messages:
return
last = messages[-1]
content: Any = getattr(last, "content", "") or ""
if isinstance(content, list):
content = "\n".join(
(c.get("text", str(c)) if isinstance(c, dict) else str(c)) for c in content
)
_CONSOLE.print(str(content))
async def _repl_loop(
sess: InteractiveSession,
reg: SlashRegistry,
prompt_session: PromptSession[str],
) -> int:
"""Inner REPL loop. Returns 0 on clean exit, non-zero on error."""
while True:
try:
line = await prompt_session.prompt_async("» ")
except (EOFError, KeyboardInterrupt):
_CONSOLE.print()
return 0
line = (line or "").strip()
if not line:
continue
parsed = parse_slash(line)
if parsed is not None:
if parsed.name == "":
_CONSOLE.print("[dim]empty slash command; try /help[/]")
continue
done = await reg.dispatch(parsed)
if done:
return 0
if parsed.name not in reg.names:
_CONSOLE.print(f"[yellow]unknown command: /{parsed.name}[/]")
continue
# Forward to agent.
expanded = _expand_file_refs(line, sess.repo_root)
agent = sess.build_agent_if_needed()
try:
await _invoke_and_stream(agent, expanded, sess.session_id)
except Exception as e:
_CONSOLE.print(f"[red]agent error:[/] {type(e).__name__}: {e}")
async def _interactive_loop_async(persona_override: str | None, model_override: str | None) -> int:
config = load_config()
require_consent(config.data_dir)
db = Database(config.database_url)
await db.init_schema()
personas = load_personas_from_dir(_seed_root() / "personas")
if not personas:
_CONSOLE.print("[red]no personas seeded; run `mydeepagent init`[/]")
return 1
pricing = _static_pricing_seed()
session_id = uuid4()
try:
sess = InteractiveSession(config, personas, db, pricing, Path.cwd(), session_id)
if persona_override:
try:
sess.set_persona(persona_override)
except ValueError as e:
_CONSOLE.print(f"[red]{e}[/]")
return 1
if model_override:
sess.set_model(model_override)
reg = SlashRegistry()
_register_slash(reg, sess)
persona_label = f"{sess.persona.name}@{sess.persona.version}"
_CONSOLE.print(f"[bold cyan]my-deepagent[/] — persona [cyan]{persona_label}[/]")
_CONSOLE.print("[dim]type /help for commands, /quit to exit[/]")
prompt_session: PromptSession[str] = PromptSession(
history=FileHistory(str(_history_path(config))),
completer=_completer(personas, reg.names),
)
return await _repl_loop(sess, reg, prompt_session)
finally:
await db.dispose()
def interactive_command(persona: str | None = None, model: str | None = None) -> int:
"""Entry point for the interactive REPL. Returns an exit code."""
return asyncio.run(_interactive_loop_async(persona, model))

View File

@@ -0,0 +1,40 @@
"""login / logout / keys list commands."""
from __future__ import annotations
import typer
from rich.console import Console
from ..i18n import t
from ..keys import delete_api_key, get_api_key, list_providers, mask, set_api_key
_CONSOLE = Console()
def login_command(provider: str) -> None:
value = typer.prompt(t("login.prompt", provider=provider), hide_input=True, default="")
if not value.strip():
_CONSOLE.print(f"[yellow]{t('login.empty')}[/]")
raise typer.Exit(code=1)
set_api_key(provider, value.strip())
_CONSOLE.print(f"[green]{t('login.saved', provider=provider)}[/]")
def logout_command(provider: str) -> None:
removed = delete_api_key(provider)
if removed:
_CONSOLE.print(f"[green]{t('logout.removed', provider=provider)}[/]")
else:
_CONSOLE.print(f"[yellow]{t('logout.not_found', provider=provider)}[/]")
def keys_list_command() -> None:
_CONSOLE.print(t("keys.header"))
found = False
for provider in list_providers():
value = get_api_key(provider)
if value:
_CONSOLE.print(t("keys.entry", provider=provider, masked=mask(value)))
found = True
if not found:
_CONSOLE.print(t("keys.none"))

View File

@@ -1 +1,150 @@
"""Typer CLI entry point. Filled in Step 6."""
"""my-deepagent CLI entry point."""
from __future__ import annotations
from pathlib import Path
import typer
from .doctor import doctor_command
from .init import init_command
from .keys_cmd import keys_list_command, login_command, logout_command
app = typer.Typer(no_args_is_help=False, add_completion=True)
runs_app = typer.Typer(help="Inspect or resume past runs.")
@runs_app.command("list")
def runs_list(
limit: int = typer.Option(20, help="Number of runs to show"),
state: str | None = typer.Option(None, help="Filter by state"),
) -> None:
"""List recent runs."""
from .runs import runs_list_command
runs_list_command(limit, state)
@runs_app.command("show")
def runs_show(run_id: str = typer.Argument(...)) -> None:
"""Show details for a specific run."""
from .runs import runs_show_command
runs_show_command(run_id)
@runs_app.command("resume")
def runs_resume(run_id: str = typer.Argument(...)) -> None:
"""Resume a paused run (v0.1.0: not implemented — shows status only)."""
from .runs import runs_resume_command
runs_resume_command(run_id)
app.add_typer(runs_app, name="runs")
@app.command()
def init() -> None:
"""First-run setup: governance consent + API key + doctor."""
init_command()
@app.command()
def login(provider: str = typer.Argument("openrouter")) -> None:
"""Store an API key for the given provider in the OS keyring."""
login_command(provider)
@app.command()
def logout(provider: str = typer.Argument("openrouter")) -> None:
"""Remove a stored API key from the OS keyring."""
logout_command(provider)
@app.command(name="keys")
def keys_list() -> None:
"""List registered providers (masked)."""
keys_list_command()
@app.command()
def doctor() -> None:
"""Run environment diagnostics (Python/uv/disk for v0.1.0; full suite in Step 12)."""
doctor_command()
@app.command(name="run")
def run(
workflow_path: Path = typer.Argument(..., help="Path to the workflow yaml"), # noqa: B008
repo: Path = typer.Option(Path.cwd(), help="Repo root"), # noqa: B008
base_branch: str = typer.Option("main", help="Base branch"),
no_preview: bool = typer.Option(False, "--no-preview", help="Skip cost preview"),
) -> None:
"""Execute a workflow template end-to-end."""
from .run import run_command
run_command(workflow_path, repo, base_branch, no_preview)
@app.command()
def stats(
by: str = typer.Option("model", help="model | persona | day"),
since_days: int = typer.Option(7, help="Window size in days"),
) -> None:
"""Aggregate LLM-call stats from the ledger."""
from .stats import stats_command
stats_command(by, since_days)
@app.command()
def budget() -> None:
"""Show the current budget ledger (per-scope spend / cap)."""
from .stats import budget_command
budget_command()
@app.command(name="costs")
def costs() -> None:
"""Alias for `stats --by day` over the last 30 days."""
from .stats import stats_command
stats_command(by="day", since_days=30)
@app.command(name="pricing")
def pricing() -> None:
"""Show cached OpenRouter pricing matrix (populated by `doctor`)."""
from .stats import pricing_command
pricing_command()
@app.callback(invoke_without_command=True)
def main(
ctx: typer.Context,
agent: str | None = typer.Option(None, "--agent", help="Start with a specific persona"),
model: str | None = typer.Option(None, "--model", help="Model override"),
) -> None:
from ..logging import configure_logging
try:
from ..config import load_config
cfg = load_config()
configure_logging(level=cfg.log_level, json_output=False)
except Exception:
configure_logging(level="info", json_output=False)
if ctx.invoked_subcommand is None:
from .interactive import interactive_command
code = interactive_command(agent, model)
raise typer.Exit(code=code)
if __name__ == "__main__":
app()

View File

@@ -1 +1,194 @@
"""CLI run command implementation. Implemented in Step 6."""
"""mydeepagent run <workflow.yaml> — execute a workflow end-to-end."""
from __future__ import annotations
import asyncio
from pathlib import Path
import typer
from rich.console import Console
from rich.table import Table
from sqlalchemy import select
from ..artifact_schema import ArtifactSchemaRegistry
from ..binding import BackendAvailability, PersonaConsentStore, bind_personas
from ..budget import BudgetTracker, make_budget_tracker_from_config
from ..config import Config, load_config
from ..engine import WorkflowEngine
from ..enums import Backend
from ..governance import require_consent
from ..monitoring.cost_estimator import WorkflowCostEstimate, estimate_workflow
from ..monitoring.pricing import ModelPrice, PricingCache
from ..persistence.db import Database
from ..persistence.models import ModelPricingRow
from ..persona import load_personas_from_dir
from ..tui.approval import cli_approval_callback
from ..workflow import load_workflow_yaml
_CONSOLE = Console()
def run_command(
workflow_path: Path,
repo: Path,
base_branch: str,
no_preview: bool = False,
) -> None:
"""Synchronous CLI wrapper for the async engine."""
asyncio.run(_run_async(workflow_path, repo, base_branch, no_preview))
async def cli_budget_prompt(scope: str, projected: float, cap: float) -> bool:
"""Prompt the user to extend the budget cap when it is hit."""
_CONSOLE.print()
_CONSOLE.print(
f"[yellow]Budget cap reached[/]: scope={scope} projected=${projected:.4f} cap=${cap:.4f}"
)
return typer.confirm("Extend cap and proceed?", default=False)
def _static_pricing_seed_fallback() -> list[ModelPrice]:
"""Return seed model prices used when the model_pricing DB table is empty.
Unit: USD per 1,000 tokens. (OpenRouter publishes per-token; we store per-1K to keep
cost arithmetic in a more readable range. ``compute_cost(model, in, out)`` divides
by 1000.)
"""
return [
ModelPrice("anthropic/claude-sonnet-4-6", 0.003, 0.015, 200_000),
ModelPrice("anthropic/claude-haiku-4-5", 0.001, 0.005, 200_000),
ModelPrice("anthropic/claude-opus-4-1", 0.015, 0.075, 200_000),
ModelPrice("deepseek/deepseek-chat", 0.00028, 0.00112, 64_000),
]
async def _load_pricing_from_db(config: Config, db: Database) -> PricingCache:
"""Load pricing from the persisted model_pricing table.
Falls back to the static seed when the table is empty (doctor not yet run).
"""
async with db.session() as s:
rows = list((await s.execute(select(ModelPricingRow))).scalars().all())
cache = PricingCache()
if rows:
cache.set(
[
ModelPrice(
model=r.model,
input_per_1k_usd=r.input_per_1k_usd,
output_per_1k_usd=r.output_per_1k_usd,
context_length=r.context_length,
)
for r in rows
]
)
return cache
cache.set(_static_pricing_seed_fallback())
return cache
def _print_preview(estimate: WorkflowCostEstimate, config: object) -> None:
cfg: Config = config # type: ignore[assignment]
table = Table(title="Cost preview")
table.add_column("Phase")
table.add_column("Persona")
table.add_column("Model")
table.add_column("In/Out tokens", justify="right")
table.add_column("Est. cost", justify="right")
for p in estimate.phases:
cost_str = f"${p.estimated_cost_usd:.4f}"
table.add_row(
p.phase_key,
p.persona_name,
p.model,
f"{p.estimated_input_tokens}/{p.estimated_output_tokens}",
cost_str,
)
_CONSOLE.print(table)
_CONSOLE.print(f"Total estimated: [bold]${estimate.total_usd:.4f}[/]")
_CONSOLE.print(
f"Run cap: [bold]${cfg.budget_run_usd}[/] | Daily cap: [bold]${cfg.budget_daily_usd}[/]"
)
async def _run_async(
workflow_path: Path,
repo: Path,
base_branch: str,
no_preview: bool,
) -> None:
config = load_config()
require_consent(config.data_dir)
template = load_workflow_yaml(workflow_path)
# Locate seed schemas relative to the installed package root
seed_root = Path(__file__).resolve().parents[3] / "docs" / "schemas"
personas_dir = seed_root / "personas"
artifacts_root = seed_root / "artifacts"
personas = load_personas_from_dir(personas_dir)
registry = ArtifactSchemaRegistry(roots=[artifacts_root])
db = Database(config.database_url)
await db.init_schema()
# Crash recovery: mark non-terminal runs from a previous process as failed
# so the active-run uniqueness slot is freed before starting new work.
from ..recovery import sweep_orphan_runs
report = await sweep_orphan_runs(db)
if report.total:
_CONSOLE.print(
f"[yellow]recovery: marked {len(report.failed_runs)} orphan run(s) "
f"and {len(report.failed_phases)} phase(s) as failed[/]"
)
try:
consent_store = PersonaConsentStore(config.data_dir / "persona-consents.json")
bindings = bind_personas(
template,
personas,
BackendAvailability(available_backends=frozenset(Backend)),
consent_store,
)
# Pricing + cost preview — use DB-cached prices; fall back to static seed
pricing = await _load_pricing_from_db(config, db)
if not no_preview:
estimate = estimate_workflow(template, bindings, pricing)
_print_preview(estimate, config)
if not typer.confirm("Proceed?", default=True):
raise typer.Exit(code=0)
budget: BudgetTracker = make_budget_tracker_from_config(
db, config, prompt_callback=cli_budget_prompt
)
await budget.init()
engine = WorkflowEngine(
db=db,
config=config,
persona_pool=personas,
artifact_registry=registry,
consent_store=consent_store,
available_backends=BackendAvailability(available_backends=frozenset(Backend)),
approval_callback=cli_approval_callback,
budget_tracker=budget,
pricing=pricing,
)
engine.install_signal_handlers()
result = await engine.run(
template,
repo_path=repo,
base_branch=base_branch,
)
_CONSOLE.print(f"[bold]{result.state.value}[/] run_id={result.run_id}")
if result.final_report_path:
_CONSOLE.print(f"report: {result.final_report_path}")
if result.error:
_CONSOLE.print(f"[red]error[/]: {result.error}")
raise typer.Exit(code=1)
finally:
await db.dispose()

View File

@@ -0,0 +1,204 @@
"""mydeepagent runs list / show / resume — read-only-ish run history queries."""
from __future__ import annotations
import asyncio
from pathlib import Path
from uuid import UUID
import typer
from rich.console import Console
from rich.table import Table
from sqlalchemy import desc, select
from ..config import load_config
from ..persistence.db import Database
from ..persistence.models import (
ArtifactRow,
RunEventRow,
RunPhaseRow,
RunRow,
)
_CONSOLE = Console()
def runs_list_command(limit: int = 20, state_filter: str | None = None) -> None:
asyncio.run(_runs_list_async(limit, state_filter))
def runs_show_command(run_id: str) -> None:
asyncio.run(_runs_show_async(run_id))
def runs_resume_command(run_id: str) -> None:
asyncio.run(_runs_resume_async(run_id))
async def _runs_list_async(limit: int, state_filter: str | None) -> None:
config = load_config()
db = Database(config.database_url)
await db.init_schema()
try:
async with db.session() as s:
stmt = select(RunRow).order_by(desc(RunRow.created_at)).limit(limit)
if state_filter:
stmt = stmt.where(RunRow.state == state_filter)
rows = (await s.execute(stmt)).scalars().all()
if not rows:
_CONSOLE.print("[dim](no runs)[/]")
return
table = Table(title=f"Recent runs (latest {len(rows)})")
table.add_column("Run ID")
table.add_column("State")
table.add_column("Repo")
table.add_column("Branch")
table.add_column("Created")
table.add_column("Ended")
for r in rows:
table.add_row(
str(r.id)[:8] + "",
r.state,
Path(r.repo_path).name,
r.base_branch,
(r.created_at or "")[:19],
(r.ended_at or "")[:19] if r.ended_at else "",
)
_CONSOLE.print(table)
finally:
await db.dispose()
async def _runs_show_async(run_id: str) -> None:
full_id = await _resolve_run_id(run_id)
config = load_config()
db = Database(config.database_url)
await db.init_schema()
try:
async with db.session() as s:
run = await s.get(RunRow, full_id)
if run is None:
_CONSOLE.print(f"[red]run not found:[/] {run_id}")
raise typer.Exit(code=1)
phases = (
(
await s.execute(
select(RunPhaseRow)
.where(RunPhaseRow.run_id == full_id)
.order_by(RunPhaseRow.seq)
)
)
.scalars()
.all()
)
artifacts = (
(await s.execute(select(ArtifactRow).where(ArtifactRow.run_id == full_id)))
.scalars()
.all()
)
events = (
(
await s.execute(
select(RunEventRow)
.where(RunEventRow.run_id == full_id)
.order_by(RunEventRow.seq)
.limit(50)
)
)
.scalars()
.all()
)
_CONSOLE.print(f"[bold]Run {run.id}[/]")
_CONSOLE.print(f" state: [cyan]{run.state}[/]")
_CONSOLE.print(f" repo: {run.repo_path}@{run.base_branch}")
_CONSOLE.print(f" worktree: {run.worktree_root}")
_CONSOLE.print(f" created: {run.created_at}")
_CONSOLE.print(f" ended: {run.ended_at or ''}")
if run.final_report_path:
_CONSOLE.print(f" report: {run.final_report_path}")
_CONSOLE.print()
_CONSOLE.print("[bold]Phases[/]")
for ph in phases:
_CONSOLE.print(f" - {ph.phase_key:20s} state={ph.state:15s} attempts={ph.attempts}")
if artifacts:
_CONSOLE.print()
_CONSOLE.print("[bold]Artifacts[/]")
for a in artifacts:
_CONSOLE.print(f" - {a.path} (schema={a.schema_id}, valid={a.valid})")
_CONSOLE.print()
_CONSOLE.print(f"[bold]Events (last {len(events)})[/]")
for ev in events:
_CONSOLE.print(f" [{ev.seq:4d}] {ev.ts} {ev.type}")
finally:
await db.dispose()
async def _runs_resume_async(run_id: str) -> None:
"""v0.1.0: resume is not implemented.
Surfaces the run state and hints at next steps. Future v0.2 implementation:
rehydrate the workflow template by template_hash, replay phase loop from the
first non-completed phase using the existing checkpointer.
"""
full_id = await _resolve_run_id(run_id)
config = load_config()
db = Database(config.database_url)
await db.init_schema()
try:
async with db.session() as s:
run = await s.get(RunRow, full_id)
if run is None:
_CONSOLE.print(f"[red]run not found:[/] {run_id}")
raise typer.Exit(code=1)
if run.state in ("completed", "failed", "aborted"):
_CONSOLE.print(
f"[yellow]Run {run.id} is already terminal ({run.state}). "
"Start a fresh run with `mydeepagent run <workflow.yaml>`.[/]"
)
raise typer.Exit(code=1)
_CONSOLE.print(
"[yellow]Resume is not implemented in v0.1.0. The crash-recovery sweep at startup "
"marked this run as failed; relaunch the workflow with `mydeepagent run`.[/]"
)
raise typer.Exit(code=2)
finally:
await db.dispose()
async def _resolve_run_id(prefix_or_full: str) -> str:
"""Accept either a full UUID or a 6+ char prefix and return the canonical full id."""
try:
return str(UUID(prefix_or_full))
except ValueError:
pass
if len(prefix_or_full) < 6:
_CONSOLE.print(
f"[red]ambiguous run id (need full UUID or >=6-char prefix):[/] {prefix_or_full}"
)
raise typer.Exit(code=2)
config = load_config()
db = Database(config.database_url)
await db.init_schema()
try:
async with db.session() as s:
rows = (
(
await s.execute(
select(RunRow.id).where(RunRow.id.like(f"{prefix_or_full}%")).limit(2)
)
)
.scalars()
.all()
)
if not rows:
_CONSOLE.print(f"[red]no run matches prefix:[/] {prefix_or_full}")
raise typer.Exit(code=1)
if len(rows) > 1:
_CONSOLE.print(f"[red]ambiguous prefix matches >1 run:[/] {prefix_or_full}")
raise typer.Exit(code=1)
return rows[0]
finally:
await db.dispose()

View File

@@ -1 +1,179 @@
"""CLI stats command for usage summary. Implemented in Step 12."""
"""mydeepagent stats / costs / budget / pricing — read-only ledger + history queries."""
from __future__ import annotations
import asyncio
from collections.abc import Sequence
from datetime import UTC, datetime, timedelta
from typing import Any
import typer
from rich.console import Console
from rich.table import Table
from sqlalchemy import func, select
from ..config import load_config
from ..persistence.db import Database
from ..persistence.models import BudgetLedgerRow, LlmCallRow, ModelPricingRow
_CONSOLE = Console()
def stats_command(by: str = "model", since_days: int = 7) -> None:
"""Synchronous CLI wrapper for the async stats query."""
asyncio.run(_stats_async(by, since_days))
async def _stats_async(by: str, since_days: int) -> None:
config = load_config()
db = Database(config.database_url)
await db.init_schema()
try:
since = (datetime.now(UTC) - timedelta(days=since_days)).isoformat(timespec="seconds")
async with db.session() as s:
if by == "model":
rows: Sequence[Any] = (
await s.execute(
select(
LlmCallRow.model,
func.count().label("calls"),
func.sum(LlmCallRow.input_tokens).label("input"),
func.sum(LlmCallRow.output_tokens).label("output"),
func.sum(LlmCallRow.cost_usd_total).label("cost"),
)
.where(LlmCallRow.ts >= since)
.group_by(LlmCallRow.model)
)
).all()
_render_stats_table(
"Stats by model",
rows,
["Model", "Calls", "Input", "Output", "Cost ($)"],
)
elif by == "persona":
rows = (
await s.execute(
select(
LlmCallRow.persona_name,
func.count().label("calls"),
func.sum(LlmCallRow.cost_usd_total).label("cost"),
)
.where(LlmCallRow.ts >= since)
.group_by(LlmCallRow.persona_name)
)
).all()
_render_stats_table(
"Stats by persona",
rows,
["Persona", "Calls", "Cost ($)"],
)
elif by == "day":
rows = (
await s.execute(
select(
func.substr(LlmCallRow.ts, 1, 10).label("day"),
func.count().label("calls"),
func.sum(LlmCallRow.cost_usd_total).label("cost"),
)
.where(LlmCallRow.ts >= since)
.group_by("day")
)
).all()
_render_stats_table(
"Stats by day",
rows,
["Day", "Calls", "Cost ($)"],
)
else:
typer.echo(f"unknown --by option: {by!r}", err=True)
raise typer.Exit(code=2)
finally:
await db.dispose()
def budget_command() -> None:
"""Synchronous CLI wrapper for the async budget ledger query."""
asyncio.run(_budget_async())
async def _budget_async() -> None:
config = load_config()
db = Database(config.database_url)
await db.init_schema()
try:
async with db.session() as s:
rows = list((await s.execute(select(BudgetLedgerRow))).scalars().all())
if not rows:
_CONSOLE.print("[dim](no budget activity yet)[/]")
return
table = Table(title="Budget ledger")
table.add_column("Scope")
table.add_column("Spent ($)", justify="right")
table.add_column("Cap ($)", justify="right")
table.add_column("Remaining ($)", justify="right")
table.add_column("Last update")
for row in rows:
remaining = (
"" if row.cap_usd is None else f"{max(0.0, row.cap_usd - row.spent_usd):.4f}"
)
cap = "" if row.cap_usd is None else f"{row.cap_usd:.4f}"
table.add_row(
row.scope,
f"{row.spent_usd:.4f}",
cap,
remaining,
row.last_updated,
)
_CONSOLE.print(table)
finally:
await db.dispose()
def pricing_command() -> None:
"""Show cached OpenRouter pricing matrix (populated by `doctor`)."""
asyncio.run(_pricing_async())
async def _pricing_async() -> None:
config = load_config()
db = Database(config.database_url)
await db.init_schema()
try:
async with db.session() as s:
rows = list(
(await s.execute(select(ModelPricingRow).order_by(ModelPricingRow.model)))
.scalars()
.all()
)
if not rows:
_CONSOLE.print("[dim](no pricing data — run `mydeepagent doctor` to fetch)[/]")
return
table = Table(title="OpenRouter pricing (per 1K tokens, USD)")
table.add_column("Model")
table.add_column("Input", justify="right")
table.add_column("Output", justify="right")
table.add_column("Context", justify="right")
table.add_column("Fetched")
for r in rows:
table.add_row(
r.model,
f"{r.input_per_1k_usd:.4f}",
f"{r.output_per_1k_usd:.4f}",
str(r.context_length),
(r.fetched_at or "")[:19],
)
_CONSOLE.print(table)
finally:
await db.dispose()
def _render_stats_table(title: str, rows: Sequence[Any], headers: list[str]) -> None:
if not rows:
_CONSOLE.print("[dim](no data for the past period)[/]")
return
table = Table(title=title)
for h in headers:
table.add_column(h)
for row in rows:
table.add_row(*[str(v if v is not None else "") for v in row])
_CONSOLE.print(table)

View File

@@ -1 +1,917 @@
"""LangGraph run engine orchestrator. Implemented in Step 7."""
"""WorkflowEngine: orchestrates run lifecycle, phase loop, artifact validation, approval gate."""
from __future__ import annotations
import asyncio
import json
import signal
from contextlib import suppress
from dataclasses import dataclass
from datetime import UTC, datetime
from pathlib import Path
from typing import Any
from uuid import UUID, uuid4
from sqlalchemy import select
from .artifact_schema import ArtifactSchemaRegistry
from .audit import make_audit_recorder
from .binding import (
BackendAvailability,
Binding,
BindingOverride,
PersonaConsentStore,
bind_personas,
)
from .budget import BudgetTracker
from .config import Config
from .enums import ApprovalDecisionAction, ApprovalState, RunPhaseState, RunState
from .errors import MyDeepAgentError
from .hash import sha256
from .middleware.artifact_watcher import ArtifactWatcherMiddleware
from .middleware.audit import AuditToolMiddleware
from .middleware.cost import CostMiddleware
from .monitoring.pricing import PricingCache
from .persistence.db import Database
from .persistence.models import (
AgentPersonaRow,
ApprovalDecisionRow,
ApprovalRequestRow,
ArtifactRow,
LlmCallRow,
RunBindingRow,
RunEventRow,
RunInputRow,
RunPhaseRow,
RunRow,
WorkflowTemplateRow,
)
from .persona import Persona
from .run_event import RunEventType, run_idempotency_key
from .session import build_agent
from .workflow import WorkflowPhase, WorkflowTemplate
# ApprovalCallback type: async (request_payload: dict, gates: list[str]) -> ApprovalDecisionAction
ApprovalCallback = Any # Callable[[dict, list[str]], Awaitable[ApprovalDecisionAction]]
_DEFAULT_PHASE_TIMEOUT_SECONDS = 300 # 5 minutes
@dataclass(frozen=True)
class RunResult:
run_id: UUID
state: RunState
final_report_path: Path | None
error: str | None = None
class _PhaseAbortedError(Exception):
def __init__(self, reason: str) -> None:
self.reason = reason
super().__init__(reason)
class WorkflowEngine:
"""In-process workflow engine for v0.1.0.
For each phase: build_agent -> invoke -> wait for write_file targeting
expected_artifact_path -> load + jsonschema validate -> repair 1x if invalid
-> approval gate -> next phase.
All events appended idempotently to run_events via the
(run_id, idempotency_key) UNIQUE constraint — concurrent/retry safe.
"""
def __init__(
self,
db: Database,
config: Config,
persona_pool: list[Persona],
artifact_registry: ArtifactSchemaRegistry,
consent_store: PersonaConsentStore,
available_backends: BackendAvailability,
approval_callback: ApprovalCallback,
budget_tracker: BudgetTracker | None = None,
pricing: PricingCache | None = None,
) -> None:
self._db = db
self._config = config
self._personas = persona_pool
self._artifacts = artifact_registry
self._consent = consent_store
self._backends = available_backends
self._approval = approval_callback
self._budget = budget_tracker
self._pricing = pricing or PricingCache()
self._shutdown_event: asyncio.Event = asyncio.Event()
self._inflight_tasks: set[asyncio.Task[Any]] = set()
def install_signal_handlers(self) -> None:
"""Attach SIGTERM/SIGINT handlers to the running event loop.
Idempotent: calling twice replaces the previous handlers. Should be invoked
from ``cli/run.py`` once the asyncio loop is up. On shutdown signal:
in-flight ainvoke() tasks get a 30s grace, then are cancelled.
"""
loop = asyncio.get_running_loop()
for sig in (signal.SIGTERM, signal.SIGINT):
with suppress(NotImplementedError, ValueError):
loop.add_signal_handler(sig, self._on_signal, sig)
def _on_signal(self, sig: signal.Signals) -> None:
self._shutdown_event.set()
loop = asyncio.get_running_loop()
loop.call_later(30.0, self._force_cancel_inflight)
def _force_cancel_inflight(self) -> None:
for task in list(self._inflight_tasks):
if not task.done():
task.cancel()
@property
def shutdown_requested(self) -> bool:
return self._shutdown_event.is_set()
async def run(
self,
template: WorkflowTemplate,
*,
repo_path: Path,
base_branch: str = "main",
requirements_md: str = "",
override: BindingOverride | None = None,
) -> RunResult:
run_id = uuid4()
worktree_root = self._config.workspace_root / str(run_id)
worktree_root.mkdir(parents=True, exist_ok=True)
artifacts_dir = worktree_root / "artifacts"
artifacts_dir.mkdir(parents=True, exist_ok=True)
bindings = bind_personas(template, self._personas, self._backends, self._consent, override)
await self._persist_run_skeleton(
None,
run_id,
template,
bindings,
repo_path,
base_branch,
worktree_root,
requirements_md,
)
await self._append_event(run_id, None, RunEventType.RUN_CREATED, {})
await self._append_event(run_id, None, RunEventType.RUN_STARTED, {})
await self._set_run_state(run_id, RunState.EXECUTING)
try:
for phase_def in template.phases:
role_binding = bindings[phase_def.role]
await self._run_phase(run_id, worktree_root, template, phase_def, role_binding)
await self._set_run_state(run_id, RunState.COMPLETED)
await self._append_event(run_id, None, RunEventType.RUN_COMPLETED, {})
report_path = await self._compose_final_report(
run_id, worktree_root, RunState.COMPLETED
)
return RunResult(run_id=run_id, state=RunState.COMPLETED, final_report_path=report_path)
except _PhaseAbortedError as e:
await self._set_run_state(run_id, RunState.ABORTED)
await self._append_event(run_id, None, RunEventType.RUN_ABORTED, {"reason": e.reason})
report_path = await self._compose_final_report(
run_id, worktree_root, RunState.ABORTED, error=e.reason
)
return RunResult(
run_id=run_id,
state=RunState.ABORTED,
final_report_path=report_path,
error=e.reason,
)
except MyDeepAgentError as e:
await self._set_run_state(run_id, RunState.FAILED)
await self._append_event(
run_id, None, RunEventType.RUN_FAILED, {"code": e.code, "message": str(e)}
)
report_path = await self._compose_final_report(
run_id, worktree_root, RunState.FAILED, error=str(e)
)
return RunResult(
run_id=run_id,
state=RunState.FAILED,
final_report_path=report_path,
error=str(e),
)
# ------------------------------------------------------------------
# Phase execution
# ------------------------------------------------------------------
async def _run_phase(
self,
run_id: UUID,
worktree_root: Path,
template: WorkflowTemplate,
phase_def: WorkflowPhase,
binding: Binding,
) -> None:
if self.shutdown_requested:
await self._append_event(run_id, None, RunEventType.RUN_PAUSED, {"reason": "shutdown"})
await self._set_run_state(run_id, RunState.PAUSED)
raise _PhaseAbortedError(reason="shutdown signal received")
phase_id = await self._ensure_phase_row(run_id, phase_def)
await self._set_phase_state(phase_id, RunPhaseState.RUNNING)
await self._append_event(
run_id, phase_id, RunEventType.PHASE_STARTED, {"phase_key": phase_def.key}
)
# Phases without an expected artifact complete immediately
if phase_def.expected_artifact is None:
await self._set_phase_state(phase_id, RunPhaseState.COMPLETED)
await self._append_event(run_id, phase_id, RunEventType.PHASE_COMPLETED, {})
return
expected_path = (worktree_root / phase_def.expected_artifact.path).resolve()
expected_path.parent.mkdir(parents=True, exist_ok=True)
# Repair loop: max 2 attempts
for attempt in range(1, 3):
validated = await self._run_agent_and_validate(
run_id, phase_id, worktree_root, phase_def, binding, expected_path, attempt
)
if validated:
break
# validated=False means: invalid/timeout + still have budget for retry
# on attempt 2, _run_agent_and_validate raises instead of returning False
await self._run_approval_gate(run_id, phase_id, phase_def, expected_path)
await self._set_phase_state(phase_id, RunPhaseState.COMPLETED)
await self._append_event(run_id, phase_id, RunEventType.PHASE_COMPLETED, {})
async def _run_agent_and_validate(
self,
run_id: UUID,
phase_id: UUID,
worktree_root: Path,
phase_def: WorkflowPhase,
binding: Binding,
expected_path: Path,
attempt: int,
) -> bool:
"""Invoke agent for one attempt and validate artifact. Returns True on success.
Returns False when attempt < 2 and artifact is missing/invalid (caller retries).
Raises MyDeepAgentError on final failure (attempt >= 2).
"""
written = await self._invoke_agent_until_artifact(
run_id, phase_id, worktree_root, phase_def, binding, expected_path, attempt=attempt
)
if not written:
await self._append_event(run_id, phase_id, RunEventType.ARTIFACT_TIMEOUT, {})
if attempt >= 2:
await self._set_phase_state(phase_id, RunPhaseState.FAILED)
await self._append_event(
run_id,
phase_id,
RunEventType.PHASE_FAILED,
{"reason": "artifact_timeout_exhausted"},
)
raise MyDeepAgentError.human_required(
"artifact_timeout_exhausted",
message=(
f"phase '{phase_def.key}' did not produce expected artifact "
f"after {attempt} attempts"
),
)
return False
# Validate the written artifact
await self._set_phase_state(phase_id, RunPhaseState.VALIDATING)
assert phase_def.expected_artifact is not None
schema_id = phase_def.expected_artifact.schema_id
try:
data = json.loads(expected_path.read_text(encoding="utf-8"))
except (OSError, json.JSONDecodeError) as exc:
await self._append_event(
run_id,
phase_id,
RunEventType.ARTIFACT_INVALID,
{"errors": [{"message": str(exc)}]},
)
if attempt >= 2:
raise MyDeepAgentError.human_required(
"artifact_invalid_after_repair",
message=str(exc),
cause=exc,
) from exc
await self._append_event(run_id, phase_id, RunEventType.PROMPT_REPAIRED, {})
return False
result = self._artifacts.validate(schema_id, data)
if result.ok:
await self._persist_artifact(run_id, phase_id, expected_path, schema_id, valid=True)
await self._append_event(run_id, phase_id, RunEventType.ARTIFACT_VALIDATED, {})
return True
error_payload = [{"path": f.path, "message": f.message} for f in result.errors[:5]]
await self._persist_artifact(
run_id,
phase_id,
expected_path,
schema_id,
valid=False,
errors=list(result.errors),
)
await self._append_event(
run_id, phase_id, RunEventType.ARTIFACT_INVALID, {"errors": error_payload}
)
if attempt >= 2:
await self._set_phase_state(phase_id, RunPhaseState.FAILED)
await self._append_event(
run_id,
phase_id,
RunEventType.PHASE_FAILED,
{"reason": "artifact_invalid_after_repair"},
)
raise MyDeepAgentError.human_required(
"artifact_invalid_after_repair",
message=f"phase '{phase_def.key}' artifact failed validation after repair",
)
await self._append_event(run_id, phase_id, RunEventType.PROMPT_REPAIRED, {})
return False
async def _run_approval_gate(
self,
run_id: UUID,
phase_id: UUID,
phase_def: WorkflowPhase,
expected_path: Path,
) -> None:
"""Run the approval gate if gates are configured. Raises on reject/abort."""
if not phase_def.gates:
return
await self._set_phase_state(phase_id, RunPhaseState.AWAITING_APPROVAL)
decision = await self._request_approval(run_id, phase_id, phase_def, expected_path)
if decision == ApprovalDecisionAction.ABORT:
raise _PhaseAbortedError(reason=f"aborted at phase {phase_def.key}")
if decision != ApprovalDecisionAction.APPROVE:
await self._set_phase_state(phase_id, RunPhaseState.FAILED)
await self._append_event(
run_id, phase_id, RunEventType.PHASE_FAILED, {"reason": decision.value}
)
raise MyDeepAgentError.human_required(
"approval_rejected",
message=f"phase '{phase_def.key}' approval was {decision.value}",
)
async def _invoke_agent_until_artifact(
self,
run_id: UUID,
phase_id: UUID,
worktree_root: Path,
phase_def: WorkflowPhase,
binding: Binding,
expected_path: Path,
attempt: int,
) -> bool:
"""Build agent + invoke + return True if expected_path was written, False on timeout."""
written_paths: list[str] = []
async def _on_written(path: str, _content: str) -> None:
written_paths.append(path)
watcher = ArtifactWatcherMiddleware(expected_path, _on_written)
cost_mw = CostMiddleware(
pricing=self._pricing,
model_name=binding.persona.model,
run_id=run_id,
phase_id=phase_id,
persona_name=binding.persona.name,
budget_tracker=self._budget,
recorder=self._record_llm_call,
)
audit_mw = AuditToolMiddleware(
run_id=run_id,
phase_id=phase_id,
file_recorder=make_audit_recorder(self._config.state_dir),
)
agent = build_agent(
binding.persona,
self._config,
root_dir=worktree_root,
middleware=[watcher, cost_mw, audit_mw],
)
envelope = self._build_envelope(run_id, phase_id, phase_def, attempt, expected_path)
await self._append_event(
run_id, phase_id, RunEventType.ARTIFACT_EXPECTED, {"path": str(expected_path)}
)
event_type = RunEventType.PROMPT_REPAIRED if attempt > 1 else RunEventType.PROMPT_SENT
await self._append_event(run_id, phase_id, event_type, {"attempt": attempt})
timeout = float(phase_def.timeout_seconds or _DEFAULT_PHASE_TIMEOUT_SECONDS)
try:
invoke_task: asyncio.Task[Any] = asyncio.create_task(
agent.ainvoke({"messages": [{"role": "user", "content": envelope}]})
)
self._inflight_tasks.add(invoke_task)
try:
await asyncio.wait_for(asyncio.shield(invoke_task), timeout=timeout)
except TimeoutError:
pass
finally:
self._inflight_tasks.discard(invoke_task)
except asyncio.CancelledError:
pass
return expected_path.is_file()
def _build_envelope(
self,
run_id: UUID,
phase_id: UUID,
phase_def: WorkflowPhase,
attempt: int,
expected_path: Path,
) -> str:
artifact = phase_def.expected_artifact
assert artifact is not None
try:
schema_def = self._artifacts.load(artifact.schema_id)
schema_inline = json.dumps(schema_def, indent=2, ensure_ascii=False)
except (MyDeepAgentError, AttributeError):
# AttributeError covers test scaffolding that instantiates the engine
# via __new__ without wiring _artifacts; production paths always have it.
schema_inline = "(schema not available)"
repair_note = (
"\n\n[REPAIR ATTEMPT]\n"
"Your previous artifact did not validate against the JSON Schema below. "
"Re-read the schema carefully and emit a corrected JSON object that satisfies "
"every `required` field and respects all `enum`, `type`, `minLength`, and "
"`additionalProperties: false` constraints."
if attempt > 1
else ""
)
return (
f"MYDEEPAGENT_PROMPT_BEGIN {phase_id}\n"
f"Run: {run_id}\n"
f"Phase: {phase_def.key}\n"
f"Attempt: {attempt}\n"
f"Expected artifact path: {expected_path}\n"
f"Expected schema id: {artifact.schema_id}\n"
f"\n"
f"JSON Schema 2020-12 for this artifact (you MUST satisfy it exactly):\n"
f"```json\n{schema_inline}\n```\n"
f"\n"
f"Use the `write_file` tool to write a JSON object that matches the schema "
f"to the exact path `{expected_path}`. The file must parse as valid JSON.\n"
f"\n"
f"Instructions:\n"
f"{phase_def.instructions}"
f"{repair_note}\n"
f"MYDEEPAGENT_PROMPT_END {phase_id}"
)
# ------------------------------------------------------------------
# Approval gate
# ------------------------------------------------------------------
async def _request_approval(
self,
run_id: UUID,
phase_id: UUID,
phase_def: WorkflowPhase,
artifact_path: Path,
) -> ApprovalDecisionAction:
request_id = uuid4()
idem_key = f"{phase_def.key}:{artifact_path.name}"
payload: dict[str, Any] = {
"phase_key": phase_def.key,
"artifact_path": str(artifact_path),
"gates": list(phase_def.gates),
}
async with self._db.session() as s:
s.add(
ApprovalRequestRow(
id=str(request_id),
run_id=str(run_id),
phase_id=str(phase_id),
gate_key=phase_def.gates[0] if phase_def.gates else "default",
state=ApprovalState.PENDING.value,
idempotency_key=idem_key,
payload=payload,
created_at=_now_iso(),
)
)
await self._append_event(
run_id,
phase_id,
RunEventType.APPROVAL_REQUESTED,
{"request_id": str(request_id)},
)
decision: ApprovalDecisionAction = await self._approval(payload, list(phase_def.gates))
async with self._db.session() as s:
s.add(
ApprovalDecisionRow(
id=str(uuid4()),
approval_request_id=str(request_id),
action=decision.value,
decided_at=_now_iso(),
idempotency_key=f"{idem_key}:{decision.value}",
)
)
await self._append_event(
run_id, phase_id, RunEventType.APPROVAL_RESOLVED, {"action": decision.value}
)
return decision
# ------------------------------------------------------------------
# Final report
# ------------------------------------------------------------------
async def _compose_final_report(
self,
run_id: UUID,
worktree_root: Path,
status: RunState,
error: str | None = None,
) -> Path:
worktree_root.mkdir(parents=True, exist_ok=True)
async with self._db.session() as s:
run = await s.get(RunRow, str(run_id))
phase_rows = list(
(await s.execute(select(RunPhaseRow).where(RunPhaseRow.run_id == str(run_id))))
.scalars()
.all()
)
artifact_rows = list(
(await s.execute(select(ArtifactRow).where(ArtifactRow.run_id == str(run_id))))
.scalars()
.all()
)
event_rows = list(
(
await s.execute(
select(RunEventRow)
.where(RunEventRow.run_id == str(run_id))
.order_by(RunEventRow.seq.desc())
.limit(20)
)
)
.scalars()
.all()
)
report: dict[str, Any] = {
"runId": str(run_id),
"templateHash": run.template_hash if run else "",
"status": status.value,
"phases": [
{
"key": p.phase_key,
"state": p.state,
"started_at": p.started_at,
"ended_at": p.ended_at,
"attempts": p.attempts,
}
for p in phase_rows
],
"artifacts": [
{"path": a.path, "schema": a.schema_id, "hash": a.hash} for a in artifact_rows
],
"events": [{"seq": e.seq, "type": e.type, "ts": e.ts} for e in reversed(event_rows)],
"unresolved": [],
"endedAt": _now_iso(),
"error": error,
}
json_path = worktree_root / f"{run_id}.report.json"
md_path = worktree_root / f"{run_id}.report.md"
json_path.write_text(json.dumps(report, indent=2, ensure_ascii=False), encoding="utf-8")
md_path.write_text(_render_report_md(report), encoding="utf-8")
return json_path
# ------------------------------------------------------------------
# Persistence helpers
# ------------------------------------------------------------------
async def _record_llm_call(self, record: dict[str, Any]) -> None:
"""CostMiddleware recorder: persist one LlmCallRow per model call.
Fills every NOT NULL column of LlmCallRow. Per-input/output cost is computed
from the same PricingCache that the middleware already consulted, so the
ledger and the row stay consistent.
"""
in_tokens = int(record.get("input_tokens") or 0)
out_tokens = int(record.get("output_tokens") or 0)
model = str(record.get("model") or "")
# Reproduce per-direction cost from the cached price.
price = self._pricing.get(model) if self._pricing is not None else None
if price is not None:
cost_input = (in_tokens / 1000.0) * price.input_per_1k_usd
cost_output = (out_tokens / 1000.0) * price.output_per_1k_usd
else:
cost_input = 0.0
cost_output = 0.0
cost_total = float(record.get("cost_usd_total") or (cost_input + cost_output))
run_id_val = record.get("run_id")
phase_id_val = record.get("phase_id")
session_id_val = record.get("interactive_session_id")
thread_id = (
f"run:{run_id_val}:phase:{phase_id_val}"
if run_id_val is not None
else f"session:{session_id_val}"
)
persona_name = str(record.get("persona_name") or "")
async with self._db.session() as s:
s.add(
LlmCallRow(
run_id=(str(run_id_val) if run_id_val is not None else None),
phase_id=(str(phase_id_val) if phase_id_val is not None else None),
interactive_session_id=(
str(session_id_val) if session_id_val is not None else None
),
thread_id=thread_id,
persona_name=persona_name,
persona_version=1,
model=model,
role="main",
turn_index=0,
input_tokens=in_tokens,
output_tokens=out_tokens,
cached_tokens=0,
reasoning_tokens=0,
cost_usd_input=cost_input,
cost_usd_output=cost_output,
cost_usd_total=cost_total,
latency_ms=int(record.get("latency_ms") or 0),
status=str(record.get("status") or "ok"),
error_code=record.get("error_code"),
request_id=None,
ts=_now_iso(),
)
)
try:
await s.commit()
except Exception:
await s.rollback()
async def _persist_run_skeleton(
self,
_unused_session: Any, # kept for caller compatibility — we open own sessions
run_id: UUID,
template: WorkflowTemplate,
bindings: dict[str, Binding],
repo_path: Path,
base_branch: str,
worktree_root: Path,
requirements_md: str,
) -> None:
template_hash = template.compute_hash()
now = _now_iso()
# --- Phase 1: upsert FK targets (committed separately to satisfy FK ordering) ---
template_id = uuid4()
async with self._db.session() as s:
existing_tpl = (
await s.execute(
select(WorkflowTemplateRow).where(WorkflowTemplateRow.hash == template_hash)
)
).scalar_one_or_none()
if existing_tpl is None:
s.add(
WorkflowTemplateRow(
id=str(template_id),
name=template.name,
version=template.version,
hash=template_hash,
definition=template.model_dump(by_alias=True),
created_at=now,
)
)
else:
template_id = UUID(existing_tpl.id)
persona_ids: dict[str, UUID] = {}
for role_id, binding in bindings.items():
persona_hash = binding.persona.compute_hash()
async with self._db.session() as s:
existing_persona = (
await s.execute(
select(AgentPersonaRow).where(AgentPersonaRow.hash == persona_hash)
)
).scalar_one_or_none()
if existing_persona is None:
persona_id = uuid4()
s.add(
AgentPersonaRow(
id=str(persona_id),
name=binding.persona.name,
version=binding.persona.version,
hash=persona_hash,
definition=binding.persona.model_dump(),
created_at=now,
)
)
else:
persona_id = UUID(existing_persona.id)
persona_ids[role_id] = persona_id
# --- Phase 2: insert RunRow (FK: workflow_templates — already committed above) ---
async with self._db.session() as s:
s.add(
RunRow(
id=str(run_id),
template_id=str(template_id),
template_hash=template_hash,
state=RunState.CREATED.value,
repo_path=str(repo_path),
base_branch=base_branch,
worktree_root=str(worktree_root),
created_at=now,
updated_at=now,
)
)
# --- Phase 3: insert RunInputRow + RunBindingRow (FK: runs — now committed) ---
async with self._db.session() as s:
s.add(
RunInputRow(
id=str(uuid4()),
run_id=str(run_id),
requirements_md=requirements_md,
objective={},
extra={},
input_hash=sha256(
{"requirements": requirements_md, "template_hash": template_hash}
),
)
)
for role_id, binding in bindings.items():
persona_hash = binding.persona.compute_hash()
s.add(
RunBindingRow(
id=str(uuid4()),
run_id=str(run_id),
role_id=role_id,
persona_id=str(persona_ids[role_id]),
persona_hash=persona_hash,
backend=binding.persona.backend.value,
binding_hash=binding.binding_hash,
)
)
async def _ensure_phase_row(self, run_id: UUID, phase_def: WorkflowPhase) -> UUID:
async with self._db.session() as s:
existing = (
await s.execute(
select(RunPhaseRow).where(
RunPhaseRow.run_id == str(run_id),
RunPhaseRow.phase_key == phase_def.key,
)
)
).scalar_one_or_none()
if existing is not None:
return UUID(existing.id)
phase_id = uuid4()
existing_count = len(
(
await s.execute(select(RunPhaseRow).where(RunPhaseRow.run_id == str(run_id)))
).all()
)
s.add(
RunPhaseRow(
id=str(phase_id),
run_id=str(run_id),
phase_key=phase_def.key,
seq=existing_count,
state=RunPhaseState.PENDING.value,
attempts=0,
started_at=_now_iso(),
)
)
return phase_id
async def _set_phase_state(self, phase_id: UUID, state: RunPhaseState) -> None:
async with self._db.session() as s:
row = await s.get(RunPhaseRow, str(phase_id))
if row is not None:
row.state = state.value
if state in (
RunPhaseState.COMPLETED,
RunPhaseState.FAILED,
RunPhaseState.SKIPPED,
):
row.ended_at = _now_iso()
async def _set_run_state(self, run_id: UUID, state: RunState) -> None:
async with self._db.session() as s:
row = await s.get(RunRow, str(run_id))
if row is not None:
row.state = state.value
row.updated_at = _now_iso()
if state in (RunState.COMPLETED, RunState.FAILED, RunState.ABORTED):
row.ended_at = _now_iso()
async def _append_event(
self,
run_id: UUID,
phase_id: UUID | None,
event_type: RunEventType,
payload: dict[str, Any],
) -> None:
idem_extra = {
k: str(v)
for k, v in payload.items()
if k in ("phase_key", "attempt", "request_id", "action", "code")
}
idem = run_idempotency_key(event_type, run_id, **idem_extra)
async with self._db.session() as s:
existing_count = len(
(
await s.execute(select(RunEventRow).where(RunEventRow.run_id == str(run_id)))
).all()
)
s.add(
RunEventRow(
run_id=str(run_id),
phase_id=str(phase_id) if phase_id is not None else None,
seq=existing_count + 1,
type=event_type.value,
payload=payload,
idempotency_key=idem,
ts=_now_iso(),
)
)
try:
await s.flush()
except Exception:
await s.rollback()
async def _persist_artifact(
self,
run_id: UUID,
phase_id: UUID,
path: Path,
schema_id: str,
*,
valid: bool,
errors: list[Any] | None = None,
) -> None:
try:
content = path.read_bytes()
except OSError:
return
artifact_hash = sha256({"bytes_len": len(content), "hex_prefix": content[:64].hex()})
async with self._db.session() as s:
s.add(
ArtifactRow(
id=str(uuid4()),
run_id=str(run_id),
phase_id=str(phase_id),
path=str(path),
schema_id=schema_id,
hash=artifact_hash,
valid=valid,
validation_error=(
[{"path": f.path, "message": f.message} for f in errors] if errors else None
),
created_at=_now_iso(),
)
)
try:
await s.flush()
except Exception:
await s.rollback()
# ------------------------------------------------------------------
# Module-level helpers
# ------------------------------------------------------------------
def _now_iso() -> str:
return datetime.now(UTC).isoformat(timespec="seconds")
def _render_report_md(report: dict[str, Any]) -> str:
lines: list[str] = [
f"# Run {report['runId']}",
f"**Status**: {report['status']}",
f"**Template hash**: `{report['templateHash']}`",
f"**Ended at**: {report['endedAt']}",
"",
"## Phases",
]
for p in report["phases"]:
lines.append(f"- **{p['key']}** — state={p['state']}, attempts={p['attempts']}")
lines.append("\n## Artifacts")
for a in report["artifacts"]:
lines.append(f"- `{a['path']}` (schema={a['schema']}, hash={a['hash'][:16]}...)")
if report.get("error"):
lines += ["", "## Error", str(report["error"])]
return "\n".join(lines) + "\n"

View File

@@ -0,0 +1,41 @@
"""Governance consent for sending user code to external LLM providers."""
from __future__ import annotations
import json
import os
from datetime import UTC, datetime
from pathlib import Path
from .errors import MyDeepAgentError
def consent_path(data_dir: Path) -> Path:
return data_dir / "governance-accepted.json"
def has_consent(data_dir: Path) -> bool:
return consent_path(data_dir).is_file()
def record_consent(data_dir: Path) -> None:
data_dir.mkdir(parents=True, exist_ok=True)
target = consent_path(data_dir)
payload = {"accepted_at": datetime.now(UTC).isoformat(timespec="seconds")}
tmp = target.with_suffix(target.suffix + ".tmp")
fd = os.open(tmp, os.O_WRONLY | os.O_CREAT | os.O_TRUNC, 0o600)
try:
os.write(fd, json.dumps(payload, indent=2).encode("utf-8"))
os.fsync(fd)
finally:
os.close(fd)
os.replace(tmp, target)
def require_consent(data_dir: Path) -> None:
if not has_consent(data_dir):
raise MyDeepAgentError.human_required(
"governance_not_accepted",
message="governance consent not recorded",
recovery_hint="run `mydeepagent init` and accept the data-governance prompt",
)

View File

@@ -0,0 +1,45 @@
"""Lightweight i18n catalog loader. Two languages (ko, en). Default ko per CTO decision."""
from __future__ import annotations
import os
import tomllib
from functools import lru_cache
from pathlib import Path
from typing import Literal
Lang = Literal["ko", "en"]
_CATALOG_DIR = Path(__file__).parent
@lru_cache(maxsize=4)
def _load(lang: Lang) -> dict[str, dict[str, str]]:
path = _CATALOG_DIR / f"{lang}.toml"
if not path.is_file():
return {}
with path.open("rb") as f:
data = tomllib.load(f)
return {section: dict(entries) for section, entries in data.items()}
def resolve_lang(default: Lang = "ko") -> Lang:
env = os.environ.get("MYDEEPAGENT_LANG")
if env in ("ko", "en"):
return env # type: ignore[return-value]
return default
def t(key: str, lang: Lang | None = None, **fmt: object) -> str:
"""Translate a key like 'section.key'. Falls back to the key itself if missing."""
actual_lang = lang or resolve_lang()
section_name, _, leaf = key.partition(".")
catalog = _load(actual_lang)
section = catalog.get(section_name, {})
template = section.get(leaf, key)
if fmt:
try:
return template.format(**fmt)
except (KeyError, IndexError):
return template
return template

View File

@@ -0,0 +1,34 @@
[init]
welcome = "Welcome — my-deepagent first-time setup"
governance_title = "Consent to send code to external LLM providers"
governance_body = "This tool sends file contents read via read_file and similar tools to external LLM providers (Anthropic, DeepSeek, etc.) through OpenRouter. Each persona declares its provider_origin, and a separate confirmation is shown on first use."
governance_prompt = "Type 'yes' to agree (any other answer cancels): "
governance_declined = "Cannot proceed without consent. Exiting."
api_key_prompt = "OpenRouter API key (input is hidden)"
api_key_empty = "API key was empty — nothing saved."
api_key_saved = "Saved to OS keyring."
doctor_running = "Running environment diagnostics..."
done = "Setup complete. Start with `mydeepagent run <workflow.yaml>` or `mydeepagent`."
[login]
prompt = "Enter {provider} API key (hidden): "
saved = "{provider} key saved to OS keyring."
empty = "Empty input. Nothing saved."
[logout]
removed = "{provider} key removed from keyring."
not_found = "{provider} key not found in keyring (already deleted)."
[keys]
header = "Registered API keys:"
entry = " {provider:20s} {masked}"
none = " (none. Use `mydeepagent login <provider>` to register one.)"
[doctor]
header = "Environment diagnostics:"
ok = " ok {name}"
warn = " warn {name} ({detail})"
fail = " FAIL {name} ({detail})"
[errors]
no_governance = "Governance consent is missing. Run `mydeepagent init` first."

View File

@@ -0,0 +1,34 @@
[init]
welcome = "환영합니다 — my-deepagent 첫 셋업"
governance_title = "외부 LLM provider로 코드 전송 동의"
governance_body = "이 도구는 read_file 등으로 읽은 파일 내용을 OpenRouter를 통해 외부 LLM provider(Anthropic, DeepSeek 등)로 전송합니다. 페르소나마다 provider_origin이 명시되며 첫 사용 시 별도 확인이 다시 한 번 표시됩니다."
governance_prompt = "동의하시면 'yes' 입력 (그 외 모든 답은 취소): "
governance_declined = "동의 없이는 사용할 수 없습니다. 종료합니다."
api_key_prompt = "OpenRouter API key (입력은 가려집니다)"
api_key_empty = "API key가 비어있어 저장하지 않았습니다."
api_key_saved = "OS keyring에 저장되었습니다."
doctor_running = "환경 진단 실행 중..."
done = "셋업 완료. `mydeepagent run <workflow.yaml>` 또는 `mydeepagent` 로 시작하세요."
[login]
prompt = "{provider} API key 입력 (가려짐): "
saved = "{provider} key가 OS keyring에 저장되었습니다."
empty = "빈 입력입니다. 저장하지 않았습니다."
[logout]
removed = "{provider} key가 keyring에서 삭제되었습니다."
not_found = "{provider} key가 keyring에 없습니다 (이미 삭제됨)."
[keys]
header = "등록된 API key:"
entry = " {provider:20s} {masked}"
none = " (없음. `mydeepagent login <provider>` 로 등록하세요.)"
[doctor]
header = "환경 진단:"
ok = " ok {name}"
warn = " warn {name} ({detail})"
fail = " FAIL {name} ({detail})"
[errors]
no_governance = "거버넌스 동의가 없습니다. `mydeepagent init` 를 먼저 실행하세요."

View File

@@ -0,0 +1,48 @@
"""OS keyring wrapper for storing provider API keys. Service name: 'my-deepagent'."""
from __future__ import annotations
from typing import Final
import keyring as keyring
_SERVICE: Final[str] = "my-deepagent"
def _make_username(provider: str) -> str:
return f"{provider}_api_key"
def get_api_key(provider: str) -> str | None:
"""Return the stored key for ``provider``, or None if absent."""
return keyring.get_password(_SERVICE, _make_username(provider))
def set_api_key(provider: str, value: str) -> None:
"""Persist ``value`` in the OS keyring under provider's slot."""
keyring.set_password(_SERVICE, _make_username(provider), value)
def delete_api_key(provider: str) -> bool:
"""Remove the stored key. Returns True if a key existed and was removed."""
if keyring.get_password(_SERVICE, _make_username(provider)) is None:
return False
keyring.delete_password(_SERVICE, _make_username(provider))
return True
def list_providers() -> list[str]:
"""Return the providers we recognise (we don't enumerate keyring contents).
Callers iterate this list and call get_api_key for each to detect presence.
"""
return ["openrouter", "anthropic", "openai", "google", "langsmith"]
def mask(value: str | None) -> str:
"""Mask an API key for display: 'sk-or-v1-...c2e7' or '(not set)' if None."""
if not value:
return "(not set)"
if len(value) <= 8:
return "***"
return f"{value[:8]}...{value[-4:]}"

View File

@@ -0,0 +1,88 @@
"""structlog configuration with built-in secret scrubbing.
Scrubs known API key patterns and bearer tokens from all log output (both rich
pretty-printed and JSON). Apply ``configure_logging(config)`` once at process
start (called from CLI entry points).
"""
from __future__ import annotations
import logging
import re
import sys
from typing import Any
import structlog
# Secret patterns. Order matters: more specific first.
_SECRET_PATTERNS: tuple[re.Pattern[str], ...] = tuple(
re.compile(p)
for p in (
r"sk-or-[A-Za-z0-9_-]{20,}", # OpenRouter
r"sk-ant-[A-Za-z0-9_-]{20,}", # Anthropic
r"sk-proj-[A-Za-z0-9_-]{20,}", # OpenAI project keys
r"sk-[A-Za-z0-9_-]{30,}", # OpenAI (general)
r"lsv2_pt_[A-Za-z0-9_-]{20,}", # LangSmith personal token
r"lsv2_[A-Za-z0-9_-]{30,}", # LangSmith (other)
r"Bearer\s+[A-Za-z0-9._-]{20,}", # generic bearer
r"ghp_[A-Za-z0-9]{30,}", # GitHub PAT
r"glpat-[A-Za-z0-9-]{20,}", # GitLab PAT
)
)
_REDACTED = "[REDACTED]"
def scrub(text: str) -> str:
"""Replace secrets in ``text`` with ``[REDACTED]``."""
for pat in _SECRET_PATTERNS:
text = pat.sub(_REDACTED, text)
return text
def scrub_value(value: Any) -> Any:
"""Recursively scrub strings inside dicts/lists/tuples/sets. Non-strings pass through."""
if isinstance(value, str):
return scrub(value)
if isinstance(value, dict):
return {k: scrub_value(v) for k, v in value.items()}
if isinstance(value, list):
return [scrub_value(v) for v in value]
if isinstance(value, tuple):
return tuple(scrub_value(v) for v in value)
if isinstance(value, set):
return {scrub_value(v) for v in value}
return value
def _scrub_processor(_logger: Any, _method: str, event_dict: dict[str, Any]) -> dict[str, Any]:
"""structlog processor: scrub every value in the event dict."""
return {k: scrub_value(v) for k, v in event_dict.items()}
def configure_logging(level: str = "info", json_output: bool = False) -> None:
"""Configure structlog with secret-scrubbing on top of the chosen renderer."""
log_level = getattr(logging, level.upper(), logging.INFO)
logging.basicConfig(level=log_level, format="%(message)s", stream=sys.stderr)
processors: list[Any] = [
structlog.contextvars.merge_contextvars,
structlog.processors.add_log_level,
structlog.processors.TimeStamper(fmt="iso", utc=True),
_scrub_processor,
]
if json_output:
processors.append(structlog.processors.JSONRenderer())
else:
processors.append(structlog.dev.ConsoleRenderer(colors=True))
structlog.configure(
processors=processors,
wrapper_class=structlog.make_filtering_bound_logger(log_level),
logger_factory=structlog.PrintLoggerFactory(file=sys.stderr),
cache_logger_on_first_use=True,
)
def get_logger(name: str | None = None) -> Any:
return structlog.get_logger(name) if name else structlog.get_logger()

View File

@@ -0,0 +1,115 @@
"""ArtifactWatcherMiddleware: detect write_file / edit_file calls targeting expected artifact."""
from __future__ import annotations
import asyncio
from collections.abc import Awaitable, Callable
from pathlib import Path
from typing import Any
from langchain.agents.middleware import AgentMiddleware, ToolCallRequest
from langchain_core.messages import ToolMessage
# Async callback fired when write_file/edit_file targets the expected path.
# Args: (absolute_path_str, content_str)
ArtifactWriteCallback = Callable[[str, str], Awaitable[None]]
# Tool names that count as "write the artifact"
_WRITE_TOOL_NAMES: frozenset[str] = frozenset({"write_file", "edit_file"})
# Candidate argument key names for the file path, in priority order
_PATH_ARG_KEYS: tuple[str, ...] = ("file_path", "path", "file")
# Candidate argument key names for the file content
_CONTENT_ARG_KEYS: tuple[str, ...] = ("content", "text", "new_string")
class ArtifactWatcherMiddleware(AgentMiddleware[Any, None, Any]):
"""Intercepts write_file / edit_file tool calls and fires a callback when the
targeted path matches *expected_path* (after resolution to an absolute path).
The middleware never suppresses or modifies the tool call — it always forwards
to ``handler``. The callback runs *after* the tool succeeds; any exception raised
inside the callback is caught and silently discarded so it cannot break the agent
loop.
"""
def __init__(
self,
expected_path: Path,
on_artifact_written: ArtifactWriteCallback,
) -> None:
super().__init__()
self._expected = expected_path.resolve()
self._callback = on_artifact_written
self._notified = asyncio.Event()
self._content: str | None = None
# ------------------------------------------------------------------
# Public helpers
# ------------------------------------------------------------------
@property
def notified(self) -> asyncio.Event:
"""Set once the expected artifact has been written."""
return self._notified
@property
def content(self) -> str | None:
"""Content string passed to the write/edit tool, or None if not yet written."""
return self._content
# ------------------------------------------------------------------
# AgentMiddleware interface
# ------------------------------------------------------------------
async def awrap_tool_call(
self,
request: ToolCallRequest,
handler: Callable[[ToolCallRequest], Awaitable[ToolMessage | Any]],
) -> ToolMessage | Any:
result = await handler(request)
tool_call = request.tool_call # ToolCall TypedDict: {"name": str, "args": dict, "id": ...}
name: str = tool_call["name"]
if name in _WRITE_TOOL_NAMES:
args: dict[str, Any] = dict(tool_call["args"] or {})
path_str = self._extract_path(args)
if path_str:
resolved = self._resolve_path(path_str)
if resolved == self._expected:
content = self._extract_content(args)
self._content = content
self._notified.set()
try:
await self._callback(str(resolved), content)
except Exception: # noqa: S110
pass # callback must not break agent loop
return result
# ------------------------------------------------------------------
# Private helpers
# ------------------------------------------------------------------
def _resolve_path(self, path_str: str) -> Path:
"""Resolve a possibly-relative path to absolute using expected_path's parent as base."""
p = Path(path_str)
if p.is_absolute():
return p.resolve()
# Relative paths are anchored to the expected artifact's directory
return (self._expected.parent / p).resolve()
@staticmethod
def _extract_path(args: dict[str, Any]) -> str:
for key in _PATH_ARG_KEYS:
val = args.get(key)
if isinstance(val, str) and val:
return val
return ""
@staticmethod
def _extract_content(args: dict[str, Any]) -> str:
for key in _CONTENT_ARG_KEYS:
val = args.get(key)
if isinstance(val, str):
return val
return ""

View File

@@ -1,66 +1,70 @@
"""AuditToolMiddleware: capture every tool call for audit log + DB.
Records: name, args, result/error, duration.
"""
"""AuditToolMiddleware: capture every tool call to audit.jsonl + tool_calls DB row."""
from __future__ import annotations
import time
from collections.abc import Awaitable, Callable
from typing import Any
from uuid import UUID
from langchain.agents.middleware import AgentMiddleware
AuditRecorder = Callable[[dict[str, Any]], Awaitable[None]]
class AuditToolMiddleware(AgentMiddleware):
"""Record every tool invocation for the audit log and DB sink (Step 8)."""
"""Record every tool invocation for the audit log and DB sink.
Accepts two optional recorders:
- ``file_recorder``: JSONL file at {state_dir}/audit.jsonl (append-only)
- ``db_recorder``: tool_calls DB row (optional, wired in Step 12+)
For backward compatibility, ``recorder`` is accepted as an alias for
``file_recorder`` (used by pre-Step-11 unit tests).
"""
def __init__(
self,
run_id: UUID | None = None,
phase_id: UUID | None = None,
interactive_session_id: UUID | None = None,
recorder: Any | None = None,
file_recorder: AuditRecorder | None = None,
db_recorder: AuditRecorder | None = None,
# backward-compat alias — maps to file_recorder
recorder: AuditRecorder | None = None,
) -> None:
super().__init__()
self.run_id = run_id
self.phase_id = phase_id
self.interactive_session_id = interactive_session_id
self.recorder = recorder
# ``recorder`` is a pre-Step-11 alias for file_recorder
self.file_recorder: AuditRecorder | None = (
file_recorder if file_recorder is not None else recorder
)
self.db_recorder = db_recorder
async def awrap_tool_call(self, request: Any, handler: Any) -> Any:
started = time.perf_counter()
# ToolCallRequest exposes tool_call dict with 'name' and 'args'
tool_call = getattr(request, "tool_call", {}) or {}
name: str = tool_call.get("name", "unknown") if isinstance(tool_call, dict) else "unknown"
args: dict[str, Any] = (
tool_call.get("args", {}) if isinstance(tool_call, dict) else {}
) or {}
error: str | None = None
result: Any = None
try:
result = await handler(request)
return result
except Exception as e:
await self._record(name, args, None, type(e).__name__, started)
error = type(e).__name__
raise
await self._record(name, args, result, None, started)
return result
async def _record(
self,
name: str,
args: dict[str, Any],
result: Any,
error: str | None,
started: float,
) -> None:
if self.recorder is None:
return
serializable_result: str | int | float | bool | dict[str, Any] | list[Any] | None
if isinstance(result, (str, int, float, bool, dict, list)) or result is None:
serializable_result = result
else:
serializable_result = str(result)
await self.recorder(
{
finally:
serializable_result: str | int | float | bool | dict[str, Any] | list[Any] | None
if isinstance(result, (str, int, float, bool, dict, list)) or result is None:
serializable_result = result
else:
serializable_result = str(result)
record: dict[str, Any] = {
"tool_name": name,
"args": args,
"result": serializable_result,
@@ -70,4 +74,13 @@ class AuditToolMiddleware(AgentMiddleware):
"phase_id": self.phase_id,
"interactive_session_id": self.interactive_session_id,
}
)
if self.file_recorder is not None:
try:
await self.file_recorder(record)
except Exception: # noqa: S110 — never let audit failure break the tool
pass
if self.db_recorder is not None:
try:
await self.db_recorder(record)
except Exception: # noqa: S110
pass

View File

@@ -1,4 +1,4 @@
"""CostMiddleware: capture every LLM call's usage and accumulate cost into the SQLite ledger."""
"""CostMiddleware: per-LLM-call cost tracking + optional budget enforcement."""
from __future__ import annotations
@@ -6,15 +6,17 @@ import time
from typing import Any
from uuid import UUID
from langchain.agents.middleware import AgentMiddleware
from langchain.agents.middleware import AgentMiddleware, ToolCallRequest
from langchain_core.messages import ToolMessage
from ..budget import BudgetTracker
from ..monitoring.pricing import PricingCache
class CostMiddleware(AgentMiddleware):
"""Wrap every model call. Compute cost from usage_metadata and persist.
"""Wrap every model call. Compute cost from usage_metadata and persist via recorder + budget.
Step 8 wires the DB writer via the recorder callback.
Step 8 wires the BudgetTracker via the budget_tracker parameter.
"""
def __init__(
@@ -23,18 +25,38 @@ class CostMiddleware(AgentMiddleware):
model_name: str,
run_id: UUID | None = None,
phase_id: UUID | None = None,
interactive_session_id: UUID | None = None,
persona_name: str | None = None,
recorder: Any | None = None, # callable(record) -> Awaitable[None] for DB sink (Step 8)
recorder: Any | None = None, # async callable(record) -> Awaitable[None] for DB sink
budget_tracker: BudgetTracker | None = None,
) -> None:
super().__init__()
self.pricing = pricing
self.model_name = model_name
self.run_id = run_id
self.phase_id = phase_id
self.interactive_session_id = interactive_session_id
self.persona_name = persona_name
self.recorder = recorder
self.budget = budget_tracker
async def awrap_tool_call(
self,
request: ToolCallRequest,
handler: Any,
) -> ToolMessage | Any:
"""Pass tool calls through without modification."""
return await handler(request)
async def awrap_model_call(self, request: Any, handler: Any) -> Any:
# Pre-call: ask budget tracker if estimated cost is allowed
if self.budget is not None:
estimated = self.pricing.compute_cost(self.model_name, 4000, 1500)
await self.budget.assert_can_call(
run_id=self.run_id,
persona_name=self.persona_name,
estimated_cost_usd=estimated,
)
started = time.perf_counter()
try:
response = await handler(request)
@@ -47,9 +69,27 @@ class CostMiddleware(AgentMiddleware):
error_code=type(e).__name__,
)
raise
usage = getattr(response, "usage_metadata", None) or {}
in_tokens = int(usage.get("input_tokens", 0) or 0)
out_tokens = int(usage.get("output_tokens", 0) or 0)
# Token usage shows up in different places depending on the model integration.
# langchain-openai usually fills `usage_metadata`, but for streamed responses
# or some OpenAI-compatible endpoints (OpenRouter forwarding DeepSeek/etc.)
# the count lands in `response_metadata.token_usage` with OpenAI keys
# (`prompt_tokens` / `completion_tokens`).
usage_meta = getattr(response, "usage_metadata", None) or {}
response_meta = getattr(response, "response_metadata", None) or {}
token_usage = response_meta.get("token_usage") if isinstance(response_meta, dict) else None
token_usage = token_usage or {}
in_tokens = int(
usage_meta.get("input_tokens")
or token_usage.get("prompt_tokens")
or token_usage.get("input_tokens")
or 0
)
out_tokens = int(
usage_meta.get("output_tokens")
or token_usage.get("completion_tokens")
or token_usage.get("output_tokens")
or 0
)
await self._record(
input_tokens=in_tokens,
output_tokens=out_tokens,
@@ -57,6 +97,14 @@ class CostMiddleware(AgentMiddleware):
status="ok",
error_code=None,
)
# Post-call: record actual cost in budget ledger
if self.budget is not None and (in_tokens or out_tokens):
actual = self.pricing.compute_cost(self.model_name, in_tokens, out_tokens)
await self.budget.record(
run_id=self.run_id,
persona_name=self.persona_name,
actual_cost_usd=actual,
)
return response
async def _record(

View File

@@ -0,0 +1,70 @@
"""Estimate per-phase cost using pricing matrix + crude token heuristic.
For accurate billing, use the actual usage_metadata after the call (see CostMiddleware).
This module is for the *preview* shown before ``mydeepagent run`` starts.
"""
from __future__ import annotations
from dataclasses import dataclass
from typing import TYPE_CHECKING
from ..persona import Persona
from ..workflow import WorkflowPhase, WorkflowTemplate
from .pricing import PricingCache
if TYPE_CHECKING:
from ..binding import Binding
@dataclass(frozen=True)
class PhaseCostEstimate:
phase_key: str
persona_name: str
model: str
estimated_input_tokens: int
estimated_output_tokens: int
estimated_cost_usd: float
@dataclass(frozen=True)
class WorkflowCostEstimate:
phases: list[PhaseCostEstimate]
total_usd: float
_DEFAULT_INPUT_TOKENS = 4000 # generous: instructions + context + prior artifacts
_DEFAULT_OUTPUT_TOKENS = 1500 # bounded by max_tokens; we use persona max_tokens if set
def estimate_phase(
phase: WorkflowPhase,
persona: Persona,
pricing: PricingCache,
) -> PhaseCostEstimate:
"""Estimate the cost of a single phase based on persona model and default token counts."""
input_tokens = _DEFAULT_INPUT_TOKENS
output_tokens = int(persona.model_params.get("max_tokens", _DEFAULT_OUTPUT_TOKENS))
cost = pricing.compute_cost(persona.model, input_tokens, output_tokens)
return PhaseCostEstimate(
phase_key=phase.key,
persona_name=f"{persona.name}@{persona.version}",
model=persona.model,
estimated_input_tokens=input_tokens,
estimated_output_tokens=output_tokens,
estimated_cost_usd=cost,
)
def estimate_workflow(
template: WorkflowTemplate,
bindings: dict[str, Binding],
pricing: PricingCache,
) -> WorkflowCostEstimate:
"""Estimate the total cost of all phases in a workflow template."""
phases: list[PhaseCostEstimate] = []
for phase in template.phases:
binding = bindings[phase.role]
phases.append(estimate_phase(phase, binding.persona, pricing))
total = sum(p.estimated_cost_usd for p in phases)
return WorkflowCostEstimate(phases=phases, total_usd=total)

View File

@@ -0,0 +1,159 @@
"""Crash recovery: sweep non-terminal runs at startup and mark them as failed.
This v0.1.0 implementation is conservative — runs that were mid-flight at the previous
process death are *not* resumed automatically. They are marked ``failed`` with a
synthesized ``run.failed`` event so the active-run uniqueness slot is freed and the
user can re-run if desired. Real Temporal-style resume is deferred to v0.2 or beyond.
"""
from __future__ import annotations
from dataclasses import dataclass
from datetime import UTC, datetime
from uuid import UUID
from sqlalchemy import func, select
from sqlalchemy.dialects.sqlite import insert as sqlite_insert
from sqlalchemy.ext.asyncio import AsyncSession
from .enums import RunPhaseState, RunState
from .persistence.db import Database
from .persistence.models import RunEventRow, RunPhaseRow, RunRow
from .run_event import RunEventType, run_idempotency_key
_NON_TERMINAL_RUN_STATES: frozenset[str] = frozenset(
{
RunState.CREATED.value,
RunState.BOUND.value,
RunState.PLANNING.value,
RunState.AWAITING_APPROVAL.value,
RunState.EXECUTING.value,
RunState.PAUSED.value,
}
)
_NON_TERMINAL_PHASE_STATES: frozenset[str] = frozenset(
{
RunPhaseState.PENDING.value,
RunPhaseState.RUNNING.value,
RunPhaseState.AWAITING_ARTIFACT.value,
RunPhaseState.VALIDATING.value,
RunPhaseState.AWAITING_APPROVAL.value,
}
)
_FAILED_REASON = "process_restart_unrecovered"
@dataclass(frozen=True)
class SweepReport:
"""Outcome of one recovery sweep."""
failed_runs: tuple[UUID, ...]
failed_phases: tuple[UUID, ...]
@property
def total(self) -> int:
return len(self.failed_runs) + len(self.failed_phases)
async def sweep_orphan_runs(db: Database) -> SweepReport:
"""Mark non-terminal runs/phases as ``failed`` and emit run.failed events.
Idempotent: rerunning when no orphans exist returns an empty SweepReport.
Uses the existing ``run_events.idempotency_key`` UNIQUE constraint so duplicate
sweeps in the same process don't insert duplicate events.
"""
failed_runs: list[UUID] = []
failed_phases: list[UUID] = []
now = _now_iso()
async with db.session() as s:
rows = (
(await s.execute(select(RunRow).where(RunRow.state.in_(_NON_TERMINAL_RUN_STATES))))
.scalars()
.all()
)
for run in rows:
run_uuid = UUID(run.id)
run.state = RunState.FAILED.value
run.ended_at = now
run.updated_at = now
run.final_report_path = None
failed_runs.append(run_uuid)
# Append a single synthesized run.failed event (idempotent).
await _append_event_idempotent(
s,
run_id=run.id,
event_type=RunEventType.RUN_FAILED,
payload={"reason": _FAILED_REASON},
extra_for_key={"reason": _FAILED_REASON},
)
# Cascade orphan phases.
phase_rows = (
(
await s.execute(
select(RunPhaseRow)
.where(RunPhaseRow.run_id == run.id)
.where(RunPhaseRow.state.in_(_NON_TERMINAL_PHASE_STATES))
)
)
.scalars()
.all()
)
for ph in phase_rows:
ph.state = RunPhaseState.FAILED.value
ph.ended_at = now
failed_phases.append(UUID(ph.id))
await s.commit()
return SweepReport(
failed_runs=tuple(failed_runs),
failed_phases=tuple(failed_phases),
)
async def _append_event_idempotent(
s: AsyncSession,
*,
run_id: str,
event_type: RunEventType,
payload: dict[str, object],
extra_for_key: dict[str, object] | None = None,
) -> None:
"""Append a run_events row using ON CONFLICT DO NOTHING on idempotency_key."""
extra = {k: str(v) for k, v in (extra_for_key or {}).items()}
key = run_idempotency_key(event_type, UUID(run_id), **extra)
# Compute next seq.
next_seq = (
await s.execute(
select(func.coalesce(func.max(RunEventRow.seq), 0) + 1).where(
RunEventRow.run_id == run_id
)
)
).scalar_one()
stmt = (
sqlite_insert(RunEventRow)
.values(
run_id=run_id,
phase_id=None,
seq=int(next_seq),
type=event_type.value,
payload=payload,
idempotency_key=key,
ts=_now_iso(),
)
.on_conflict_do_nothing(index_elements=["run_id", "idempotency_key"])
)
await s.execute(stmt)
def _now_iso() -> str:
return datetime.now(UTC).isoformat(timespec="seconds")

View File

@@ -1 +1,39 @@
"""Run event types for streaming progress. Implemented in Step 4."""
"""Run event types + idempotency key generation."""
from __future__ import annotations
from enum import StrEnum
from uuid import UUID
class RunEventType(StrEnum):
RUN_CREATED = "run.created"
RUN_STARTED = "run.started"
RUN_PAUSED = "run.paused"
RUN_RESUMED = "run.resumed"
RUN_COMPLETED = "run.completed"
RUN_FAILED = "run.failed"
RUN_ABORTED = "run.aborted"
PHASE_STARTED = "phase.started"
PHASE_COMPLETED = "phase.completed"
PHASE_FAILED = "phase.failed"
PHASE_SKIPPED = "phase.skipped"
PROMPT_SENT = "prompt.sent"
PROMPT_REPAIRED = "prompt.repaired"
ARTIFACT_EXPECTED = "artifact.expected"
ARTIFACT_VALIDATED = "artifact.validated"
ARTIFACT_INVALID = "artifact.invalid"
ARTIFACT_TIMEOUT = "artifact.timeout"
APPROVAL_REQUESTED = "approval.requested"
APPROVAL_RESOLVED = "approval.resolved"
def run_idempotency_key(event_type: RunEventType, run_id: UUID, **extra: object) -> str:
"""Deterministic idempotency key per plan v2.0 §13.1.
Key format: "<event_type>:<run_id>[:<k>=<v>...]" with extra keys sorted ascending.
"""
parts: list[str] = [event_type.value, str(run_id)]
for k in sorted(extra):
parts.append(f"{k}={extra[k]}")
return ":".join(parts)

View File

@@ -0,0 +1,28 @@
"""Cross-cutting secret resolution. Tries config -> env -> keyring -> error."""
from __future__ import annotations
import os
from .config import Config
from .errors import MyDeepAgentError
from .keys import get_api_key
def resolve_openrouter_api_key(config: Config) -> str:
"""Resolve the OpenRouter API key with priority: config -> env -> keyring -> error."""
if config.openrouter_api_key:
return config.openrouter_api_key
env_key = os.environ.get("MYDEEPAGENT_OPENROUTER_API_KEY") or os.environ.get(
"OPENROUTER_API_KEY"
)
if env_key:
return env_key
kr_key = get_api_key("openrouter")
if kr_key:
return kr_key
raise MyDeepAgentError.human_required(
"backend_auth_failed",
message="OpenRouter API key is not configured",
recovery_hint="run `mydeepagent login openrouter` to register one in the OS keyring",
)

View File

@@ -11,7 +11,6 @@ Connects:
from __future__ import annotations
import os
from pathlib import Path
from typing import Any, Literal
from uuid import UUID
@@ -28,6 +27,7 @@ from langchain_openai import ChatOpenAI
from .config import Config
from .errors import MyDeepAgentError
from .persona import FilesystemPermissionSpec, Persona, PersonaSubagent
from .secrets import resolve_openrouter_api_key as _resolve_openrouter_api_key_impl
DEFAULT_DENY_PATHS: tuple[str, ...] = (
"/.env*",
@@ -125,24 +125,13 @@ def _subagent_to_dict(sub: PersonaSubagent) -> SubAgent:
def _resolve_openrouter_api_key(config: Config) -> str:
"""Pull the OpenRouter API key from config -> env -> error.
"""Pull the OpenRouter API key from config -> env -> keyring -> error.
Priority: config.openrouter_api_key -> MYDEEPAGENT_OPENROUTER_API_KEY -> OPENROUTER_API_KEY.
Delegates to secrets.resolve_openrouter_api_key for full priority chain.
Priority: config.openrouter_api_key -> MYDEEPAGENT_OPENROUTER_API_KEY ->
OPENROUTER_API_KEY -> OS keyring -> error.
"""
if config.openrouter_api_key:
return config.openrouter_api_key
env_key = os.environ.get("MYDEEPAGENT_OPENROUTER_API_KEY") or os.environ.get(
"OPENROUTER_API_KEY"
)
if env_key:
return env_key
raise MyDeepAgentError.human_required(
"backend_auth_failed",
message="OpenRouter API key is not configured",
recovery_hint=(
"set MYDEEPAGENT_OPENROUTER_API_KEY in .env or run `mydeepagent login openrouter`"
),
)
return _resolve_openrouter_api_key_impl(config)
def resolve_model_instance(
@@ -258,7 +247,19 @@ def build_agent(
]
kwargs["permissions"] = permissions
if persona.allowed_tools:
# deepagents 0.6.x: passing `tools` as a string list to create_deep_agent() triggers
# SubAgentMiddleware._get_subagents() → langchain create_agent() → ToolNode, which
# iterates the LocalShellBackend tools. Some of those tools are raw async functions
# (not StructuredTool instances), causing:
# AttributeError: 'function' object has no attribute 'name'
# Workaround: skip `tools` kwarg for local_shell backend. deepagents exposes all
# backend-default tools (read_file, write_file, glob, grep, ls, execute, write_todos)
# to the LLM by default; SafetyShellMiddleware enforces path safety and blocks
# destructive-command execution regardless of which tools the LLM attempts to call.
# For non-local_shell backends (state, filesystem, composite), `tools` is passed
# through normally since those backends return proper StructuredTool objects.
use_tools_kwarg = persona.deepagents_backend != "local_shell"
if use_tools_kwarg and persona.allowed_tools:
kwargs["tools"] = list(persona.allowed_tools)
if subagents:
kwargs["subagents"] = subagents

View File

@@ -1 +1,61 @@
"""Slash command registry and dispatcher. Implemented in Step 10."""
"""Parse and dispatch slash commands inside the interactive REPL.
Slash commands are recognized by a leading '/'; everything else is forwarded to the agent.
"""
from __future__ import annotations
from collections.abc import Awaitable, Callable
from dataclasses import dataclass
@dataclass(frozen=True)
class SlashParsed:
"""A parsed slash command. ``raw`` is the original token after the slash."""
name: str
args: tuple[str, ...]
raw: str
def parse_slash(line: str) -> SlashParsed | None:
"""Return a SlashParsed if ``line`` starts with '/', else None."""
if not line.startswith("/"):
return None
body = line[1:].strip()
if not body:
return SlashParsed(name="", args=(), raw="")
parts = body.split()
return SlashParsed(name=parts[0].lower(), args=tuple(parts[1:]), raw=body)
SlashHandler = Callable[[SlashParsed], Awaitable[bool]]
"""A handler returns False to keep the REPL alive, True to exit it."""
class SlashRegistry:
"""Map slash command names to async handlers."""
def __init__(self) -> None:
self._handlers: dict[str, SlashHandler] = {}
self._help: dict[str, str] = {}
def register(self, name: str, handler: SlashHandler, *, help: str = "") -> None:
self._handlers[name.lower()] = handler
if help:
self._help[name.lower()] = help
async def dispatch(self, cmd: SlashParsed) -> bool:
if cmd.name in self._handlers:
return await self._handlers[cmd.name](cmd)
return False # unknown → caller decides
@property
def names(self) -> list[str]:
return sorted(self._handlers)
def help_for(self, name: str) -> str:
return self._help.get(name.lower(), "")
def all_help(self) -> list[tuple[str, str]]:
return [(n, self._help.get(n, "")) for n in self.names]

View File

@@ -1 +1,53 @@
"""TUI approval dialog for human-in-the-loop actions. Implemented in Step 7."""
"""TUI approval prompt: display phase result and ask for approve/reject/request_changes/abort."""
from __future__ import annotations
import typer
from rich.console import Console
from ..enums import ApprovalDecisionAction
_CONSOLE = Console()
_CHOICE_MAP: dict[str, ApprovalDecisionAction] = {
"approve": ApprovalDecisionAction.APPROVE,
"a": ApprovalDecisionAction.APPROVE,
"reject": ApprovalDecisionAction.REJECT,
"r": ApprovalDecisionAction.REJECT,
"request_changes": ApprovalDecisionAction.REQUEST_CHANGES,
"c": ApprovalDecisionAction.REQUEST_CHANGES,
"abort": ApprovalDecisionAction.ABORT,
"x": ApprovalDecisionAction.ABORT,
}
async def cli_approval_callback(
payload: dict[str, object],
gates: list[str],
) -> ApprovalDecisionAction:
"""Display the phase result and prompt the user for an approval decision.
Valid inputs (case-insensitive):
approve / a → APPROVE
reject / r → REJECT
request_changes / c → REQUEST_CHANGES
abort / x → ABORT
Any unrecognised input defaults to REJECT.
"""
_CONSOLE.print()
_CONSOLE.print(f"[bold cyan]Approval required[/] — gates: {', '.join(gates) or '(none)'}")
_CONSOLE.print(f" phase: {payload.get('phase_key')}")
_CONSOLE.print(f" artifact: {payload.get('artifact_path')}")
_CONSOLE.print()
raw = (
typer.prompt(
"Decision [approve / reject / request_changes / abort]",
default="approve",
)
.strip()
.lower()
)
return _CHOICE_MAP.get(raw, ApprovalDecisionAction.REJECT)