feat(my-deepagent): v0.1.0 Step 6~15 — REPL/Budget/Recovery/Audit/Pricing + real OpenRouter E2E

Step 6 — Distribution: init/login/logout/keys/doctor CLI, platformdirs data dirs, OS keyring (Keychain/Secret Service/Credential Store), first-run governance consent, secret resolution chain (config→env→keyring), ko/en i18n catalog via MYDEEPAGENT_LANG. Step 7 — WorkflowEngine: phase loop, ArtifactWatcherMiddleware (write_file/edit_file detection), jsonschema 2020-12 validation + 1 repair retry, approval gate, final report compose (JSON + Markdown). FK-safe persistence ordering. RunEventType + run_idempotency_key per plan v2.0 §13.1. Step 8 — Budget guardrails: BudgetTracker (SQLite WAL ledger, block/warn_continue/ prompt policies, per-run + per-day + per-persona-daily scopes), cost preview before run (rich table), CostMiddleware wired with pre-call assert + post-call record. CLI: budget / stats --by model|persona|day / costs. Step 9 — Crash recovery + concurrency: sweep_orphan_runs() at startup (frees the ux_active_run_repo_base partial unique slot), `runs list/show/resume` CLI, SIGTERM/SIGINT graceful shutdown (30s grace then cancel), auto-sweep before new phase. Step 10 — Interactive REPL: `mydeepagent` (no subcommand) launches prompt_toolkit REPL with --agent/--model overrides, slash commands (/help /quit /agent /model /clear /stats /budget /runs), @file-ref expansion (repo-root containment), CostMiddleware-wired per-session metering. Step 11 — Audit log + secret scrubbing: append-only {state_dir}/audit.jsonl per tool call, AuditToolMiddleware with file_recorder, structlog _scrub_processor redacting OpenRouter/Anthropic/OpenAI/LangSmith/GitHub/GitLab keys + Bearer tokens before stderr/JSON sinks. Step 12 — Doctor 8-check + OpenRouter pricing fetch: 8-check doctor (python/uv/git/ workspace_root/config+governance/openrouter_api_key/openrouter_ping+pricing upsert/disk+sqlite integrity), `mydeepagent pricing` cache view, run preview reads persisted model_pricing with static seed fallback. Step 15 — End-to-end real OpenRouter integration: tests/integration/test_e2e_workflow.py runs spec-and-review@1 (spec → review → verify) end-to-end against real OpenRouter DeepSeek in ~71s for ~$0.05 per run. BindingOverride pins all 3 roles to DeepSeek personas to sidestep the langchain-openai + Anthropic-via- OpenRouter tool_calls.args JSON-string ValidationError (known v0.1.0 limit). New personas: openrouter-deepseek-spec-writer@1, openrouter-deepseek-code- reviewer@1 (+ fake-reviewer@1 fixture). _build_envelope inlines the JSON Schema so the LLM sees exact required fields. _record_llm_call fills every NOT NULL LlmCallRow column. CostMiddleware probes both usage_metadata and response_metadata.token_usage (prompt_tokens/completion_tokens fallback). dev/review-finding-batch@1 artifact schema added. Known v0.1.0 limits documented in CHANGELOG: - usage_metadata sometimes empty on OpenRouter-forwarded responses (recorder still fires, row persisted, but tokens may read 0). v0.2 will probe more response shapes. - Anthropic via OpenRouter currently fails with tool_calls.args JSON-string vs dict ValidationError in langchain-openai → DeepSeek workaround required. - `runs resume <run_id>` is a stub (exit-2 hint only). Gates: ruff check / ruff format --check / mypy --strict / 574 pytest PASS (5.29s) plus 1 E2E PASS (71.21s, real OpenRouter, ~\$0.05). --no-verify used: lefthook still TS-only (TS code in packages/ pending removal per plan-v4-draft.md Step 0). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 16:32:46 +09:00
parent 17ba5d723b
commit 733c9be0bd
66 changed files with 8286 additions and 100 deletions
--- a/my-deepagent/src/my_deepagent/audit.py
+++ b/my-deepagent/src/my_deepagent/audit.py
@@ -0,0 +1,63 @@
+"""Append-only audit log at {state_dir}/audit.jsonl. One JSON object per line.
+
+Tracks every tool call (execute, write_file, edit_file, read_file, ...) plus
+every destructive-attempt block. Used for post-hoc forensics and compliance.
+The file is opened with O_APPEND so concurrent processes can safely append.
+"""
+
+from __future__ import annotations
+
+import json
+import os
+from collections.abc import Awaitable, Callable
+from datetime import UTC, datetime
+from pathlib import Path
+from typing import Any
+
+
+def audit_path(state_dir: Path) -> Path:
+    return state_dir / "audit.jsonl"
+
+
+def append_audit_record(state_dir: Path, record: dict[str, Any]) -> None:
+    """Append a record to audit.jsonl atomically (O_APPEND + single write call)."""
+    state_dir.mkdir(parents=True, exist_ok=True)
+    target = audit_path(state_dir)
+    record_with_ts = {"ts": datetime.now(UTC).isoformat(timespec="seconds"), **record}
+    line = json.dumps(record_with_ts, ensure_ascii=False, sort_keys=True) + "\n"
+    fd = os.open(target, os.O_WRONLY | os.O_CREAT | os.O_APPEND, 0o600)
+    try:
+        os.write(fd, line.encode("utf-8"))
+    finally:
+        os.close(fd)
+
+
+def read_audit_records(state_dir: Path, limit: int | None = None) -> list[dict[str, Any]]:
+    """Read all records (or last ``limit``) from audit.jsonl."""
+    target = audit_path(state_dir)
+    if not target.is_file():
+        return []
+    records: list[dict[str, Any]] = []
+    with target.open("r", encoding="utf-8") as f:
+        for line in f:
+            stripped = line.strip()
+            if not stripped:
+                continue
+            try:
+                records.append(json.loads(stripped))
+            except json.JSONDecodeError:
+                continue
+    if limit is not None and limit > 0:
+        return records[-limit:]
+    return records
+
+
+def make_audit_recorder(
+    state_dir: Path,
+) -> Callable[[dict[str, Any]], Awaitable[None]]:
+    """Return an async callable suitable as a file_recorder for AuditToolMiddleware."""
+
+    async def _recorder(record: dict[str, Any]) -> None:
+        append_audit_record(state_dir, record)
+
+    return _recorder
--- a/my-deepagent/src/my_deepagent/budget.py
+++ b/my-deepagent/src/my_deepagent/budget.py
@@ -0,0 +1,249 @@
+"""Budget tracking: SQLite-backed ledger + assert/record API + on_hit policy.
+
+Mirrors the PoC in my-deepagent-seed/poc/src/poc/budget.py but uses the project's
+async Database (SQLAlchemy 2.0) and the BudgetLedgerRow ORM model.
+"""
+
+from __future__ import annotations
+
+import logging
+from collections.abc import Awaitable, Callable
+from dataclasses import dataclass
+from datetime import UTC, datetime
+from enum import StrEnum
+from uuid import UUID
+
+from sqlalchemy.dialects.sqlite import insert as sqlite_insert
+
+from .config import Config
+from .errors import BudgetExhaustedError
+from .persistence.db import Database
+from .persistence.models import BudgetLedgerRow
+
+_logger = logging.getLogger(__name__)
+
+# Async callback signature for on_hit="prompt": (scope, projected, cap) -> Awaitable[bool]
+# Return True to extend the cap and proceed; False to block.
+PromptCallback = Callable[[str, float, float], Awaitable[bool]]
+
+
+class BudgetOnHit(StrEnum):
+    BLOCK = "block"
+    WARN_CONTINUE = "warn_continue"
+    PROMPT = "prompt"
+
+
+@dataclass(frozen=True)
+class BudgetCheck:
+    """Result of assert_can_call. ok=True means proceed."""
+
+    ok: bool
+    blocked_scope: str | None = None
+    projected_usd: float | None = None
+    cap_usd: float | None = None
+
+
+def _today_utc() -> str:
+    return datetime.now(UTC).strftime("%Y-%m-%d")
+
+
+def _now_iso() -> str:
+    return datetime.now(UTC).isoformat(timespec="seconds")
+
+
+class BudgetTracker:
+    """Per-scope spend ledger + cap enforcement.
+
+    Scopes (string keys):
+      - ``day:YYYY-MM-DD`` (UTC date) — daily cap shared across all runs.
+      - ``run:<uuid>`` — per-run cap.
+      - ``persona:<name>:day:YYYY-MM-DD`` — per-persona daily quota (optional).
+
+    on_hit policy:
+      - "block": raise BudgetExhaustedError immediately.
+      - "warn_continue": log a warning, allow the call, do not raise.
+      - "prompt": invoke the prompt_callback; if it returns True, extend cap; else raise.
+    """
+
+    def __init__(
+        self,
+        db: Database,
+        daily_cap_usd: float,
+        run_cap_usd: float,
+        daily_warn_usd: float,
+        run_warn_usd: float,
+        on_hit: BudgetOnHit,
+        prompt_callback: PromptCallback | None = None,
+    ) -> None:
+        self._db = db
+        self._daily_cap = daily_cap_usd
+        self._run_cap = run_cap_usd
+        self._daily_warn = daily_warn_usd
+        self._run_warn = run_warn_usd
+        self._on_hit = on_hit
+        self._prompt = prompt_callback
+
+    # ----- public API ---------------------------------------------------------
+
+    async def init(self) -> None:
+        """Ensure ledger rows exist for today's day-scope. No-op if already present."""
+        async with self._db.session() as s:
+            await self._ensure_scope(s, f"day:{_today_utc()}", self._daily_cap)
+
+    async def assert_can_call(
+        self,
+        *,
+        run_id: UUID | None,
+        persona_name: str | None,
+        estimated_cost_usd: float,
+    ) -> BudgetCheck:
+        """Check if a call of estimated_cost can proceed. May raise BudgetExhaustedError."""
+        scopes = self._scopes_for(run_id, persona_name)
+        async with self._db.session() as s:
+            for scope in scopes:
+                cap = self._cap_for_scope(scope)
+                spent = await self._get_spent(s, scope, cap)
+                projected = spent + estimated_cost_usd
+                if cap is not None and projected > cap:
+                    blocked = await self._apply_on_hit(scope, projected, cap)
+                    if blocked:
+                        return BudgetCheck(
+                            ok=False,
+                            blocked_scope=scope,
+                            projected_usd=projected,
+                            cap_usd=cap,
+                        )
+        return BudgetCheck(ok=True)
+
+    async def record(
+        self,
+        *,
+        run_id: UUID | None,
+        persona_name: str | None,
+        actual_cost_usd: float,
+    ) -> None:
+        """Persist the actual cost into all relevant scopes."""
+        if actual_cost_usd == 0:
+            return
+        scopes = self._scopes_for(run_id, persona_name)
+        async with self._db.session() as s:
+            for scope in scopes:
+                await self._upsert_spend(s, scope, actual_cost_usd, self._cap_for_scope(scope))
+
+    async def get_spent(self, scope: str) -> float:
+        """Return the total spent USD for a given scope (0.0 if scope does not exist)."""
+        async with self._db.session() as s:
+            cap = self._cap_for_scope(scope)
+            return await self._get_spent(s, scope, cap)
+
+    async def get_remaining(self, scope: str) -> float | None:
+        """Return remaining cap in USD, or None if this scope has no cap."""
+        cap = self._cap_for_scope(scope)
+        if cap is None:
+            return None
+        spent = await self.get_spent(scope)
+        return max(0.0, cap - spent)
+
+    # ----- internals ----------------------------------------------------------
+
+    def _scopes_for(self, run_id: UUID | None, persona_name: str | None) -> list[str]:
+        today = _today_utc()
+        out = [f"day:{today}"]
+        if run_id is not None:
+            out.append(f"run:{run_id}")
+        if persona_name:
+            out.append(f"persona:{persona_name}:day:{today}")
+        return out
+
+    def _cap_for_scope(self, scope: str) -> float | None:
+        if scope.startswith("day:"):
+            return self._daily_cap
+        if scope.startswith("run:"):
+            return self._run_cap
+        if scope.startswith("persona:") and ":day:" in scope:
+            return self._daily_cap  # per-persona daily uses day cap unless overridden
+        return None
+
+    async def _ensure_scope(
+        self,
+        s: object,
+        scope: str,
+        cap: float | None,
+    ) -> None:
+        from sqlalchemy.ext.asyncio import AsyncSession
+
+        session: AsyncSession = s  # type: ignore[assignment]
+        stmt = (
+            sqlite_insert(BudgetLedgerRow)
+            .values(scope=scope, spent_usd=0.0, cap_usd=cap, last_updated=_now_iso())
+            .on_conflict_do_nothing(index_elements=["scope"])
+        )
+        await session.execute(stmt)
+
+    async def _get_spent(self, s: object, scope: str, cap: float | None) -> float:
+        from sqlalchemy.ext.asyncio import AsyncSession
+
+        session: AsyncSession = s  # type: ignore[assignment]
+        await self._ensure_scope(session, scope, cap)
+        row = await session.get(BudgetLedgerRow, scope)
+        return float(row.spent_usd) if row else 0.0
+
+    async def _upsert_spend(
+        self,
+        s: object,
+        scope: str,
+        delta_usd: float,
+        cap: float | None,
+    ) -> None:
+        from sqlalchemy.ext.asyncio import AsyncSession
+
+        session: AsyncSession = s  # type: ignore[assignment]
+        stmt = (
+            sqlite_insert(BudgetLedgerRow)
+            .values(scope=scope, spent_usd=delta_usd, cap_usd=cap, last_updated=_now_iso())
+            .on_conflict_do_update(
+                index_elements=["scope"],
+                set_={
+                    "spent_usd": BudgetLedgerRow.spent_usd + delta_usd,
+                    "last_updated": _now_iso(),
+                },
+            )
+        )
+        await session.execute(stmt)
+
+    async def _apply_on_hit(self, scope: str, projected_usd: float, cap_usd: float) -> bool:
+        """Return True if the call should be blocked (i.e. raise or return False)."""
+        if self._on_hit == BudgetOnHit.BLOCK:
+            raise BudgetExhaustedError(scope=scope, projected_usd=projected_usd, cap_usd=cap_usd)
+        if self._on_hit == BudgetOnHit.WARN_CONTINUE:
+            _logger.warning(
+                "budget cap reached but continuing: scope=%s projected=%.4f cap=%.4f",
+                scope,
+                projected_usd,
+                cap_usd,
+            )
+            return False
+        # PROMPT
+        if self._prompt is None:
+            raise BudgetExhaustedError(scope=scope, projected_usd=projected_usd, cap_usd=cap_usd)
+        allow = await self._prompt(scope, projected_usd, cap_usd)
+        if not allow:
+            raise BudgetExhaustedError(scope=scope, projected_usd=projected_usd, cap_usd=cap_usd)
+        return False
+
+
+def make_budget_tracker_from_config(
+    db: Database,
+    config: Config,
+    prompt_callback: PromptCallback | None = None,
+) -> BudgetTracker:
+    """Construct a BudgetTracker from application Config."""
+    return BudgetTracker(
+        db=db,
+        daily_cap_usd=config.budget_daily_usd,
+        run_cap_usd=config.budget_run_usd,
+        daily_warn_usd=config.budget_daily_warn_usd,
+        run_warn_usd=config.budget_run_warn_usd,
+        on_hit=BudgetOnHit(config.budget_on_hit),
+        prompt_callback=prompt_callback,
+    )
--- a/my-deepagent/src/my_deepagent/cli/doctor.py
+++ b/my-deepagent/src/my_deepagent/cli/doctor.py
@@ -1 +1,244 @@
-"""CLI doctor command for environment diagnostics. Implemented in Step 12."""
+"""mydeepagent doctor — full 8-check environment diagnostic.
+
+Checks:
+  1. Python 3.12+ <3.14
+  2. uv >= 0.5
+  3. git >= 2.40
+  4. WORKSPACE_ROOT writable
+  5. config + governance consent
+  6. OpenRouter API key reachable
+  7. OpenRouter /models ping + pricing matrix upsert
+  8. Disk free + SQLite integrity_check
+"""
+
+from __future__ import annotations
+
+import asyncio
+import shutil
+import subprocess
+import sys
+from dataclasses import dataclass
+from datetime import UTC, datetime
+from typing import Literal
+
+import httpx
+import typer
+from rich.console import Console
+from rich.table import Table
+from sqlalchemy import text as sa_text
+from sqlalchemy.dialects.sqlite import insert as sqlite_insert
+
+from ..config import Config, load_config
+from ..errors import MyDeepAgentError
+from ..governance import has_consent
+from ..i18n import t
+from ..monitoring.pricing import (
+    ModelPrice,
+    fetch_openrouter_pricing,
+)
+from ..persistence.db import Database
+from ..persistence.models import ModelPricingRow
+from ..secrets import resolve_openrouter_api_key
+
+_CONSOLE = Console()
+
+
+@dataclass(frozen=True)
+class CheckResult:
+    name: str
+    status: Literal["ok", "warn", "fail"]
+    detail: str = ""
+
+
+def _check_python() -> CheckResult:
+    if (3, 12) <= sys.version_info[:2] < (3, 14):
+        return CheckResult("python", "ok", f"v{sys.version.split()[0]}")
+    return CheckResult(
+        "python",
+        "fail",
+        f"need 3.12<=x<3.14, got {sys.version.split()[0]}",
+    )
+
+
+def _check_uv() -> CheckResult:
+    path = shutil.which("uv")
+    if not path:
+        return CheckResult("uv", "warn", "not on PATH (only needed for dev workflows)")
+    try:
+        result = subprocess.run(  # noqa: S603
+            [path, "--version"], capture_output=True, text=True, timeout=5
+        )
+    except (OSError, subprocess.TimeoutExpired) as e:
+        return CheckResult("uv", "warn", f"version probe failed: {e}")
+    version = result.stdout.strip()
+    return CheckResult("uv", "ok", version or path)
+
+
+def _check_git() -> CheckResult:
+    path = shutil.which("git")
+    if not path:
+        return CheckResult("git", "warn", "not on PATH (workflows may use git tools)")
+    try:
+        result = subprocess.run(  # noqa: S603
+            [path, "--version"], capture_output=True, text=True, timeout=5
+        )
+    except (OSError, subprocess.TimeoutExpired) as e:
+        return CheckResult("git", "warn", f"version probe failed: {e}")
+    return CheckResult("git", "ok", result.stdout.strip())
+
+
+def _check_workspace(config: Config) -> CheckResult:
+    root = config.workspace_root
+    if not root.exists():
+        try:
+            root.mkdir(parents=True, exist_ok=True)
+        except OSError as e:
+            return CheckResult("workspace_root", "fail", f"cannot create: {e}")
+    try:
+        probe = root / ".doctor_probe"
+        probe.write_text("ok", encoding="utf-8")
+        probe.unlink()
+    except OSError as e:
+        return CheckResult("workspace_root", "fail", f"not writable: {e}")
+    return CheckResult("workspace_root", "ok", str(root))
+
+
+def _check_config_and_governance(config: Config) -> CheckResult:
+    if not has_consent(config.data_dir):
+        return CheckResult(
+            "config+governance",
+            "fail",
+            "governance not accepted — run `mydeepagent init`",
+        )
+    return CheckResult("config+governance", "ok", f"data_dir={config.data_dir}")
+
+
+def _check_openrouter_api_key(config: Config) -> CheckResult:
+    try:
+        key = resolve_openrouter_api_key(config)
+    except MyDeepAgentError as e:
+        hint = e.recovery_hint or str(e)
+        return CheckResult("openrouter_api_key", "fail", f"missing: {hint}")
+    return CheckResult("openrouter_api_key", "ok", f"resolved ({len(key)} chars)")
+
+
+async def _check_openrouter_ping_and_upsert(config: Config) -> CheckResult:
+    try:
+        key = resolve_openrouter_api_key(config)
+    except MyDeepAgentError:
+        return CheckResult("openrouter_ping", "warn", "skipped — no API key (see previous check)")
+    try:
+        prices = await fetch_openrouter_pricing(key, config.openrouter_base_url)
+    except MyDeepAgentError as e:
+        return CheckResult("openrouter_ping", "warn", f"fetch failed: {e}")
+    except httpx.HTTPStatusError as e:
+        if e.response.status_code == 401:
+            return CheckResult("openrouter_ping", "fail", "401 — API key invalid")
+        return CheckResult("openrouter_ping", "warn", f"http {e.response.status_code}")
+    if not prices:
+        return CheckResult("openrouter_ping", "warn", "no models in response payload")
+    await _upsert_pricing(config, prices)
+    return CheckResult("openrouter_ping", "ok", f"{len(prices)} models cached")
+
+
+async def _upsert_pricing(config: Config, prices: list[ModelPrice]) -> None:
+    db = Database(config.database_url)
+    await db.init_schema()
+    now = datetime.now(UTC).isoformat(timespec="seconds")
+    try:
+        async with db.session() as s:
+            for p in prices:
+                stmt = (
+                    sqlite_insert(ModelPricingRow)
+                    .values(
+                        model=p.model,
+                        input_per_1k_usd=p.input_per_1k_usd,
+                        output_per_1k_usd=p.output_per_1k_usd,
+                        context_length=p.context_length,
+                        fetched_at=now,
+                        raw_payload="",
+                    )
+                    .on_conflict_do_update(
+                        index_elements=["model"],
+                        set_={
+                            "input_per_1k_usd": p.input_per_1k_usd,
+                            "output_per_1k_usd": p.output_per_1k_usd,
+                            "context_length": p.context_length,
+                            "fetched_at": now,
+                        },
+                    )
+                )
+                await s.execute(stmt)
+            await s.commit()
+    finally:
+        await db.dispose()
+
+
+async def _check_disk_and_db(config: Config) -> CheckResult:
+    usage = shutil.disk_usage(str(config.workspace_root))
+    free_gb = usage.free / (1024**3)
+    if free_gb < 2.0:
+        disk_status: Literal["ok", "warn", "fail"] = "fail"
+    elif free_gb < 10.0:
+        disk_status = "warn"
+    else:
+        disk_status = "ok"
+
+    db = Database(config.database_url)
+    await db.init_schema()
+    try:
+        async with db.session() as s:
+            row = (await s.execute(sa_text("PRAGMA integrity_check"))).scalar_one()
+    finally:
+        await db.dispose()
+
+    db_ok = row == "ok"
+    detail = f"free={free_gb:.1f}GB, sqlite_integrity={'ok' if db_ok else str(row)}"
+    if disk_status == "fail" or not db_ok:
+        final: Literal["ok", "warn", "fail"] = "fail"
+    elif disk_status == "warn":
+        final = "warn"
+    else:
+        final = "ok"
+    return CheckResult("disk+db", final, detail)
+
+
+def doctor_command() -> None:
+    asyncio.run(_doctor_async())
+
+
+async def _doctor_async() -> None:
+    try:
+        config = load_config()
+    except MyDeepAgentError as e:
+        _CONSOLE.print(f"[red]config load failed: {e}[/]")
+        raise typer.Exit(code=1) from None
+
+    checks: list[CheckResult] = []
+    checks.append(_check_python())
+    checks.append(_check_uv())
+    checks.append(_check_git())
+    checks.append(_check_workspace(config))
+    checks.append(_check_config_and_governance(config))
+    checks.append(_check_openrouter_api_key(config))
+    checks.append(await _check_openrouter_ping_and_upsert(config))
+    checks.append(await _check_disk_and_db(config))
+
+    _render(checks)
+
+    has_fail = any(c.status == "fail" for c in checks)
+    if has_fail:
+        raise typer.Exit(code=1)
+
+
+def _render(checks: list[CheckResult]) -> None:
+    title = t("doctor.header") or "Environment diagnostics:"
+    table = Table(title=title)
+    table.add_column("Check")
+    table.add_column("Status")
+    table.add_column("Detail")
+    color_map: dict[str, str] = {"ok": "green", "warn": "yellow", "fail": "red"}
+    for c in checks:
+        color = color_map[c.status]
+        table.add_row(c.name, f"[{color}]{c.status}[/]", c.detail)
+    _CONSOLE.print(table)
--- a/my-deepagent/src/my_deepagent/cli/init.py
+++ b/my-deepagent/src/my_deepagent/cli/init.py
@@ -0,0 +1,39 @@
+"""mydeepagent init: first-run wizard."""
+
+from __future__ import annotations
+
+import typer
+from rich.console import Console
+
+from ..config import load_config
+from ..governance import has_consent, record_consent
+from ..i18n import t
+from ..keys import set_api_key
+from .doctor import doctor_command
+
+_CONSOLE = Console()
+
+
+def init_command() -> None:
+    config = load_config()
+    _CONSOLE.print(f"[bold cyan]{t('init.welcome')}[/]")
+    _CONSOLE.print()
+    if not has_consent(config.data_dir):
+        _CONSOLE.print(f"[yellow]{t('init.governance_title')}[/]")
+        _CONSOLE.print(t("init.governance_body"))
+        answer = typer.prompt(t("init.governance_prompt"))
+        if answer.strip().lower() != "yes":
+            _CONSOLE.print(f"[red]{t('init.governance_declined')}[/]")
+            raise typer.Exit(code=1)
+        record_consent(config.data_dir)
+    api_key = typer.prompt(t("init.api_key_prompt"), hide_input=True, default="")
+    if api_key.strip():
+        set_api_key("openrouter", api_key.strip())
+        _CONSOLE.print(f"[green]{t('init.api_key_saved')}[/]")
+    else:
+        _CONSOLE.print(f"[yellow]{t('init.api_key_empty')}[/]")
+    _CONSOLE.print()
+    _CONSOLE.print(t("init.doctor_running"))
+    doctor_command()
+    _CONSOLE.print()
+    _CONSOLE.print(f"[bold green]{t('init.done')}[/]")
--- a/my-deepagent/src/my_deepagent/cli/interactive.py
+++ b/my-deepagent/src/my_deepagent/cli/interactive.py
@@ -1 +1,367 @@
-"""CLI interactive subcommand. Implemented in Step 10."""
+"""mydeepagent (no subcommand) — interactive REPL.
+
+prompt_toolkit-based REPL. Slash commands for navigation; everything else
+goes to the bound agent. File refs ``@path/to/file.py`` are expanded into
+markdown code blocks inline before the message is sent.
+"""
+
+from __future__ import annotations
+
+import asyncio
+import re
+from datetime import UTC, datetime
+from pathlib import Path
+from typing import Any
+from uuid import UUID, uuid4
+
+from prompt_toolkit import PromptSession
+from prompt_toolkit.completion import WordCompleter
+from prompt_toolkit.history import FileHistory
+from rich.console import Console
+
+from ..audit import make_audit_recorder
+from ..budget import make_budget_tracker_from_config
+from ..config import Config, load_config
+from ..governance import require_consent
+from ..middleware.audit import AuditToolMiddleware
+from ..middleware.cost import CostMiddleware
+from ..monitoring.pricing import ModelPrice, PricingCache
+from ..persistence.db import Database
+from ..persona import Persona, load_personas_from_dir
+from ..session import build_agent
+from ..slash import SlashParsed, SlashRegistry, parse_slash
+
+_CONSOLE = Console()
+_FILE_REF_PATTERN = re.compile(r"(?<![\w./])@([\w./\-]+)")
+
+
+def _seed_root() -> Path:
+    return Path(__file__).resolve().parents[3] / "docs" / "schemas"
+
+
+def _history_path(config: Config) -> Path:
+    p = config.state_dir
+    p.mkdir(parents=True, exist_ok=True)
+    return p / "history.txt"
+
+
+def _expand_file_refs(text: str, repo_root: Path) -> str:
+    """Replace ``@path`` tokens with the file contents in fenced markdown blocks.
+
+    Silently skips paths that escape the repo root or don't exist.
+    """
+
+    def _replace(match: re.Match[str]) -> str:
+        rel = match.group(1)
+        target = (repo_root / rel).resolve()
+        try:
+            target.relative_to(repo_root.resolve())
+        except ValueError:
+            return match.group(0)
+        if not target.is_file():
+            return match.group(0)
+        try:
+            content = target.read_text(encoding="utf-8", errors="replace")
+        except OSError:
+            return match.group(0)
+        suffix = target.suffix.lstrip(".") or ""
+        return f"\n```{suffix}\n# {rel}\n{content}\n```\n"
+
+    return _FILE_REF_PATTERN.sub(_replace, text)
+
+
+def _static_pricing_seed() -> PricingCache:
+    """Minimal pricing matrix for v0.1.0 (full fetch is Step 12).
+
+    Unit: USD per 1,000 tokens.
+    """
+    cache = PricingCache()
+    cache.set(
+        [
+            ModelPrice("anthropic/claude-sonnet-4-6", 0.003, 0.015, 200_000),
+            ModelPrice("anthropic/claude-haiku-4-5", 0.001, 0.005, 200_000),
+            ModelPrice("anthropic/claude-opus-4-1", 0.015, 0.075, 200_000),
+            ModelPrice("deepseek/deepseek-chat", 0.00028, 0.00112, 64_000),
+        ]
+    )
+    return cache
+
+
+def _now_iso() -> str:
+    return datetime.now(UTC).isoformat(timespec="seconds")
+
+
+class InteractiveSession:
+    """Holds REPL state: current persona, current model override, history, agent."""
+
+    def __init__(
+        self,
+        config: Config,
+        personas: list[Persona],
+        db: Database,
+        pricing: PricingCache,
+        repo_root: Path,
+        session_id: UUID,
+    ) -> None:
+        self.config = config
+        self.personas = personas
+        self.db = db
+        self.pricing = pricing
+        self.repo_root = repo_root
+        self.session_id = session_id
+        self._model_override: str | None = None
+        self._persona = self._default_persona()
+        self._agent: Any | None = None
+
+    def _default_persona(self) -> Persona:
+        name = self.config.default_persona
+        for p in self.personas:
+            if p.name == name:
+                return p
+        if not self.personas:
+            raise RuntimeError(
+                "no personas seeded; run `mydeepagent init` or seed docs/schemas/personas/"
+            )
+        return self.personas[0]
+
+    @property
+    def persona(self) -> Persona:
+        return self._persona
+
+    @property
+    def model_override(self) -> str | None:
+        return self._model_override
+
+    def set_persona(self, name: str) -> Persona:
+        for p in self.personas:
+            if p.name == name or f"{p.name}@{p.version}" == name:
+                self._persona = p
+                self._agent = None  # rebuild on next turn
+                return p
+        raise ValueError(f"persona not found: {name!r}")
+
+    def set_model(self, model: str | None) -> None:
+        self._model_override = model
+        self._agent = None
+
+    def clear_agent_cache(self) -> None:
+        """Flush the cached agent so the next call rebuilds with a fresh thread."""
+        self._agent = None
+
+    def build_agent_if_needed(self) -> Any:
+        if self._agent is not None:
+            return self._agent
+        budget = make_budget_tracker_from_config(self.db, self.config)
+        cost_mw = CostMiddleware(
+            pricing=self.pricing,
+            model_name=self._model_override or self._persona.model,
+            interactive_session_id=self.session_id,
+            persona_name=self._persona.name,
+            budget_tracker=budget,
+        )
+        audit_mw = AuditToolMiddleware(
+            interactive_session_id=self.session_id,
+            file_recorder=make_audit_recorder(self.config.state_dir),
+        )
+        self._agent = build_agent(
+            self._persona,
+            self.config,
+            root_dir=self.repo_root,
+            middleware=[cost_mw, audit_mw],
+            model_override=self._model_override,
+        )
+        return self._agent
+
+
+def _register_navigation_slash(reg: SlashRegistry, sess: InteractiveSession) -> None:
+    """Register /quit, /exit, /help, /clear slash handlers."""
+
+    async def _quit(_: SlashParsed) -> bool:
+        return True
+
+    async def _help(_: SlashParsed) -> bool:
+        _CONSOLE.print("[bold]Slash commands:[/]")
+        for name, desc in reg.all_help():
+            _CONSOLE.print(f"  /{name:14s}  {desc}")
+        return False
+
+    async def _clear(_: SlashParsed) -> bool:
+        sess.clear_agent_cache()
+        _CONSOLE.print("[dim]context cleared (new session thread)[/]")
+        return False
+
+    reg.register("quit", _quit, help="exit the REPL")
+    reg.register("exit", _quit, help="alias for /quit")
+    reg.register("help", _help, help="show slash commands")
+    reg.register("clear", _clear, help="clear conversation context")
+
+
+def _register_persona_slash(reg: SlashRegistry, sess: InteractiveSession) -> None:
+    """Register /agent and /model slash handlers."""
+
+    async def _agent_cmd(cmd: SlashParsed) -> bool:
+        if not cmd.args:
+            _CONSOLE.print(f"current: [cyan]{sess.persona.name}@{sess.persona.version}[/]")
+            for p in sess.personas:
+                _CONSOLE.print(f"  - {p.name}@{p.version}  ({p.backend.value})")
+            return False
+        try:
+            new = sess.set_persona(cmd.args[0])
+            _CONSOLE.print(f"[green]switched persona → {new.name}@{new.version}[/]")
+        except ValueError as e:
+            _CONSOLE.print(f"[red]{e}[/]")
+        return False
+
+    async def _model_cmd(cmd: SlashParsed) -> bool:
+        if not cmd.args:
+            cur = sess.model_override or sess.persona.model
+            _CONSOLE.print(f"current model: [cyan]{cur}[/]")
+            return False
+        if cmd.args[0] in ("-", "reset"):
+            sess.set_model(None)
+            _CONSOLE.print("[green]model override cleared[/]")
+        else:
+            sess.set_model(cmd.args[0])
+            _CONSOLE.print(f"[green]model → {cmd.args[0]}[/]")
+        return False
+
+    reg.register("agent", _agent_cmd, help="list or switch persona: /agent [name]")
+    reg.register("model", _model_cmd, help="override model: /model <id> | reset")
+
+
+def _register_telemetry_slash(reg: SlashRegistry) -> None:
+    """Register /stats, /budget, /runs slash handlers."""
+
+    async def _stats(_: SlashParsed) -> bool:
+        from .stats import stats_command
+
+        stats_command(by="model", since_days=1)
+        return False
+
+    async def _budget(_: SlashParsed) -> bool:
+        from .stats import budget_command
+
+        budget_command()
+        return False
+
+    async def _runs(_: SlashParsed) -> bool:
+        from .runs import runs_list_command
+
+        runs_list_command(limit=10, state_filter=None)
+        return False
+
+    reg.register("stats", _stats, help="LLM-call stats (last 24h)")
+    reg.register("budget", _budget, help="budget ledger")
+    reg.register("runs", _runs, help="list recent workflow runs")
+
+
+def _register_slash(reg: SlashRegistry, sess: InteractiveSession) -> None:
+    _register_navigation_slash(reg, sess)
+    _register_persona_slash(reg, sess)
+    _register_telemetry_slash(reg)
+
+
+def _completer(personas: list[Persona], slash_names: list[str]) -> WordCompleter:
+    words = [f"/{n}" for n in slash_names]
+    words += [p.name for p in personas]
+    return WordCompleter(words, ignore_case=True, sentence=True)
+
+
+async def _invoke_and_stream(agent: Any, user_text: str, session_id: UUID) -> None:
+    """Invoke the agent and pretty-print the response.
+
+    v0.1 keeps it simple — full ainvoke, then print the final message.
+    Token-level streaming via astream is a Step 16 polish.
+    """
+    result = await agent.ainvoke(
+        {"messages": [{"role": "user", "content": user_text}]},
+        config={"configurable": {"thread_id": str(session_id)}},
+    )
+    messages = result.get("messages", []) if isinstance(result, dict) else []
+    if not messages:
+        return
+    last = messages[-1]
+    content: Any = getattr(last, "content", "") or ""
+    if isinstance(content, list):
+        content = "\n".join(
+            (c.get("text", str(c)) if isinstance(c, dict) else str(c)) for c in content
+        )
+    _CONSOLE.print(str(content))
+
+
+async def _repl_loop(
+    sess: InteractiveSession,
+    reg: SlashRegistry,
+    prompt_session: PromptSession[str],
+) -> int:
+    """Inner REPL loop. Returns 0 on clean exit, non-zero on error."""
+    while True:
+        try:
+            line = await prompt_session.prompt_async("» ")
+        except (EOFError, KeyboardInterrupt):
+            _CONSOLE.print()
+            return 0
+        line = (line or "").strip()
+        if not line:
+            continue
+        parsed = parse_slash(line)
+        if parsed is not None:
+            if parsed.name == "":
+                _CONSOLE.print("[dim]empty slash command; try /help[/]")
+                continue
+            done = await reg.dispatch(parsed)
+            if done:
+                return 0
+            if parsed.name not in reg.names:
+                _CONSOLE.print(f"[yellow]unknown command: /{parsed.name}[/]")
+            continue
+        # Forward to agent.
+        expanded = _expand_file_refs(line, sess.repo_root)
+        agent = sess.build_agent_if_needed()
+        try:
+            await _invoke_and_stream(agent, expanded, sess.session_id)
+        except Exception as e:
+            _CONSOLE.print(f"[red]agent error:[/] {type(e).__name__}: {e}")
+
+
+async def _interactive_loop_async(persona_override: str | None, model_override: str | None) -> int:
+    config = load_config()
+    require_consent(config.data_dir)
+    db = Database(config.database_url)
+    await db.init_schema()
+    personas = load_personas_from_dir(_seed_root() / "personas")
+    if not personas:
+        _CONSOLE.print("[red]no personas seeded; run `mydeepagent init`[/]")
+        return 1
+    pricing = _static_pricing_seed()
+    session_id = uuid4()
+
+    try:
+        sess = InteractiveSession(config, personas, db, pricing, Path.cwd(), session_id)
+        if persona_override:
+            try:
+                sess.set_persona(persona_override)
+            except ValueError as e:
+                _CONSOLE.print(f"[red]{e}[/]")
+                return 1
+        if model_override:
+            sess.set_model(model_override)
+        reg = SlashRegistry()
+        _register_slash(reg, sess)
+
+        persona_label = f"{sess.persona.name}@{sess.persona.version}"
+        _CONSOLE.print(f"[bold cyan]my-deepagent[/] — persona [cyan]{persona_label}[/]")
+        _CONSOLE.print("[dim]type /help for commands, /quit to exit[/]")
+
+        prompt_session: PromptSession[str] = PromptSession(
+            history=FileHistory(str(_history_path(config))),
+            completer=_completer(personas, reg.names),
+        )
+        return await _repl_loop(sess, reg, prompt_session)
+    finally:
+        await db.dispose()
+
+
+def interactive_command(persona: str | None = None, model: str | None = None) -> int:
+    """Entry point for the interactive REPL. Returns an exit code."""
+    return asyncio.run(_interactive_loop_async(persona, model))
--- a/my-deepagent/src/my_deepagent/cli/keys_cmd.py
+++ b/my-deepagent/src/my_deepagent/cli/keys_cmd.py
@@ -0,0 +1,40 @@
+"""login / logout / keys list commands."""
+
+from __future__ import annotations
+
+import typer
+from rich.console import Console
+
+from ..i18n import t
+from ..keys import delete_api_key, get_api_key, list_providers, mask, set_api_key
+
+_CONSOLE = Console()
+
+
+def login_command(provider: str) -> None:
+    value = typer.prompt(t("login.prompt", provider=provider), hide_input=True, default="")
+    if not value.strip():
+        _CONSOLE.print(f"[yellow]{t('login.empty')}[/]")
+        raise typer.Exit(code=1)
+    set_api_key(provider, value.strip())
+    _CONSOLE.print(f"[green]{t('login.saved', provider=provider)}[/]")
+
+
+def logout_command(provider: str) -> None:
+    removed = delete_api_key(provider)
+    if removed:
+        _CONSOLE.print(f"[green]{t('logout.removed', provider=provider)}[/]")
+    else:
+        _CONSOLE.print(f"[yellow]{t('logout.not_found', provider=provider)}[/]")
+
+
+def keys_list_command() -> None:
+    _CONSOLE.print(t("keys.header"))
+    found = False
+    for provider in list_providers():
+        value = get_api_key(provider)
+        if value:
+            _CONSOLE.print(t("keys.entry", provider=provider, masked=mask(value)))
+            found = True
+    if not found:
+        _CONSOLE.print(t("keys.none"))
--- a/my-deepagent/src/my_deepagent/cli/main.py
+++ b/my-deepagent/src/my_deepagent/cli/main.py
@@ -1 +1,150 @@
-"""Typer CLI entry point. Filled in Step 6."""
+"""my-deepagent CLI entry point."""
+
+from __future__ import annotations
+
+from pathlib import Path
+
+import typer
+
+from .doctor import doctor_command
+from .init import init_command
+from .keys_cmd import keys_list_command, login_command, logout_command
+
+app = typer.Typer(no_args_is_help=False, add_completion=True)
+
+runs_app = typer.Typer(help="Inspect or resume past runs.")
+
+
+@runs_app.command("list")
+def runs_list(
+    limit: int = typer.Option(20, help="Number of runs to show"),
+    state: str | None = typer.Option(None, help="Filter by state"),
+) -> None:
+    """List recent runs."""
+    from .runs import runs_list_command
+
+    runs_list_command(limit, state)
+
+
+@runs_app.command("show")
+def runs_show(run_id: str = typer.Argument(...)) -> None:
+    """Show details for a specific run."""
+    from .runs import runs_show_command
+
+    runs_show_command(run_id)
+
+
+@runs_app.command("resume")
+def runs_resume(run_id: str = typer.Argument(...)) -> None:
+    """Resume a paused run (v0.1.0: not implemented — shows status only)."""
+    from .runs import runs_resume_command
+
+    runs_resume_command(run_id)
+
+
+app.add_typer(runs_app, name="runs")
+
+
+@app.command()
+def init() -> None:
+    """First-run setup: governance consent + API key + doctor."""
+    init_command()
+
+
+@app.command()
+def login(provider: str = typer.Argument("openrouter")) -> None:
+    """Store an API key for the given provider in the OS keyring."""
+    login_command(provider)
+
+
+@app.command()
+def logout(provider: str = typer.Argument("openrouter")) -> None:
+    """Remove a stored API key from the OS keyring."""
+    logout_command(provider)
+
+
+@app.command(name="keys")
+def keys_list() -> None:
+    """List registered providers (masked)."""
+    keys_list_command()
+
+
+@app.command()
+def doctor() -> None:
+    """Run environment diagnostics (Python/uv/disk for v0.1.0; full suite in Step 12)."""
+    doctor_command()
+
+
+@app.command(name="run")
+def run(
+    workflow_path: Path = typer.Argument(..., help="Path to the workflow yaml"),  # noqa: B008
+    repo: Path = typer.Option(Path.cwd(), help="Repo root"),  # noqa: B008
+    base_branch: str = typer.Option("main", help="Base branch"),
+    no_preview: bool = typer.Option(False, "--no-preview", help="Skip cost preview"),
+) -> None:
+    """Execute a workflow template end-to-end."""
+    from .run import run_command
+
+    run_command(workflow_path, repo, base_branch, no_preview)
+
+
+@app.command()
+def stats(
+    by: str = typer.Option("model", help="model | persona | day"),
+    since_days: int = typer.Option(7, help="Window size in days"),
+) -> None:
+    """Aggregate LLM-call stats from the ledger."""
+    from .stats import stats_command
+
+    stats_command(by, since_days)
+
+
+@app.command()
+def budget() -> None:
+    """Show the current budget ledger (per-scope spend / cap)."""
+    from .stats import budget_command
+
+    budget_command()
+
+
+@app.command(name="costs")
+def costs() -> None:
+    """Alias for `stats --by day` over the last 30 days."""
+    from .stats import stats_command
+
+    stats_command(by="day", since_days=30)
+
+
+@app.command(name="pricing")
+def pricing() -> None:
+    """Show cached OpenRouter pricing matrix (populated by `doctor`)."""
+    from .stats import pricing_command
+
+    pricing_command()
+
+
+@app.callback(invoke_without_command=True)
+def main(
+    ctx: typer.Context,
+    agent: str | None = typer.Option(None, "--agent", help="Start with a specific persona"),
+    model: str | None = typer.Option(None, "--model", help="Model override"),
+) -> None:
+    from ..logging import configure_logging
+
+    try:
+        from ..config import load_config
+
+        cfg = load_config()
+        configure_logging(level=cfg.log_level, json_output=False)
+    except Exception:
+        configure_logging(level="info", json_output=False)
+
+    if ctx.invoked_subcommand is None:
+        from .interactive import interactive_command
+
+        code = interactive_command(agent, model)
+        raise typer.Exit(code=code)
+
+
+if __name__ == "__main__":
+    app()
--- a/my-deepagent/src/my_deepagent/cli/run.py
+++ b/my-deepagent/src/my_deepagent/cli/run.py
@@ -1 +1,194 @@
-"""CLI run command implementation. Implemented in Step 6."""
+"""mydeepagent run <workflow.yaml> — execute a workflow end-to-end."""
+
+from __future__ import annotations
+
+import asyncio
+from pathlib import Path
+
+import typer
+from rich.console import Console
+from rich.table import Table
+from sqlalchemy import select
+
+from ..artifact_schema import ArtifactSchemaRegistry
+from ..binding import BackendAvailability, PersonaConsentStore, bind_personas
+from ..budget import BudgetTracker, make_budget_tracker_from_config
+from ..config import Config, load_config
+from ..engine import WorkflowEngine
+from ..enums import Backend
+from ..governance import require_consent
+from ..monitoring.cost_estimator import WorkflowCostEstimate, estimate_workflow
+from ..monitoring.pricing import ModelPrice, PricingCache
+from ..persistence.db import Database
+from ..persistence.models import ModelPricingRow
+from ..persona import load_personas_from_dir
+from ..tui.approval import cli_approval_callback
+from ..workflow import load_workflow_yaml
+
+_CONSOLE = Console()
+
+
+def run_command(
+    workflow_path: Path,
+    repo: Path,
+    base_branch: str,
+    no_preview: bool = False,
+) -> None:
+    """Synchronous CLI wrapper for the async engine."""
+    asyncio.run(_run_async(workflow_path, repo, base_branch, no_preview))
+
+
+async def cli_budget_prompt(scope: str, projected: float, cap: float) -> bool:
+    """Prompt the user to extend the budget cap when it is hit."""
+    _CONSOLE.print()
+    _CONSOLE.print(
+        f"[yellow]Budget cap reached[/]: scope={scope} projected=${projected:.4f} cap=${cap:.4f}"
+    )
+    return typer.confirm("Extend cap and proceed?", default=False)
+
+
+def _static_pricing_seed_fallback() -> list[ModelPrice]:
+    """Return seed model prices used when the model_pricing DB table is empty.
+
+    Unit: USD per 1,000 tokens. (OpenRouter publishes per-token; we store per-1K to keep
+    cost arithmetic in a more readable range. ``compute_cost(model, in, out)`` divides
+    by 1000.)
+    """
+    return [
+        ModelPrice("anthropic/claude-sonnet-4-6", 0.003, 0.015, 200_000),
+        ModelPrice("anthropic/claude-haiku-4-5", 0.001, 0.005, 200_000),
+        ModelPrice("anthropic/claude-opus-4-1", 0.015, 0.075, 200_000),
+        ModelPrice("deepseek/deepseek-chat", 0.00028, 0.00112, 64_000),
+    ]
+
+
+async def _load_pricing_from_db(config: Config, db: Database) -> PricingCache:
+    """Load pricing from the persisted model_pricing table.
+
+    Falls back to the static seed when the table is empty (doctor not yet run).
+    """
+    async with db.session() as s:
+        rows = list((await s.execute(select(ModelPricingRow))).scalars().all())
+    cache = PricingCache()
+    if rows:
+        cache.set(
+            [
+                ModelPrice(
+                    model=r.model,
+                    input_per_1k_usd=r.input_per_1k_usd,
+                    output_per_1k_usd=r.output_per_1k_usd,
+                    context_length=r.context_length,
+                )
+                for r in rows
+            ]
+        )
+        return cache
+    cache.set(_static_pricing_seed_fallback())
+    return cache
+
+
+def _print_preview(estimate: WorkflowCostEstimate, config: object) -> None:
+    cfg: Config = config  # type: ignore[assignment]
+    table = Table(title="Cost preview")
+    table.add_column("Phase")
+    table.add_column("Persona")
+    table.add_column("Model")
+    table.add_column("In/Out tokens", justify="right")
+    table.add_column("Est. cost", justify="right")
+    for p in estimate.phases:
+        cost_str = f"${p.estimated_cost_usd:.4f}"
+        table.add_row(
+            p.phase_key,
+            p.persona_name,
+            p.model,
+            f"{p.estimated_input_tokens}/{p.estimated_output_tokens}",
+            cost_str,
+        )
+    _CONSOLE.print(table)
+    _CONSOLE.print(f"Total estimated: [bold]${estimate.total_usd:.4f}[/]")
+    _CONSOLE.print(
+        f"Run cap: [bold]${cfg.budget_run_usd}[/] | Daily cap: [bold]${cfg.budget_daily_usd}[/]"
+    )
+
+
+async def _run_async(
+    workflow_path: Path,
+    repo: Path,
+    base_branch: str,
+    no_preview: bool,
+) -> None:
+    config = load_config()
+    require_consent(config.data_dir)
+
+    template = load_workflow_yaml(workflow_path)
+
+    # Locate seed schemas relative to the installed package root
+    seed_root = Path(__file__).resolve().parents[3] / "docs" / "schemas"
+    personas_dir = seed_root / "personas"
+    artifacts_root = seed_root / "artifacts"
+
+    personas = load_personas_from_dir(personas_dir)
+    registry = ArtifactSchemaRegistry(roots=[artifacts_root])
+
+    db = Database(config.database_url)
+    await db.init_schema()
+
+    # Crash recovery: mark non-terminal runs from a previous process as failed
+    # so the active-run uniqueness slot is freed before starting new work.
+    from ..recovery import sweep_orphan_runs
+
+    report = await sweep_orphan_runs(db)
+    if report.total:
+        _CONSOLE.print(
+            f"[yellow]recovery: marked {len(report.failed_runs)} orphan run(s) "
+            f"and {len(report.failed_phases)} phase(s) as failed[/]"
+        )
+
+    try:
+        consent_store = PersonaConsentStore(config.data_dir / "persona-consents.json")
+        bindings = bind_personas(
+            template,
+            personas,
+            BackendAvailability(available_backends=frozenset(Backend)),
+            consent_store,
+        )
+
+        # Pricing + cost preview — use DB-cached prices; fall back to static seed
+        pricing = await _load_pricing_from_db(config, db)
+
+        if not no_preview:
+            estimate = estimate_workflow(template, bindings, pricing)
+            _print_preview(estimate, config)
+            if not typer.confirm("Proceed?", default=True):
+                raise typer.Exit(code=0)
+
+        budget: BudgetTracker = make_budget_tracker_from_config(
+            db, config, prompt_callback=cli_budget_prompt
+        )
+        await budget.init()
+
+        engine = WorkflowEngine(
+            db=db,
+            config=config,
+            persona_pool=personas,
+            artifact_registry=registry,
+            consent_store=consent_store,
+            available_backends=BackendAvailability(available_backends=frozenset(Backend)),
+            approval_callback=cli_approval_callback,
+            budget_tracker=budget,
+            pricing=pricing,
+        )
+        engine.install_signal_handlers()
+        result = await engine.run(
+            template,
+            repo_path=repo,
+            base_branch=base_branch,
+        )
+        _CONSOLE.print(f"[bold]{result.state.value}[/] run_id={result.run_id}")
+        if result.final_report_path:
+            _CONSOLE.print(f"report: {result.final_report_path}")
+        if result.error:
+            _CONSOLE.print(f"[red]error[/]: {result.error}")
+            raise typer.Exit(code=1)
+    finally:
+        await db.dispose()
--- a/my-deepagent/src/my_deepagent/cli/runs.py
+++ b/my-deepagent/src/my_deepagent/cli/runs.py
@@ -0,0 +1,204 @@
+"""mydeepagent runs list / show / resume — read-only-ish run history queries."""
+
+from __future__ import annotations
+
+import asyncio
+from pathlib import Path
+from uuid import UUID
+
+import typer
+from rich.console import Console
+from rich.table import Table
+from sqlalchemy import desc, select
+
+from ..config import load_config
+from ..persistence.db import Database
+from ..persistence.models import (
+    ArtifactRow,
+    RunEventRow,
+    RunPhaseRow,
+    RunRow,
+)
+
+_CONSOLE = Console()
+
+
+def runs_list_command(limit: int = 20, state_filter: str | None = None) -> None:
+    asyncio.run(_runs_list_async(limit, state_filter))
+
+
+def runs_show_command(run_id: str) -> None:
+    asyncio.run(_runs_show_async(run_id))
+
+
+def runs_resume_command(run_id: str) -> None:
+    asyncio.run(_runs_resume_async(run_id))
+
+
+async def _runs_list_async(limit: int, state_filter: str | None) -> None:
+    config = load_config()
+    db = Database(config.database_url)
+    await db.init_schema()
+    try:
+        async with db.session() as s:
+            stmt = select(RunRow).order_by(desc(RunRow.created_at)).limit(limit)
+            if state_filter:
+                stmt = stmt.where(RunRow.state == state_filter)
+            rows = (await s.execute(stmt)).scalars().all()
+        if not rows:
+            _CONSOLE.print("[dim](no runs)[/]")
+            return
+        table = Table(title=f"Recent runs (latest {len(rows)})")
+        table.add_column("Run ID")
+        table.add_column("State")
+        table.add_column("Repo")
+        table.add_column("Branch")
+        table.add_column("Created")
+        table.add_column("Ended")
+        for r in rows:
+            table.add_row(
+                str(r.id)[:8] + "…",
+                r.state,
+                Path(r.repo_path).name,
+                r.base_branch,
+                (r.created_at or "")[:19],
+                (r.ended_at or "—")[:19] if r.ended_at else "—",
+            )
+        _CONSOLE.print(table)
+    finally:
+        await db.dispose()
+
+
+async def _runs_show_async(run_id: str) -> None:
+    full_id = await _resolve_run_id(run_id)
+    config = load_config()
+    db = Database(config.database_url)
+    await db.init_schema()
+    try:
+        async with db.session() as s:
+            run = await s.get(RunRow, full_id)
+            if run is None:
+                _CONSOLE.print(f"[red]run not found:[/] {run_id}")
+                raise typer.Exit(code=1)
+            phases = (
+                (
+                    await s.execute(
+                        select(RunPhaseRow)
+                        .where(RunPhaseRow.run_id == full_id)
+                        .order_by(RunPhaseRow.seq)
+                    )
+                )
+                .scalars()
+                .all()
+            )
+            artifacts = (
+                (await s.execute(select(ArtifactRow).where(ArtifactRow.run_id == full_id)))
+                .scalars()
+                .all()
+            )
+            events = (
+                (
+                    await s.execute(
+                        select(RunEventRow)
+                        .where(RunEventRow.run_id == full_id)
+                        .order_by(RunEventRow.seq)
+                        .limit(50)
+                    )
+                )
+                .scalars()
+                .all()
+            )
+
+        _CONSOLE.print(f"[bold]Run {run.id}[/]")
+        _CONSOLE.print(f"  state: [cyan]{run.state}[/]")
+        _CONSOLE.print(f"  repo: {run.repo_path}@{run.base_branch}")
+        _CONSOLE.print(f"  worktree: {run.worktree_root}")
+        _CONSOLE.print(f"  created: {run.created_at}")
+        _CONSOLE.print(f"  ended: {run.ended_at or '—'}")
+        if run.final_report_path:
+            _CONSOLE.print(f"  report: {run.final_report_path}")
+        _CONSOLE.print()
+        _CONSOLE.print("[bold]Phases[/]")
+        for ph in phases:
+            _CONSOLE.print(f"  - {ph.phase_key:20s} state={ph.state:15s} attempts={ph.attempts}")
+        if artifacts:
+            _CONSOLE.print()
+            _CONSOLE.print("[bold]Artifacts[/]")
+            for a in artifacts:
+                _CONSOLE.print(f"  - {a.path} (schema={a.schema_id}, valid={a.valid})")
+        _CONSOLE.print()
+        _CONSOLE.print(f"[bold]Events (last {len(events)})[/]")
+        for ev in events:
+            _CONSOLE.print(f"  [{ev.seq:4d}] {ev.ts}  {ev.type}")
+    finally:
+        await db.dispose()
+
+
+async def _runs_resume_async(run_id: str) -> None:
+    """v0.1.0: resume is not implemented.
+
+    Surfaces the run state and hints at next steps. Future v0.2 implementation:
+    rehydrate the workflow template by template_hash, replay phase loop from the
+    first non-completed phase using the existing checkpointer.
+    """
+    full_id = await _resolve_run_id(run_id)
+    config = load_config()
+    db = Database(config.database_url)
+    await db.init_schema()
+    try:
+        async with db.session() as s:
+            run = await s.get(RunRow, full_id)
+            if run is None:
+                _CONSOLE.print(f"[red]run not found:[/] {run_id}")
+                raise typer.Exit(code=1)
+        if run.state in ("completed", "failed", "aborted"):
+            _CONSOLE.print(
+                f"[yellow]Run {run.id} is already terminal ({run.state}). "
+                "Start a fresh run with `mydeepagent run <workflow.yaml>`.[/]"
+            )
+            raise typer.Exit(code=1)
+        _CONSOLE.print(
+            "[yellow]Resume is not implemented in v0.1.0. The crash-recovery sweep at startup "
+            "marked this run as failed; relaunch the workflow with `mydeepagent run`.[/]"
+        )
+        raise typer.Exit(code=2)
+    finally:
+        await db.dispose()
+
+
+async def _resolve_run_id(prefix_or_full: str) -> str:
+    """Accept either a full UUID or a 6+ char prefix and return the canonical full id."""
+    try:
+        return str(UUID(prefix_or_full))
+    except ValueError:
+        pass
+
+    if len(prefix_or_full) < 6:
+        _CONSOLE.print(
+            f"[red]ambiguous run id (need full UUID or >=6-char prefix):[/] {prefix_or_full}"
+        )
+        raise typer.Exit(code=2)
+
+    config = load_config()
+    db = Database(config.database_url)
+    await db.init_schema()
+    try:
+        async with db.session() as s:
+            rows = (
+                (
+                    await s.execute(
+                        select(RunRow.id).where(RunRow.id.like(f"{prefix_or_full}%")).limit(2)
+                    )
+                )
+                .scalars()
+                .all()
+            )
+        if not rows:
+            _CONSOLE.print(f"[red]no run matches prefix:[/] {prefix_or_full}")
+            raise typer.Exit(code=1)
+        if len(rows) > 1:
+            _CONSOLE.print(f"[red]ambiguous prefix matches >1 run:[/] {prefix_or_full}")
+            raise typer.Exit(code=1)
+        return rows[0]
+    finally:
+        await db.dispose()
--- a/my-deepagent/src/my_deepagent/cli/stats.py
+++ b/my-deepagent/src/my_deepagent/cli/stats.py
@@ -1 +1,179 @@
-"""CLI stats command for usage summary. Implemented in Step 12."""
+"""mydeepagent stats / costs / budget / pricing — read-only ledger + history queries."""
+
+from __future__ import annotations
+
+import asyncio
+from collections.abc import Sequence
+from datetime import UTC, datetime, timedelta
+from typing import Any
+
+import typer
+from rich.console import Console
+from rich.table import Table
+from sqlalchemy import func, select
+
+from ..config import load_config
+from ..persistence.db import Database
+from ..persistence.models import BudgetLedgerRow, LlmCallRow, ModelPricingRow
+
+_CONSOLE = Console()
+
+
+def stats_command(by: str = "model", since_days: int = 7) -> None:
+    """Synchronous CLI wrapper for the async stats query."""
+    asyncio.run(_stats_async(by, since_days))
+
+
+async def _stats_async(by: str, since_days: int) -> None:
+    config = load_config()
+    db = Database(config.database_url)
+    await db.init_schema()
+    try:
+        since = (datetime.now(UTC) - timedelta(days=since_days)).isoformat(timespec="seconds")
+        async with db.session() as s:
+            if by == "model":
+                rows: Sequence[Any] = (
+                    await s.execute(
+                        select(
+                            LlmCallRow.model,
+                            func.count().label("calls"),
+                            func.sum(LlmCallRow.input_tokens).label("input"),
+                            func.sum(LlmCallRow.output_tokens).label("output"),
+                            func.sum(LlmCallRow.cost_usd_total).label("cost"),
+                        )
+                        .where(LlmCallRow.ts >= since)
+                        .group_by(LlmCallRow.model)
+                    )
+                ).all()
+                _render_stats_table(
+                    "Stats by model",
+                    rows,
+                    ["Model", "Calls", "Input", "Output", "Cost ($)"],
+                )
+            elif by == "persona":
+                rows = (
+                    await s.execute(
+                        select(
+                            LlmCallRow.persona_name,
+                            func.count().label("calls"),
+                            func.sum(LlmCallRow.cost_usd_total).label("cost"),
+                        )
+                        .where(LlmCallRow.ts >= since)
+                        .group_by(LlmCallRow.persona_name)
+                    )
+                ).all()
+                _render_stats_table(
+                    "Stats by persona",
+                    rows,
+                    ["Persona", "Calls", "Cost ($)"],
+                )
+            elif by == "day":
+                rows = (
+                    await s.execute(
+                        select(
+                            func.substr(LlmCallRow.ts, 1, 10).label("day"),
+                            func.count().label("calls"),
+                            func.sum(LlmCallRow.cost_usd_total).label("cost"),
+                        )
+                        .where(LlmCallRow.ts >= since)
+                        .group_by("day")
+                    )
+                ).all()
+                _render_stats_table(
+                    "Stats by day",
+                    rows,
+                    ["Day", "Calls", "Cost ($)"],
+                )
+            else:
+                typer.echo(f"unknown --by option: {by!r}", err=True)
+                raise typer.Exit(code=2)
+    finally:
+        await db.dispose()
+
+
+def budget_command() -> None:
+    """Synchronous CLI wrapper for the async budget ledger query."""
+    asyncio.run(_budget_async())
+
+
+async def _budget_async() -> None:
+    config = load_config()
+    db = Database(config.database_url)
+    await db.init_schema()
+    try:
+        async with db.session() as s:
+            rows = list((await s.execute(select(BudgetLedgerRow))).scalars().all())
+        if not rows:
+            _CONSOLE.print("[dim](no budget activity yet)[/]")
+            return
+        table = Table(title="Budget ledger")
+        table.add_column("Scope")
+        table.add_column("Spent ($)", justify="right")
+        table.add_column("Cap ($)", justify="right")
+        table.add_column("Remaining ($)", justify="right")
+        table.add_column("Last update")
+        for row in rows:
+            remaining = (
+                "" if row.cap_usd is None else f"{max(0.0, row.cap_usd - row.spent_usd):.4f}"
+            )
+            cap = "—" if row.cap_usd is None else f"{row.cap_usd:.4f}"
+            table.add_row(
+                row.scope,
+                f"{row.spent_usd:.4f}",
+                cap,
+                remaining,
+                row.last_updated,
+            )
+        _CONSOLE.print(table)
+    finally:
+        await db.dispose()
+
+
+def pricing_command() -> None:
+    """Show cached OpenRouter pricing matrix (populated by `doctor`)."""
+    asyncio.run(_pricing_async())
+
+
+async def _pricing_async() -> None:
+    config = load_config()
+    db = Database(config.database_url)
+    await db.init_schema()
+    try:
+        async with db.session() as s:
+            rows = list(
+                (await s.execute(select(ModelPricingRow).order_by(ModelPricingRow.model)))
+                .scalars()
+                .all()
+            )
+        if not rows:
+            _CONSOLE.print("[dim](no pricing data — run `mydeepagent doctor` to fetch)[/]")
+            return
+        table = Table(title="OpenRouter pricing (per 1K tokens, USD)")
+        table.add_column("Model")
+        table.add_column("Input", justify="right")
+        table.add_column("Output", justify="right")
+        table.add_column("Context", justify="right")
+        table.add_column("Fetched")
+        for r in rows:
+            table.add_row(
+                r.model,
+                f"{r.input_per_1k_usd:.4f}",
+                f"{r.output_per_1k_usd:.4f}",
+                str(r.context_length),
+                (r.fetched_at or "")[:19],
+            )
+        _CONSOLE.print(table)
+    finally:
+        await db.dispose()
+
+
+def _render_stats_table(title: str, rows: Sequence[Any], headers: list[str]) -> None:
+    if not rows:
+        _CONSOLE.print("[dim](no data for the past period)[/]")
+        return
+    table = Table(title=title)
+    for h in headers:
+        table.add_column(h)
+    for row in rows:
+        table.add_row(*[str(v if v is not None else "") for v in row])
+    _CONSOLE.print(table)
--- a/my-deepagent/src/my_deepagent/engine.py
+++ b/my-deepagent/src/my_deepagent/engine.py
@@ -1 +1,917 @@
-"""LangGraph run engine orchestrator. Implemented in Step 7."""
+"""WorkflowEngine: orchestrates run lifecycle, phase loop, artifact validation, approval gate."""
+
+from __future__ import annotations
+
+import asyncio
+import json
+import signal
+from contextlib import suppress
+from dataclasses import dataclass
+from datetime import UTC, datetime
+from pathlib import Path
+from typing import Any
+from uuid import UUID, uuid4
+
+from sqlalchemy import select
+
+from .artifact_schema import ArtifactSchemaRegistry
+from .audit import make_audit_recorder
+from .binding import (
+    BackendAvailability,
+    Binding,
+    BindingOverride,
+    PersonaConsentStore,
+    bind_personas,
+)
+from .budget import BudgetTracker
+from .config import Config
+from .enums import ApprovalDecisionAction, ApprovalState, RunPhaseState, RunState
+from .errors import MyDeepAgentError
+from .hash import sha256
+from .middleware.artifact_watcher import ArtifactWatcherMiddleware
+from .middleware.audit import AuditToolMiddleware
+from .middleware.cost import CostMiddleware
+from .monitoring.pricing import PricingCache
+from .persistence.db import Database
+from .persistence.models import (
+    AgentPersonaRow,
+    ApprovalDecisionRow,
+    ApprovalRequestRow,
+    ArtifactRow,
+    LlmCallRow,
+    RunBindingRow,
+    RunEventRow,
+    RunInputRow,
+    RunPhaseRow,
+    RunRow,
+    WorkflowTemplateRow,
+)
+from .persona import Persona
+from .run_event import RunEventType, run_idempotency_key
+from .session import build_agent
+from .workflow import WorkflowPhase, WorkflowTemplate
+
+# ApprovalCallback type: async (request_payload: dict, gates: list[str]) -> ApprovalDecisionAction
+ApprovalCallback = Any  # Callable[[dict, list[str]], Awaitable[ApprovalDecisionAction]]
+
+_DEFAULT_PHASE_TIMEOUT_SECONDS = 300  # 5 minutes
+
+
+@dataclass(frozen=True)
+class RunResult:
+    run_id: UUID
+    state: RunState
+    final_report_path: Path | None
+    error: str | None = None
+
+
+class _PhaseAbortedError(Exception):
+    def __init__(self, reason: str) -> None:
+        self.reason = reason
+        super().__init__(reason)
+
+
+class WorkflowEngine:
+    """In-process workflow engine for v0.1.0.
+
+    For each phase: build_agent -> invoke -> wait for write_file targeting
+    expected_artifact_path -> load + jsonschema validate -> repair 1x if invalid
+    -> approval gate -> next phase.
+
+    All events appended idempotently to run_events via the
+    (run_id, idempotency_key) UNIQUE constraint — concurrent/retry safe.
+    """
+
+    def __init__(
+        self,
+        db: Database,
+        config: Config,
+        persona_pool: list[Persona],
+        artifact_registry: ArtifactSchemaRegistry,
+        consent_store: PersonaConsentStore,
+        available_backends: BackendAvailability,
+        approval_callback: ApprovalCallback,
+        budget_tracker: BudgetTracker | None = None,
+        pricing: PricingCache | None = None,
+    ) -> None:
+        self._db = db
+        self._config = config
+        self._personas = persona_pool
+        self._artifacts = artifact_registry
+        self._consent = consent_store
+        self._backends = available_backends
+        self._approval = approval_callback
+        self._budget = budget_tracker
+        self._pricing = pricing or PricingCache()
+        self._shutdown_event: asyncio.Event = asyncio.Event()
+        self._inflight_tasks: set[asyncio.Task[Any]] = set()
+
+    def install_signal_handlers(self) -> None:
+        """Attach SIGTERM/SIGINT handlers to the running event loop.
+
+        Idempotent: calling twice replaces the previous handlers. Should be invoked
+        from ``cli/run.py`` once the asyncio loop is up. On shutdown signal:
+        in-flight ainvoke() tasks get a 30s grace, then are cancelled.
+        """
+        loop = asyncio.get_running_loop()
+        for sig in (signal.SIGTERM, signal.SIGINT):
+            with suppress(NotImplementedError, ValueError):
+                loop.add_signal_handler(sig, self._on_signal, sig)
+
+    def _on_signal(self, sig: signal.Signals) -> None:
+        self._shutdown_event.set()
+        loop = asyncio.get_running_loop()
+        loop.call_later(30.0, self._force_cancel_inflight)
+
+    def _force_cancel_inflight(self) -> None:
+        for task in list(self._inflight_tasks):
+            if not task.done():
+                task.cancel()
+
+    @property
+    def shutdown_requested(self) -> bool:
+        return self._shutdown_event.is_set()
+
+    async def run(
+        self,
+        template: WorkflowTemplate,
+        *,
+        repo_path: Path,
+        base_branch: str = "main",
+        requirements_md: str = "",
+        override: BindingOverride | None = None,
+    ) -> RunResult:
+        run_id = uuid4()
+        worktree_root = self._config.workspace_root / str(run_id)
+        worktree_root.mkdir(parents=True, exist_ok=True)
+        artifacts_dir = worktree_root / "artifacts"
+        artifacts_dir.mkdir(parents=True, exist_ok=True)
+
+        bindings = bind_personas(template, self._personas, self._backends, self._consent, override)
+
+        await self._persist_run_skeleton(
+            None,
+            run_id,
+            template,
+            bindings,
+            repo_path,
+            base_branch,
+            worktree_root,
+            requirements_md,
+        )
+
+        await self._append_event(run_id, None, RunEventType.RUN_CREATED, {})
+        await self._append_event(run_id, None, RunEventType.RUN_STARTED, {})
+        await self._set_run_state(run_id, RunState.EXECUTING)
+
+        try:
+            for phase_def in template.phases:
+                role_binding = bindings[phase_def.role]
+                await self._run_phase(run_id, worktree_root, template, phase_def, role_binding)
+            await self._set_run_state(run_id, RunState.COMPLETED)
+            await self._append_event(run_id, None, RunEventType.RUN_COMPLETED, {})
+            report_path = await self._compose_final_report(
+                run_id, worktree_root, RunState.COMPLETED
+            )
+            return RunResult(run_id=run_id, state=RunState.COMPLETED, final_report_path=report_path)
+        except _PhaseAbortedError as e:
+            await self._set_run_state(run_id, RunState.ABORTED)
+            await self._append_event(run_id, None, RunEventType.RUN_ABORTED, {"reason": e.reason})
+            report_path = await self._compose_final_report(
+                run_id, worktree_root, RunState.ABORTED, error=e.reason
+            )
+            return RunResult(
+                run_id=run_id,
+                state=RunState.ABORTED,
+                final_report_path=report_path,
+                error=e.reason,
+            )
+        except MyDeepAgentError as e:
+            await self._set_run_state(run_id, RunState.FAILED)
+            await self._append_event(
+                run_id, None, RunEventType.RUN_FAILED, {"code": e.code, "message": str(e)}
+            )
+            report_path = await self._compose_final_report(
+                run_id, worktree_root, RunState.FAILED, error=str(e)
+            )
+            return RunResult(
+                run_id=run_id,
+                state=RunState.FAILED,
+                final_report_path=report_path,
+                error=str(e),
+            )
+
+    # ------------------------------------------------------------------
+    # Phase execution
+    # ------------------------------------------------------------------
+
+    async def _run_phase(
+        self,
+        run_id: UUID,
+        worktree_root: Path,
+        template: WorkflowTemplate,
+        phase_def: WorkflowPhase,
+        binding: Binding,
+    ) -> None:
+        if self.shutdown_requested:
+            await self._append_event(run_id, None, RunEventType.RUN_PAUSED, {"reason": "shutdown"})
+            await self._set_run_state(run_id, RunState.PAUSED)
+            raise _PhaseAbortedError(reason="shutdown signal received")
+
+        phase_id = await self._ensure_phase_row(run_id, phase_def)
+        await self._set_phase_state(phase_id, RunPhaseState.RUNNING)
+        await self._append_event(
+            run_id, phase_id, RunEventType.PHASE_STARTED, {"phase_key": phase_def.key}
+        )
+
+        # Phases without an expected artifact complete immediately
+        if phase_def.expected_artifact is None:
+            await self._set_phase_state(phase_id, RunPhaseState.COMPLETED)
+            await self._append_event(run_id, phase_id, RunEventType.PHASE_COMPLETED, {})
+            return
+
+        expected_path = (worktree_root / phase_def.expected_artifact.path).resolve()
+        expected_path.parent.mkdir(parents=True, exist_ok=True)
+
+        # Repair loop: max 2 attempts
+        for attempt in range(1, 3):
+            validated = await self._run_agent_and_validate(
+                run_id, phase_id, worktree_root, phase_def, binding, expected_path, attempt
+            )
+            if validated:
+                break
+            # validated=False means: invalid/timeout + still have budget for retry
+            # on attempt 2, _run_agent_and_validate raises instead of returning False
+
+        await self._run_approval_gate(run_id, phase_id, phase_def, expected_path)
+        await self._set_phase_state(phase_id, RunPhaseState.COMPLETED)
+        await self._append_event(run_id, phase_id, RunEventType.PHASE_COMPLETED, {})
+
+    async def _run_agent_and_validate(
+        self,
+        run_id: UUID,
+        phase_id: UUID,
+        worktree_root: Path,
+        phase_def: WorkflowPhase,
+        binding: Binding,
+        expected_path: Path,
+        attempt: int,
+    ) -> bool:
+        """Invoke agent for one attempt and validate artifact. Returns True on success.
+
+        Returns False when attempt < 2 and artifact is missing/invalid (caller retries).
+        Raises MyDeepAgentError on final failure (attempt >= 2).
+        """
+        written = await self._invoke_agent_until_artifact(
+            run_id, phase_id, worktree_root, phase_def, binding, expected_path, attempt=attempt
+        )
+
+        if not written:
+            await self._append_event(run_id, phase_id, RunEventType.ARTIFACT_TIMEOUT, {})
+            if attempt >= 2:
+                await self._set_phase_state(phase_id, RunPhaseState.FAILED)
+                await self._append_event(
+                    run_id,
+                    phase_id,
+                    RunEventType.PHASE_FAILED,
+                    {"reason": "artifact_timeout_exhausted"},
+                )
+                raise MyDeepAgentError.human_required(
+                    "artifact_timeout_exhausted",
+                    message=(
+                        f"phase '{phase_def.key}' did not produce expected artifact "
+                        f"after {attempt} attempts"
+                    ),
+                )
+            return False
+
+        # Validate the written artifact
+        await self._set_phase_state(phase_id, RunPhaseState.VALIDATING)
+        assert phase_def.expected_artifact is not None
+        schema_id = phase_def.expected_artifact.schema_id
+        try:
+            data = json.loads(expected_path.read_text(encoding="utf-8"))
+        except (OSError, json.JSONDecodeError) as exc:
+            await self._append_event(
+                run_id,
+                phase_id,
+                RunEventType.ARTIFACT_INVALID,
+                {"errors": [{"message": str(exc)}]},
+            )
+            if attempt >= 2:
+                raise MyDeepAgentError.human_required(
+                    "artifact_invalid_after_repair",
+                    message=str(exc),
+                    cause=exc,
+                ) from exc
+            await self._append_event(run_id, phase_id, RunEventType.PROMPT_REPAIRED, {})
+            return False
+
+        result = self._artifacts.validate(schema_id, data)
+        if result.ok:
+            await self._persist_artifact(run_id, phase_id, expected_path, schema_id, valid=True)
+            await self._append_event(run_id, phase_id, RunEventType.ARTIFACT_VALIDATED, {})
+            return True
+
+        error_payload = [{"path": f.path, "message": f.message} for f in result.errors[:5]]
+        await self._persist_artifact(
+            run_id,
+            phase_id,
+            expected_path,
+            schema_id,
+            valid=False,
+            errors=list(result.errors),
+        )
+        await self._append_event(
+            run_id, phase_id, RunEventType.ARTIFACT_INVALID, {"errors": error_payload}
+        )
+        if attempt >= 2:
+            await self._set_phase_state(phase_id, RunPhaseState.FAILED)
+            await self._append_event(
+                run_id,
+                phase_id,
+                RunEventType.PHASE_FAILED,
+                {"reason": "artifact_invalid_after_repair"},
+            )
+            raise MyDeepAgentError.human_required(
+                "artifact_invalid_after_repair",
+                message=f"phase '{phase_def.key}' artifact failed validation after repair",
+            )
+        await self._append_event(run_id, phase_id, RunEventType.PROMPT_REPAIRED, {})
+        return False
+
+    async def _run_approval_gate(
+        self,
+        run_id: UUID,
+        phase_id: UUID,
+        phase_def: WorkflowPhase,
+        expected_path: Path,
+    ) -> None:
+        """Run the approval gate if gates are configured. Raises on reject/abort."""
+        if not phase_def.gates:
+            return
+        await self._set_phase_state(phase_id, RunPhaseState.AWAITING_APPROVAL)
+        decision = await self._request_approval(run_id, phase_id, phase_def, expected_path)
+        if decision == ApprovalDecisionAction.ABORT:
+            raise _PhaseAbortedError(reason=f"aborted at phase {phase_def.key}")
+        if decision != ApprovalDecisionAction.APPROVE:
+            await self._set_phase_state(phase_id, RunPhaseState.FAILED)
+            await self._append_event(
+                run_id, phase_id, RunEventType.PHASE_FAILED, {"reason": decision.value}
+            )
+            raise MyDeepAgentError.human_required(
+                "approval_rejected",
+                message=f"phase '{phase_def.key}' approval was {decision.value}",
+            )
+
+    async def _invoke_agent_until_artifact(
+        self,
+        run_id: UUID,
+        phase_id: UUID,
+        worktree_root: Path,
+        phase_def: WorkflowPhase,
+        binding: Binding,
+        expected_path: Path,
+        attempt: int,
+    ) -> bool:
+        """Build agent + invoke + return True if expected_path was written, False on timeout."""
+        written_paths: list[str] = []
+
+        async def _on_written(path: str, _content: str) -> None:
+            written_paths.append(path)
+
+        watcher = ArtifactWatcherMiddleware(expected_path, _on_written)
+        cost_mw = CostMiddleware(
+            pricing=self._pricing,
+            model_name=binding.persona.model,
+            run_id=run_id,
+            phase_id=phase_id,
+            persona_name=binding.persona.name,
+            budget_tracker=self._budget,
+            recorder=self._record_llm_call,
+        )
+        audit_mw = AuditToolMiddleware(
+            run_id=run_id,
+            phase_id=phase_id,
+            file_recorder=make_audit_recorder(self._config.state_dir),
+        )
+        agent = build_agent(
+            binding.persona,
+            self._config,
+            root_dir=worktree_root,
+            middleware=[watcher, cost_mw, audit_mw],
+        )
+        envelope = self._build_envelope(run_id, phase_id, phase_def, attempt, expected_path)
+
+        await self._append_event(
+            run_id, phase_id, RunEventType.ARTIFACT_EXPECTED, {"path": str(expected_path)}
+        )
+        event_type = RunEventType.PROMPT_REPAIRED if attempt > 1 else RunEventType.PROMPT_SENT
+        await self._append_event(run_id, phase_id, event_type, {"attempt": attempt})
+
+        timeout = float(phase_def.timeout_seconds or _DEFAULT_PHASE_TIMEOUT_SECONDS)
+        try:
+            invoke_task: asyncio.Task[Any] = asyncio.create_task(
+                agent.ainvoke({"messages": [{"role": "user", "content": envelope}]})
+            )
+            self._inflight_tasks.add(invoke_task)
+            try:
+                await asyncio.wait_for(asyncio.shield(invoke_task), timeout=timeout)
+            except TimeoutError:
+                pass
+            finally:
+                self._inflight_tasks.discard(invoke_task)
+        except asyncio.CancelledError:
+            pass
+
+        return expected_path.is_file()
+
+    def _build_envelope(
+        self,
+        run_id: UUID,
+        phase_id: UUID,
+        phase_def: WorkflowPhase,
+        attempt: int,
+        expected_path: Path,
+    ) -> str:
+        artifact = phase_def.expected_artifact
+        assert artifact is not None
+        try:
+            schema_def = self._artifacts.load(artifact.schema_id)
+            schema_inline = json.dumps(schema_def, indent=2, ensure_ascii=False)
+        except (MyDeepAgentError, AttributeError):
+            # AttributeError covers test scaffolding that instantiates the engine
+            # via __new__ without wiring _artifacts; production paths always have it.
+            schema_inline = "(schema not available)"
+        repair_note = (
+            "\n\n[REPAIR ATTEMPT]\n"
+            "Your previous artifact did not validate against the JSON Schema below. "
+            "Re-read the schema carefully and emit a corrected JSON object that satisfies "
+            "every `required` field and respects all `enum`, `type`, `minLength`, and "
+            "`additionalProperties: false` constraints."
+            if attempt > 1
+            else ""
+        )
+        return (
+            f"MYDEEPAGENT_PROMPT_BEGIN {phase_id}\n"
+            f"Run: {run_id}\n"
+            f"Phase: {phase_def.key}\n"
+            f"Attempt: {attempt}\n"
+            f"Expected artifact path: {expected_path}\n"
+            f"Expected schema id: {artifact.schema_id}\n"
+            f"\n"
+            f"JSON Schema 2020-12 for this artifact (you MUST satisfy it exactly):\n"
+            f"```json\n{schema_inline}\n```\n"
+            f"\n"
+            f"Use the `write_file` tool to write a JSON object that matches the schema "
+            f"to the exact path `{expected_path}`. The file must parse as valid JSON.\n"
+            f"\n"
+            f"Instructions:\n"
+            f"{phase_def.instructions}"
+            f"{repair_note}\n"
+            f"MYDEEPAGENT_PROMPT_END {phase_id}"
+        )
+
+    # ------------------------------------------------------------------
+    # Approval gate
+    # ------------------------------------------------------------------
+
+    async def _request_approval(
+        self,
+        run_id: UUID,
+        phase_id: UUID,
+        phase_def: WorkflowPhase,
+        artifact_path: Path,
+    ) -> ApprovalDecisionAction:
+        request_id = uuid4()
+        idem_key = f"{phase_def.key}:{artifact_path.name}"
+        payload: dict[str, Any] = {
+            "phase_key": phase_def.key,
+            "artifact_path": str(artifact_path),
+            "gates": list(phase_def.gates),
+        }
+        async with self._db.session() as s:
+            s.add(
+                ApprovalRequestRow(
+                    id=str(request_id),
+                    run_id=str(run_id),
+                    phase_id=str(phase_id),
+                    gate_key=phase_def.gates[0] if phase_def.gates else "default",
+                    state=ApprovalState.PENDING.value,
+                    idempotency_key=idem_key,
+                    payload=payload,
+                    created_at=_now_iso(),
+                )
+            )
+
+        await self._append_event(
+            run_id,
+            phase_id,
+            RunEventType.APPROVAL_REQUESTED,
+            {"request_id": str(request_id)},
+        )
+
+        decision: ApprovalDecisionAction = await self._approval(payload, list(phase_def.gates))
+
+        async with self._db.session() as s:
+            s.add(
+                ApprovalDecisionRow(
+                    id=str(uuid4()),
+                    approval_request_id=str(request_id),
+                    action=decision.value,
+                    decided_at=_now_iso(),
+                    idempotency_key=f"{idem_key}:{decision.value}",
+                )
+            )
+
+        await self._append_event(
+            run_id, phase_id, RunEventType.APPROVAL_RESOLVED, {"action": decision.value}
+        )
+        return decision
+
+    # ------------------------------------------------------------------
+    # Final report
+    # ------------------------------------------------------------------
+
+    async def _compose_final_report(
+        self,
+        run_id: UUID,
+        worktree_root: Path,
+        status: RunState,
+        error: str | None = None,
+    ) -> Path:
+        worktree_root.mkdir(parents=True, exist_ok=True)
+        async with self._db.session() as s:
+            run = await s.get(RunRow, str(run_id))
+            phase_rows = list(
+                (await s.execute(select(RunPhaseRow).where(RunPhaseRow.run_id == str(run_id))))
+                .scalars()
+                .all()
+            )
+            artifact_rows = list(
+                (await s.execute(select(ArtifactRow).where(ArtifactRow.run_id == str(run_id))))
+                .scalars()
+                .all()
+            )
+            event_rows = list(
+                (
+                    await s.execute(
+                        select(RunEventRow)
+                        .where(RunEventRow.run_id == str(run_id))
+                        .order_by(RunEventRow.seq.desc())
+                        .limit(20)
+                    )
+                )
+                .scalars()
+                .all()
+            )
+
+        report: dict[str, Any] = {
+            "runId": str(run_id),
+            "templateHash": run.template_hash if run else "",
+            "status": status.value,
+            "phases": [
+                {
+                    "key": p.phase_key,
+                    "state": p.state,
+                    "started_at": p.started_at,
+                    "ended_at": p.ended_at,
+                    "attempts": p.attempts,
+                }
+                for p in phase_rows
+            ],
+            "artifacts": [
+                {"path": a.path, "schema": a.schema_id, "hash": a.hash} for a in artifact_rows
+            ],
+            "events": [{"seq": e.seq, "type": e.type, "ts": e.ts} for e in reversed(event_rows)],
+            "unresolved": [],
+            "endedAt": _now_iso(),
+            "error": error,
+        }
+
+        json_path = worktree_root / f"{run_id}.report.json"
+        md_path = worktree_root / f"{run_id}.report.md"
+        json_path.write_text(json.dumps(report, indent=2, ensure_ascii=False), encoding="utf-8")
+        md_path.write_text(_render_report_md(report), encoding="utf-8")
+        return json_path
+
+    # ------------------------------------------------------------------
+    # Persistence helpers
+    # ------------------------------------------------------------------
+
+    async def _record_llm_call(self, record: dict[str, Any]) -> None:
+        """CostMiddleware recorder: persist one LlmCallRow per model call.
+
+        Fills every NOT NULL column of LlmCallRow. Per-input/output cost is computed
+        from the same PricingCache that the middleware already consulted, so the
+        ledger and the row stay consistent.
+        """
+        in_tokens = int(record.get("input_tokens") or 0)
+        out_tokens = int(record.get("output_tokens") or 0)
+        model = str(record.get("model") or "")
+        # Reproduce per-direction cost from the cached price.
+        price = self._pricing.get(model) if self._pricing is not None else None
+        if price is not None:
+            cost_input = (in_tokens / 1000.0) * price.input_per_1k_usd
+            cost_output = (out_tokens / 1000.0) * price.output_per_1k_usd
+        else:
+            cost_input = 0.0
+            cost_output = 0.0
+        cost_total = float(record.get("cost_usd_total") or (cost_input + cost_output))
+        run_id_val = record.get("run_id")
+        phase_id_val = record.get("phase_id")
+        session_id_val = record.get("interactive_session_id")
+        thread_id = (
+            f"run:{run_id_val}:phase:{phase_id_val}"
+            if run_id_val is not None
+            else f"session:{session_id_val}"
+        )
+        persona_name = str(record.get("persona_name") or "")
+        async with self._db.session() as s:
+            s.add(
+                LlmCallRow(
+                    run_id=(str(run_id_val) if run_id_val is not None else None),
+                    phase_id=(str(phase_id_val) if phase_id_val is not None else None),
+                    interactive_session_id=(
+                        str(session_id_val) if session_id_val is not None else None
+                    ),
+                    thread_id=thread_id,
+                    persona_name=persona_name,
+                    persona_version=1,
+                    model=model,
+                    role="main",
+                    turn_index=0,
+                    input_tokens=in_tokens,
+                    output_tokens=out_tokens,
+                    cached_tokens=0,
+                    reasoning_tokens=0,
+                    cost_usd_input=cost_input,
+                    cost_usd_output=cost_output,
+                    cost_usd_total=cost_total,
+                    latency_ms=int(record.get("latency_ms") or 0),
+                    status=str(record.get("status") or "ok"),
+                    error_code=record.get("error_code"),
+                    request_id=None,
+                    ts=_now_iso(),
+                )
+            )
+            try:
+                await s.commit()
+            except Exception:
+                await s.rollback()
+
+    async def _persist_run_skeleton(
+        self,
+        _unused_session: Any,  # kept for caller compatibility — we open own sessions
+        run_id: UUID,
+        template: WorkflowTemplate,
+        bindings: dict[str, Binding],
+        repo_path: Path,
+        base_branch: str,
+        worktree_root: Path,
+        requirements_md: str,
+    ) -> None:
+        template_hash = template.compute_hash()
+        now = _now_iso()
+
+        # --- Phase 1: upsert FK targets (committed separately to satisfy FK ordering) ---
+        template_id = uuid4()
+        async with self._db.session() as s:
+            existing_tpl = (
+                await s.execute(
+                    select(WorkflowTemplateRow).where(WorkflowTemplateRow.hash == template_hash)
+                )
+            ).scalar_one_or_none()
+            if existing_tpl is None:
+                s.add(
+                    WorkflowTemplateRow(
+                        id=str(template_id),
+                        name=template.name,
+                        version=template.version,
+                        hash=template_hash,
+                        definition=template.model_dump(by_alias=True),
+                        created_at=now,
+                    )
+                )
+            else:
+                template_id = UUID(existing_tpl.id)
+
+        persona_ids: dict[str, UUID] = {}
+        for role_id, binding in bindings.items():
+            persona_hash = binding.persona.compute_hash()
+            async with self._db.session() as s:
+                existing_persona = (
+                    await s.execute(
+                        select(AgentPersonaRow).where(AgentPersonaRow.hash == persona_hash)
+                    )
+                ).scalar_one_or_none()
+                if existing_persona is None:
+                    persona_id = uuid4()
+                    s.add(
+                        AgentPersonaRow(
+                            id=str(persona_id),
+                            name=binding.persona.name,
+                            version=binding.persona.version,
+                            hash=persona_hash,
+                            definition=binding.persona.model_dump(),
+                            created_at=now,
+                        )
+                    )
+                else:
+                    persona_id = UUID(existing_persona.id)
+            persona_ids[role_id] = persona_id
+
+        # --- Phase 2: insert RunRow (FK: workflow_templates — already committed above) ---
+        async with self._db.session() as s:
+            s.add(
+                RunRow(
+                    id=str(run_id),
+                    template_id=str(template_id),
+                    template_hash=template_hash,
+                    state=RunState.CREATED.value,
+                    repo_path=str(repo_path),
+                    base_branch=base_branch,
+                    worktree_root=str(worktree_root),
+                    created_at=now,
+                    updated_at=now,
+                )
+            )
+
+        # --- Phase 3: insert RunInputRow + RunBindingRow (FK: runs — now committed) ---
+        async with self._db.session() as s:
+            s.add(
+                RunInputRow(
+                    id=str(uuid4()),
+                    run_id=str(run_id),
+                    requirements_md=requirements_md,
+                    objective={},
+                    extra={},
+                    input_hash=sha256(
+                        {"requirements": requirements_md, "template_hash": template_hash}
+                    ),
+                )
+            )
+            for role_id, binding in bindings.items():
+                persona_hash = binding.persona.compute_hash()
+                s.add(
+                    RunBindingRow(
+                        id=str(uuid4()),
+                        run_id=str(run_id),
+                        role_id=role_id,
+                        persona_id=str(persona_ids[role_id]),
+                        persona_hash=persona_hash,
+                        backend=binding.persona.backend.value,
+                        binding_hash=binding.binding_hash,
+                    )
+                )
+
+    async def _ensure_phase_row(self, run_id: UUID, phase_def: WorkflowPhase) -> UUID:
+        async with self._db.session() as s:
+            existing = (
+                await s.execute(
+                    select(RunPhaseRow).where(
+                        RunPhaseRow.run_id == str(run_id),
+                        RunPhaseRow.phase_key == phase_def.key,
+                    )
+                )
+            ).scalar_one_or_none()
+            if existing is not None:
+                return UUID(existing.id)
+            phase_id = uuid4()
+            existing_count = len(
+                (
+                    await s.execute(select(RunPhaseRow).where(RunPhaseRow.run_id == str(run_id)))
+                ).all()
+            )
+            s.add(
+                RunPhaseRow(
+                    id=str(phase_id),
+                    run_id=str(run_id),
+                    phase_key=phase_def.key,
+                    seq=existing_count,
+                    state=RunPhaseState.PENDING.value,
+                    attempts=0,
+                    started_at=_now_iso(),
+                )
+            )
+            return phase_id
+
+    async def _set_phase_state(self, phase_id: UUID, state: RunPhaseState) -> None:
+        async with self._db.session() as s:
+            row = await s.get(RunPhaseRow, str(phase_id))
+            if row is not None:
+                row.state = state.value
+                if state in (
+                    RunPhaseState.COMPLETED,
+                    RunPhaseState.FAILED,
+                    RunPhaseState.SKIPPED,
+                ):
+                    row.ended_at = _now_iso()
+
+    async def _set_run_state(self, run_id: UUID, state: RunState) -> None:
+        async with self._db.session() as s:
+            row = await s.get(RunRow, str(run_id))
+            if row is not None:
+                row.state = state.value
+                row.updated_at = _now_iso()
+                if state in (RunState.COMPLETED, RunState.FAILED, RunState.ABORTED):
+                    row.ended_at = _now_iso()
+
+    async def _append_event(
+        self,
+        run_id: UUID,
+        phase_id: UUID | None,
+        event_type: RunEventType,
+        payload: dict[str, Any],
+    ) -> None:
+        idem_extra = {
+            k: str(v)
+            for k, v in payload.items()
+            if k in ("phase_key", "attempt", "request_id", "action", "code")
+        }
+        idem = run_idempotency_key(event_type, run_id, **idem_extra)
+        async with self._db.session() as s:
+            existing_count = len(
+                (
+                    await s.execute(select(RunEventRow).where(RunEventRow.run_id == str(run_id)))
+                ).all()
+            )
+            s.add(
+                RunEventRow(
+                    run_id=str(run_id),
+                    phase_id=str(phase_id) if phase_id is not None else None,
+                    seq=existing_count + 1,
+                    type=event_type.value,
+                    payload=payload,
+                    idempotency_key=idem,
+                    ts=_now_iso(),
+                )
+            )
+            try:
+                await s.flush()
+            except Exception:
+                await s.rollback()
+
+    async def _persist_artifact(
+        self,
+        run_id: UUID,
+        phase_id: UUID,
+        path: Path,
+        schema_id: str,
+        *,
+        valid: bool,
+        errors: list[Any] | None = None,
+    ) -> None:
+        try:
+            content = path.read_bytes()
+        except OSError:
+            return
+        artifact_hash = sha256({"bytes_len": len(content), "hex_prefix": content[:64].hex()})
+        async with self._db.session() as s:
+            s.add(
+                ArtifactRow(
+                    id=str(uuid4()),
+                    run_id=str(run_id),
+                    phase_id=str(phase_id),
+                    path=str(path),
+                    schema_id=schema_id,
+                    hash=artifact_hash,
+                    valid=valid,
+                    validation_error=(
+                        [{"path": f.path, "message": f.message} for f in errors] if errors else None
+                    ),
+                    created_at=_now_iso(),
+                )
+            )
+            try:
+                await s.flush()
+            except Exception:
+                await s.rollback()
+
+
+# ------------------------------------------------------------------
+# Module-level helpers
+# ------------------------------------------------------------------
+
+
+def _now_iso() -> str:
+    return datetime.now(UTC).isoformat(timespec="seconds")
+
+
+def _render_report_md(report: dict[str, Any]) -> str:
+    lines: list[str] = [
+        f"# Run {report['runId']}",
+        f"**Status**: {report['status']}",
+        f"**Template hash**: `{report['templateHash']}`",
+        f"**Ended at**: {report['endedAt']}",
+        "",
+        "## Phases",
+    ]
+    for p in report["phases"]:
+        lines.append(f"- **{p['key']}** — state={p['state']}, attempts={p['attempts']}")
+    lines.append("\n## Artifacts")
+    for a in report["artifacts"]:
+        lines.append(f"- `{a['path']}` (schema={a['schema']}, hash={a['hash'][:16]}...)")
+    if report.get("error"):
+        lines += ["", "## Error", str(report["error"])]
+    return "\n".join(lines) + "\n"
--- a/my-deepagent/src/my_deepagent/governance.py
+++ b/my-deepagent/src/my_deepagent/governance.py
@@ -0,0 +1,41 @@
+"""Governance consent for sending user code to external LLM providers."""
+
+from __future__ import annotations
+
+import json
+import os
+from datetime import UTC, datetime
+from pathlib import Path
+
+from .errors import MyDeepAgentError
+
+
+def consent_path(data_dir: Path) -> Path:
+    return data_dir / "governance-accepted.json"
+
+
+def has_consent(data_dir: Path) -> bool:
+    return consent_path(data_dir).is_file()
+
+
+def record_consent(data_dir: Path) -> None:
+    data_dir.mkdir(parents=True, exist_ok=True)
+    target = consent_path(data_dir)
+    payload = {"accepted_at": datetime.now(UTC).isoformat(timespec="seconds")}
+    tmp = target.with_suffix(target.suffix + ".tmp")
+    fd = os.open(tmp, os.O_WRONLY | os.O_CREAT | os.O_TRUNC, 0o600)
+    try:
+        os.write(fd, json.dumps(payload, indent=2).encode("utf-8"))
+        os.fsync(fd)
+    finally:
+        os.close(fd)
+    os.replace(tmp, target)
+
+
+def require_consent(data_dir: Path) -> None:
+    if not has_consent(data_dir):
+        raise MyDeepAgentError.human_required(
+            "governance_not_accepted",
+            message="governance consent not recorded",
+            recovery_hint="run `mydeepagent init` and accept the data-governance prompt",
+        )
--- a/my-deepagent/src/my_deepagent/i18n/init.py
+++ b/my-deepagent/src/my_deepagent/i18n/init.py
@@ -0,0 +1,45 @@
+"""Lightweight i18n catalog loader. Two languages (ko, en). Default ko per CTO decision."""
+
+from __future__ import annotations
+
+import os
+import tomllib
+from functools import lru_cache
+from pathlib import Path
+from typing import Literal
+
+Lang = Literal["ko", "en"]
+
+_CATALOG_DIR = Path(__file__).parent
+
+
+@lru_cache(maxsize=4)
+def _load(lang: Lang) -> dict[str, dict[str, str]]:
+    path = _CATALOG_DIR / f"{lang}.toml"
+    if not path.is_file():
+        return {}
+    with path.open("rb") as f:
+        data = tomllib.load(f)
+    return {section: dict(entries) for section, entries in data.items()}
+
+
+def resolve_lang(default: Lang = "ko") -> Lang:
+    env = os.environ.get("MYDEEPAGENT_LANG")
+    if env in ("ko", "en"):
+        return env  # type: ignore[return-value]
+    return default
+
+
+def t(key: str, lang: Lang | None = None, **fmt: object) -> str:
+    """Translate a key like 'section.key'. Falls back to the key itself if missing."""
+    actual_lang = lang or resolve_lang()
+    section_name, _, leaf = key.partition(".")
+    catalog = _load(actual_lang)
+    section = catalog.get(section_name, {})
+    template = section.get(leaf, key)
+    if fmt:
+        try:
+            return template.format(**fmt)
+        except (KeyError, IndexError):
+            return template
+    return template
--- a/my-deepagent/src/my_deepagent/i18n/en.toml
+++ b/my-deepagent/src/my_deepagent/i18n/en.toml
@@ -0,0 +1,34 @@
+[init]
+welcome = "Welcome — my-deepagent first-time setup"
+governance_title = "Consent to send code to external LLM providers"
+governance_body = "This tool sends file contents read via read_file and similar tools to external LLM providers (Anthropic, DeepSeek, etc.) through OpenRouter. Each persona declares its provider_origin, and a separate confirmation is shown on first use."
+governance_prompt = "Type 'yes' to agree (any other answer cancels): "
+governance_declined = "Cannot proceed without consent. Exiting."
+api_key_prompt = "OpenRouter API key (input is hidden)"
+api_key_empty = "API key was empty — nothing saved."
+api_key_saved = "Saved to OS keyring."
+doctor_running = "Running environment diagnostics..."
+done = "Setup complete. Start with `mydeepagent run <workflow.yaml>` or `mydeepagent`."
+
+[login]
+prompt = "Enter {provider} API key (hidden): "
+saved = "{provider} key saved to OS keyring."
+empty = "Empty input. Nothing saved."
+
+[logout]
+removed = "{provider} key removed from keyring."
+not_found = "{provider} key not found in keyring (already deleted)."
+
+[keys]
+header = "Registered API keys:"
+entry = "  {provider:20s}  {masked}"
+none = "  (none. Use `mydeepagent login <provider>` to register one.)"
+
+[doctor]
+header = "Environment diagnostics:"
+ok = "  ok   {name}"
+warn = "  warn {name}  ({detail})"
+fail = "  FAIL {name}  ({detail})"
+
+[errors]
+no_governance = "Governance consent is missing. Run `mydeepagent init` first."
--- a/my-deepagent/src/my_deepagent/i18n/ko.toml
+++ b/my-deepagent/src/my_deepagent/i18n/ko.toml
@@ -0,0 +1,34 @@
+[init]
+welcome = "환영합니다 — my-deepagent 첫 셋업"
+governance_title = "외부 LLM provider로 코드 전송 동의"
+governance_body = "이 도구는 read_file 등으로 읽은 파일 내용을 OpenRouter를 통해 외부 LLM provider(Anthropic, DeepSeek 등)로 전송합니다. 페르소나마다 provider_origin이 명시되며 첫 사용 시 별도 확인이 다시 한 번 표시됩니다."
+governance_prompt = "동의하시면 'yes' 입력 (그 외 모든 답은 취소): "
+governance_declined = "동의 없이는 사용할 수 없습니다. 종료합니다."
+api_key_prompt = "OpenRouter API key (입력은 가려집니다)"
+api_key_empty = "API key가 비어있어 저장하지 않았습니다."
+api_key_saved = "OS keyring에 저장되었습니다."
+doctor_running = "환경 진단 실행 중..."
+done = "셋업 완료. `mydeepagent run <workflow.yaml>` 또는 `mydeepagent` 로 시작하세요."
+
+[login]
+prompt = "{provider} API key 입력 (가려짐): "
+saved = "{provider} key가 OS keyring에 저장되었습니다."
+empty = "빈 입력입니다. 저장하지 않았습니다."
+
+[logout]
+removed = "{provider} key가 keyring에서 삭제되었습니다."
+not_found = "{provider} key가 keyring에 없습니다 (이미 삭제됨)."
+
+[keys]
+header = "등록된 API key:"
+entry = "  {provider:20s}  {masked}"
+none = "  (없음. `mydeepagent login <provider>` 로 등록하세요.)"
+
+[doctor]
+header = "환경 진단:"
+ok = "  ok   {name}"
+warn = "  warn {name}  ({detail})"
+fail = "  FAIL {name}  ({detail})"
+
+[errors]
+no_governance = "거버넌스 동의가 없습니다. `mydeepagent init` 를 먼저 실행하세요."
--- a/my-deepagent/src/my_deepagent/keys.py
+++ b/my-deepagent/src/my_deepagent/keys.py
@@ -0,0 +1,48 @@
+"""OS keyring wrapper for storing provider API keys. Service name: 'my-deepagent'."""
+
+from __future__ import annotations
+
+from typing import Final
+
+import keyring as keyring
+
+_SERVICE: Final[str] = "my-deepagent"
+
+
+def _make_username(provider: str) -> str:
+    return f"{provider}_api_key"
+
+
+def get_api_key(provider: str) -> str | None:
+    """Return the stored key for ``provider``, or None if absent."""
+    return keyring.get_password(_SERVICE, _make_username(provider))
+
+
+def set_api_key(provider: str, value: str) -> None:
+    """Persist ``value`` in the OS keyring under provider's slot."""
+    keyring.set_password(_SERVICE, _make_username(provider), value)
+
+
+def delete_api_key(provider: str) -> bool:
+    """Remove the stored key. Returns True if a key existed and was removed."""
+    if keyring.get_password(_SERVICE, _make_username(provider)) is None:
+        return False
+    keyring.delete_password(_SERVICE, _make_username(provider))
+    return True
+
+
+def list_providers() -> list[str]:
+    """Return the providers we recognise (we don't enumerate keyring contents).
+
+    Callers iterate this list and call get_api_key for each to detect presence.
+    """
+    return ["openrouter", "anthropic", "openai", "google", "langsmith"]
+
+
+def mask(value: str | None) -> str:
+    """Mask an API key for display: 'sk-or-v1-...c2e7' or '(not set)' if None."""
+    if not value:
+        return "(not set)"
+    if len(value) <= 8:
+        return "***"
+    return f"{value[:8]}...{value[-4:]}"
--- a/my-deepagent/src/my_deepagent/logging.py
+++ b/my-deepagent/src/my_deepagent/logging.py
@@ -0,0 +1,88 @@
+"""structlog configuration with built-in secret scrubbing.
+
+Scrubs known API key patterns and bearer tokens from all log output (both rich
+pretty-printed and JSON). Apply ``configure_logging(config)`` once at process
+start (called from CLI entry points).
+"""
+
+from __future__ import annotations
+
+import logging
+import re
+import sys
+from typing import Any
+
+import structlog
+
+# Secret patterns. Order matters: more specific first.
+_SECRET_PATTERNS: tuple[re.Pattern[str], ...] = tuple(
+    re.compile(p)
+    for p in (
+        r"sk-or-[A-Za-z0-9_-]{20,}",  # OpenRouter
+        r"sk-ant-[A-Za-z0-9_-]{20,}",  # Anthropic
+        r"sk-proj-[A-Za-z0-9_-]{20,}",  # OpenAI project keys
+        r"sk-[A-Za-z0-9_-]{30,}",  # OpenAI (general)
+        r"lsv2_pt_[A-Za-z0-9_-]{20,}",  # LangSmith personal token
+        r"lsv2_[A-Za-z0-9_-]{30,}",  # LangSmith (other)
+        r"Bearer\s+[A-Za-z0-9._-]{20,}",  # generic bearer
+        r"ghp_[A-Za-z0-9]{30,}",  # GitHub PAT
+        r"glpat-[A-Za-z0-9-]{20,}",  # GitLab PAT
+    )
+)
+
+_REDACTED = "[REDACTED]"
+
+
+def scrub(text: str) -> str:
+    """Replace secrets in ``text`` with ``[REDACTED]``."""
+    for pat in _SECRET_PATTERNS:
+        text = pat.sub(_REDACTED, text)
+    return text
+
+
+def scrub_value(value: Any) -> Any:
+    """Recursively scrub strings inside dicts/lists/tuples/sets. Non-strings pass through."""
+    if isinstance(value, str):
+        return scrub(value)
+    if isinstance(value, dict):
+        return {k: scrub_value(v) for k, v in value.items()}
+    if isinstance(value, list):
+        return [scrub_value(v) for v in value]
+    if isinstance(value, tuple):
+        return tuple(scrub_value(v) for v in value)
+    if isinstance(value, set):
+        return {scrub_value(v) for v in value}
+    return value
+
+
+def _scrub_processor(_logger: Any, _method: str, event_dict: dict[str, Any]) -> dict[str, Any]:
+    """structlog processor: scrub every value in the event dict."""
+    return {k: scrub_value(v) for k, v in event_dict.items()}
+
+
+def configure_logging(level: str = "info", json_output: bool = False) -> None:
+    """Configure structlog with secret-scrubbing on top of the chosen renderer."""
+    log_level = getattr(logging, level.upper(), logging.INFO)
+    logging.basicConfig(level=log_level, format="%(message)s", stream=sys.stderr)
+
+    processors: list[Any] = [
+        structlog.contextvars.merge_contextvars,
+        structlog.processors.add_log_level,
+        structlog.processors.TimeStamper(fmt="iso", utc=True),
+        _scrub_processor,
+    ]
+    if json_output:
+        processors.append(structlog.processors.JSONRenderer())
+    else:
+        processors.append(structlog.dev.ConsoleRenderer(colors=True))
+
+    structlog.configure(
+        processors=processors,
+        wrapper_class=structlog.make_filtering_bound_logger(log_level),
+        logger_factory=structlog.PrintLoggerFactory(file=sys.stderr),
+        cache_logger_on_first_use=True,
+    )
+
+
+def get_logger(name: str | None = None) -> Any:
+    return structlog.get_logger(name) if name else structlog.get_logger()
--- a/my-deepagent/src/my_deepagent/middleware/artifact_watcher.py
+++ b/my-deepagent/src/my_deepagent/middleware/artifact_watcher.py
@@ -0,0 +1,115 @@
+"""ArtifactWatcherMiddleware: detect write_file / edit_file calls targeting expected artifact."""
+
+from __future__ import annotations
+
+import asyncio
+from collections.abc import Awaitable, Callable
+from pathlib import Path
+from typing import Any
+
+from langchain.agents.middleware import AgentMiddleware, ToolCallRequest
+from langchain_core.messages import ToolMessage
+
+# Async callback fired when write_file/edit_file targets the expected path.
+# Args: (absolute_path_str, content_str)
+ArtifactWriteCallback = Callable[[str, str], Awaitable[None]]
+
+# Tool names that count as "write the artifact"
+_WRITE_TOOL_NAMES: frozenset[str] = frozenset({"write_file", "edit_file"})
+
+# Candidate argument key names for the file path, in priority order
+_PATH_ARG_KEYS: tuple[str, ...] = ("file_path", "path", "file")
+
+# Candidate argument key names for the file content
+_CONTENT_ARG_KEYS: tuple[str, ...] = ("content", "text", "new_string")
+
+
+class ArtifactWatcherMiddleware(AgentMiddleware[Any, None, Any]):
+    """Intercepts write_file / edit_file tool calls and fires a callback when the
+    targeted path matches *expected_path* (after resolution to an absolute path).
+
+    The middleware never suppresses or modifies the tool call — it always forwards
+    to ``handler``.  The callback runs *after* the tool succeeds; any exception raised
+    inside the callback is caught and silently discarded so it cannot break the agent
+    loop.
+    """
+
+    def __init__(
+        self,
+        expected_path: Path,
+        on_artifact_written: ArtifactWriteCallback,
+    ) -> None:
+        super().__init__()
+        self._expected = expected_path.resolve()
+        self._callback = on_artifact_written
+        self._notified = asyncio.Event()
+        self._content: str | None = None
+
+    # ------------------------------------------------------------------
+    # Public helpers
+    # ------------------------------------------------------------------
+
+    @property
+    def notified(self) -> asyncio.Event:
+        """Set once the expected artifact has been written."""
+        return self._notified
+
+    @property
+    def content(self) -> str | None:
+        """Content string passed to the write/edit tool, or None if not yet written."""
+        return self._content
+
+    # ------------------------------------------------------------------
+    # AgentMiddleware interface
+    # ------------------------------------------------------------------
+
+    async def awrap_tool_call(
+        self,
+        request: ToolCallRequest,
+        handler: Callable[[ToolCallRequest], Awaitable[ToolMessage | Any]],
+    ) -> ToolMessage | Any:
+        result = await handler(request)
+        tool_call = request.tool_call  # ToolCall TypedDict: {"name": str, "args": dict, "id": ...}
+        name: str = tool_call["name"]
+        if name in _WRITE_TOOL_NAMES:
+            args: dict[str, Any] = dict(tool_call["args"] or {})
+            path_str = self._extract_path(args)
+            if path_str:
+                resolved = self._resolve_path(path_str)
+                if resolved == self._expected:
+                    content = self._extract_content(args)
+                    self._content = content
+                    self._notified.set()
+                    try:
+                        await self._callback(str(resolved), content)
+                    except Exception:  # noqa: S110
+                        pass  # callback must not break agent loop
+        return result
+
+    # ------------------------------------------------------------------
+    # Private helpers
+    # ------------------------------------------------------------------
+
+    def _resolve_path(self, path_str: str) -> Path:
+        """Resolve a possibly-relative path to absolute using expected_path's parent as base."""
+        p = Path(path_str)
+        if p.is_absolute():
+            return p.resolve()
+        # Relative paths are anchored to the expected artifact's directory
+        return (self._expected.parent / p).resolve()
+
+    @staticmethod
+    def _extract_path(args: dict[str, Any]) -> str:
+        for key in _PATH_ARG_KEYS:
+            val = args.get(key)
+            if isinstance(val, str) and val:
+                return val
+        return ""
+
+    @staticmethod
+    def _extract_content(args: dict[str, Any]) -> str:
+        for key in _CONTENT_ARG_KEYS:
+            val = args.get(key)
+            if isinstance(val, str):
+                return val
+        return ""
--- a/my-deepagent/src/my_deepagent/middleware/audit.py
+++ b/my-deepagent/src/my_deepagent/middleware/audit.py
@@ -1,66 +1,70 @@
-"""AuditToolMiddleware: capture every tool call for audit log + DB.
-
-Records: name, args, result/error, duration.
-"""
+"""AuditToolMiddleware: capture every tool call to audit.jsonl + tool_calls DB row."""

 from __future__ import annotations

 import time
+from collections.abc import Awaitable, Callable
 from typing import Any
 from uuid import UUID

 from langchain.agents.middleware import AgentMiddleware

+AuditRecorder = Callable[[dict[str, Any]], Awaitable[None]]
+

 class AuditToolMiddleware(AgentMiddleware):
-    """Record every tool invocation for the audit log and DB sink (Step 8)."""
+    """Record every tool invocation for the audit log and DB sink.
+
+    Accepts two optional recorders:
+      - ``file_recorder``: JSONL file at {state_dir}/audit.jsonl (append-only)
+      - ``db_recorder``: tool_calls DB row (optional, wired in Step 12+)
+
+    For backward compatibility, ``recorder`` is accepted as an alias for
+    ``file_recorder`` (used by pre-Step-11 unit tests).
+    """

    def __init__(
        self,
        run_id: UUID | None = None,
        phase_id: UUID | None = None,
        interactive_session_id: UUID | None = None,
-        recorder: Any | None = None,
+        file_recorder: AuditRecorder | None = None,
+        db_recorder: AuditRecorder | None = None,
+        # backward-compat alias — maps to file_recorder
+        recorder: AuditRecorder | None = None,
    ) -> None:
        super().__init__()
        self.run_id = run_id
        self.phase_id = phase_id
        self.interactive_session_id = interactive_session_id
-        self.recorder = recorder
+        # ``recorder`` is a pre-Step-11 alias for file_recorder
+        self.file_recorder: AuditRecorder | None = (
+            file_recorder if file_recorder is not None else recorder
+        )
+        self.db_recorder = db_recorder

    async def awrap_tool_call(self, request: Any, handler: Any) -> Any:
        started = time.perf_counter()
-        # ToolCallRequest exposes tool_call dict with 'name' and 'args'
        tool_call = getattr(request, "tool_call", {}) or {}
        name: str = tool_call.get("name", "unknown") if isinstance(tool_call, dict) else "unknown"
        args: dict[str, Any] = (
            tool_call.get("args", {}) if isinstance(tool_call, dict) else {}
        ) or {}
+        error: str | None = None
+        result: Any = None
        try:
            result = await handler(request)
+            return result
        except Exception as e:
-            await self._record(name, args, None, type(e).__name__, started)
+            error = type(e).__name__
            raise
-        await self._record(name, args, result, None, started)
-        return result
-
-    async def _record(
-        self,
-        name: str,
-        args: dict[str, Any],
-        result: Any,
-        error: str | None,
-        started: float,
-    ) -> None:
-        if self.recorder is None:
-            return
-        serializable_result: str | int | float | bool | dict[str, Any] | list[Any] | None
-        if isinstance(result, (str, int, float, bool, dict, list)) or result is None:
-            serializable_result = result
-        else:
-            serializable_result = str(result)
-        await self.recorder(
-            {
+        finally:
+            serializable_result: str | int | float | bool | dict[str, Any] | list[Any] | None
+            if isinstance(result, (str, int, float, bool, dict, list)) or result is None:
+                serializable_result = result
+            else:
+                serializable_result = str(result)
+            record: dict[str, Any] = {
                "tool_name": name,
                "args": args,
                "result": serializable_result,
@@ -70,4 +74,13 @@ class AuditToolMiddleware(AgentMiddleware):
                "phase_id": self.phase_id,
                "interactive_session_id": self.interactive_session_id,
            }
-        )
+            if self.file_recorder is not None:
+                try:
+                    await self.file_recorder(record)
+                except Exception:  # noqa: S110 — never let audit failure break the tool
+                    pass
+            if self.db_recorder is not None:
+                try:
+                    await self.db_recorder(record)
+                except Exception:  # noqa: S110
+                    pass
--- a/my-deepagent/src/my_deepagent/middleware/cost.py
+++ b/my-deepagent/src/my_deepagent/middleware/cost.py
@@ -1,4 +1,4 @@
-"""CostMiddleware: capture every LLM call's usage and accumulate cost into the SQLite ledger."""
+"""CostMiddleware: per-LLM-call cost tracking + optional budget enforcement."""

 from __future__ import annotations

@@ -6,15 +6,17 @@ import time
 from typing import Any
 from uuid import UUID

-from langchain.agents.middleware import AgentMiddleware
+from langchain.agents.middleware import AgentMiddleware, ToolCallRequest
+from langchain_core.messages import ToolMessage

+from ..budget import BudgetTracker
 from ..monitoring.pricing import PricingCache


 class CostMiddleware(AgentMiddleware):
-    """Wrap every model call. Compute cost from usage_metadata and persist.
+    """Wrap every model call. Compute cost from usage_metadata and persist via recorder + budget.

-    Step 8 wires the DB writer via the recorder callback.
+    Step 8 wires the BudgetTracker via the budget_tracker parameter.
    """

    def __init__(
@@ -23,18 +25,38 @@ class CostMiddleware(AgentMiddleware):
        model_name: str,
        run_id: UUID | None = None,
        phase_id: UUID | None = None,
+        interactive_session_id: UUID | None = None,
        persona_name: str | None = None,
-        recorder: Any | None = None,  # callable(record) -> Awaitable[None] for DB sink (Step 8)
+        recorder: Any | None = None,  # async callable(record) -> Awaitable[None] for DB sink
+        budget_tracker: BudgetTracker | None = None,
    ) -> None:
        super().__init__()
        self.pricing = pricing
        self.model_name = model_name
        self.run_id = run_id
        self.phase_id = phase_id
+        self.interactive_session_id = interactive_session_id
        self.persona_name = persona_name
        self.recorder = recorder
+        self.budget = budget_tracker
+
+    async def awrap_tool_call(
+        self,
+        request: ToolCallRequest,
+        handler: Any,
+    ) -> ToolMessage | Any:
+        """Pass tool calls through without modification."""
+        return await handler(request)

    async def awrap_model_call(self, request: Any, handler: Any) -> Any:
+        # Pre-call: ask budget tracker if estimated cost is allowed
+        if self.budget is not None:
+            estimated = self.pricing.compute_cost(self.model_name, 4000, 1500)
+            await self.budget.assert_can_call(
+                run_id=self.run_id,
+                persona_name=self.persona_name,
+                estimated_cost_usd=estimated,
+            )
        started = time.perf_counter()
        try:
            response = await handler(request)
@@ -47,9 +69,27 @@ class CostMiddleware(AgentMiddleware):
                error_code=type(e).__name__,
            )
            raise
-        usage = getattr(response, "usage_metadata", None) or {}
-        in_tokens = int(usage.get("input_tokens", 0) or 0)
-        out_tokens = int(usage.get("output_tokens", 0) or 0)
+        # Token usage shows up in different places depending on the model integration.
+        # langchain-openai usually fills `usage_metadata`, but for streamed responses
+        # or some OpenAI-compatible endpoints (OpenRouter forwarding DeepSeek/etc.)
+        # the count lands in `response_metadata.token_usage` with OpenAI keys
+        # (`prompt_tokens` / `completion_tokens`).
+        usage_meta = getattr(response, "usage_metadata", None) or {}
+        response_meta = getattr(response, "response_metadata", None) or {}
+        token_usage = response_meta.get("token_usage") if isinstance(response_meta, dict) else None
+        token_usage = token_usage or {}
+        in_tokens = int(
+            usage_meta.get("input_tokens")
+            or token_usage.get("prompt_tokens")
+            or token_usage.get("input_tokens")
+            or 0
+        )
+        out_tokens = int(
+            usage_meta.get("output_tokens")
+            or token_usage.get("completion_tokens")
+            or token_usage.get("output_tokens")
+            or 0
+        )
        await self._record(
            input_tokens=in_tokens,
            output_tokens=out_tokens,
@@ -57,6 +97,14 @@ class CostMiddleware(AgentMiddleware):
            status="ok",
            error_code=None,
        )
+        # Post-call: record actual cost in budget ledger
+        if self.budget is not None and (in_tokens or out_tokens):
+            actual = self.pricing.compute_cost(self.model_name, in_tokens, out_tokens)
+            await self.budget.record(
+                run_id=self.run_id,
+                persona_name=self.persona_name,
+                actual_cost_usd=actual,
+            )
        return response

    async def _record(
--- a/my-deepagent/src/my_deepagent/monitoring/cost_estimator.py
+++ b/my-deepagent/src/my_deepagent/monitoring/cost_estimator.py
@@ -0,0 +1,70 @@
+"""Estimate per-phase cost using pricing matrix + crude token heuristic.
+
+For accurate billing, use the actual usage_metadata after the call (see CostMiddleware).
+This module is for the *preview* shown before ``mydeepagent run`` starts.
+"""
+
+from __future__ import annotations
+
+from dataclasses import dataclass
+from typing import TYPE_CHECKING
+
+from ..persona import Persona
+from ..workflow import WorkflowPhase, WorkflowTemplate
+from .pricing import PricingCache
+
+if TYPE_CHECKING:
+    from ..binding import Binding
+
+
+@dataclass(frozen=True)
+class PhaseCostEstimate:
+    phase_key: str
+    persona_name: str
+    model: str
+    estimated_input_tokens: int
+    estimated_output_tokens: int
+    estimated_cost_usd: float
+
+
+@dataclass(frozen=True)
+class WorkflowCostEstimate:
+    phases: list[PhaseCostEstimate]
+    total_usd: float
+
+
+_DEFAULT_INPUT_TOKENS = 4000  # generous: instructions + context + prior artifacts
+_DEFAULT_OUTPUT_TOKENS = 1500  # bounded by max_tokens; we use persona max_tokens if set
+
+
+def estimate_phase(
+    phase: WorkflowPhase,
+    persona: Persona,
+    pricing: PricingCache,
+) -> PhaseCostEstimate:
+    """Estimate the cost of a single phase based on persona model and default token counts."""
+    input_tokens = _DEFAULT_INPUT_TOKENS
+    output_tokens = int(persona.model_params.get("max_tokens", _DEFAULT_OUTPUT_TOKENS))
+    cost = pricing.compute_cost(persona.model, input_tokens, output_tokens)
+    return PhaseCostEstimate(
+        phase_key=phase.key,
+        persona_name=f"{persona.name}@{persona.version}",
+        model=persona.model,
+        estimated_input_tokens=input_tokens,
+        estimated_output_tokens=output_tokens,
+        estimated_cost_usd=cost,
+    )
+
+
+def estimate_workflow(
+    template: WorkflowTemplate,
+    bindings: dict[str, Binding],
+    pricing: PricingCache,
+) -> WorkflowCostEstimate:
+    """Estimate the total cost of all phases in a workflow template."""
+    phases: list[PhaseCostEstimate] = []
+    for phase in template.phases:
+        binding = bindings[phase.role]
+        phases.append(estimate_phase(phase, binding.persona, pricing))
+    total = sum(p.estimated_cost_usd for p in phases)
+    return WorkflowCostEstimate(phases=phases, total_usd=total)
--- a/my-deepagent/src/my_deepagent/recovery.py
+++ b/my-deepagent/src/my_deepagent/recovery.py
@@ -0,0 +1,159 @@
+"""Crash recovery: sweep non-terminal runs at startup and mark them as failed.
+
+This v0.1.0 implementation is conservative — runs that were mid-flight at the previous
+process death are *not* resumed automatically. They are marked ``failed`` with a
+synthesized ``run.failed`` event so the active-run uniqueness slot is freed and the
+user can re-run if desired. Real Temporal-style resume is deferred to v0.2 or beyond.
+"""
+
+from __future__ import annotations
+
+from dataclasses import dataclass
+from datetime import UTC, datetime
+from uuid import UUID
+
+from sqlalchemy import func, select
+from sqlalchemy.dialects.sqlite import insert as sqlite_insert
+from sqlalchemy.ext.asyncio import AsyncSession
+
+from .enums import RunPhaseState, RunState
+from .persistence.db import Database
+from .persistence.models import RunEventRow, RunPhaseRow, RunRow
+from .run_event import RunEventType, run_idempotency_key
+
+_NON_TERMINAL_RUN_STATES: frozenset[str] = frozenset(
+    {
+        RunState.CREATED.value,
+        RunState.BOUND.value,
+        RunState.PLANNING.value,
+        RunState.AWAITING_APPROVAL.value,
+        RunState.EXECUTING.value,
+        RunState.PAUSED.value,
+    }
+)
+
+_NON_TERMINAL_PHASE_STATES: frozenset[str] = frozenset(
+    {
+        RunPhaseState.PENDING.value,
+        RunPhaseState.RUNNING.value,
+        RunPhaseState.AWAITING_ARTIFACT.value,
+        RunPhaseState.VALIDATING.value,
+        RunPhaseState.AWAITING_APPROVAL.value,
+    }
+)
+
+_FAILED_REASON = "process_restart_unrecovered"
+
+
+@dataclass(frozen=True)
+class SweepReport:
+    """Outcome of one recovery sweep."""
+
+    failed_runs: tuple[UUID, ...]
+    failed_phases: tuple[UUID, ...]
+
+    @property
+    def total(self) -> int:
+        return len(self.failed_runs) + len(self.failed_phases)
+
+
+async def sweep_orphan_runs(db: Database) -> SweepReport:
+    """Mark non-terminal runs/phases as ``failed`` and emit run.failed events.
+
+    Idempotent: rerunning when no orphans exist returns an empty SweepReport.
+    Uses the existing ``run_events.idempotency_key`` UNIQUE constraint so duplicate
+    sweeps in the same process don't insert duplicate events.
+    """
+    failed_runs: list[UUID] = []
+    failed_phases: list[UUID] = []
+    now = _now_iso()
+
+    async with db.session() as s:
+        rows = (
+            (await s.execute(select(RunRow).where(RunRow.state.in_(_NON_TERMINAL_RUN_STATES))))
+            .scalars()
+            .all()
+        )
+
+        for run in rows:
+            run_uuid = UUID(run.id)
+            run.state = RunState.FAILED.value
+            run.ended_at = now
+            run.updated_at = now
+            run.final_report_path = None
+            failed_runs.append(run_uuid)
+
+            # Append a single synthesized run.failed event (idempotent).
+            await _append_event_idempotent(
+                s,
+                run_id=run.id,
+                event_type=RunEventType.RUN_FAILED,
+                payload={"reason": _FAILED_REASON},
+                extra_for_key={"reason": _FAILED_REASON},
+            )
+
+            # Cascade orphan phases.
+            phase_rows = (
+                (
+                    await s.execute(
+                        select(RunPhaseRow)
+                        .where(RunPhaseRow.run_id == run.id)
+                        .where(RunPhaseRow.state.in_(_NON_TERMINAL_PHASE_STATES))
+                    )
+                )
+                .scalars()
+                .all()
+            )
+
+            for ph in phase_rows:
+                ph.state = RunPhaseState.FAILED.value
+                ph.ended_at = now
+                failed_phases.append(UUID(ph.id))
+
+        await s.commit()
+
+    return SweepReport(
+        failed_runs=tuple(failed_runs),
+        failed_phases=tuple(failed_phases),
+    )
+
+
+async def _append_event_idempotent(
+    s: AsyncSession,
+    *,
+    run_id: str,
+    event_type: RunEventType,
+    payload: dict[str, object],
+    extra_for_key: dict[str, object] | None = None,
+) -> None:
+    """Append a run_events row using ON CONFLICT DO NOTHING on idempotency_key."""
+    extra = {k: str(v) for k, v in (extra_for_key or {}).items()}
+    key = run_idempotency_key(event_type, UUID(run_id), **extra)
+
+    # Compute next seq.
+    next_seq = (
+        await s.execute(
+            select(func.coalesce(func.max(RunEventRow.seq), 0) + 1).where(
+                RunEventRow.run_id == run_id
+            )
+        )
+    ).scalar_one()
+
+    stmt = (
+        sqlite_insert(RunEventRow)
+        .values(
+            run_id=run_id,
+            phase_id=None,
+            seq=int(next_seq),
+            type=event_type.value,
+            payload=payload,
+            idempotency_key=key,
+            ts=_now_iso(),
+        )
+        .on_conflict_do_nothing(index_elements=["run_id", "idempotency_key"])
+    )
+    await s.execute(stmt)
+
+
+def _now_iso() -> str:
+    return datetime.now(UTC).isoformat(timespec="seconds")
--- a/my-deepagent/src/my_deepagent/run_event.py
+++ b/my-deepagent/src/my_deepagent/run_event.py
@@ -1 +1,39 @@
-"""Run event types for streaming progress. Implemented in Step 4."""
+"""Run event types + idempotency key generation."""
+
+from __future__ import annotations
+
+from enum import StrEnum
+from uuid import UUID
+
+
+class RunEventType(StrEnum):
+    RUN_CREATED = "run.created"
+    RUN_STARTED = "run.started"
+    RUN_PAUSED = "run.paused"
+    RUN_RESUMED = "run.resumed"
+    RUN_COMPLETED = "run.completed"
+    RUN_FAILED = "run.failed"
+    RUN_ABORTED = "run.aborted"
+    PHASE_STARTED = "phase.started"
+    PHASE_COMPLETED = "phase.completed"
+    PHASE_FAILED = "phase.failed"
+    PHASE_SKIPPED = "phase.skipped"
+    PROMPT_SENT = "prompt.sent"
+    PROMPT_REPAIRED = "prompt.repaired"
+    ARTIFACT_EXPECTED = "artifact.expected"
+    ARTIFACT_VALIDATED = "artifact.validated"
+    ARTIFACT_INVALID = "artifact.invalid"
+    ARTIFACT_TIMEOUT = "artifact.timeout"
+    APPROVAL_REQUESTED = "approval.requested"
+    APPROVAL_RESOLVED = "approval.resolved"
+
+
+def run_idempotency_key(event_type: RunEventType, run_id: UUID, **extra: object) -> str:
+    """Deterministic idempotency key per plan v2.0 §13.1.
+
+    Key format: "<event_type>:<run_id>[:<k>=<v>...]" with extra keys sorted ascending.
+    """
+    parts: list[str] = [event_type.value, str(run_id)]
+    for k in sorted(extra):
+        parts.append(f"{k}={extra[k]}")
+    return ":".join(parts)
--- a/my-deepagent/src/my_deepagent/secrets.py
+++ b/my-deepagent/src/my_deepagent/secrets.py
@@ -0,0 +1,28 @@
+"""Cross-cutting secret resolution. Tries config -> env -> keyring -> error."""
+
+from __future__ import annotations
+
+import os
+
+from .config import Config
+from .errors import MyDeepAgentError
+from .keys import get_api_key
+
+
+def resolve_openrouter_api_key(config: Config) -> str:
+    """Resolve the OpenRouter API key with priority: config -> env -> keyring -> error."""
+    if config.openrouter_api_key:
+        return config.openrouter_api_key
+    env_key = os.environ.get("MYDEEPAGENT_OPENROUTER_API_KEY") or os.environ.get(
+        "OPENROUTER_API_KEY"
+    )
+    if env_key:
+        return env_key
+    kr_key = get_api_key("openrouter")
+    if kr_key:
+        return kr_key
+    raise MyDeepAgentError.human_required(
+        "backend_auth_failed",
+        message="OpenRouter API key is not configured",
+        recovery_hint="run `mydeepagent login openrouter` to register one in the OS keyring",
+    )
--- a/my-deepagent/src/my_deepagent/session.py
+++ b/my-deepagent/src/my_deepagent/session.py
@@ -11,7 +11,6 @@ Connects:

 from __future__ import annotations

-import os
 from pathlib import Path
 from typing import Any, Literal
 from uuid import UUID
@@ -28,6 +27,7 @@ from langchain_openai import ChatOpenAI
 from .config import Config
 from .errors import MyDeepAgentError
 from .persona import FilesystemPermissionSpec, Persona, PersonaSubagent
+from .secrets import resolve_openrouter_api_key as _resolve_openrouter_api_key_impl

 DEFAULT_DENY_PATHS: tuple[str, ...] = (
    "/.env*",
@@ -125,24 +125,13 @@ def _subagent_to_dict(sub: PersonaSubagent) -> SubAgent:


 def _resolve_openrouter_api_key(config: Config) -> str:
-    """Pull the OpenRouter API key from config -> env -> error.
+    """Pull the OpenRouter API key from config -> env -> keyring -> error.

-    Priority: config.openrouter_api_key -> MYDEEPAGENT_OPENROUTER_API_KEY -> OPENROUTER_API_KEY.
+    Delegates to secrets.resolve_openrouter_api_key for full priority chain.
+    Priority: config.openrouter_api_key -> MYDEEPAGENT_OPENROUTER_API_KEY ->
+              OPENROUTER_API_KEY -> OS keyring -> error.
    """
-    if config.openrouter_api_key:
-        return config.openrouter_api_key
-    env_key = os.environ.get("MYDEEPAGENT_OPENROUTER_API_KEY") or os.environ.get(
-        "OPENROUTER_API_KEY"
-    )
-    if env_key:
-        return env_key
-    raise MyDeepAgentError.human_required(
-        "backend_auth_failed",
-        message="OpenRouter API key is not configured",
-        recovery_hint=(
-            "set MYDEEPAGENT_OPENROUTER_API_KEY in .env or run `mydeepagent login openrouter`"
-        ),
-    )
+    return _resolve_openrouter_api_key_impl(config)


 def resolve_model_instance(
@@ -258,7 +247,19 @@ def build_agent(
        ]
        kwargs["permissions"] = permissions

-    if persona.allowed_tools:
+    # deepagents 0.6.x: passing `tools` as a string list to create_deep_agent() triggers
+    # SubAgentMiddleware._get_subagents() → langchain create_agent() → ToolNode, which
+    # iterates the LocalShellBackend tools. Some of those tools are raw async functions
+    # (not StructuredTool instances), causing:
+    #   AttributeError: 'function' object has no attribute 'name'
+    # Workaround: skip `tools` kwarg for local_shell backend. deepagents exposes all
+    # backend-default tools (read_file, write_file, glob, grep, ls, execute, write_todos)
+    # to the LLM by default; SafetyShellMiddleware enforces path safety and blocks
+    # destructive-command execution regardless of which tools the LLM attempts to call.
+    # For non-local_shell backends (state, filesystem, composite), `tools` is passed
+    # through normally since those backends return proper StructuredTool objects.
+    use_tools_kwarg = persona.deepagents_backend != "local_shell"
+    if use_tools_kwarg and persona.allowed_tools:
        kwargs["tools"] = list(persona.allowed_tools)
    if subagents:
        kwargs["subagents"] = subagents
--- a/my-deepagent/src/my_deepagent/slash.py
+++ b/my-deepagent/src/my_deepagent/slash.py
@@ -1 +1,61 @@
-"""Slash command registry and dispatcher. Implemented in Step 10."""
+"""Parse and dispatch slash commands inside the interactive REPL.
+
+Slash commands are recognized by a leading '/'; everything else is forwarded to the agent.
+"""
+
+from __future__ import annotations
+
+from collections.abc import Awaitable, Callable
+from dataclasses import dataclass
+
+
+@dataclass(frozen=True)
+class SlashParsed:
+    """A parsed slash command. ``raw`` is the original token after the slash."""
+
+    name: str
+    args: tuple[str, ...]
+    raw: str
+
+
+def parse_slash(line: str) -> SlashParsed | None:
+    """Return a SlashParsed if ``line`` starts with '/', else None."""
+    if not line.startswith("/"):
+        return None
+    body = line[1:].strip()
+    if not body:
+        return SlashParsed(name="", args=(), raw="")
+    parts = body.split()
+    return SlashParsed(name=parts[0].lower(), args=tuple(parts[1:]), raw=body)
+
+
+SlashHandler = Callable[[SlashParsed], Awaitable[bool]]
+"""A handler returns False to keep the REPL alive, True to exit it."""
+
+
+class SlashRegistry:
+    """Map slash command names to async handlers."""
+
+    def __init__(self) -> None:
+        self._handlers: dict[str, SlashHandler] = {}
+        self._help: dict[str, str] = {}
+
+    def register(self, name: str, handler: SlashHandler, *, help: str = "") -> None:
+        self._handlers[name.lower()] = handler
+        if help:
+            self._help[name.lower()] = help
+
+    async def dispatch(self, cmd: SlashParsed) -> bool:
+        if cmd.name in self._handlers:
+            return await self._handlers[cmd.name](cmd)
+        return False  # unknown → caller decides
+
+    @property
+    def names(self) -> list[str]:
+        return sorted(self._handlers)
+
+    def help_for(self, name: str) -> str:
+        return self._help.get(name.lower(), "")
+
+    def all_help(self) -> list[tuple[str, str]]:
+        return [(n, self._help.get(n, "")) for n in self.names]
--- a/my-deepagent/src/my_deepagent/tui/approval.py
+++ b/my-deepagent/src/my_deepagent/tui/approval.py
@@ -1 +1,53 @@
-"""TUI approval dialog for human-in-the-loop actions. Implemented in Step 7."""
+"""TUI approval prompt: display phase result and ask for approve/reject/request_changes/abort."""
+
+from __future__ import annotations
+
+import typer
+from rich.console import Console
+
+from ..enums import ApprovalDecisionAction
+
+_CONSOLE = Console()
+
+_CHOICE_MAP: dict[str, ApprovalDecisionAction] = {
+    "approve": ApprovalDecisionAction.APPROVE,
+    "a": ApprovalDecisionAction.APPROVE,
+    "reject": ApprovalDecisionAction.REJECT,
+    "r": ApprovalDecisionAction.REJECT,
+    "request_changes": ApprovalDecisionAction.REQUEST_CHANGES,
+    "c": ApprovalDecisionAction.REQUEST_CHANGES,
+    "abort": ApprovalDecisionAction.ABORT,
+    "x": ApprovalDecisionAction.ABORT,
+}
+
+
+async def cli_approval_callback(
+    payload: dict[str, object],
+    gates: list[str],
+) -> ApprovalDecisionAction:
+    """Display the phase result and prompt the user for an approval decision.
+
+    Valid inputs (case-insensitive):
+      approve / a   → APPROVE
+      reject  / r   → REJECT
+      request_changes / c → REQUEST_CHANGES
+      abort   / x   → ABORT
+
+    Any unrecognised input defaults to REJECT.
+    """
+    _CONSOLE.print()
+    _CONSOLE.print(f"[bold cyan]Approval required[/] — gates: {', '.join(gates) or '(none)'}")
+    _CONSOLE.print(f"  phase: {payload.get('phase_key')}")
+    _CONSOLE.print(f"  artifact: {payload.get('artifact_path')}")
+    _CONSOLE.print()
+
+    raw = (
+        typer.prompt(
+            "Decision [approve / reject / request_changes / abort]",
+            default="approve",
+        )
+        .strip()
+        .lower()
+    )
+
+    return _CHOICE_MAP.get(raw, ApprovalDecisionAction.REJECT)