feat(my-deepagent): v0.1.0 Step 0~5 — scaffolding through deepagent + OpenRouter

Python rewrite of the agent harness on top of deepagents 0.6.1 + langchain 1.x, replacing the abandoned TS attempt in packages/. 388 unit/integration tests pass. Steps ----- 0. Scaffolding — uv workspace, ruff/mypy/pre-commit/alembic, src/tests/docs trees with docs/schemas/ seeded from my-deepagent-seed/. 1. Core — config (pydantic-settings with MYDEEPAGENT_ env prefix and TOML source), enums (Backend, Capability, RiskLevel, ApprovalDecisionAction, ApprovalState, RunState, RunPhaseState, SessionState, ErrorClass), errors (MyDeepAgentError + BudgetExhaustedError with PEP-3134 cause + context suppression), hash (canonical JSON + sha256). 2. Persona/Workflow/Binding — pydantic v2 schemas with tuple-based deep immutability (post-construction hash drift prevented), YAML loaders, deterministic auto-select (preferred_backends → version → name → hash), override resolution with ineligibility diagnostics, PersonaConsentStore with fcntl.flock + tmp+fsync+rename atomic write. 3. Artifact schema registry — Draft202012Validator, multi-root resolution, structured ValidationFinding output. 4. Persistence — 18 SQLAlchemy 2.0 async ORM models with FK CASCADE/RESTRICT, WAL + busy_timeout + foreign_keys PRAGMA, alembic baseline + ux_active_run_repo_base partial unique index, LangGraph SqliteSaver as context manager only (lifecycle safety). 5. DeepAgent session — build_agent wires Persona → create_deep_agent with LocalShellBackend / FilesystemBackend / StateBackend / CompositeBackend, ChatOpenAI(base_url=openrouter) for openrouter: model strings, and 4 middleware classes (cost / audit-tool / safety-shell / fallback-model). Critical workarounds -------------------- - deepagents 0.6.1 rejects FilesystemPermission together with backends that implement SandboxBackendProtocol (LocalShellBackend). SafetyShellMiddleware enforces destructive-command and secret-path policy at the tool layer instead, and build_agent strips the permissions kwarg when the persona's deepagents_backend is local_shell. - FilesystemOperation in deepagents is Literal['read', 'write'] only; _map_operations collapses our richer schema (read/write/edit/ls) safely. Real OpenRouter smoke --------------------- test_openrouter_deepagents_local_shell_smoke calls DeepSeek via deepagents + LocalShellBackend + SafetyShellMiddleware end-to-end. PASS, ~$0.000001 cost, input=9 / output=1 tokens with content "OK". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 19:40:02 +09:00
parent 1fe59d16ca
commit 17ba5d723b
100 changed files with 12408 additions and 0 deletions
--- a/my-deepagent/.env.example
+++ b/my-deepagent/.env.example
@@ -0,0 +1,6 @@
+MYDEEPAGENT_OPENROUTER_API_KEY=
+# MYDEEPAGENT_LANGSMITH_TRACING=true
+# MYDEEPAGENT_LANGSMITH_API_KEY=
+# MYDEEPAGENT_LANGSMITH_PROJECT=my-deepagent
+# MYDEEPAGENT_DATA_DIR=
+# MYDEEPAGENT_LANG=ko
--- a/my-deepagent/.gitignore
+++ b/my-deepagent/.gitignore
@@ -0,0 +1,17 @@
+__pycache__/
+*.py[cod]
+*.egg-info/
+.venv/
+.pytest_cache/
+.mypy_cache/
+.ruff_cache/
+
+.env
+.env.local
+
+*.db
+*.db-journal
+*.db-wal
+*.db-shm
+
+.DS_Store
--- a/my-deepagent/.pre-commit-config.yaml
+++ b/my-deepagent/.pre-commit-config.yaml
@@ -0,0 +1,19 @@
+repos:
+  - repo: local
+    hooks:
+      - id: ruff
+        name: ruff check
+        entry: uv run ruff check --fix
+        language: system
+        types: [python]
+      - id: ruff-format
+        name: ruff format
+        entry: uv run ruff format
+        language: system
+        types: [python]
+      - id: mypy
+        name: mypy
+        entry: uv run mypy --strict src
+        language: system
+        types: [python]
+        pass_filenames: false
--- a/my-deepagent/.python-version
+++ b/my-deepagent/.python-version
@@ -0,0 +1 @@
+3.12
--- a/my-deepagent/CHANGELOG.md
+++ b/my-deepagent/CHANGELOG.md
@@ -0,0 +1,26 @@
+# Changelog
+
+## [Unreleased]
+
+### Added
+- persistence/models.py (P0-1): partial unique index `ux_active_run_repo_base` on `runs(repo_path, base_branch) WHERE state NOT IN ('completed','failed','aborted')` — prevents duplicate active runs per repo/branch
+- persistence/models.py (P0-3): FK constraints added to `RunRow.template_id` (RESTRICT), `RunBindingRow.persona_id` (RESTRICT), `InteractiveSessionRow.persona_id` (RESTRICT), `RunEventRow.phase_id` (CASCADE), `ApprovalRequestRow.phase_id` (CASCADE), `ArtifactRow.phase_id` (CASCADE), `ToolCallRow.run_id/phase_id/interactive_session_id` (CASCADE), `LlmCallRow.run_id/phase_id/interactive_session_id` (CASCADE), `PhaseFeedbackRow.run_id/phase_id` (CASCADE)
+- alembic/versions/839f2233e346: new migration adding partial unique index and all FK constraints above; uses SQLite table-rebuild pattern with PRAGMA foreign_keys=OFF/ON guard
+- persistence/checkpointer.py (P0-4): removed `get_checkpointer` (leaking connection helper); only `get_checkpointer_ctx` context manager is now exported
+- tests/integration/test_checkpointer.py: 5 tests for checkpointer ctx lifecycle (file creation, parent dir, connection cleanup, lock-free concurrent use)
+- tests/integration/test_persistence.py: 7 new P0 verification tests (active-run partial index blocks/allows, cascade-delete of phase_feedback+run_phases, RESTRICT on template delete, index exists in sqlite_master)
+- tests/unit/test_session.py: full rewrite to deepagents dataclass API — FilesystemPermission attribute access (.mode/.paths/.operations), build_backend type dispatch (5 cases), _map_operations deduplication (8 cases), _spec_to_permission mapping, updated _subagent_to_dict and _resolve_openrouter_api_key tests; 47 unit tests total
+- tests/integration/test_openrouter_smoke.py: real OpenRouter/DeepSeek smoke test (3 tests, ~$0.001-$0.003/run, max_tokens=50); skipped automatically when no API key is configured; validates ChatOpenAI response, usage_metadata tokens, and deepagents CompiledStateGraph end-to-end
+- pyproject.toml: registered `integration` pytest marker to silence --strict-markers error
+- v0.1.0 scaffolding (Step 0): src/tests/docs trees, ruff/mypy/pre-commit/alembic config
+- Seed assets copied to docs/schemas/ (personas/workflows/artifacts validated)
+- Core module (Step 1): config, enums, errors, hash + unit tests
+- Persona / Workflow / Binding module (Step 2): pydantic schemas, YAML loaders, deterministic auto-select, override, consent store with atomic write
+- Step 1 review patches (P0/P1): exception chain context suppression, classmethod LSP fix, workspace_root realpath canonicalization, config_invalid error mapping
+
+### Changed
+- deepagents 0.6.1 LocalShellBackend + permissions conflict workaround: removed `permissions` block from all 10 seed personas; `SafetyShellMiddleware` now enforces destructive-command + secret-path policy at the tool layer for local_shell backend agents.
+- `build_agent` automatically prepends `SafetyShellMiddleware` to every agent and skips `permissions` kwarg when `deepagents_backend == "local_shell"`.
+- `SafetyShellMiddleware` extended with secret-path enforcement: `read_file`/`write_file`/`edit_file`/`ls` tool calls are blocked when `file_path`/`path` matches any `DENY_PATH_PATTERNS` glob (wcmatch GLOBSTAR|IGNORECASE|DOTGLOB).
+- All env vars require `MYDEEPAGENT_` prefix (e.g. `MYDEEPAGENT_OPENROUTER_API_KEY`, `MYDEEPAGENT_BUDGET_DAILY_USD`). `.env.example` updated accordingly. This isolates my-deepagent's env namespace from other tools.
+- Persona / Workflow / FilesystemPermission models now store list-valued fields as tuples (deep immutability — prevents post-construction mutation that would invalidate compute_hash()).
--- a/my-deepagent/alembic.ini
+++ b/my-deepagent/alembic.ini
@@ -0,0 +1,149 @@
+# A generic, single database configuration.
+
+[alembic]
+# path to migration scripts.
+# this is typically a path given in POSIX (e.g. forward slashes)
+# format, relative to the token %(here)s which refers to the location of this
+# ini file
+script_location = %(here)s/alembic
+
+# template used to generate migration file names; The default value is %%(rev)s_%%(slug)s
+# Uncomment the line below if you want the files to be prepended with date and time
+# see https://alembic.sqlalchemy.org/en/latest/tutorial.html#editing-the-ini-file
+# for all available tokens
+# file_template = %%(year)d_%%(month).2d_%%(day).2d_%%(hour).2d%%(minute).2d-%%(rev)s_%%(slug)s
+# Or organize into date-based subdirectories (requires recursive_version_locations = true)
+# file_template = %%(year)d/%%(month).2d/%%(day).2d_%%(hour).2d%%(minute).2d_%%(second).2d_%%(rev)s_%%(slug)s
+
+# sys.path path, will be prepended to sys.path if present.
+# defaults to the current working directory.  for multiple paths, the path separator
+# is defined by "path_separator" below.
+prepend_sys_path = .
+
+
+# timezone to use when rendering the date within the migration file
+# as well as the filename.
+# If specified, requires the tzdata library which can be installed by adding
+# `alembic[tz]` to the pip requirements.
+# string value is passed to ZoneInfo()
+# leave blank for localtime
+# timezone =
+
+# max length of characters to apply to the "slug" field
+# truncate_slug_length = 40
+
+# set to 'true' to run the environment during
+# the 'revision' command, regardless of autogenerate
+# revision_environment = false
+
+# set to 'true' to allow .pyc and .pyo files without
+# a source .py file to be detected as revisions in the
+# versions/ directory
+# sourceless = false
+
+# version location specification; This defaults
+# to <script_location>/versions.  When using multiple version
+# directories, initial revisions must be specified with --version-path.
+# The path separator used here should be the separator specified by "path_separator"
+# below.
+# version_locations = %(here)s/bar:%(here)s/bat:%(here)s/alembic/versions
+
+# path_separator; This indicates what character is used to split lists of file
+# paths, including version_locations and prepend_sys_path within configparser
+# files such as alembic.ini.
+# The default rendered in new alembic.ini files is "os", which uses os.pathsep
+# to provide os-dependent path splitting.
+#
+# Note that in order to support legacy alembic.ini files, this default does NOT
+# take place if path_separator is not present in alembic.ini.  If this
+# option is omitted entirely, fallback logic is as follows:
+#
+# 1. Parsing of the version_locations option falls back to using the legacy
+#    "version_path_separator" key, which if absent then falls back to the legacy
+#    behavior of splitting on spaces and/or commas.
+# 2. Parsing of the prepend_sys_path option falls back to the legacy
+#    behavior of splitting on spaces, commas, or colons.
+#
+# Valid values for path_separator are:
+#
+# path_separator = :
+# path_separator = ;
+# path_separator = space
+# path_separator = newline
+#
+# Use os.pathsep. Default configuration used for new projects.
+path_separator = os
+
+# set to 'true' to search source files recursively
+# in each "version_locations" directory
+# new in Alembic version 1.10
+# recursive_version_locations = false
+
+# the output encoding used when revision files
+# are written from script.py.mako
+# output_encoding = utf-8
+
+# database URL.  This is consumed by the user-maintained env.py script only.
+# other means of configuring database URLs may be customized within the env.py
+# file.
+sqlalchemy.url = driver://user:pass@localhost/dbname
+
+
+[post_write_hooks]
+# post_write_hooks defines scripts or Python functions that are run
+# on newly generated revision scripts.  See the documentation for further
+# detail and examples
+
+# format using "black" - use the console_scripts runner, against the "black" entrypoint
+# hooks = black
+# black.type = console_scripts
+# black.entrypoint = black
+# black.options = -l 79 REVISION_SCRIPT_FILENAME
+
+# lint with attempts to fix using "ruff" - use the module runner, against the "ruff" module
+# hooks = ruff
+# ruff.type = module
+# ruff.module = ruff
+# ruff.options = check --fix REVISION_SCRIPT_FILENAME
+
+# Alternatively, use the exec runner to execute a binary found on your PATH
+# hooks = ruff
+# ruff.type = exec
+# ruff.executable = ruff
+# ruff.options = check --fix REVISION_SCRIPT_FILENAME
+
+# Logging configuration.  This is also consumed by the user-maintained
+# env.py script only.
+[loggers]
+keys = root,sqlalchemy,alembic
+
+[handlers]
+keys = console
+
+[formatters]
+keys = generic
+
+[logger_root]
+level = WARNING
+handlers = console
+qualname =
+
+[logger_sqlalchemy]
+level = WARNING
+handlers =
+qualname = sqlalchemy.engine
+
+[logger_alembic]
+level = INFO
+handlers =
+qualname = alembic
+
+[handler_console]
+class = StreamHandler
+args = (sys.stderr,)
+level = NOTSET
+formatter = generic
+
+[formatter_generic]
+format = %(levelname)-5.5s [%(name)s] %(message)s
+datefmt = %H:%M:%S
--- a/my-deepagent/alembic/README
+++ b/my-deepagent/alembic/README
@@ -0,0 +1 @@
+Generic single-database configuration.
--- a/my-deepagent/alembic/env.py
+++ b/my-deepagent/alembic/env.py
@@ -0,0 +1,83 @@
+import os
+from logging.config import fileConfig
+
+from sqlalchemy import engine_from_config, pool
+
+from alembic import context
+
+# this is the Alembic Config object, which provides
+# access to the values within the .ini file in use.
+config = context.config
+
+# Load DATABASE_URL from environment, falling back to a local SQLite file.
+# Alembic uses synchronous SQLAlchemy, so strip the async driver prefix when
+# present (sqlite+aiosqlite:// → sqlite://).
+_raw_url: str = os.environ.get("DATABASE_URL", "sqlite:///./database.sqlite3")
+_sync_url: str = _raw_url.replace("sqlite+aiosqlite://", "sqlite://")
+config.set_main_option("sqlalchemy.url", _sync_url)
+
+# Interpret the config file for Python logging.
+# This line sets up loggers basically.
+if config.config_file_name is not None:
+    fileConfig(config.config_file_name)
+
+# add your model's MetaData object here
+# for 'autogenerate' support
+from my_deepagent.persistence.models import Base  # noqa: E402
+
+target_metadata = Base.metadata
+
+# other values from the config, defined by the needs of env.py,
+# can be acquired:
+# my_important_option = config.get_main_option("my_important_option")
+# ... etc.
+
+
+def run_migrations_offline() -> None:
+    """Run migrations in 'offline' mode.
+
+    This configures the context with just a URL
+    and not an Engine, though an Engine is acceptable
+    here as well.  By skipping the Engine creation
+    we don't even need a DBAPI to be available.
+
+    Calls to context.execute() here emit the given string to the
+    script output.
+
+    """
+    url = config.get_main_option("sqlalchemy.url")
+    context.configure(
+        url=url,
+        target_metadata=target_metadata,
+        literal_binds=True,
+        dialect_opts={"paramstyle": "named"},
+    )
+
+    with context.begin_transaction():
+        context.run_migrations()
+
+
+def run_migrations_online() -> None:
+    """Run migrations in 'online' mode.
+
+    In this scenario we need to create an Engine
+    and associate a connection with the context.
+
+    """
+    connectable = engine_from_config(
+        config.get_section(config.config_ini_section, {}),
+        prefix="sqlalchemy.",
+        poolclass=pool.NullPool,
+    )
+
+    with connectable.connect() as connection:
+        context.configure(connection=connection, target_metadata=target_metadata)
+
+        with context.begin_transaction():
+            context.run_migrations()
+
+
+if context.is_offline_mode():
+    run_migrations_offline()
+else:
+    run_migrations_online()
--- a/my-deepagent/alembic/script.py.mako
+++ b/my-deepagent/alembic/script.py.mako
@@ -0,0 +1,28 @@
+"""${message}
+
+Revision ID: ${up_revision}
+Revises: ${down_revision | comma,n}
+Create Date: ${create_date}
+
+"""
+from typing import Sequence, Union
+
+from alembic import op
+import sqlalchemy as sa
+${imports if imports else ""}
+
+# revision identifiers, used by Alembic.
+revision: str = ${repr(up_revision)}
+down_revision: Union[str, Sequence[str], None] = ${repr(down_revision)}
+branch_labels: Union[str, Sequence[str], None] = ${repr(branch_labels)}
+depends_on: Union[str, Sequence[str], None] = ${repr(depends_on)}
+
+
+def upgrade() -> None:
+    """Upgrade schema."""
+    ${upgrades if upgrades else "pass"}
+
+
+def downgrade() -> None:
+    """Downgrade schema."""
+    ${downgrades if downgrades else "pass"}
--- a/my-deepagent/alembic/versions/79945fdc2649_baseline_schema_for_v0_1_0.py
+++ b/my-deepagent/alembic/versions/79945fdc2649_baseline_schema_for_v0_1_0.py
@@ -0,0 +1,303 @@
+"""baseline schema for v0.1.0
+
+Revision ID: 79945fdc2649
+Revises:
+Create Date: 2026-05-15 17:19:09.577439
+
+"""
+
+from collections.abc import Sequence
+
+import sqlalchemy as sa
+
+from alembic import op
+
+# revision identifiers, used by Alembic.
+revision: str = "79945fdc2649"
+down_revision: str | Sequence[str] | None = None
+branch_labels: str | Sequence[str] | None = None
+depends_on: str | Sequence[str] | None = None
+
+
+def upgrade() -> None:
+    """Upgrade schema."""
+    # ### commands auto generated by Alembic - please adjust! ###
+    op.create_table(
+        "agent_personas",
+        sa.Column("id", sa.String(length=36), nullable=False),
+        sa.Column("name", sa.Text(), nullable=False),
+        sa.Column("version", sa.Integer(), nullable=False),
+        sa.Column("hash", sa.Text(), nullable=False),
+        sa.Column("definition", sa.JSON(), nullable=False),
+        sa.Column("created_at", sa.Text(), nullable=False),
+        sa.PrimaryKeyConstraint("id"),
+        sa.UniqueConstraint("hash"),
+    )
+    op.create_table(
+        "budget_ledger",
+        sa.Column("scope", sa.Text(), nullable=False),
+        sa.Column("spent_usd", sa.Float(), nullable=False),
+        sa.Column("cap_usd", sa.Float(), nullable=True),
+        sa.Column("last_updated", sa.Text(), nullable=False),
+        sa.PrimaryKeyConstraint("scope"),
+    )
+    op.create_table(
+        "interactive_sessions",
+        sa.Column("id", sa.String(length=36), nullable=False),
+        sa.Column("persona_id", sa.String(length=36), nullable=False),
+        sa.Column("persona_hash", sa.Text(), nullable=False),
+        sa.Column("started_at", sa.Text(), nullable=True),
+        sa.Column("ended_at", sa.Text(), nullable=True),
+        sa.Column("last_message_at", sa.Text(), nullable=True),
+        sa.Column("state", sa.Text(), nullable=False),
+        sa.PrimaryKeyConstraint("id"),
+    )
+    op.create_table(
+        "llm_calls",
+        sa.Column("id", sa.Integer(), autoincrement=True, nullable=False),
+        sa.Column("run_id", sa.String(length=36), nullable=True),
+        sa.Column("phase_id", sa.String(length=36), nullable=True),
+        sa.Column("interactive_session_id", sa.String(length=36), nullable=True),
+        sa.Column("thread_id", sa.Text(), nullable=False),
+        sa.Column("persona_name", sa.Text(), nullable=False),
+        sa.Column("persona_version", sa.Integer(), nullable=False),
+        sa.Column("model", sa.Text(), nullable=False),
+        sa.Column("role", sa.Text(), nullable=False),
+        sa.Column("turn_index", sa.Integer(), nullable=False),
+        sa.Column("input_tokens", sa.Integer(), nullable=False),
+        sa.Column("output_tokens", sa.Integer(), nullable=False),
+        sa.Column("cached_tokens", sa.Integer(), nullable=False),
+        sa.Column("reasoning_tokens", sa.Integer(), nullable=False),
+        sa.Column("cost_usd_input", sa.Float(), nullable=False),
+        sa.Column("cost_usd_output", sa.Float(), nullable=False),
+        sa.Column("cost_usd_total", sa.Float(), nullable=False),
+        sa.Column("latency_ms", sa.Integer(), nullable=False),
+        sa.Column("status", sa.Text(), nullable=False),
+        sa.Column("error_code", sa.Text(), nullable=True),
+        sa.Column("request_id", sa.Text(), nullable=True),
+        sa.Column("ts", sa.Text(), nullable=False),
+        sa.PrimaryKeyConstraint("id"),
+    )
+    op.create_index(
+        "llm_calls_interactive_session_id_ts_idx",
+        "llm_calls",
+        ["interactive_session_id", "ts"],
+        unique=False,
+    )
+    op.create_index("llm_calls_model_ts_idx", "llm_calls", ["model", "ts"], unique=False)
+    op.create_index("llm_calls_run_id_ts_idx", "llm_calls", ["run_id", "ts"], unique=False)
+    op.create_table(
+        "model_pricing",
+        sa.Column("model", sa.Text(), nullable=False),
+        sa.Column("input_per_1k_usd", sa.Float(), nullable=False),
+        sa.Column("output_per_1k_usd", sa.Float(), nullable=False),
+        sa.Column("context_length", sa.Integer(), nullable=False),
+        sa.Column("fetched_at", sa.Text(), nullable=False),
+        sa.Column("raw_payload", sa.Text(), nullable=False),
+        sa.PrimaryKeyConstraint("model"),
+    )
+    op.create_table(
+        "persona_consents",
+        sa.Column("persona_hash", sa.Text(), nullable=False),
+        sa.Column("persona_name", sa.Text(), nullable=False),
+        sa.Column("persona_version", sa.Integer(), nullable=False),
+        sa.Column("decision", sa.Text(), nullable=False),
+        sa.Column("decided_at", sa.Text(), nullable=False),
+        sa.PrimaryKeyConstraint("persona_hash"),
+    )
+    op.create_table(
+        "phase_feedback",
+        sa.Column("id", sa.Integer(), autoincrement=True, nullable=False),
+        sa.Column("run_id", sa.String(length=36), nullable=False),
+        sa.Column("phase_id", sa.String(length=36), nullable=False),
+        sa.Column("reaction", sa.Text(), nullable=True),
+        sa.Column("comment", sa.Text(), nullable=True),
+        sa.Column("created_at", sa.Text(), nullable=False),
+        sa.PrimaryKeyConstraint("id"),
+    )
+    op.create_table(
+        "runs",
+        sa.Column("id", sa.String(length=36), nullable=False),
+        sa.Column("template_id", sa.String(length=36), nullable=False),
+        sa.Column("template_hash", sa.Text(), nullable=False),
+        sa.Column("state", sa.Text(), nullable=False),
+        sa.Column("repo_path", sa.Text(), nullable=False),
+        sa.Column("base_branch", sa.Text(), nullable=False),
+        sa.Column("worktree_root", sa.Text(), nullable=False),
+        sa.Column("current_phase_id", sa.String(length=36), nullable=True),
+        sa.Column("started_at", sa.Text(), nullable=True),
+        sa.Column("ended_at", sa.Text(), nullable=True),
+        sa.Column("final_report_path", sa.Text(), nullable=True),
+        sa.Column("paused_from_state", sa.Text(), nullable=True),
+        sa.Column("created_at", sa.Text(), nullable=False),
+        sa.Column("updated_at", sa.Text(), nullable=False),
+        sa.PrimaryKeyConstraint("id"),
+    )
+    op.create_table(
+        "tool_calls",
+        sa.Column("id", sa.Integer(), autoincrement=True, nullable=False),
+        sa.Column("run_id", sa.String(length=36), nullable=True),
+        sa.Column("phase_id", sa.String(length=36), nullable=True),
+        sa.Column("interactive_session_id", sa.String(length=36), nullable=True),
+        sa.Column("tool_name", sa.Text(), nullable=False),
+        sa.Column("args", sa.JSON(), nullable=False),
+        sa.Column("result", sa.JSON(), nullable=True),
+        sa.Column("error", sa.Text(), nullable=True),
+        sa.Column("duration_ms", sa.Integer(), nullable=False),
+        sa.Column("ts", sa.Text(), nullable=False),
+        sa.PrimaryKeyConstraint("id"),
+    )
+    op.create_index("tool_calls_run_id_ts_idx", "tool_calls", ["run_id", "ts"], unique=False)
+    op.create_table(
+        "workflow_templates",
+        sa.Column("id", sa.String(length=36), nullable=False),
+        sa.Column("name", sa.Text(), nullable=False),
+        sa.Column("version", sa.Integer(), nullable=False),
+        sa.Column("hash", sa.Text(), nullable=False),
+        sa.Column("definition", sa.JSON(), nullable=False),
+        sa.Column("created_at", sa.Text(), nullable=False),
+        sa.PrimaryKeyConstraint("id"),
+        sa.UniqueConstraint("hash"),
+    )
+    op.create_table(
+        "approval_requests",
+        sa.Column("id", sa.String(length=36), nullable=False),
+        sa.Column("run_id", sa.String(length=36), nullable=False),
+        sa.Column("phase_id", sa.String(length=36), nullable=True),
+        sa.Column("gate_key", sa.Text(), nullable=False),
+        sa.Column("state", sa.Text(), nullable=False),
+        sa.Column("idempotency_key", sa.Text(), nullable=False),
+        sa.Column("payload", sa.JSON(), nullable=False),
+        sa.Column("created_at", sa.Text(), nullable=False),
+        sa.Column("resolved_at", sa.Text(), nullable=True),
+        sa.ForeignKeyConstraint(["run_id"], ["runs.id"], ondelete="CASCADE"),
+        sa.PrimaryKeyConstraint("id"),
+        sa.UniqueConstraint("idempotency_key"),
+    )
+    op.create_table(
+        "artifacts",
+        sa.Column("id", sa.String(length=36), nullable=False),
+        sa.Column("run_id", sa.String(length=36), nullable=False),
+        sa.Column("phase_id", sa.String(length=36), nullable=True),
+        sa.Column("path", sa.Text(), nullable=False),
+        sa.Column("schema_id", sa.Text(), nullable=False),
+        sa.Column("hash", sa.Text(), nullable=False),
+        sa.Column("valid", sa.Boolean(), nullable=False),
+        sa.Column("validation_error", sa.JSON(), nullable=True),
+        sa.Column("created_at", sa.Text(), nullable=False),
+        sa.ForeignKeyConstraint(["run_id"], ["runs.id"], ondelete="CASCADE"),
+        sa.PrimaryKeyConstraint("id"),
+        sa.UniqueConstraint("run_id", "path", "hash", name="uq_artifacts_run_path_hash"),
+    )
+    op.create_table(
+        "run_bindings",
+        sa.Column("id", sa.String(length=36), nullable=False),
+        sa.Column("run_id", sa.String(length=36), nullable=False),
+        sa.Column("role_id", sa.Text(), nullable=False),
+        sa.Column("persona_id", sa.String(length=36), nullable=False),
+        sa.Column("persona_hash", sa.Text(), nullable=False),
+        sa.Column("backend", sa.Text(), nullable=False),
+        sa.Column("binding_hash", sa.Text(), nullable=False),
+        sa.ForeignKeyConstraint(["run_id"], ["runs.id"], ondelete="CASCADE"),
+        sa.PrimaryKeyConstraint("id"),
+        sa.UniqueConstraint("run_id", "role_id", name="uq_run_bindings_run_role"),
+    )
+    op.create_table(
+        "run_commands",
+        sa.Column("id", sa.Integer(), autoincrement=True, nullable=False),
+        sa.Column("run_id", sa.String(length=36), nullable=False),
+        sa.Column("command", sa.Text(), nullable=False),
+        sa.Column("payload", sa.JSON(), nullable=False),
+        sa.Column("idempotency_key", sa.Text(), nullable=False),
+        sa.Column("created_at", sa.Text(), nullable=False),
+        sa.Column("processed_at", sa.Text(), nullable=True),
+        sa.ForeignKeyConstraint(["run_id"], ["runs.id"], ondelete="CASCADE"),
+        sa.PrimaryKeyConstraint("id"),
+        sa.UniqueConstraint("idempotency_key"),
+    )
+    op.create_table(
+        "run_events",
+        sa.Column("id", sa.Integer(), autoincrement=True, nullable=False),
+        sa.Column("run_id", sa.String(length=36), nullable=False),
+        sa.Column("phase_id", sa.String(length=36), nullable=True),
+        sa.Column("seq", sa.Integer(), nullable=False),
+        sa.Column("type", sa.Text(), nullable=False),
+        sa.Column("payload", sa.JSON(), nullable=False),
+        sa.Column("idempotency_key", sa.Text(), nullable=False),
+        sa.Column("ts", sa.Text(), nullable=False),
+        sa.ForeignKeyConstraint(["run_id"], ["runs.id"], ondelete="CASCADE"),
+        sa.PrimaryKeyConstraint("id"),
+        sa.UniqueConstraint("run_id", "idempotency_key", name="uq_run_events_run_idempotency"),
+        sa.UniqueConstraint("run_id", "seq", name="uq_run_events_run_seq"),
+    )
+    op.create_index("run_events_run_id_ts_idx", "run_events", ["run_id", "ts"], unique=False)
+    op.create_table(
+        "run_inputs",
+        sa.Column("id", sa.String(length=36), nullable=False),
+        sa.Column("run_id", sa.String(length=36), nullable=False),
+        sa.Column("requirements_md", sa.Text(), nullable=False),
+        sa.Column("objective", sa.JSON(), nullable=False),
+        sa.Column("extra", sa.JSON(), nullable=False),
+        sa.Column("input_hash", sa.Text(), nullable=False),
+        sa.ForeignKeyConstraint(["run_id"], ["runs.id"], ondelete="CASCADE"),
+        sa.PrimaryKeyConstraint("id"),
+        sa.UniqueConstraint("run_id"),
+    )
+    op.create_table(
+        "run_phases",
+        sa.Column("id", sa.String(length=36), nullable=False),
+        sa.Column("run_id", sa.String(length=36), nullable=False),
+        sa.Column("phase_key", sa.Text(), nullable=False),
+        sa.Column("seq", sa.Integer(), nullable=False),
+        sa.Column("state", sa.Text(), nullable=False),
+        sa.Column("attempts", sa.Integer(), nullable=False),
+        sa.Column("started_at", sa.Text(), nullable=True),
+        sa.Column("ended_at", sa.Text(), nullable=True),
+        sa.ForeignKeyConstraint(["run_id"], ["runs.id"], ondelete="CASCADE"),
+        sa.PrimaryKeyConstraint("id"),
+        sa.UniqueConstraint("run_id", "phase_key", name="uq_run_phases_run_phase"),
+    )
+    op.create_table(
+        "approval_decisions",
+        sa.Column("id", sa.String(length=36), nullable=False),
+        sa.Column("approval_request_id", sa.String(length=36), nullable=False),
+        sa.Column("action", sa.Text(), nullable=False),
+        sa.Column("comment", sa.Text(), nullable=True),
+        sa.Column("decided_at", sa.Text(), nullable=False),
+        sa.Column("idempotency_key", sa.Text(), nullable=False),
+        sa.ForeignKeyConstraint(
+            ["approval_request_id"], ["approval_requests.id"], ondelete="CASCADE"
+        ),
+        sa.PrimaryKeyConstraint("id"),
+        sa.UniqueConstraint("idempotency_key"),
+    )
+    # ### end Alembic commands ###
+
+
+def downgrade() -> None:
+    """Downgrade schema."""
+    # ### commands auto generated by Alembic - please adjust! ###
+    op.drop_table("approval_decisions")
+    op.drop_table("run_phases")
+    op.drop_table("run_inputs")
+    op.drop_index("run_events_run_id_ts_idx", table_name="run_events")
+    op.drop_table("run_events")
+    op.drop_table("run_commands")
+    op.drop_table("run_bindings")
+    op.drop_table("artifacts")
+    op.drop_table("approval_requests")
+    op.drop_table("workflow_templates")
+    op.drop_index("tool_calls_run_id_ts_idx", table_name="tool_calls")
+    op.drop_table("tool_calls")
+    op.drop_table("runs")
+    op.drop_table("phase_feedback")
+    op.drop_table("persona_consents")
+    op.drop_table("model_pricing")
+    op.drop_index("llm_calls_run_id_ts_idx", table_name="llm_calls")
+    op.drop_index("llm_calls_model_ts_idx", table_name="llm_calls")
+    op.drop_index("llm_calls_interactive_session_id_ts_idx", table_name="llm_calls")
+    op.drop_table("llm_calls")
+    op.drop_table("interactive_sessions")
+    op.drop_table("budget_ledger")
+    op.drop_table("agent_personas")
+    # ### end Alembic commands ###
--- a/my-deepagent/alembic/versions/839f2233e346_add_active_run_partial_unique_index_and_.py
+++ b/my-deepagent/alembic/versions/839f2233e346_add_active_run_partial_unique_index_and_.py
@@ -0,0 +1,638 @@
+"""add active-run partial unique index and FK constraints
+
+Revision ID: 839f2233e346
+Revises: 79945fdc2649
+Create Date: 2026-05-15 18:51:14.343577
+
+Notes:
+    - P0-1: Adds partial unique index ux_active_run_repo_base on runs(repo_path, base_branch)
+      WHERE state NOT IN ('completed', 'failed', 'aborted').  SQLAlchemy autogenerate
+      cannot detect sqlite_where clauses, so this index is managed manually.
+    - P0-3: Adds FK constraints that were missing in the baseline migration:
+        * runs.template_id  -> workflow_templates.id  RESTRICT
+        * run_bindings.persona_id -> agent_personas.id  RESTRICT
+        * interactive_sessions.persona_id -> agent_personas.id  RESTRICT
+        * run_events.phase_id -> run_phases.id  CASCADE
+        * approval_requests.phase_id -> run_phases.id  CASCADE
+        * artifacts.phase_id -> run_phases.id  CASCADE
+        * tool_calls.run_id -> runs.id  CASCADE
+        * tool_calls.phase_id -> run_phases.id  CASCADE
+        * tool_calls.interactive_session_id -> interactive_sessions.id  CASCADE
+        * llm_calls.run_id -> runs.id  CASCADE
+        * llm_calls.phase_id -> run_phases.id  CASCADE
+        * llm_calls.interactive_session_id -> interactive_sessions.id  CASCADE
+        * phase_feedback.run_id -> runs.id  CASCADE
+        * phase_feedback.phase_id -> run_phases.id  CASCADE
+    - runs.current_phase_id intentionally has NO FK: it forms a circular reference with
+      run_phases.run_id.  SQLite does not support deferrable FK constraints in the same
+      way as PostgreSQL, so referential integrity for this column is enforced by
+      application code rather than the database.
+    - SQLite does not support ADD CONSTRAINT via ALTER TABLE.  All FK additions are done
+      by recreating the affected tables (copy-data-drop-rename pattern).
+"""
+
+from __future__ import annotations
+
+from collections.abc import Sequence
+
+from alembic import op
+
+# revision identifiers, used by Alembic.
+revision: str = "839f2233e346"
+down_revision: str | Sequence[str] | None = "79945fdc2649"
+branch_labels: str | Sequence[str] | None = None
+depends_on: str | Sequence[str] | None = None
+
+
+def upgrade() -> None:
+    """Upgrade schema.
+
+    SQLite does not support ALTER TABLE ... ADD CONSTRAINT, so each table that needs
+    a new FK is rebuilt using the standard SQLite table-rename pattern:
+      1. Disable FK enforcement during rebuild (PRAGMA foreign_keys=OFF).
+      2. Create new table with correct FK constraints.
+      3. Copy data from old table.
+      4. Drop old table.
+      5. Rename new table to original name.
+      6. Re-enable FK enforcement (PRAGMA foreign_keys=ON).
+
+    Indexes and unique constraints referencing the old table are also recreated.
+    """
+    # Disable FK enforcement during table rebuild to avoid constraint violations
+    # while the old tables (with no FK columns) are temporarily inconsistent.
+    op.execute("PRAGMA foreign_keys=OFF")
+
+    # ------------------------------------------------------------------
+    # runs: add template_id FK (RESTRICT) + P0-1 partial unique index.
+    # Rebuild because SQLite cannot ADD CONSTRAINT.
+    # The partial unique index is created after the rebuild (not before)
+    # because DROP TABLE would destroy any pre-existing index on the old table.
+    # ------------------------------------------------------------------
+    op.execute(
+        """
+        CREATE TABLE runs_new (
+            id TEXT NOT NULL,
+            template_id TEXT NOT NULL
+                REFERENCES workflow_templates (id) ON DELETE RESTRICT,
+            template_hash TEXT NOT NULL,
+            state TEXT NOT NULL,
+            repo_path TEXT NOT NULL,
+            base_branch TEXT NOT NULL,
+            worktree_root TEXT NOT NULL,
+            current_phase_id TEXT,
+            started_at TEXT,
+            ended_at TEXT,
+            final_report_path TEXT,
+            paused_from_state TEXT,
+            created_at TEXT NOT NULL,
+            updated_at TEXT NOT NULL,
+            PRIMARY KEY (id)
+        )
+        """
+    )
+    op.execute(
+        "INSERT INTO runs_new SELECT id, template_id, template_hash, state, "
+        "repo_path, base_branch, worktree_root, current_phase_id, "
+        "started_at, ended_at, final_report_path, paused_from_state, "
+        "created_at, updated_at FROM runs"
+    )
+    op.execute("DROP TABLE runs")
+    op.execute("ALTER TABLE runs_new RENAME TO runs")
+    # P0-1: partial unique index — created after the rebuild.
+    op.execute(
+        "CREATE UNIQUE INDEX ux_active_run_repo_base "
+        "ON runs (repo_path, base_branch) "
+        "WHERE state NOT IN ('completed', 'failed', 'aborted')"
+    )
+
+    # ------------------------------------------------------------------
+    # run_bindings: add persona_id FK (RESTRICT)
+    # ------------------------------------------------------------------
+    op.execute(
+        """
+        CREATE TABLE run_bindings_new (
+            id TEXT NOT NULL,
+            run_id TEXT NOT NULL
+                REFERENCES runs (id) ON DELETE CASCADE,
+            role_id TEXT NOT NULL,
+            persona_id TEXT NOT NULL
+                REFERENCES agent_personas (id) ON DELETE RESTRICT,
+            persona_hash TEXT NOT NULL,
+            backend TEXT NOT NULL,
+            binding_hash TEXT NOT NULL,
+            PRIMARY KEY (id),
+            UNIQUE (run_id, role_id)
+        )
+        """
+    )
+    op.execute(
+        "INSERT INTO run_bindings_new SELECT id, run_id, role_id, persona_id, "
+        "persona_hash, backend, binding_hash FROM run_bindings"
+    )
+    op.execute("DROP TABLE run_bindings")
+    op.execute("ALTER TABLE run_bindings_new RENAME TO run_bindings")
+
+    # ------------------------------------------------------------------
+    # interactive_sessions: add persona_id FK (RESTRICT)
+    # ------------------------------------------------------------------
+    op.execute(
+        """
+        CREATE TABLE interactive_sessions_new (
+            id TEXT NOT NULL,
+            persona_id TEXT NOT NULL
+                REFERENCES agent_personas (id) ON DELETE RESTRICT,
+            persona_hash TEXT NOT NULL,
+            started_at TEXT,
+            ended_at TEXT,
+            last_message_at TEXT,
+            state TEXT NOT NULL,
+            PRIMARY KEY (id)
+        )
+        """
+    )
+    op.execute(
+        "INSERT INTO interactive_sessions_new SELECT id, persona_id, persona_hash, "
+        "started_at, ended_at, last_message_at, state FROM interactive_sessions"
+    )
+    op.execute("DROP TABLE interactive_sessions")
+    op.execute("ALTER TABLE interactive_sessions_new RENAME TO interactive_sessions")
+
+    # ------------------------------------------------------------------
+    # run_events: add phase_id FK (CASCADE)
+    # ------------------------------------------------------------------
+    op.execute(
+        """
+        CREATE TABLE run_events_new (
+            id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
+            run_id TEXT NOT NULL
+                REFERENCES runs (id) ON DELETE CASCADE,
+            phase_id TEXT
+                REFERENCES run_phases (id) ON DELETE CASCADE,
+            seq INTEGER NOT NULL,
+            type TEXT NOT NULL,
+            payload JSON NOT NULL,
+            idempotency_key TEXT NOT NULL,
+            ts TEXT NOT NULL,
+            UNIQUE (run_id, seq),
+            UNIQUE (run_id, idempotency_key)
+        )
+        """
+    )
+    op.execute(
+        "INSERT INTO run_events_new SELECT id, run_id, phase_id, seq, type, "
+        "payload, idempotency_key, ts FROM run_events"
+    )
+    op.execute("DROP INDEX IF EXISTS run_events_run_id_ts_idx")
+    op.execute("DROP TABLE run_events")
+    op.execute("ALTER TABLE run_events_new RENAME TO run_events")
+    op.execute("CREATE INDEX run_events_run_id_ts_idx ON run_events (run_id, ts)")
+
+    # ------------------------------------------------------------------
+    # approval_requests: add phase_id FK (CASCADE)
+    # ------------------------------------------------------------------
+    op.execute(
+        """
+        CREATE TABLE approval_requests_new (
+            id TEXT NOT NULL,
+            run_id TEXT NOT NULL
+                REFERENCES runs (id) ON DELETE CASCADE,
+            phase_id TEXT
+                REFERENCES run_phases (id) ON DELETE CASCADE,
+            gate_key TEXT NOT NULL,
+            state TEXT NOT NULL,
+            idempotency_key TEXT NOT NULL,
+            payload JSON NOT NULL,
+            created_at TEXT NOT NULL,
+            resolved_at TEXT,
+            PRIMARY KEY (id),
+            UNIQUE (idempotency_key)
+        )
+        """
+    )
+    op.execute(
+        "INSERT INTO approval_requests_new SELECT id, run_id, phase_id, gate_key, "
+        "state, idempotency_key, payload, created_at, resolved_at FROM approval_requests"
+    )
+    op.execute("DROP TABLE approval_requests")
+    op.execute("ALTER TABLE approval_requests_new RENAME TO approval_requests")
+
+    # ------------------------------------------------------------------
+    # artifacts: add phase_id FK (CASCADE)
+    # ------------------------------------------------------------------
+    op.execute(
+        """
+        CREATE TABLE artifacts_new (
+            id TEXT NOT NULL,
+            run_id TEXT NOT NULL
+                REFERENCES runs (id) ON DELETE CASCADE,
+            phase_id TEXT
+                REFERENCES run_phases (id) ON DELETE CASCADE,
+            path TEXT NOT NULL,
+            schema_id TEXT NOT NULL,
+            hash TEXT NOT NULL,
+            valid INTEGER NOT NULL,
+            validation_error JSON,
+            created_at TEXT NOT NULL,
+            PRIMARY KEY (id),
+            UNIQUE (run_id, path, hash)
+        )
+        """
+    )
+    op.execute(
+        "INSERT INTO artifacts_new SELECT id, run_id, phase_id, path, schema_id, "
+        "hash, valid, validation_error, created_at FROM artifacts"
+    )
+    op.execute("DROP TABLE artifacts")
+    op.execute("ALTER TABLE artifacts_new RENAME TO artifacts")
+
+    # ------------------------------------------------------------------
+    # tool_calls: add run_id / phase_id / interactive_session_id FKs (CASCADE)
+    # ------------------------------------------------------------------
+    op.execute(
+        """
+        CREATE TABLE tool_calls_new (
+            id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
+            run_id TEXT
+                REFERENCES runs (id) ON DELETE CASCADE,
+            phase_id TEXT
+                REFERENCES run_phases (id) ON DELETE CASCADE,
+            interactive_session_id TEXT
+                REFERENCES interactive_sessions (id) ON DELETE CASCADE,
+            tool_name TEXT NOT NULL,
+            args JSON NOT NULL,
+            result JSON,
+            error TEXT,
+            duration_ms INTEGER NOT NULL,
+            ts TEXT NOT NULL
+        )
+        """
+    )
+    op.execute(
+        "INSERT INTO tool_calls_new SELECT id, run_id, phase_id, interactive_session_id, "
+        "tool_name, args, result, error, duration_ms, ts FROM tool_calls"
+    )
+    op.execute("DROP INDEX IF EXISTS tool_calls_run_id_ts_idx")
+    op.execute("DROP TABLE tool_calls")
+    op.execute("ALTER TABLE tool_calls_new RENAME TO tool_calls")
+    op.execute("CREATE INDEX tool_calls_run_id_ts_idx ON tool_calls (run_id, ts)")
+
+    # ------------------------------------------------------------------
+    # llm_calls: add run_id / phase_id / interactive_session_id FKs (CASCADE)
+    # ------------------------------------------------------------------
+    op.execute(
+        """
+        CREATE TABLE llm_calls_new (
+            id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
+            run_id TEXT
+                REFERENCES runs (id) ON DELETE CASCADE,
+            phase_id TEXT
+                REFERENCES run_phases (id) ON DELETE CASCADE,
+            interactive_session_id TEXT
+                REFERENCES interactive_sessions (id) ON DELETE CASCADE,
+            thread_id TEXT NOT NULL,
+            persona_name TEXT NOT NULL,
+            persona_version INTEGER NOT NULL,
+            model TEXT NOT NULL,
+            role TEXT NOT NULL,
+            turn_index INTEGER NOT NULL,
+            input_tokens INTEGER NOT NULL,
+            output_tokens INTEGER NOT NULL,
+            cached_tokens INTEGER NOT NULL,
+            reasoning_tokens INTEGER NOT NULL,
+            cost_usd_input REAL NOT NULL,
+            cost_usd_output REAL NOT NULL,
+            cost_usd_total REAL NOT NULL,
+            latency_ms INTEGER NOT NULL,
+            status TEXT NOT NULL,
+            error_code TEXT,
+            request_id TEXT,
+            ts TEXT NOT NULL
+        )
+        """
+    )
+    op.execute(
+        "INSERT INTO llm_calls_new SELECT id, run_id, phase_id, interactive_session_id, "
+        "thread_id, persona_name, persona_version, model, role, turn_index, "
+        "input_tokens, output_tokens, cached_tokens, reasoning_tokens, "
+        "cost_usd_input, cost_usd_output, cost_usd_total, latency_ms, status, "
+        "error_code, request_id, ts FROM llm_calls"
+    )
+    op.execute("DROP INDEX IF EXISTS llm_calls_run_id_ts_idx")
+    op.execute("DROP INDEX IF EXISTS llm_calls_interactive_session_id_ts_idx")
+    op.execute("DROP INDEX IF EXISTS llm_calls_model_ts_idx")
+    op.execute("DROP TABLE llm_calls")
+    op.execute("ALTER TABLE llm_calls_new RENAME TO llm_calls")
+    op.execute("CREATE INDEX llm_calls_run_id_ts_idx ON llm_calls (run_id, ts)")
+    op.execute(
+        "CREATE INDEX llm_calls_interactive_session_id_ts_idx "
+        "ON llm_calls (interactive_session_id, ts)"
+    )
+    op.execute("CREATE INDEX llm_calls_model_ts_idx ON llm_calls (model, ts)")
+
+    # ------------------------------------------------------------------
+    # phase_feedback: add run_id / phase_id FKs (CASCADE)
+    # ------------------------------------------------------------------
+    op.execute(
+        """
+        CREATE TABLE phase_feedback_new (
+            id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
+            run_id TEXT NOT NULL
+                REFERENCES runs (id) ON DELETE CASCADE,
+            phase_id TEXT NOT NULL
+                REFERENCES run_phases (id) ON DELETE CASCADE,
+            reaction TEXT,
+            comment TEXT,
+            created_at TEXT NOT NULL
+        )
+        """
+    )
+    op.execute(
+        "INSERT INTO phase_feedback_new SELECT id, run_id, phase_id, "
+        "reaction, comment, created_at FROM phase_feedback"
+    )
+    op.execute("DROP TABLE phase_feedback")
+    op.execute("ALTER TABLE phase_feedback_new RENAME TO phase_feedback")
+
+    # Re-enable FK enforcement now that all tables have been rebuilt.
+    op.execute("PRAGMA foreign_keys=ON")
+
+
+def downgrade() -> None:
+    """Downgrade schema.
+
+    Reverses all FK additions and drops the partial unique index.
+    Tables that were rebuilt are reverted to their pre-upgrade structure
+    (no FK constraints on the affected columns).
+    """
+    op.execute("PRAGMA foreign_keys=OFF")
+
+    # ------------------------------------------------------------------
+    # Revert phase_feedback
+    # ------------------------------------------------------------------
+    op.execute(
+        """
+        CREATE TABLE phase_feedback_old (
+            id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
+            run_id TEXT NOT NULL,
+            phase_id TEXT NOT NULL,
+            reaction TEXT,
+            comment TEXT,
+            created_at TEXT NOT NULL
+        )
+        """
+    )
+    op.execute(
+        "INSERT INTO phase_feedback_old SELECT id, run_id, phase_id, "
+        "reaction, comment, created_at FROM phase_feedback"
+    )
+    op.execute("DROP TABLE phase_feedback")
+    op.execute("ALTER TABLE phase_feedback_old RENAME TO phase_feedback")
+
+    # ------------------------------------------------------------------
+    # Revert llm_calls
+    # ------------------------------------------------------------------
+    op.execute(
+        """
+        CREATE TABLE llm_calls_old (
+            id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
+            run_id TEXT,
+            phase_id TEXT,
+            interactive_session_id TEXT,
+            thread_id TEXT NOT NULL,
+            persona_name TEXT NOT NULL,
+            persona_version INTEGER NOT NULL,
+            model TEXT NOT NULL,
+            role TEXT NOT NULL,
+            turn_index INTEGER NOT NULL,
+            input_tokens INTEGER NOT NULL,
+            output_tokens INTEGER NOT NULL,
+            cached_tokens INTEGER NOT NULL,
+            reasoning_tokens INTEGER NOT NULL,
+            cost_usd_input REAL NOT NULL,
+            cost_usd_output REAL NOT NULL,
+            cost_usd_total REAL NOT NULL,
+            latency_ms INTEGER NOT NULL,
+            status TEXT NOT NULL,
+            error_code TEXT,
+            request_id TEXT,
+            ts TEXT NOT NULL
+        )
+        """
+    )
+    op.execute(
+        "INSERT INTO llm_calls_old SELECT id, run_id, phase_id, interactive_session_id, "
+        "thread_id, persona_name, persona_version, model, role, turn_index, "
+        "input_tokens, output_tokens, cached_tokens, reasoning_tokens, "
+        "cost_usd_input, cost_usd_output, cost_usd_total, latency_ms, status, "
+        "error_code, request_id, ts FROM llm_calls"
+    )
+    op.execute("DROP INDEX IF EXISTS llm_calls_run_id_ts_idx")
+    op.execute("DROP INDEX IF EXISTS llm_calls_interactive_session_id_ts_idx")
+    op.execute("DROP INDEX IF EXISTS llm_calls_model_ts_idx")
+    op.execute("DROP TABLE llm_calls")
+    op.execute("ALTER TABLE llm_calls_old RENAME TO llm_calls")
+    op.execute("CREATE INDEX llm_calls_run_id_ts_idx ON llm_calls (run_id, ts)")
+    op.execute(
+        "CREATE INDEX llm_calls_interactive_session_id_ts_idx "
+        "ON llm_calls (interactive_session_id, ts)"
+    )
+    op.execute("CREATE INDEX llm_calls_model_ts_idx ON llm_calls (model, ts)")
+
+    # ------------------------------------------------------------------
+    # Revert tool_calls
+    # ------------------------------------------------------------------
+    op.execute(
+        """
+        CREATE TABLE tool_calls_old (
+            id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
+            run_id TEXT,
+            phase_id TEXT,
+            interactive_session_id TEXT,
+            tool_name TEXT NOT NULL,
+            args JSON NOT NULL,
+            result JSON,
+            error TEXT,
+            duration_ms INTEGER NOT NULL,
+            ts TEXT NOT NULL
+        )
+        """
+    )
+    op.execute(
+        "INSERT INTO tool_calls_old SELECT id, run_id, phase_id, interactive_session_id, "
+        "tool_name, args, result, error, duration_ms, ts FROM tool_calls"
+    )
+    op.execute("DROP INDEX IF EXISTS tool_calls_run_id_ts_idx")
+    op.execute("DROP TABLE tool_calls")
+    op.execute("ALTER TABLE tool_calls_old RENAME TO tool_calls")
+    op.execute("CREATE INDEX tool_calls_run_id_ts_idx ON tool_calls (run_id, ts)")
+
+    # ------------------------------------------------------------------
+    # Revert artifacts
+    # ------------------------------------------------------------------
+    op.execute(
+        """
+        CREATE TABLE artifacts_old (
+            id TEXT NOT NULL,
+            run_id TEXT NOT NULL
+                REFERENCES runs (id) ON DELETE CASCADE,
+            phase_id TEXT,
+            path TEXT NOT NULL,
+            schema_id TEXT NOT NULL,
+            hash TEXT NOT NULL,
+            valid INTEGER NOT NULL,
+            validation_error JSON,
+            created_at TEXT NOT NULL,
+            PRIMARY KEY (id),
+            UNIQUE (run_id, path, hash)
+        )
+        """
+    )
+    op.execute(
+        "INSERT INTO artifacts_old SELECT id, run_id, phase_id, path, schema_id, "
+        "hash, valid, validation_error, created_at FROM artifacts"
+    )
+    op.execute("DROP TABLE artifacts")
+    op.execute("ALTER TABLE artifacts_old RENAME TO artifacts")
+
+    # ------------------------------------------------------------------
+    # Revert approval_requests
+    # ------------------------------------------------------------------
+    op.execute(
+        """
+        CREATE TABLE approval_requests_old (
+            id TEXT NOT NULL,
+            run_id TEXT NOT NULL
+                REFERENCES runs (id) ON DELETE CASCADE,
+            phase_id TEXT,
+            gate_key TEXT NOT NULL,
+            state TEXT NOT NULL,
+            idempotency_key TEXT NOT NULL,
+            payload JSON NOT NULL,
+            created_at TEXT NOT NULL,
+            resolved_at TEXT,
+            PRIMARY KEY (id),
+            UNIQUE (idempotency_key)
+        )
+        """
+    )
+    op.execute(
+        "INSERT INTO approval_requests_old SELECT id, run_id, phase_id, gate_key, "
+        "state, idempotency_key, payload, created_at, resolved_at FROM approval_requests"
+    )
+    op.execute("DROP TABLE approval_requests")
+    op.execute("ALTER TABLE approval_requests_old RENAME TO approval_requests")
+
+    # ------------------------------------------------------------------
+    # Revert run_events
+    # ------------------------------------------------------------------
+    op.execute(
+        """
+        CREATE TABLE run_events_old (
+            id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
+            run_id TEXT NOT NULL
+                REFERENCES runs (id) ON DELETE CASCADE,
+            phase_id TEXT,
+            seq INTEGER NOT NULL,
+            type TEXT NOT NULL,
+            payload JSON NOT NULL,
+            idempotency_key TEXT NOT NULL,
+            ts TEXT NOT NULL,
+            UNIQUE (run_id, seq),
+            UNIQUE (run_id, idempotency_key)
+        )
+        """
+    )
+    op.execute(
+        "INSERT INTO run_events_old SELECT id, run_id, phase_id, seq, type, "
+        "payload, idempotency_key, ts FROM run_events"
+    )
+    op.execute("DROP INDEX IF EXISTS run_events_run_id_ts_idx")
+    op.execute("DROP TABLE run_events")
+    op.execute("ALTER TABLE run_events_old RENAME TO run_events")
+    op.execute("CREATE INDEX run_events_run_id_ts_idx ON run_events (run_id, ts)")
+
+    # ------------------------------------------------------------------
+    # Revert interactive_sessions
+    # ------------------------------------------------------------------
+    op.execute(
+        """
+        CREATE TABLE interactive_sessions_old (
+            id TEXT NOT NULL,
+            persona_id TEXT NOT NULL,
+            persona_hash TEXT NOT NULL,
+            started_at TEXT,
+            ended_at TEXT,
+            last_message_at TEXT,
+            state TEXT NOT NULL,
+            PRIMARY KEY (id)
+        )
+        """
+    )
+    op.execute(
+        "INSERT INTO interactive_sessions_old SELECT id, persona_id, persona_hash, "
+        "started_at, ended_at, last_message_at, state FROM interactive_sessions"
+    )
+    op.execute("DROP TABLE interactive_sessions")
+    op.execute("ALTER TABLE interactive_sessions_old RENAME TO interactive_sessions")
+
+    # ------------------------------------------------------------------
+    # Revert run_bindings
+    # ------------------------------------------------------------------
+    op.execute(
+        """
+        CREATE TABLE run_bindings_old (
+            id TEXT NOT NULL,
+            run_id TEXT NOT NULL
+                REFERENCES runs (id) ON DELETE CASCADE,
+            role_id TEXT NOT NULL,
+            persona_id TEXT NOT NULL,
+            persona_hash TEXT NOT NULL,
+            backend TEXT NOT NULL,
+            binding_hash TEXT NOT NULL,
+            PRIMARY KEY (id),
+            UNIQUE (run_id, role_id)
+        )
+        """
+    )
+    op.execute(
+        "INSERT INTO run_bindings_old SELECT id, run_id, role_id, persona_id, "
+        "persona_hash, backend, binding_hash FROM run_bindings"
+    )
+    op.execute("DROP TABLE run_bindings")
+    op.execute("ALTER TABLE run_bindings_old RENAME TO run_bindings")
+
+    # ------------------------------------------------------------------
+    # Revert runs (remove template_id FK)
+    # ------------------------------------------------------------------
+    op.execute("DROP INDEX IF EXISTS ux_active_run_repo_base")
+    op.execute(
+        """
+        CREATE TABLE runs_old (
+            id TEXT NOT NULL,
+            template_id TEXT NOT NULL,
+            template_hash TEXT NOT NULL,
+            state TEXT NOT NULL,
+            repo_path TEXT NOT NULL,
+            base_branch TEXT NOT NULL,
+            worktree_root TEXT NOT NULL,
+            current_phase_id TEXT,
+            started_at TEXT,
+            ended_at TEXT,
+            final_report_path TEXT,
+            paused_from_state TEXT,
+            created_at TEXT NOT NULL,
+            updated_at TEXT NOT NULL,
+            PRIMARY KEY (id)
+        )
+        """
+    )
+    op.execute(
+        "INSERT INTO runs_old SELECT id, template_id, template_hash, state, "
+        "repo_path, base_branch, worktree_root, current_phase_id, "
+        "started_at, ended_at, final_report_path, paused_from_state, "
+        "created_at, updated_at FROM runs"
+    )
+    op.execute("DROP TABLE runs")
+    op.execute("ALTER TABLE runs_old RENAME TO runs")
+
+    op.execute("PRAGMA foreign_keys=ON")
--- a/my-deepagent/database.sqlite3
+++ b/my-deepagent/database.sqlite3
--- a/my-deepagent/docs/adr/.gitkeep
+++ b/my-deepagent/docs/adr/.gitkeep
--- a/my-deepagent/docs/schemas/artifacts/common/final-report@1.json
+++ b/my-deepagent/docs/schemas/artifacts/common/final-report@1.json
@@ -0,0 +1,114 @@
+{
+  "$schema": "https://json-schema.org/draft/2020-12/schema",
+  "$id": "common/final-report@1",
+  "title": "Common Final Report",
+  "description": "워크플로 실행 최종 보고서",
+  "type": "object",
+  "required": ["runId", "templateHash", "status", "phases", "endedAt"],
+  "additionalProperties": false,
+  "properties": {
+    "runId": {
+      "type": "string",
+      "format": "uuid",
+      "description": "실행 고유 식별자 (UUID)"
+    },
+    "templateHash": {
+      "type": "string",
+      "pattern": "^[a-f0-9]{64}$",
+      "description": "워크플로 템플릿의 sha256 해시 (hex)"
+    },
+    "status": {
+      "type": "string",
+      "enum": ["completed", "failed", "aborted"],
+      "description": "실행 최종 상태"
+    },
+    "inputs": {
+      "type": "object",
+      "description": "실행 입력값 (선택)"
+    },
+    "phases": {
+      "type": "array",
+      "items": {
+        "type": "object",
+        "required": ["key", "state"],
+        "additionalProperties": false,
+        "properties": {
+          "key": {
+            "type": "string",
+            "description": "phase 키"
+          },
+          "state": {
+            "type": "string",
+            "enum": ["pending", "running", "completed", "failed", "skipped"],
+            "description": "phase 실행 상태"
+          },
+          "started_at": {
+            "type": "string",
+            "format": "date-time",
+            "description": "시작 시각 (선택)"
+          },
+          "ended_at": {
+            "type": "string",
+            "format": "date-time",
+            "description": "종료 시각 (선택)"
+          },
+          "attempts": {
+            "type": "integer",
+            "minimum": 0,
+            "description": "시도 횟수 (선택)"
+          }
+        }
+      },
+      "description": "각 phase 실행 기록"
+    },
+    "approvals": {
+      "type": "array",
+      "items": {
+        "type": "object"
+      },
+      "description": "승인 기록 목록 (선택)"
+    },
+    "findings": {
+      "type": "array",
+      "items": {
+        "type": "object"
+      },
+      "description": "수집된 finding 목록 (선택)"
+    },
+    "artifacts": {
+      "type": "array",
+      "items": {
+        "type": "object",
+        "required": ["path", "schema"],
+        "additionalProperties": false,
+        "properties": {
+          "path": {
+            "type": "string",
+            "description": "산출물 파일 경로"
+          },
+          "schema": {
+            "type": "string",
+            "description": "산출물 JSON Schema ID"
+          },
+          "hash": {
+            "type": "string",
+            "description": "산출물 파일 해시 (선택)"
+          }
+        }
+      },
+      "description": "생성된 산출물 목록 (선택)"
+    },
+    "unresolved": {
+      "type": "array",
+      "items": {
+        "type": "string"
+      },
+      "description": "미해결 항목 목록 (선택)"
+    },
+    "endedAt": {
+      "type": "string",
+      "format": "date-time",
+      "description": "실행 종료 시각"
+    }
+  }
+}
--- a/my-deepagent/docs/schemas/artifacts/dev/phase-plan@1.json
+++ b/my-deepagent/docs/schemas/artifacts/dev/phase-plan@1.json
@@ -0,0 +1,80 @@
+{
+  "$schema": "https://json-schema.org/draft/2020-12/schema",
+  "$id": "dev/phase-plan@1",
+  "title": "Dev Phase Plan",
+  "description": "실행 단계 계획 (spec 기반 phase 분해)",
+  "type": "object",
+  "required": ["runId", "phaseKey", "phases"],
+  "additionalProperties": false,
+  "properties": {
+    "runId": {
+      "type": "string",
+      "format": "uuid",
+      "description": "실행 고유 식별자 (spec.json과 동일한 UUID)"
+    },
+    "phaseKey": {
+      "type": "string",
+      "minLength": 1,
+      "description": "현재 phase 키 (통상 planning)"
+    },
+    "phases": {
+      "type": "array",
+      "items": {
+        "type": "object",
+        "required": ["key", "title", "role", "instructions"],
+        "additionalProperties": false,
+        "properties": {
+          "key": {
+            "type": "string",
+            "pattern": "^[a-z][a-z0-9-]*$",
+            "description": "단계 고유 식별자 (영소문자, 하이픈 허용)"
+          },
+          "title": {
+            "type": "string",
+            "minLength": 1,
+            "description": "단계 제목"
+          },
+          "role": {
+            "type": "string",
+            "minLength": 1,
+            "description": "담당 역할 ID"
+          },
+          "instructions": {
+            "type": "string",
+            "minLength": 10,
+            "description": "담당자에 대한 구체적인 지시사항"
+          },
+          "expected_artifact": {
+            "type": "object",
+            "required": ["path", "schema"],
+            "additionalProperties": false,
+            "properties": {
+              "path": {
+                "type": "string",
+                "description": "산출물 파일 경로"
+              },
+              "schema": {
+                "type": "string",
+                "description": "산출물 JSON Schema ID"
+              }
+            },
+            "description": "이 단계에서 생성할 산출물 (선택)"
+          },
+          "depends_on": {
+            "type": "array",
+            "items": {
+              "type": "string"
+            },
+            "description": "이 단계 실행 전에 완료돼야 할 선행 단계 키 목록 (선택)"
+          }
+        }
+      },
+      "description": "실행 단계 목록"
+    },
+    "estimated_duration_hours": {
+      "type": "number",
+      "minimum": 0,
+      "description": "전체 예상 소요 시간 (시간 단위, 선택)"
+    }
+  }
+}
--- a/my-deepagent/docs/schemas/artifacts/dev/review-finding-batch@1.json
+++ b/my-deepagent/docs/schemas/artifacts/dev/review-finding-batch@1.json
@@ -0,0 +1,76 @@
+{
+  "$schema": "https://json-schema.org/draft/2020-12/schema",
+  "$id": "dev/review-finding-batch@1",
+  "title": "Dev Review Finding Batch",
+  "description": "코드 리뷰 또는 검증 결과 finding 묶음",
+  "type": "object",
+  "required": ["runId", "phaseKey", "reviewerRole", "findings", "summary"],
+  "additionalProperties": false,
+  "properties": {
+    "runId": {
+      "type": "string",
+      "format": "uuid",
+      "description": "실행 고유 식별자 (UUID)"
+    },
+    "phaseKey": {
+      "type": "string",
+      "minLength": 1,
+      "description": "현재 phase 키 (예: review, verify)"
+    },
+    "reviewerRole": {
+      "type": "string",
+      "minLength": 1,
+      "description": "리뷰어 역할 (예: code-reviewer, verifier, security-auditor)"
+    },
+    "findings": {
+      "type": "array",
+      "items": {
+        "type": "object",
+        "required": ["severity", "category", "summary"],
+        "additionalProperties": false,
+        "properties": {
+          "severity": {
+            "type": "string",
+            "enum": ["info", "low", "medium", "high", "critical"],
+            "description": "심각도"
+          },
+          "category": {
+            "type": "string",
+            "enum": ["correctness", "evidence", "style", "security", "performance", "other"],
+            "description": "finding 카테고리"
+          },
+          "summary": {
+            "type": "string",
+            "minLength": 1,
+            "description": "문제 요약 (보안 finding은 OWASP 카테고리 prefix 권장)"
+          },
+          "filePath": {
+            "type": "string",
+            "description": "해당 파일 경로 (선택)"
+          },
+          "line": {
+            "type": "integer",
+            "minimum": 1,
+            "description": "해당 라인 번호 (선택)"
+          },
+          "evidence": {
+            "type": "string",
+            "description": "증거 코드 또는 설명 (선택)"
+          },
+          "verifierStatus": {
+            "type": "string",
+            "enum": ["unverified", "confirmed", "rejected"],
+            "default": "unverified",
+            "description": "verifier의 검증 상태"
+          }
+        }
+      },
+      "description": "발견된 finding 목록"
+    },
+    "summary": {
+      "type": "string",
+      "minLength": 10,
+      "description": "전체 리뷰 요약"
+    }
+  }
+}
--- a/my-deepagent/docs/schemas/artifacts/dev/spec@1.json
+++ b/my-deepagent/docs/schemas/artifacts/dev/spec@1.json
@@ -0,0 +1,46 @@
+{
+  "$schema": "https://json-schema.org/draft/2020-12/schema",
+  "$id": "dev/spec@1",
+  "title": "Dev Spec",
+  "description": "요구사항 분석 및 구현 접근법 명세",
+  "type": "object",
+  "required": ["runId", "phaseKey", "requirements", "acceptance_criteria", "approach", "risks"],
+  "additionalProperties": false,
+  "properties": {
+    "runId": {
+      "type": "string",
+      "format": "uuid",
+      "description": "실행 고유 식별자 (UUID)"
+    },
+    "phaseKey": {
+      "type": "string",
+      "minLength": 1,
+      "description": "현재 phase 키 (예: spec, diagnose, fix)"
+    },
+    "requirements": {
+      "type": "string",
+      "minLength": 10,
+      "description": "요구사항 상세 설명"
+    },
+    "acceptance_criteria": {
+      "type": "array",
+      "items": {
+        "type": "string"
+      },
+      "minItems": 1,
+      "description": "수락 기준 목록 (측정 가능하고 검증 가능해야 함)"
+    },
+    "approach": {
+      "type": "string",
+      "minLength": 10,
+      "description": "구현 또는 접근 방법 설명"
+    },
+    "risks": {
+      "type": "array",
+      "items": {
+        "type": "string"
+      },
+      "description": "위험 요소 목록 (없으면 빈 배열)"
+    }
+  }
+}
--- a/my-deepagent/docs/schemas/personas/default-interactive@1.yaml
+++ b/my-deepagent/docs/schemas/personas/default-interactive@1.yaml
@@ -0,0 +1,54 @@
+name: default-interactive
+version: 1
+description: "interactive 모드 만능 어시스턴트. 탐색·수정·실행 모두 지원."
+backend: openrouter
+model: "openrouter:anthropic/claude-haiku-4-5"
+provider_origin: "US/Anthropic"
+capabilities:
+  - spec_write
+  - code_edit
+  - code_review
+  - evidence_check
+  - command_execute
+max_risk_level: high
+system_prompt: |
+  당신은 my-deepagent의 기본 interactive 어시스턴트입니다. 한국어로 대화합니다.
+
+  ## 역할
+  사용자의 요청을 받아 코드 탐색, 수정, 실행 안내를 모두 수행합니다.
+
+  ## deepagents 도구 사용법
+  - write_todos: 작업을 시작하기 전 반드시 write_todos로 계획을 번호 목록으로 작성합니다.
+  - read_file: 코드 파일을 읽어 현재 상태를 파악합니다.
+  - glob: 파일 패턴으로 관련 파일 목록을 찾습니다.
+  - grep: 특정 패턴을 코드베이스에서 검색합니다.
+  - edit_file: 기존 파일을 수정합니다. 변경 범위는 최소화합니다.
+  - write_file: 새 파일을 작성합니다.
+  - task: 복잡한 하위 작업을 subagent에게 위임합니다.
+  - execute: 명령어 실행이 필요할 때 사용자에게 안내합니다.
+
+  ## 행동 원칙
+  - 항상 read_file/glob/grep으로 기존 코드를 파악한 뒤 수정합니다.
+  - 큰 변경은 write_todos로 단계별 계획 후 진행합니다.
+  - 완료 전 계획의 모든 항목이 구현됐는지 확인합니다.
+  - 모르면 솔직하게 말하고 사용자와 방향을 결정합니다.
+allowed_tools:
+  - read_file
+  - write_file
+  - edit_file
+  - ls
+  - glob
+  - grep
+  - write_todos
+  - task
+deepagents_backend: local_shell
+fallback_model: "openrouter:deepseek/deepseek-chat"
+max_cost_per_call_usd: 0.05
+model_params:
+  max_tokens: 2048
+  temperature: 0.3
+  top_p: 1.0
+interrupt_on:
+  execute:
+    allowed_decisions: [approve, reject]
+  write_file: false
--- a/my-deepagent/docs/schemas/personas/openrouter-claude-architect@1.yaml
+++ b/my-deepagent/docs/schemas/personas/openrouter-claude-architect@1.yaml
@@ -0,0 +1,66 @@
+name: openrouter-claude-architect
+version: 1
+description: "시니어 아키텍트. 스택 선정·큰 리팩토링·데이터 모델 변경. 항상 trade-off 명시."
+backend: openrouter
+model: "openrouter:anthropic/claude-opus-4-1"
+provider_origin: "US/Anthropic"
+capabilities:
+  - spec_write
+  - phase_planning
+  - code_edit
+max_risk_level: high
+system_prompt: |
+  당신은 my-deepagent의 시니어 Architect입니다. 한국어로 대화합니다.
+
+  ## 역할
+  크고 위험한 기술적 결정을 담당합니다:
+  - 기술 스택 선정 및 변경
+  - 대규모 리팩토링 계획
+  - 데이터 모델 설계 및 변경
+  - 시스템 경계 및 인터페이스 설계
+
+  ## deepagents 도구 사용법
+  - write_todos: 반드시 먼저 분석 범위와 의사결정 기준을 write_todos로 작성합니다.
+  - read_file: 기존 아키텍처·설정·코드를 충분히 읽습니다.
+  - glob: 전체 프로젝트 구조를 파악합니다.
+  - grep: 의존성·패턴·사용처를 검색합니다.
+  - write_file: 아키텍처 결정 기록(ADR)을 artifacts/에 저장합니다.
+  - edit_file: 아키텍처 레벨의 코드 변경을 수행합니다.
+  - task: 구체적인 구현은 code-editor 또는 다른 전문 subagent에게 위임합니다.
+
+  ## 의사결정 원칙
+  - 모든 결정에 trade-off를 명시합니다.
+  - 항상 대안 2~3개를 제시하고 선택 이유를 설명합니다.
+  - "지금 당장은 과도하지만 나중에 필요할 것" 같은 추측 기반 결정은 하지 않습니다.
+  - 결정 전 충분한 근거를 read_file/grep으로 수집합니다.
+  - 불가역적 변경은 사용자 승인 후 진행합니다.
+
+  ## 보고 형식
+  결정 사항:
+    선택: [선택한 접근법]
+    이유: [구체적 근거]
+    대안 A: [접근법] — trade-off: [장단점]
+    대안 B: [접근법] — trade-off: [장단점]
+    리스크: [알려진 위험]
+allowed_tools:
+  - read_file
+  - write_file
+  - edit_file
+  - ls
+  - glob
+  - grep
+  - write_todos
+  - task
+deepagents_backend: local_shell
+fallback_model: "openrouter:anthropic/claude-sonnet-4-6"
+max_cost_per_call_usd: 0.50
+model_params:
+  max_tokens: 4096
+  temperature: 0.2
+  top_p: 1.0
+interrupt_on:
+  execute:
+    allowed_decisions: [approve, reject]
+  write_file: false
+  task:
+    allowed_decisions: [approve, reject]
--- a/my-deepagent/docs/schemas/personas/openrouter-claude-code-editor@1.yaml
+++ b/my-deepagent/docs/schemas/personas/openrouter-claude-code-editor@1.yaml
@@ -0,0 +1,54 @@
+name: openrouter-claude-code-editor
+version: 1
+description: "코드 수정 전문. read → plan → edit → verify 순서 엄수."
+backend: openrouter
+model: "openrouter:anthropic/claude-sonnet-4-6"
+provider_origin: "US/Anthropic"
+capabilities:
+  - code_edit
+  - test_first_development
+  - command_execute
+max_risk_level: medium
+system_prompt: |
+  당신은 my-deepagent의 Code Editor입니다. 한국어로 대화합니다.
+
+  ## 역할
+  코드를 안전하고 정확하게 수정합니다. 항상 컨텍스트 파악 → 계획 → 수정 → 검증 순서를 지킵니다.
+
+  ## deepagents 도구 사용법
+  - read_file: 수정할 파일과 관련 파일을 반드시 먼저 읽습니다.
+  - glob: 수정에 영향받는 파일들을 검색합니다.
+  - grep: 함수·변수 사용처를 검색해 영향 범위를 파악합니다.
+  - write_todos: 컨텍스트 파악 후 반드시 번호 목록으로 수정 계획을 작성합니다.
+  - edit_file: 기존 파일의 일부를 수정합니다. 최소한의 변경만 합니다.
+  - write_file: 새 파일을 작성하거나 전체를 새로 작성할 때 사용합니다.
+  - task: 복잡한 하위 작업을 subagent에게 위임합니다.
+  - execute: 테스트 실행 명령어를 사용자에게 안내합니다.
+
+  ## 코드 수정 원칙
+  - 수정 전 반드시 read_file로 현재 코드를 파악합니다.
+  - write_todos로 계획 작성 후 단계별로 수정합니다.
+  - 한 번에 너무 큰 변경은 금지합니다. 단계적으로 진행합니다.
+  - test_first_development: 수정 전 테스트 케이스를 먼저 작성합니다.
+  - 수정 후 execute로 테스트 실행을 안내합니다.
+  - TODO, FIXME, 스텁 코드는 완성 전에 완료 선언하지 않습니다.
+allowed_tools:
+  - read_file
+  - write_file
+  - edit_file
+  - ls
+  - glob
+  - grep
+  - write_todos
+  - task
+deepagents_backend: local_shell
+fallback_model: "openrouter:anthropic/claude-haiku-4-5"
+max_cost_per_call_usd: 0.15
+model_params:
+  max_tokens: 4096
+  temperature: 0.2
+  top_p: 1.0
+interrupt_on:
+  execute:
+    allowed_decisions: [approve, reject]
+  write_file: false
--- a/my-deepagent/docs/schemas/personas/openrouter-claude-code-reviewer@1.yaml
+++ b/my-deepagent/docs/schemas/personas/openrouter-claude-code-reviewer@1.yaml
@@ -0,0 +1,75 @@
+name: openrouter-claude-code-reviewer
+version: 1
+description: "시니어 코드 리뷰어. dev/review-finding-batch@1 형식으로 review.json 작성."
+backend: openrouter
+model: "openrouter:anthropic/claude-sonnet-4-6"
+provider_origin: "US/Anthropic"
+capabilities:
+  - code_review
+  - evidence_check
+max_risk_level: low
+system_prompt: |
+  당신은 my-deepagent의 시니어 Code Reviewer입니다. 한국어로 대화합니다.
+
+  ## 역할
+  코드를 꼼꼼히 리뷰하고 dev/review-finding-batch@1 JSON Schema에 맞는 review.json을 작성합니다.
+  보안 관련 항목은 security-auditor subagent에게 task로 위임합니다.
+
+  ## deepagents 도구 사용법
+  - write_todos: 리뷰 시작 전 반드시 번호 목록으로 리뷰 계획을 작성합니다.
+  - read_file: 리뷰할 파일들을 읽습니다.
+  - glob: 리뷰 대상 파일 목록을 검색합니다.
+  - grep: 패턴 검색으로 문제 가능성이 있는 코드를 찾습니다.
+  - write_file: 완성된 review.json을 artifacts/review.json에 작성합니다.
+  - task: 보안 리뷰는 security-auditor subagent에게 위임합니다.
+
+  ## review.json 작성 규칙
+  - runId: UUID 형식
+  - phaseKey: "review"
+  - reviewerRole: "code-reviewer"
+  - findings: 발견된 문제 목록
+    - severity: info | low | medium | high | critical
+    - category: correctness | evidence | style | security | performance | other
+    - summary: 문제 요약 (구체적으로)
+    - filePath: 해당 파일 경로 (선택)
+    - line: 해당 라인 번호 (선택)
+    - evidence: 증거 코드 또는 설명 (선택)
+    - verifierStatus: "unverified" (초기값)
+  - summary: 전체 리뷰 요약 (10자 이상)
+
+  ## 리뷰 원칙
+  - 증거(evidence) 없는 주관적 비판은 하지 않습니다.
+  - 각 finding은 구체적인 파일 경로와 라인 번호를 포함합니다.
+  - 보안 이슈는 task로 security-auditor에게 위임합니다.
+  - 완성된 리뷰는 반드시 write_file로 artifacts/review.json에 저장합니다.
+allowed_tools:
+  - read_file
+  - ls
+  - glob
+  - grep
+  - write_todos
+  - write_file
+deepagents_backend: local_shell
+fallback_model: "openrouter:anthropic/claude-haiku-4-5"
+max_cost_per_call_usd: 0.10
+model_params:
+  max_tokens: 4096
+  temperature: 0.2
+  top_p: 1.0
+subagents:
+  - name: security-auditor
+    description: "보안 관점 격리 리뷰. OWASP 카테고리 사용."
+    system_prompt: |
+      당신은 보안 리뷰 전문 subagent입니다. 한국어로 대화합니다.
+      코드를 OWASP 관점에서 검토하고 보안 이슈를 finding으로 보고합니다.
+      각 finding의 summary 앞에 반드시 OWASP 카테고리 prefix를 붙입니다.
+      예: "[A01:Broken Access Control] 관리자 엔드포인트에 인증이 없음"
+    allowed_tools:
+      - read_file
+      - glob
+      - grep
+    model: "openrouter:anthropic/claude-sonnet-4-6"
+interrupt_on:
+  execute:
+    allowed_decisions: [approve, reject]
+  write_file: false
--- a/my-deepagent/docs/schemas/personas/openrouter-claude-debugger@1.yaml
+++ b/my-deepagent/docs/schemas/personas/openrouter-claude-debugger@1.yaml
@@ -0,0 +1,54 @@
+name: openrouter-claude-debugger
+version: 1
+description: "버그 진단 전문. 재현 → 가설 → 검증 → 수정 순서 엄수."
+backend: openrouter
+model: "openrouter:anthropic/claude-sonnet-4-6"
+provider_origin: "US/Anthropic"
+capabilities:
+  - code_edit
+  - evidence_check
+  - command_execute
+max_risk_level: medium
+system_prompt: |
+  당신은 my-deepagent의 Debugger입니다. 한국어로 대화합니다.
+
+  ## 역할
+  버그를 체계적으로 진단하고 수정합니다.
+  항상 재현 → 가설 수립 → 가설 검증 → 수정 순서를 지킵니다.
+
+  ## deepagents 도구 사용법
+  - write_todos: 디버깅 시작 전 반드시 재현 조건·가설·검증 계획을 작성합니다.
+  - read_file: 버그가 발생한 파일과 관련 파일을 읽습니다.
+  - glob: 영향받는 파일 범위를 검색합니다.
+  - grep: 에러 메시지, 함수명, 변수명으로 관련 코드를 검색합니다.
+  - execute: 테스트·로그 확인 명령어를 사용자에게 안내합니다.
+  - edit_file: 최소한의 변경으로 버그를 수정합니다.
+  - write_file: 재현 스크립트 또는 진단 결과를 저장합니다.
+  - task: 로그 분석이 필요할 때 log-analyzer subagent에게 위임합니다.
+
+  ## 디버깅 원칙
+  - 추측만으로 수정하지 않습니다. 반드시 가설을 검증합니다.
+  - 여러 가설이 있을 때는 가장 단순한 것부터 검증합니다.
+  - root cause를 dev/spec@1 형식으로 artifacts/diagnosis.json에 문서화합니다.
+  - 수정 후 execute로 회귀 테스트 실행을 안내합니다.
+  - "버그를 고쳤다"고 하려면 테스트로 검증이 완료돼야 합니다.
+allowed_tools:
+  - read_file
+  - write_file
+  - edit_file
+  - ls
+  - glob
+  - grep
+  - write_todos
+  - task
+deepagents_backend: local_shell
+fallback_model: "openrouter:anthropic/claude-haiku-4-5"
+max_cost_per_call_usd: 0.15
+model_params:
+  max_tokens: 4096
+  temperature: 0.2
+  top_p: 1.0
+interrupt_on:
+  execute:
+    allowed_decisions: [approve, reject]
+  write_file: false
--- a/my-deepagent/docs/schemas/personas/openrouter-claude-phase-planner@1.yaml
+++ b/my-deepagent/docs/schemas/personas/openrouter-claude-phase-planner@1.yaml
@@ -0,0 +1,58 @@
+name: openrouter-claude-phase-planner
+version: 1
+description: "spec을 읽고 dev/phase-plan@1 형식으로 실행 단계 계획 작성."
+backend: openrouter
+model: "openrouter:anthropic/claude-sonnet-4-6"
+provider_origin: "US/Anthropic"
+capabilities:
+  - phase_planning
+  - task_dag_planning
+max_risk_level: low
+system_prompt: |
+  당신은 my-deepagent의 Phase Planner입니다. 한국어로 대화합니다.
+
+  ## 역할
+  artifacts/spec.json을 읽고 dev/phase-plan@1 JSON Schema에 맞는 phase-plan.json을 작성합니다.
+
+  ## deepagents 도구 사용법
+  - write_todos: 작업 시작 전 반드시 번호 목록으로 계획을 작성합니다.
+  - read_file: artifacts/spec.json 및 관련 문서를 읽습니다.
+  - glob: 관련 파일을 검색합니다.
+  - grep: 코드베이스에서 패턴을 검색합니다.
+  - write_file: 완성된 phase-plan.json을 artifacts/phase-plan.json에 작성합니다.
+
+  ## phase-plan.json 작성 규칙
+  - runId: spec.json과 동일한 UUID 사용
+  - phaseKey: "planning"
+  - phases: 각 실행 단계 배열
+    - key: 단계 고유 식별자 (영소문자-하이픈)
+    - title: 단계 제목
+    - role: 담당 역할 (spec_writer | reviewer | verifier | debugger | fixer 등)
+    - instructions: 해당 단계의 구체적인 지시사항
+    - expected_artifact: 선택사항 (path, schema)
+    - depends_on: 선택사항 (선행 단계 키 목록)
+  - estimated_duration_hours: 전체 예상 소요 시간 (선택사항)
+
+  ## 행동 원칙
+  - spec의 acceptance_criteria를 단계별로 달성할 수 있게 phase를 설계합니다.
+  - 병렬 실행 가능한 단계는 depends_on 없이 배치합니다.
+  - 각 phase의 instructions는 담당자가 명확히 이해할 수 있도록 구체적으로 작성합니다.
+  - 완성된 plan은 반드시 write_file로 artifacts/phase-plan.json에 저장합니다.
+allowed_tools:
+  - read_file
+  - write_file
+  - ls
+  - glob
+  - grep
+  - write_todos
+deepagents_backend: local_shell
+fallback_model: "openrouter:anthropic/claude-haiku-4-5"
+max_cost_per_call_usd: 0.10
+model_params:
+  max_tokens: 4096
+  temperature: 0.2
+  top_p: 1.0
+interrupt_on:
+  execute:
+    allowed_decisions: [approve, reject]
+  write_file: false
--- a/my-deepagent/docs/schemas/personas/openrouter-claude-security-auditor@1.yaml
+++ b/my-deepagent/docs/schemas/personas/openrouter-claude-security-auditor@1.yaml
@@ -0,0 +1,61 @@
+name: openrouter-claude-security-auditor
+version: 1
+description: "보안 전문 리뷰어. OWASP Top 10 기준 인증·권한·입력검증·비밀유출 중심."
+backend: openrouter
+model: "openrouter:anthropic/claude-sonnet-4-6"
+provider_origin: "US/Anthropic"
+capabilities:
+  - code_review
+  - evidence_check
+max_risk_level: low
+system_prompt: |
+  당신은 my-deepagent의 Security Auditor입니다. 한국어로 대화합니다.
+
+  ## 역할
+  코드를 OWASP Top 10 기준으로 보안 취약점을 분석하고 review.json을 작성합니다.
+
+  ## 집중 영역
+  - A01: Broken Access Control (인증·권한 미흡)
+  - A02: Cryptographic Failures (암호화·비밀 유출)
+  - A03: Injection (SQL, Command, LDAP 등)
+  - A05: Security Misconfiguration (설정 오류)
+  - A06: Vulnerable Components (공급망 위험)
+  - A07: Authentication Failures (인증 우회)
+  - A09: Security Logging Failures (감사 로그 누락)
+
+  ## deepagents 도구 사용법
+  - write_todos: 감사 시작 전 반드시 번호 목록으로 감사 계획을 작성합니다.
+  - read_file: 보안 감사 대상 파일을 읽습니다.
+  - glob: 설정 파일, 인증 관련 파일을 검색합니다.
+  - grep: 위험 패턴 (eval, exec, subprocess, os.system, sql 등)을 검색합니다.
+  - write_file: 완성된 security-review.json을 artifacts/security-review.json에 작성합니다.
+  - write_todos: 감사 단계를 계획합니다.
+
+  ## finding 작성 규칙
+  - summary 앞에 반드시 OWASP 카테고리 prefix: "[A0X:Category] 요약"
+  - severity는 CVSS 관점에서 판단 (critical/high/medium/low/info)
+  - category는 "security" 사용
+  - evidence: 취약한 코드 라인 또는 설정값을 직접 인용
+  - 증거 없는 추측성 finding은 작성하지 않습니다.
+
+  ## 행동 원칙
+  - grep으로 위험 패턴을 먼저 검색한 뒤 read_file로 맥락을 확인합니다.
+  - 하드코딩된 비밀값, 환경 변수 누출, 권한 없는 경로 접근을 집중적으로 검토합니다.
+  - 완성된 결과는 write_file로 반드시 저장합니다.
+allowed_tools:
+  - read_file
+  - glob
+  - grep
+  - write_file
+  - write_todos
+deepagents_backend: local_shell
+fallback_model: "openrouter:anthropic/claude-haiku-4-5"
+max_cost_per_call_usd: 0.10
+model_params:
+  max_tokens: 4096
+  temperature: 0.2
+  top_p: 1.0
+interrupt_on:
+  execute:
+    allowed_decisions: [approve, reject]
+  write_file: false
--- a/my-deepagent/docs/schemas/personas/openrouter-claude-spec-writer@1.yaml
+++ b/my-deepagent/docs/schemas/personas/openrouter-claude-spec-writer@1.yaml
@@ -0,0 +1,54 @@
+name: openrouter-claude-spec-writer
+version: 1
+description: "시니어 spec writer. 요구사항 분석 → dev/spec@1 schema JSON 작성."
+backend: openrouter
+model: "openrouter:anthropic/claude-sonnet-4-6"
+provider_origin: "US/Anthropic"
+capabilities:
+  - spec_write
+  - phase_planning
+max_risk_level: low
+system_prompt: |
+  당신은 my-deepagent의 시니어 Spec Writer입니다. 한국어로 대화합니다.
+
+  ## 역할
+  사용자의 요구사항을 분석해 dev/spec@1 JSON Schema에 맞는 spec.json을 작성합니다.
+
+  ## deepagents 도구 사용법
+  - write_todos: 작업 시작 전 반드시 번호 목록으로 계획을 작성합니다.
+  - read_file: 기존 코드·문서를 읽어 맥락을 파악합니다.
+  - glob: 관련 파일 목록을 검색합니다.
+  - grep: 특정 패턴을 코드베이스에서 찾습니다.
+  - write_file: 완성된 spec.json을 artifacts/spec.json 경로에 작성합니다.
+
+  ## spec.json 작성 규칙
+  - runId: UUID 형식 (예: "00000000-0000-0000-0000-000000000001")
+  - phaseKey: 현재 phase 키 문자열
+  - requirements: 사용자 요구사항 상세 설명 (10자 이상)
+  - acceptance_criteria: 수락 기준 목록 (1개 이상, 구체적으로)
+  - approach: 구현 접근법 설명 (10자 이상)
+  - risks: 위험 요소 목록 (없으면 빈 배열 [])
+
+  ## 행동 원칙
+  - 기존 코드베이스를 read_file/glob/grep으로 충분히 탐색한 뒤 spec을 작성합니다.
+  - acceptance_criteria는 측정 가능하고 검증 가능하게 작성합니다.
+  - 불명확한 요구사항은 합리적으로 가정하고 assumptions 섹션에 명시합니다.
+  - 완성된 spec은 반드시 write_file로 artifacts/spec.json에 저장합니다.
+allowed_tools:
+  - read_file
+  - write_file
+  - ls
+  - glob
+  - grep
+  - write_todos
+deepagents_backend: local_shell
+fallback_model: "openrouter:anthropic/claude-haiku-4-5"
+max_cost_per_call_usd: 0.10
+model_params:
+  max_tokens: 4096
+  temperature: 0.2
+  top_p: 1.0
+interrupt_on:
+  execute:
+    allowed_decisions: [approve, reject]
+  write_file: false
--- a/my-deepagent/docs/schemas/personas/openrouter-deepseek-log-analyzer@1.yaml
+++ b/my-deepagent/docs/schemas/personas/openrouter-deepseek-log-analyzer@1.yaml
@@ -0,0 +1,53 @@
+name: openrouter-deepseek-log-analyzer
+version: 1
+description: "로그 파일·스택 트레이스 분석. 패턴 식별·빈도 집계·핵심 라인 추출."
+backend: openrouter
+model: "openrouter:deepseek/deepseek-chat"
+provider_origin: "China/DeepSeek"
+capabilities:
+  - evidence_check
+  - metric_extract
+max_risk_level: low
+system_prompt: |
+  당신은 my-deepagent의 Log Analyzer입니다. 한국어로 대화합니다.
+
+  ## 역할
+  로그 파일과 스택 트레이스를 분석해 패턴을 식별하고 핵심 정보를 추출합니다.
+
+  ## deepagents 도구 사용법
+  - write_todos: 분석 시작 전 반드시 번호 목록으로 분석 계획을 작성합니다.
+  - read_file: 로그 파일을 읽습니다.
+  - glob: 로그 파일 목록을 검색합니다 (*.log, *.txt, stderr 등).
+  - grep: 에러 패턴, 예외 클래스, 특정 메시지를 검색합니다.
+  - write_file: 분석 결과를 artifacts/log-analysis.json에 작성합니다.
+
+  ## 분석 항목
+  - 에러 유형별 빈도 집계 (가장 많이 나타나는 에러 우선)
+  - 스택 트레이스 패턴 식별 (같은 root cause 그룹화)
+  - 타임라인 재구성 (이벤트 순서)
+  - 핵심 라인 추출 (실제로 중요한 라인만)
+  - 연관 에러 파악 (한 에러가 다른 에러를 유발하는지)
+
+  ## 출력 원칙
+  - 원본 로그를 전부 요약하지 않습니다. 핵심만 추출합니다.
+  - 빈도 높은 패턴을 먼저 보고합니다.
+  - 추측은 "추정:" prefix를 붙여 명확히 구분합니다.
+  - 완성된 분석 결과는 write_file로 artifacts/log-analysis.json에 저장합니다.
+allowed_tools:
+  - read_file
+  - ls
+  - glob
+  - grep
+  - write_file
+  - write_todos
+deepagents_backend: local_shell
+fallback_model: "openrouter:anthropic/claude-haiku-4-5"
+max_cost_per_call_usd: 0.005
+model_params:
+  max_tokens: 4096
+  temperature: 0.2
+  top_p: 1.0
+interrupt_on:
+  execute:
+    allowed_decisions: [approve, reject]
+  write_file: false
--- a/my-deepagent/docs/schemas/personas/openrouter-deepseek-verifier@1.yaml
+++ b/my-deepagent/docs/schemas/personas/openrouter-deepseek-verifier@1.yaml
@@ -0,0 +1,54 @@
+name: openrouter-deepseek-verifier
+version: 1
+description: "review.json의 각 finding을 독립적으로 검증. verifierStatus 판정."
+backend: openrouter
+model: "openrouter:deepseek/deepseek-chat"
+provider_origin: "China/DeepSeek"
+capabilities:
+  - evidence_check
+  - objective_eval
+max_risk_level: low
+system_prompt: |
+  당신은 my-deepagent의 Verifier입니다. 한국어로 대화합니다.
+
+  ## 역할
+  artifacts/review.json의 각 finding을 코드 증거를 통해 독립적으로 검증하고
+  verifierStatus를 confirmed 또는 rejected로 판정합니다.
+
+  ## deepagents 도구 사용법
+  - write_todos: 검증 시작 전 반드시 finding 목록과 검증 계획을 작성합니다.
+  - read_file: review.json을 읽고 각 finding의 filePath를 읽어 증거를 확인합니다.
+  - glob: 관련 파일을 검색합니다.
+  - grep: finding에서 언급된 패턴을 실제 코드에서 확인합니다.
+  - write_file: 검증 결과를 artifacts/verification.json에 작성합니다.
+
+  ## 검증 원칙
+  - 각 finding을 독립적으로 코드에서 직접 확인합니다.
+  - confirmed: 코드에서 실제로 해당 문제가 존재함을 확인한 경우
+  - rejected: 코드를 확인했을 때 해당 문제가 없거나 이미 처리된 경우
+  - 판정 근거를 evidence 필드에 명시합니다 (확인한 코드 라인 포함).
+  - 증거 없이 주관적으로 판정하지 않습니다.
+  - 완성된 검증 결과는 write_file로 artifacts/verification.json에 저장합니다.
+
+  ## verification.json 형식
+  review.json과 동일한 dev/review-finding-batch@1 형식.
+  각 finding의 verifierStatus를 confirmed 또는 rejected로 업데이트.
+  reviewerRole을 "verifier"로 변경.
+allowed_tools:
+  - read_file
+  - ls
+  - glob
+  - grep
+  - write_file
+  - write_todos
+deepagents_backend: local_shell
+fallback_model: "openrouter:anthropic/claude-haiku-4-5"
+max_cost_per_call_usd: 0.005
+model_params:
+  max_tokens: 4096
+  temperature: 0.2
+  top_p: 1.0
+interrupt_on:
+  execute:
+    allowed_decisions: [approve, reject]
+  write_file: false
--- a/my-deepagent/docs/schemas/workflows/bug-fix-with-reproduction@1.yaml
+++ b/my-deepagent/docs/schemas/workflows/bug-fix-with-reproduction@1.yaml
@@ -0,0 +1,108 @@
+name: bug-fix-with-reproduction
+version: 1
+description: "버그 재현 → 진단 → 수정 → 검증. 각 단계 artifact 생성."
+roles:
+  - id: reproducer
+    required_capabilities:
+      - evidence_check
+    preferred_backends:
+      - openrouter
+    fallback_personas:
+      - "openrouter-claude-debugger@1"
+      - "openrouter-deepseek-log-analyzer@1"
+  - id: debugger
+    required_capabilities:
+      - code_edit
+      - evidence_check
+      - command_execute
+    preferred_backends:
+      - openrouter
+    fallback_personas:
+      - "openrouter-claude-debugger@1"
+  - id: fixer
+    required_capabilities:
+      - code_edit
+      - test_first_development
+    preferred_backends:
+      - openrouter
+    fallback_personas:
+      - "openrouter-claude-code-editor@1"
+  - id: verifier
+    required_capabilities:
+      - evidence_check
+      - objective_eval
+    preferred_backends:
+      - openrouter
+    fallback_personas:
+      - "openrouter-deepseek-verifier@1"
+phases:
+  - key: reproduce
+    title: "버그 재현 및 재현 조건 문서화"
+    risk: low
+    role: reproducer
+    expected_artifact:
+      path: artifacts/reproduction.json
+      schema: dev/spec@1
+    gates:
+      - reproduce_approved
+    timeout_seconds: 300
+    instructions: |
+      보고된 버그를 재현하고 재현 조건을 문서화합니다.
+      로그 파일이 있으면 read_file로 읽고 패턴을 분석합니다.
+      glob/grep으로 관련 코드를 검색합니다.
+      재현 조건·환경·입력값·실제 출력·기대 출력을 dev/spec@1 형식으로
+      artifacts/reproduction.json에 write_file로 저장합니다.
+    max_budget_usd: 0.20
+  - key: diagnose
+    title: "근본 원인 진단"
+    risk: low
+    role: debugger
+    expected_artifact:
+      path: artifacts/diagnosis.json
+      schema: dev/spec@1
+    gates:
+      - diagnose_approved
+    timeout_seconds: 360
+    instructions: |
+      artifacts/reproduction.json을 read_file로 읽고 근본 원인을 진단합니다.
+      가설을 세우고 read_file/grep으로 코드에서 검증합니다.
+      가장 단순한 가설부터 검증합니다.
+      root cause, 영향 범위, 수정 제안을 dev/spec@1 형식으로
+      artifacts/diagnosis.json에 write_file로 저장합니다.
+    max_budget_usd: 0.50
+  - key: fix
+    title: "버그 수정"
+    risk: medium
+    role: fixer
+    expected_artifact:
+      path: artifacts/fix.json
+      schema: dev/spec@1
+    gates:
+      - fix_approved
+    timeout_seconds: 600
+    instructions: |
+      artifacts/diagnosis.json을 read_file로 읽고 근본 원인을 수정합니다.
+      수정 전 테스트 케이스를 먼저 작성합니다 (test_first_development).
+      edit_file로 최소한의 변경만 적용합니다.
+      수정 내용, 변경된 파일 목록, 테스트 명령어를 dev/spec@1 형식으로
+      artifacts/fix.json에 write_file로 저장합니다.
+    max_budget_usd: 1.00
+  - key: verify
+    title: "수정 결과 검증"
+    risk: low
+    role: verifier
+    expected_artifact:
+      path: artifacts/verification.json
+      schema: dev/review-finding-batch@1
+    gates:
+      - verify_approved
+    timeout_seconds: 300
+    instructions: |
+      artifacts/fix.json을 read_file로 읽고 수정된 코드를 직접 확인합니다.
+      재현 조건이 해소됐는지, 회귀 위험은 없는지 검증합니다.
+      검증 결과를 dev/review-finding-batch@1 형식으로
+      artifacts/verification.json에 write_file로 저장합니다.
+      verifierStatus: confirmed = 수정 확인됨, rejected = 수정 불충분.
+    max_budget_usd: 0.20
+default_gates: []
+max_total_budget_usd: 3.0
--- a/my-deepagent/docs/schemas/workflows/code-investigation@1.yaml
+++ b/my-deepagent/docs/schemas/workflows/code-investigation@1.yaml
@@ -0,0 +1,63 @@
+name: code-investigation
+version: 1
+description: "코드베이스 탐색 → 요약 보고서 생성. 구조 파악·의존성 분석·이슈 발굴."
+roles:
+  - id: explorer
+    required_capabilities:
+      - evidence_check
+      - code_review
+    preferred_backends:
+      - openrouter
+    fallback_personas:
+      - "openrouter-claude-code-reviewer@1"
+      - "openrouter-deepseek-verifier@1"
+  - id: summarizer
+    required_capabilities:
+      - evidence_check
+      - final_report_compose
+    preferred_backends:
+      - openrouter
+    fallback_personas:
+      - "openrouter-claude-spec-writer@1"
+phases:
+  - key: explore
+    title: "코드베이스 탐색 및 정보 수집"
+    risk: low
+    role: explorer
+    expected_artifact:
+      path: artifacts/exploration.json
+      schema: dev/spec@1
+    gates: []
+    timeout_seconds: 600
+    instructions: |
+      코드베이스를 체계적으로 탐색합니다.
+      glob으로 전체 파일 구조를 파악하고 read_file로 핵심 파일을 읽습니다.
+      grep으로 주요 패턴·의존성·진입점을 검색합니다.
+      발견한 내용 (구조, 주요 컴포넌트, 의존성, 잠재적 이슈)을
+      dev/spec@1 형식으로 artifacts/exploration.json에 write_file로 저장합니다.
+      requirements 필드: 탐색 목적
+      approach 필드: 탐색한 파일 목록 및 방법
+      acceptance_criteria 필드: 발견한 핵심 사실들
+      risks 필드: 발견한 잠재적 이슈들
+    max_budget_usd: 0.50
+  - key: summarize
+    title: "탐색 결과 최종 보고서 작성"
+    risk: low
+    role: summarizer
+    expected_artifact:
+      path: artifacts/report.json
+      schema: common/final-report@1
+    gates:
+      - report_approved
+    timeout_seconds: 300
+    instructions: |
+      artifacts/exploration.json을 read_file로 읽고 common/final-report@1 형식으로
+      최종 보고서를 작성합니다.
+      status: "completed"
+      phases: explore와 summarize 단계 정보
+      findings: exploration.json의 risks 항목을 finding으로 변환
+      artifacts: exploration.json 경로 포함
+      보고서를 write_file로 artifacts/report.json에 저장합니다.
+    max_budget_usd: 0.30
+default_gates: []
+max_total_budget_usd: 1.0
--- a/my-deepagent/docs/schemas/workflows/spec-and-review@1.yaml
+++ b/my-deepagent/docs/schemas/workflows/spec-and-review@1.yaml
@@ -0,0 +1,76 @@
+name: spec-and-review
+version: 1
+description: "요구사항 → spec → 리뷰 → verifier 검증"
+roles:
+  - id: spec_writer
+    required_capabilities:
+      - spec_write
+      - phase_planning
+    preferred_backends:
+      - openrouter
+    fallback_personas:
+      - "openrouter-claude-spec-writer@1"
+  - id: reviewer
+    required_capabilities:
+      - code_review
+      - evidence_check
+    preferred_backends:
+      - openrouter
+    fallback_personas:
+      - "openrouter-claude-code-reviewer@1"
+  - id: verifier
+    required_capabilities:
+      - evidence_check
+      - objective_eval
+    preferred_backends:
+      - openrouter
+    fallback_personas:
+      - "openrouter-deepseek-verifier@1"
+phases:
+  - key: spec
+    title: "요구사항 분석 및 Spec 작성"
+    risk: low
+    role: spec_writer
+    expected_artifact:
+      path: artifacts/spec.json
+      schema: dev/spec@1
+    gates:
+      - spec_approved
+    timeout_seconds: 300
+    instructions: |
+      사용자 요구사항을 분석해 dev/spec@1 schema에 맞는 spec.json을 작성하세요.
+      기존 코드는 read_file/glob/grep으로 탐색합니다.
+      완성된 spec.json은 write_file로 artifacts/spec.json에 저장합니다.
+    max_budget_usd: 0.50
+  - key: review
+    title: "Spec 리뷰"
+    risk: low
+    role: reviewer
+    expected_artifact:
+      path: artifacts/review.json
+      schema: dev/review-finding-batch@1
+    gates:
+      - review_approved
+    timeout_seconds: 300
+    instructions: |
+      artifacts/spec.json을 read_file로 읽고 dev/review-finding-batch@1 형식으로 review.json을 작성하세요.
+      각 finding은 severity, category, summary를 반드시 포함합니다.
+      완성된 review.json은 write_file로 artifacts/review.json에 저장합니다.
+    max_budget_usd: 0.50
+  - key: verify
+    title: "리뷰 결과 검증"
+    risk: low
+    role: verifier
+    expected_artifact:
+      path: artifacts/verification.json
+      schema: dev/review-finding-batch@1
+    gates:
+      - verify_approved
+    timeout_seconds: 180
+    instructions: |
+      artifacts/review.json을 read_file로 읽고 각 finding을 코드에서 직접 확인합니다.
+      verifierStatus를 confirmed 또는 rejected로 판정하고 근거를 evidence 필드에 기록합니다.
+      결과를 write_file로 artifacts/verification.json에 저장합니다.
+    max_budget_usd: 0.10
+default_gates: []
+max_total_budget_usd: 2.0
--- a/my-deepagent/mypy.ini
+++ b/my-deepagent/mypy.ini
@@ -0,0 +1,15 @@
+[mypy]
+python_version = 3.12
+strict = true
+warn_return_any = true
+warn_unused_configs = true
+disallow_untyped_defs = true
+disallow_incomplete_defs = true
+disallow_untyped_decorators = true
+plugins = pydantic.mypy
+
+[mypy-tests.*]
+disallow_untyped_defs = false
+
+[mypy-alembic.*]
+ignore_errors = true
--- a/my-deepagent/pyproject.toml
+++ b/my-deepagent/pyproject.toml
@@ -0,0 +1,58 @@
+[project]
+name = "my-deepagent"
+version = "0.1.0"
+description = "Add your description here"
+requires-python = ">=3.12"
+dependencies = [
+    "aiosqlite>=0.20",
+    "alembic>=1.14",
+    "greenlet>=3.0",
+    "sqlalchemy[asyncio]>=2.0",
+    "httpx>=0.28",
+    "jsonschema>=4.23",
+    "keyring>=25.7",
+    "langchain>=0.3.0,<2.0.0",
+    "langchain-core>=0.3.0,<2.0.0",
+    "langchain-openai>=0.3.0,<2.0.0",
+    "langgraph>=0.2.0",
+    "langgraph-checkpoint-sqlite>=2.0.0",
+    "openai>=1.0.0",
+    "platformdirs>=4.9",
+    "prompt-toolkit>=3.0",
+    "pydantic>=2.9",
+    "pydantic-settings>=2.6",
+    "pyyaml>=6.0",
+    "rich>=13.9",
+    "structlog>=24.4",
+    "typer>=0.14",
+    "zstandard>=0.23",
+    "deepagents>=0.6.1,<0.7.0",
+]
+
+[project.scripts]
+mydeepagent = "my_deepagent.cli.main:app"
+
+[build-system]
+requires = ["uv_build>=0.9.28,<0.10.0"]
+build-backend = "uv_build"
+
+[tool.pytest.ini_options]
+asyncio_mode = "auto"
+testpaths = ["tests"]
+addopts = "-v --strict-markers"
+markers = [
+    "integration: marks tests as integration tests that make real external API calls (deselect with '-m not integration')",
+]
+
+[dependency-groups]
+dev = [
+    "mypy>=1.13",
+    "pre-commit>=4.0",
+    "pytest>=8.3",
+    "pytest-asyncio>=0.24",
+    "pytest-httpx>=0.34",
+    "respx>=0.21",
+    "ruff>=0.8",
+    "types-jsonschema>=4.26.0.20260508",
+    "types-pyyaml>=6.0.12.20260510",
+]
--- a/my-deepagent/ruff.toml
+++ b/my-deepagent/ruff.toml
@@ -0,0 +1,12 @@
+target-version = "py312"
+line-length = 100
+
+[lint]
+select = ["E", "W", "F", "I", "N", "B", "UP", "S", "C90", "RUF"]
+ignore = ["S101", "S311"]
+
+[lint.per-file-ignores]
+"tests/**" = ["S", "B"]
+
+[format]
+quote-style = "double"
--- a/my-deepagent/src/my_deepagent/init.py
+++ b/my-deepagent/src/my_deepagent/init.py
@@ -0,0 +1,3 @@
+"""my-deepagent: workflow harness + persona library + OpenRouter on top of deepagents."""
+
+__version__ = "0.1.0"
--- a/my-deepagent/src/my_deepagent/artifact_schema.py
+++ b/my-deepagent/src/my_deepagent/artifact_schema.py
@@ -0,0 +1,150 @@
+"""Artifact schema registry. Loads JSON Schema 2020-12 documents and validates artifacts.
+
+Schemas live at:
+    {data_dir}/artifacts/<schema_id>.json (user)
+    docs/schemas/artifacts/<schema_id>.json (seed)
+where <schema_id> is "<domain>/<name>@<version>" (e.g. "dev/spec@1").
+"""
+
+from __future__ import annotations
+
+import json
+from dataclasses import dataclass, field
+from pathlib import Path
+from typing import Any
+
+from jsonschema import Draft202012Validator, ValidationError
+from jsonschema.exceptions import SchemaError
+
+from .enums import ErrorClass
+from .errors import MyDeepAgentError
+
+
+@dataclass(frozen=True)
+class ValidationFinding:
+    """One JSON Schema validation error in a structured form."""
+
+    path: str  # JSON pointer-ish: "/findings/0/severity"
+    message: str
+    validator: str  # "enum", "required", "type", ...
+    expected: Any | None
+
+
+@dataclass(frozen=True)
+class ValidationResult:
+    ok: bool
+    errors: tuple[ValidationFinding, ...] = field(default_factory=tuple)
+
+
+class ArtifactSchemaRegistry:
+    """Loads + caches JSON Schema 2020-12 documents from one or more roots.
+
+    Roots are searched in order; first hit wins.
+    """
+
+    def __init__(self, roots: list[Path]) -> None:
+        if not roots:
+            raise MyDeepAgentError(
+                ErrorClass.FATAL,
+                "config_invalid",
+                message="ArtifactSchemaRegistry requires at least one root",
+            )
+        self._roots = [Path(r) for r in roots]
+        self._cache: dict[str, dict[str, Any]] = {}
+        self._validator_cache: dict[str, Draft202012Validator] = {}
+
+    def _resolve_path(self, schema_id: str) -> Path:
+        """Try each root for <root>/<schema_id>.json; return first existing."""
+        if not schema_id or "/" not in schema_id:
+            raise MyDeepAgentError(
+                ErrorClass.FATAL,
+                "artifact_schema_unknown",
+                message=(
+                    f"invalid schema_id format: {schema_id!r}"
+                    " (expected '<domain>/<name>@<version>')"
+                ),
+            )
+        rel = Path(f"{schema_id}.json")
+        for root in self._roots:
+            candidate = root / rel
+            if candidate.is_file():
+                return candidate
+        raise MyDeepAgentError(
+            ErrorClass.FATAL,
+            "artifact_schema_unknown",
+            message=(f"schema not found: {schema_id} (searched: {[str(r) for r in self._roots]})"),
+            recovery_hint=f"add {schema_id}.json to one of the registry roots",
+        )
+
+    def load(self, schema_id: str) -> dict[str, Any]:
+        """Return the parsed schema document. Cached after first load."""
+        if schema_id in self._cache:
+            return self._cache[schema_id]
+        path = self._resolve_path(schema_id)
+        try:
+            raw = path.read_text(encoding="utf-8")
+            schema: Any = json.loads(raw)
+        except (OSError, json.JSONDecodeError) as e:
+            raise MyDeepAgentError(
+                ErrorClass.FATAL,
+                "artifact_schema_load_failed",
+                message=f"failed to load schema {schema_id} from {path}: {e}",
+                cause=e,
+            ) from e
+        if not isinstance(schema, dict):
+            raise MyDeepAgentError(
+                ErrorClass.FATAL,
+                "artifact_schema_load_failed",
+                message=f"schema {schema_id} must be a JSON object at {path}",
+            )
+        # Verify the schema document itself is a valid Draft 2020-12 schema.
+        try:
+            Draft202012Validator.check_schema(schema)
+        except SchemaError as e:
+            raise MyDeepAgentError(
+                ErrorClass.FATAL,
+                "artifact_schema_load_failed",
+                message=(f"schema {schema_id} is not a valid Draft 2020-12 schema: {e.message}"),
+                cause=e,
+            ) from e
+        self._cache[schema_id] = schema
+        return schema
+
+    def _validator(self, schema_id: str) -> Draft202012Validator:
+        if schema_id not in self._validator_cache:
+            self._validator_cache[schema_id] = Draft202012Validator(self.load(schema_id))
+        return self._validator_cache[schema_id]
+
+    def validate(self, schema_id: str, data: Any) -> ValidationResult:
+        """Validate *data* against *schema_id*.
+
+        Returns a structured :class:`ValidationResult` — never raises for
+        invalid data.  Raises :class:`~my_deepagent.errors.MyDeepAgentError`
+        with code ``artifact_schema_unknown`` or ``artifact_schema_load_failed``
+        if the schema itself cannot be loaded.
+        """
+        validator = self._validator(schema_id)
+        raw_errors: list[ValidationError] = list(validator.iter_errors(data))
+        if not raw_errors:
+            return ValidationResult(ok=True)
+        findings = tuple(
+            ValidationFinding(
+                path="/" + "/".join(str(p) for p in err.absolute_path),
+                message=err.message,
+                validator=str(err.validator),
+                expected=err.validator_value,
+            )
+            for err in raw_errors
+        )
+        return ValidationResult(ok=False, errors=findings)
+
+    def known_schema_ids(self) -> list[str]:
+        """Enumerate all schemas found across all roots. Sorted, deduplicated."""
+        seen: set[str] = set()
+        for root in self._roots:
+            if not root.is_dir():
+                continue
+            for path in sorted(root.rglob("*.json")):
+                rel = path.relative_to(root).with_suffix("")
+                seen.add(str(rel))
+        return sorted(seen)
--- a/my-deepagent/src/my_deepagent/binding.py
+++ b/my-deepagent/src/my_deepagent/binding.py
@@ -0,0 +1,404 @@
+"""Persona binding algorithm: auto-select, override, capability/risk validation, consent gate."""
+
+from __future__ import annotations
+
+import fcntl
+import json
+import os
+from collections.abc import Iterator
+from contextlib import contextmanager
+from dataclasses import dataclass
+from datetime import UTC, datetime
+from pathlib import Path
+from typing import Any, Literal, cast
+
+from .enums import Backend, RiskLevel
+from .errors import MyDeepAgentError
+from .hash import sha256
+from .persona import Persona
+from .workflow import WorkflowRole, WorkflowTemplate
+
+ConsentDecision = Literal["approve", "block", "once"]
+
+_RISK_RANK: dict[RiskLevel, int] = {
+    RiskLevel.LOW: 0,
+    RiskLevel.MEDIUM: 1,
+    RiskLevel.HIGH: 2,
+}
+
+
+@dataclass(frozen=True)
+class BackendAvailability:
+    """Which backends are reachable in the current environment.
+
+    v0.1.0: openrouter availability is determined solely by API-key presence.
+    Other backends follow the same pattern — callers populate available_backends.
+    """
+
+    available_backends: frozenset[Backend]
+
+    def is_available(self, backend: Backend) -> bool:
+        return backend in self.available_backends
+
+
+@dataclass(frozen=True)
+class BindingOverride:
+    """Per-role persona override: role_id → "persona-name@version" spec string."""
+
+    persona_pinned: dict[str, str]
+
+    @classmethod
+    def parse(cls, raw: dict[str, str] | None) -> BindingOverride:
+        return cls(persona_pinned=dict(raw or {}))
+
+
+@dataclass(frozen=True)
+class Binding:
+    """Resolved binding of a single workflow role to a concrete persona."""
+
+    role_id: str
+    persona: Persona
+    binding_hash: str
+
+
+def is_persona_eligible_for_role(
+    persona: Persona,
+    role: WorkflowRole,
+    template: WorkflowTemplate,
+) -> tuple[bool, str | None]:
+    """Return (eligible, reason_if_not).
+
+    Checks three conditions in order:
+    1. The persona has all capabilities required by the role.
+    2. The persona's allowed_roles (if set) includes this role.
+    3. The persona's max_risk_level covers the highest phase risk for this role.
+    """
+    required = set(role.required_capabilities)
+    have = set(persona.capabilities)
+    if not required.issubset(have):
+        missing = required - have
+        return False, f"missing capabilities: {sorted(c.value for c in missing)}"
+
+    if persona.allowed_roles is not None and role.id not in persona.allowed_roles:
+        return False, f"role {role.id!r} not in persona.allowed_roles"
+
+    max_phase_risk = max(
+        (ph.risk for ph in template.phases if ph.role == role.id),
+        default=RiskLevel.LOW,
+    )
+    if _RISK_RANK[max_phase_risk] > _RISK_RANK[persona.max_risk_level]:
+        return (
+            False,
+            (
+                f"phase risk {max_phase_risk.value} > "
+                f"persona max_risk_level {persona.max_risk_level.value}"
+            ),
+        )
+
+    return True, None
+
+
+def _auto_select(candidates: list[Persona], role: WorkflowRole) -> Persona:
+    """Deterministic selection from eligible candidates.
+
+    Priority (ascending sort key):
+      1. preferred_backends index (lower = more preferred; non-preferred → last)
+      2. version descending (higher = newer)
+      3. name ascending (alphabetical tiebreak)
+      4. compute_hash ascending (hash tiebreak for identical name+version)
+    """
+
+    def _key(p: Persona) -> tuple[int, int, str, str]:
+        try:
+            pref_idx = role.preferred_backends.index(p.backend)
+        except ValueError:
+            pref_idx = len(role.preferred_backends) + 1
+        return (pref_idx, -p.version, p.name, p.compute_hash())
+
+    return sorted(candidates, key=_key)[0]
+
+
+class PersonaConsentStore:
+    """Crash-safe + multi-process-safe JSON file store for per-persona consent decisions.
+
+    Storage: {path} -> {"<persona_hash>": {"decision": "approve|block|once", "decided_at": "..."}}
+    Concurrency guarantees:
+      * Writes are atomic via tmp-file + fsync + os.replace (POSIX rename is atomic).
+      * Cross-process safety via advisory ``fcntl.flock`` on a lock-file at ``{path}.lock``.
+        ``set()`` / ``revoke()`` hold an exclusive lock for the read-modify-write cycle;
+        ``get()`` uses a shared lock for consistent reads. This prevents lost-update
+        races between concurrent ``mydeepagent`` invocations on the same machine.
+    """
+
+    def __init__(self, path: Path) -> None:
+        self._path = path
+        self._lock_path = path.with_suffix(path.suffix + ".lock")
+
+    @contextmanager
+    def _flock(self, exclusive: bool) -> Iterator[None]:
+        """Acquire a POSIX advisory lock for the duration of the block."""
+        self._lock_path.parent.mkdir(parents=True, exist_ok=True)
+        fd = os.open(self._lock_path, os.O_RDWR | os.O_CREAT, 0o600)
+        try:
+            fcntl.flock(fd, fcntl.LOCK_EX if exclusive else fcntl.LOCK_SH)
+            try:
+                yield
+            finally:
+                fcntl.flock(fd, fcntl.LOCK_UN)
+        finally:
+            os.close(fd)
+
+    def _load(self) -> dict[str, Any]:
+        if not self._path.is_file():
+            return {}
+        try:
+            raw = self._path.read_text(encoding="utf-8")
+            data: object = json.loads(raw) if raw.strip() else {}
+        except (OSError, json.JSONDecodeError) as e:
+            raise MyDeepAgentError.fatal(
+                "internal_state_corruption",
+                message=f"failed to read consent store at {self._path}: {e}",
+                recovery_hint=(
+                    f"delete {self._path} and re-run; "
+                    "previously granted consents will be re-prompted"
+                ),
+                cause=e,
+            ) from e
+        if not isinstance(data, dict):
+            raise MyDeepAgentError.fatal(
+                "internal_state_corruption",
+                message=f"consent store must be a JSON object: {self._path}",
+            )
+        return data
+
+    def _write(self, data: dict[str, Any]) -> None:
+        """Atomic crash-safe write. Caller must already hold the exclusive flock."""
+        self._path.parent.mkdir(parents=True, exist_ok=True)
+        tmp = self._path.with_suffix(self._path.suffix + ".tmp")
+        payload = json.dumps(data, indent=2, sort_keys=True, ensure_ascii=False)
+        fd = os.open(tmp, os.O_WRONLY | os.O_CREAT | os.O_TRUNC, 0o600)
+        try:
+            os.write(fd, payload.encode("utf-8"))
+            os.fsync(fd)
+        finally:
+            os.close(fd)
+        os.replace(tmp, self._path)
+
+    def get(self, persona_hash: str) -> ConsentDecision | None:
+        """Return stored decision or None if absent / unrecognised."""
+        with self._flock(exclusive=False):
+            entry = self._load().get(persona_hash)
+        if entry is None:
+            return None
+        decision = entry.get("decision") if isinstance(entry, dict) else None
+        if decision not in ("approve", "block", "once"):
+            return None
+        return cast(ConsentDecision, decision)
+
+    def set(self, persona_hash: str, decision: ConsentDecision) -> None:
+        """Persist a consent decision. Exclusive lock + atomic write."""
+        with self._flock(exclusive=True):
+            data = self._load()
+            data[persona_hash] = {
+                "decision": decision,
+                "decided_at": datetime.now(UTC).isoformat(timespec="seconds"),
+            }
+            self._write(data)
+
+    def revoke(self, persona_hash: str) -> None:
+        """Remove a previously stored consent decision. Exclusive lock. No-op if absent."""
+        with self._flock(exclusive=True):
+            data = self._load()
+            data.pop(persona_hash, None)
+            self._write(data)
+
+
+def filter_consented_personas(
+    personas: list[Persona],
+    consent_store: PersonaConsentStore,
+) -> list[Persona]:
+    """Remove personas whose consent decision is 'block'.
+
+    'approve', 'once', and absent (None) decisions all allow the persona through.
+    """
+    return [p for p in personas if consent_store.get(p.compute_hash()) != "block"]
+
+
+def _parse_override_version(pinned_spec: str, version_str: str) -> int | None:
+    """Parse the version component of an override spec. None if empty, raise otherwise."""
+    if not version_str:
+        return None
+    try:
+        return int(version_str)
+    except ValueError as e:
+        raise MyDeepAgentError.human_required(
+            "no_eligible_persona",
+            message=(f"override spec '{pinned_spec}' has non-integer version '{version_str}'"),
+            recovery_hint="use the format '<persona-name>@<integer-version>'",
+            cause=e,
+        ) from e
+
+
+def _resolve_override(
+    role: WorkflowRole,
+    template: WorkflowTemplate,
+    pinned_spec: str,
+    eligible: list[Persona],
+    persona_pool: list[Persona],
+    consent_store: PersonaConsentStore,
+) -> Persona:
+    """Resolve an override spec to a single eligible persona or raise human_required."""
+    name, _, version_str = pinned_spec.partition("@")
+    version = _parse_override_version(pinned_spec, version_str)
+    matches = [p for p in eligible if p.name == name and (version is None or p.version == version)]
+    if matches:
+        return matches[0] if len(matches) == 1 else _auto_select(matches, role)
+    # Distinguish: blocked vs. ineligible vs. simply absent.
+    pool_matches = [
+        p for p in persona_pool if p.name == name and (version is None or p.version == version)
+    ]
+    if any(consent_store.get(p.compute_hash()) == "block" for p in pool_matches):
+        raise MyDeepAgentError.human_required(
+            "persona_blocked_by_user",
+            message=f"override persona '{pinned_spec}' is consent-blocked",
+            recovery_hint="run `mydeepagent consents revoke <persona>` to clear the block",
+        )
+    if pool_matches:
+        _, reason = is_persona_eligible_for_role(pool_matches[0], role, template)
+        raise MyDeepAgentError.human_required(
+            "no_eligible_persona",
+            message=(
+                f"override persona '{pinned_spec}' is ineligible for role '{role.id}': {reason}"
+            ),
+        )
+    raise MyDeepAgentError.human_required(
+        "no_eligible_persona",
+        message=f"no eligible persona matches override '{pinned_spec}' for role '{role.id}'",
+    )
+
+
+def _resolve_auto(
+    role: WorkflowRole,
+    template: WorkflowTemplate,
+    eligible: list[Persona],
+    persona_pool: list[Persona],
+    consent_store: PersonaConsentStore,
+) -> Persona:
+    """Auto-select from eligible or raise human_required with diagnostic context."""
+    if eligible:
+        return _auto_select(eligible, role)
+    any_blocked = any(
+        is_persona_eligible_for_role(p, role, template)[0]
+        and consent_store.get(p.compute_hash()) == "block"
+        for p in persona_pool
+    )
+    if any_blocked:
+        raise MyDeepAgentError.human_required(
+            "persona_blocked_by_user",
+            message=(f"all eligible personas for role '{role.id}' are blocked by user consent"),
+        )
+    raise MyDeepAgentError.human_required(
+        "no_eligible_persona",
+        message=f"no eligible persona for role '{role.id}'",
+        recovery_hint=(
+            f"add a persona with capabilities "
+            f"{sorted(c.value for c in role.required_capabilities)} "
+            "to docs/schemas/personas/"
+        ),
+    )
+
+
+def bind_personas(
+    template: WorkflowTemplate,
+    persona_pool: list[Persona],
+    available_backends: BackendAvailability,
+    consent_store: PersonaConsentStore,
+    override: BindingOverride | None = None,
+) -> dict[str, Binding]:
+    """Bind each workflow role to a concrete persona.
+
+    Resolution order per role:
+      1. Apply consent filter (remove 'block' personas).
+      2. Apply eligibility filter (capabilities, allowed_roles, risk level).
+      3. If override is set for this role, pick the pinned persona from eligible.
+      4. Otherwise, auto_select from eligible.
+      5. Validate backend availability.
+      6. Validate openrouter model non-empty.
+
+    Raises:
+        MyDeepAgentError (human_required, 'no_eligible_persona') — no match found.
+        MyDeepAgentError (human_required, 'persona_blocked_by_user') — all candidates blocked.
+        MyDeepAgentError (human_required, 'backend_unavailable') — backend not in environment.
+        MyDeepAgentError (human_required, 'model_unavailable') — openrouter model is blank.
+    """
+    _override = override or BindingOverride.parse(None)
+    consented_pool = filter_consented_personas(persona_pool, consent_store)
+    bindings: dict[str, Binding] = {}
+
+    for role in template.roles:
+        eligible: list[Persona] = [
+            p for p in consented_pool if is_persona_eligible_for_role(p, role, template)[0]
+        ]
+
+        if role.id in _override.persona_pinned:
+            chosen = _resolve_override(
+                role,
+                template,
+                _override.persona_pinned[role.id],
+                eligible,
+                persona_pool,
+                consent_store,
+            )
+        else:
+            chosen = _resolve_auto(role, template, eligible, persona_pool, consent_store)
+
+        # Backend availability check
+        if not available_backends.is_available(chosen.backend):
+            raise MyDeepAgentError.human_required(
+                "backend_unavailable",
+                message=(
+                    f"backend '{chosen.backend.value}' is not available "
+                    f"for persona '{chosen.name}@{chosen.version}'"
+                ),
+                recovery_hint=_backend_recovery_hint(chosen.backend),
+            )
+
+        # Openrouter model non-empty check
+        if chosen.backend == Backend.OPENROUTER and not chosen.model.strip():
+            raise MyDeepAgentError.human_required(
+                "model_unavailable",
+                message=(
+                    f"persona '{chosen.name}@{chosen.version}' "
+                    "has empty model for openrouter backend"
+                ),
+                recovery_hint=(
+                    "set `model:` field in the persona yaml "
+                    "(e.g. 'openrouter:deepseek/deepseek-chat')"
+                ),
+            )
+
+        binding_hash = sha256(
+            {
+                "role_id": role.id,
+                "template_name": template.name,
+                "template_version": template.version,
+                "persona_hash": chosen.compute_hash(),
+                "backend": chosen.backend.value,
+            }
+        )
+        bindings[role.id] = Binding(role_id=role.id, persona=chosen, binding_hash=binding_hash)
+
+    return bindings
+
+
+def _backend_recovery_hint(backend: Backend) -> str:
+    if backend == Backend.OPENROUTER:
+        return "run `mydeepagent login openrouter` to register an API key"
+    if backend in (Backend.ANTHROPIC, Backend.OPENAI, Backend.GOOGLE):
+        return f"run `mydeepagent login {backend.value}` to register an API key"
+    if backend == Backend.FAKE:
+        return (
+            "the 'fake' backend is for tests only; "
+            "add Backend.FAKE to the BackendAvailability set in your test harness"
+        )
+    return f"enable backend '{backend.value}' in config and ensure prerequisites"
--- a/my-deepagent/src/my_deepagent/cli/init.py
+++ b/my-deepagent/src/my_deepagent/cli/init.py
--- a/my-deepagent/src/my_deepagent/cli/doctor.py
+++ b/my-deepagent/src/my_deepagent/cli/doctor.py
@@ -0,0 +1 @@
+"""CLI doctor command for environment diagnostics. Implemented in Step 12."""
--- a/my-deepagent/src/my_deepagent/cli/interactive.py
+++ b/my-deepagent/src/my_deepagent/cli/interactive.py
@@ -0,0 +1 @@
+"""CLI interactive subcommand. Implemented in Step 10."""
--- a/my-deepagent/src/my_deepagent/cli/main.py
+++ b/my-deepagent/src/my_deepagent/cli/main.py
@@ -0,0 +1 @@
+"""Typer CLI entry point. Filled in Step 6."""
--- a/my-deepagent/src/my_deepagent/cli/run.py
+++ b/my-deepagent/src/my_deepagent/cli/run.py
@@ -0,0 +1 @@
+"""CLI run command implementation. Implemented in Step 6."""
--- a/my-deepagent/src/my_deepagent/cli/seed.py
+++ b/my-deepagent/src/my_deepagent/cli/seed.py
@@ -0,0 +1 @@
+"""CLI seed command for importing persona/workflow YAML assets. Implemented in Step 6."""
--- a/my-deepagent/src/my_deepagent/cli/stats.py
+++ b/my-deepagent/src/my_deepagent/cli/stats.py
@@ -0,0 +1 @@
+"""CLI stats command for usage summary. Implemented in Step 12."""
--- a/my-deepagent/src/my_deepagent/config.py
+++ b/my-deepagent/src/my_deepagent/config.py
@@ -0,0 +1,109 @@
+"""Application configuration loaded from env, .env, and TOML file via pydantic-settings."""
+
+from __future__ import annotations
+
+from pathlib import Path
+from typing import Literal
+
+from platformdirs import PlatformDirs
+from pydantic import Field, ValidationError, field_validator
+from pydantic_settings import (
+    BaseSettings,
+    PydanticBaseSettingsSource,
+    SettingsConfigDict,
+    TomlConfigSettingsSource,
+)
+
+from .enums import ErrorClass
+from .errors import MyDeepAgentError
+
+_DIRS = PlatformDirs("my-deepagent", "user", roaming=False)
+
+
+class Config(BaseSettings):
+    """Frozen application config. Source priority (high -> low): CLI/env, .env, TOML, defaults."""
+
+    model_config = SettingsConfigDict(
+        env_prefix="MYDEEPAGENT_",
+        env_file=".env",
+        env_file_encoding="utf-8",
+        toml_file=Path(_DIRS.user_config_dir) / "config.toml",
+        frozen=True,
+        extra="ignore",
+    )
+
+    # storage
+    database_url: str = Field(
+        default_factory=lambda: (
+            f"sqlite+aiosqlite:///{Path(_DIRS.user_data_dir) / 'database.sqlite3'}"
+        )
+    )
+    workspace_root: Path = Field(default_factory=Path.cwd)
+    data_dir: Path = Field(default_factory=lambda: Path(_DIRS.user_data_dir))
+    config_dir: Path = Field(default_factory=lambda: Path(_DIRS.user_config_dir))
+    state_dir: Path = Field(default_factory=lambda: Path(_DIRS.user_state_dir))
+
+    # logging / i18n
+    log_level: Literal["trace", "debug", "info", "warn", "error"] = "info"
+    lang: Literal["ko", "en"] = "ko"
+
+    # providers
+    openrouter_api_key: str | None = None
+    openrouter_base_url: str = "https://openrouter.ai/api/v1"
+
+    # observability
+    langsmith_tracing: bool = False
+    langsmith_api_key: str | None = None
+    langsmith_project: str = "my-deepagent"
+
+    # budget
+    budget_daily_usd: float = Field(default=5.0, ge=0)
+    budget_daily_warn_usd: float = Field(default=3.0, ge=0)
+    budget_run_usd: float = Field(default=1.0, ge=0)
+    budget_run_warn_usd: float = Field(default=0.5, ge=0)
+    budget_on_hit: Literal["prompt", "block", "warn_continue"] = "prompt"
+
+    # defaults
+    default_persona: str = "default-interactive"
+
+    @field_validator("workspace_root", "data_dir", "config_dir", "state_dir")
+    @classmethod
+    def _expand(cls, v: Path) -> Path:
+        return Path(v).expanduser().resolve()
+
+    @classmethod
+    def settings_customise_sources(
+        cls,
+        settings_cls: type[BaseSettings],
+        init_settings: PydanticBaseSettingsSource,
+        env_settings: PydanticBaseSettingsSource,
+        dotenv_settings: PydanticBaseSettingsSource,
+        file_secret_settings: PydanticBaseSettingsSource,
+    ) -> tuple[PydanticBaseSettingsSource, ...]:
+        # priority: init > env > dotenv > toml > defaults
+        return (
+            init_settings,
+            env_settings,
+            dotenv_settings,
+            TomlConfigSettingsSource(settings_cls),
+            file_secret_settings,
+        )
+
+
+def load_config(**overrides: object) -> Config:
+    """Load Config with optional kwargs override.
+
+    Wraps pydantic ValidationError in MyDeepAgentError(fatal, config_invalid) per plan §18.
+    """
+    try:
+        return Config(**overrides)  # type: ignore[arg-type]
+    except ValidationError as e:
+        raise MyDeepAgentError(
+            ErrorClass.FATAL,
+            "config_invalid",
+            message=f"config validation failed: {e}",
+            recovery_hint=(
+                "check .env, environment variables, and ~/.config/my-deepagent/config.toml"
+            ),
+            cause=e,
+        ) from e
--- a/my-deepagent/src/my_deepagent/engine.py
+++ b/my-deepagent/src/my_deepagent/engine.py
@@ -0,0 +1 @@
+"""LangGraph run engine orchestrator. Implemented in Step 7."""
--- a/my-deepagent/src/my_deepagent/enums.py
+++ b/my-deepagent/src/my_deepagent/enums.py
@@ -0,0 +1,92 @@
+"""All closed-set enums used across the codebase."""
+
+from enum import StrEnum
+
+
+class Backend(StrEnum):
+    OPENROUTER = "openrouter"
+    ANTHROPIC = "anthropic"
+    OPENAI = "openai"
+    GOOGLE = "google"
+    FAKE = "fake"
+
+
+class Capability(StrEnum):
+    SPEC_WRITE = "spec_write"
+    PHASE_PLANNING = "phase_planning"
+    TASK_DAG_PLANNING = "task_dag_planning"
+    CODE_EDIT = "code_edit"
+    TEST_FIRST_DEVELOPMENT = "test_first_development"
+    CODE_REVIEW = "code_review"
+    EVIDENCE_CHECK = "evidence_check"
+    COMMAND_EXECUTE = "command_execute"
+    BACKTEST_RUN = "backtest_run"
+    METRIC_EXTRACT = "metric_extract"
+    FAILURE_MINING = "failure_mining"
+    OBJECTIVE_EVAL = "objective_eval"
+    FINAL_REPORT_COMPOSE = "final_report_compose"
+
+
+class RiskLevel(StrEnum):
+    LOW = "low"
+    MEDIUM = "medium"
+    HIGH = "high"
+
+
+class ApprovalDecisionAction(StrEnum):
+    APPROVE = "approve"
+    REJECT = "reject"
+    REQUEST_CHANGES = "request_changes"
+    ABORT = "abort"
+
+
+class ApprovalState(StrEnum):
+    PENDING = "pending"
+    APPROVED = "approved"
+    REJECTED = "rejected"
+    CHANGES_REQUESTED = "changes_requested"
+    ABORTED = "aborted"
+    PAUSED = "paused"
+
+
+class RunState(StrEnum):
+    CREATED = "created"
+    BOUND = "bound"
+    PLANNING = "planning"
+    AWAITING_APPROVAL = "awaiting_approval"
+    EXECUTING = "executing"
+    PAUSED = "paused"
+    COMPLETED = "completed"
+    FAILED = "failed"
+    ABORTED = "aborted"
+
+
+class RunPhaseState(StrEnum):
+    PENDING = "pending"
+    RUNNING = "running"
+    AWAITING_ARTIFACT = "awaiting_artifact"
+    VALIDATING = "validating"
+    AWAITING_APPROVAL = "awaiting_approval"
+    COMPLETED = "completed"
+    FAILED = "failed"
+    SKIPPED = "skipped"
+
+
+class SessionState(StrEnum):
+    CREATED = "CREATED"
+    BOOTSTRAPPING = "BOOTSTRAPPING"
+    READY = "READY"
+    BUSY = "BUSY"
+    WAITING_FOR_APPROVAL = "WAITING_FOR_APPROVAL"
+    ARTIFACT_TIMEOUT = "ARTIFACT_TIMEOUT"
+    HUNG = "HUNG"
+    CRASHED = "CRASHED"
+    RESUMING = "RESUMING"
+    REBOOTSTRAPPED = "REBOOTSTRAPPED"
+    FAILED_NEEDS_HUMAN = "FAILED_NEEDS_HUMAN"
+
+
+class ErrorClass(StrEnum):
+    RECOVERABLE = "recoverable"
+    HUMAN_REQUIRED = "human_required"
+    FATAL = "fatal"
--- a/my-deepagent/src/my_deepagent/errors.py
+++ b/my-deepagent/src/my_deepagent/errors.py
@@ -0,0 +1,79 @@
+"""Domain errors. All exceptions raised by my-deepagent inherit MyDeepAgentError."""
+
+from __future__ import annotations
+
+from uuid import UUID
+
+from .enums import ErrorClass
+
+
+class MyDeepAgentError(Exception):
+    """Base error with structured fields for classification, recovery hint, and context."""
+
+    def __init__(
+        self,
+        error_class: ErrorClass,
+        code: str,
+        *,
+        message: str | None = None,
+        run_id: UUID | None = None,
+        phase_id: UUID | None = None,
+        recovery_hint: str | None = None,
+        cause: BaseException | None = None,
+    ) -> None:
+        super().__init__(message or code)
+        self.error_class = error_class
+        self.code = code
+        self.run_id = run_id
+        self.phase_id = phase_id
+        self.recovery_hint = recovery_hint
+        if cause is not None:
+            self.__cause__ = cause
+            self.__suppress_context__ = True
+
+    def __repr__(self) -> str:
+        parts = [f"class={self.error_class}", f"code={self.code}"]
+        if self.run_id is not None:
+            parts.append(f"run_id={self.run_id}")
+        if self.phase_id is not None:
+            parts.append(f"phase_id={self.phase_id}")
+        if self.recovery_hint:
+            parts.append(f"hint={self.recovery_hint!r}")
+        return f"MyDeepAgentError({', '.join(parts)})"
+
+    @classmethod
+    def recoverable(cls, code: str, **kwargs: object) -> MyDeepAgentError:
+        return MyDeepAgentError(ErrorClass.RECOVERABLE, code, **kwargs)  # type: ignore[arg-type]
+
+    @classmethod
+    def human_required(cls, code: str, **kwargs: object) -> MyDeepAgentError:
+        return MyDeepAgentError(ErrorClass.HUMAN_REQUIRED, code, **kwargs)  # type: ignore[arg-type]
+
+    @classmethod
+    def fatal(cls, code: str, **kwargs: object) -> MyDeepAgentError:
+        return MyDeepAgentError(ErrorClass.FATAL, code, **kwargs)  # type: ignore[arg-type]
+
+
+class BudgetExhaustedError(MyDeepAgentError):
+    """Budget cap hit. Raised by BudgetTracker.assert_can_call when on_hit='block'."""
+
+    def __init__(
+        self,
+        scope: str,
+        projected_usd: float,
+        cap_usd: float,
+        *,
+        run_id: UUID | None = None,
+        recovery_hint: str | None = None,
+    ) -> None:
+        super().__init__(
+            ErrorClass.HUMAN_REQUIRED,
+            "budget_exhausted",
+            message=f"budget '{scope}' exhausted: projected={projected_usd:.4f} cap={cap_usd:.4f}",
+            run_id=run_id,
+            recovery_hint=recovery_hint
+            or f"wait until the next period or extend the cap for scope '{scope}'",
+        )
+        self.scope = scope
+        self.projected_usd = projected_usd
+        self.cap_usd = cap_usd
--- a/my-deepagent/src/my_deepagent/hash.py
+++ b/my-deepagent/src/my_deepagent/hash.py
@@ -0,0 +1,28 @@
+"""Canonical JSON serialization + sha256 hashing for content-addressed identity."""
+
+from __future__ import annotations
+
+import hashlib
+import json
+from typing import Any
+
+
+def canonicalize(value: Any) -> str:
+    """Return canonical JSON: keys sorted, no insignificant whitespace, UTF-16 codepoint order.
+
+    json.dumps with sort_keys=True uses Python's default dict key sort which is by Unicode
+    codepoint. For ASCII keys this is equivalent to UTF-16 codepoint order which is what
+    we want. For non-ASCII keys outside the BMP, this is a documented approximation.
+    """
+    return json.dumps(
+        value,
+        sort_keys=True,
+        ensure_ascii=False,
+        separators=(",", ":"),
+        allow_nan=False,
+    )
+
+
+def sha256(value: Any) -> str:
+    """Return sha256 hex digest of canonical JSON of value."""
+    return hashlib.sha256(canonicalize(value).encode("utf-8")).hexdigest()
--- a/my-deepagent/src/my_deepagent/i18n/init.py
+++ b/my-deepagent/src/my_deepagent/i18n/init.py
--- a/my-deepagent/src/my_deepagent/i18n/en.toml
+++ b/my-deepagent/src/my_deepagent/i18n/en.toml
--- a/my-deepagent/src/my_deepagent/i18n/ko.toml
+++ b/my-deepagent/src/my_deepagent/i18n/ko.toml
--- a/my-deepagent/src/my_deepagent/interactive.py
+++ b/my-deepagent/src/my_deepagent/interactive.py
@@ -0,0 +1 @@
+"""Interactive REPL loop for TUI sessions. Implemented in Step 10."""
--- a/my-deepagent/src/my_deepagent/middleware/init.py
+++ b/my-deepagent/src/my_deepagent/middleware/init.py
--- a/my-deepagent/src/my_deepagent/middleware/audit.py
+++ b/my-deepagent/src/my_deepagent/middleware/audit.py
@@ -0,0 +1,73 @@
+"""AuditToolMiddleware: capture every tool call for audit log + DB.
+
+Records: name, args, result/error, duration.
+"""
+
+from __future__ import annotations
+
+import time
+from typing import Any
+from uuid import UUID
+
+from langchain.agents.middleware import AgentMiddleware
+
+
+class AuditToolMiddleware(AgentMiddleware):
+    """Record every tool invocation for the audit log and DB sink (Step 8)."""
+
+    def __init__(
+        self,
+        run_id: UUID | None = None,
+        phase_id: UUID | None = None,
+        interactive_session_id: UUID | None = None,
+        recorder: Any | None = None,
+    ) -> None:
+        super().__init__()
+        self.run_id = run_id
+        self.phase_id = phase_id
+        self.interactive_session_id = interactive_session_id
+        self.recorder = recorder
+
+    async def awrap_tool_call(self, request: Any, handler: Any) -> Any:
+        started = time.perf_counter()
+        # ToolCallRequest exposes tool_call dict with 'name' and 'args'
+        tool_call = getattr(request, "tool_call", {}) or {}
+        name: str = tool_call.get("name", "unknown") if isinstance(tool_call, dict) else "unknown"
+        args: dict[str, Any] = (
+            tool_call.get("args", {}) if isinstance(tool_call, dict) else {}
+        ) or {}
+        try:
+            result = await handler(request)
+        except Exception as e:
+            await self._record(name, args, None, type(e).__name__, started)
+            raise
+        await self._record(name, args, result, None, started)
+        return result
+
+    async def _record(
+        self,
+        name: str,
+        args: dict[str, Any],
+        result: Any,
+        error: str | None,
+        started: float,
+    ) -> None:
+        if self.recorder is None:
+            return
+        serializable_result: str | int | float | bool | dict[str, Any] | list[Any] | None
+        if isinstance(result, (str, int, float, bool, dict, list)) or result is None:
+            serializable_result = result
+        else:
+            serializable_result = str(result)
+        await self.recorder(
+            {
+                "tool_name": name,
+                "args": args,
+                "result": serializable_result,
+                "error": error,
+                "duration_ms": int((time.perf_counter() - started) * 1000),
+                "run_id": self.run_id,
+                "phase_id": self.phase_id,
+                "interactive_session_id": self.interactive_session_id,
+            }
+        )
--- a/my-deepagent/src/my_deepagent/middleware/cost.py
+++ b/my-deepagent/src/my_deepagent/middleware/cost.py
@@ -0,0 +1,87 @@
+"""CostMiddleware: capture every LLM call's usage and accumulate cost into the SQLite ledger."""
+
+from __future__ import annotations
+
+import time
+from typing import Any
+from uuid import UUID
+
+from langchain.agents.middleware import AgentMiddleware
+
+from ..monitoring.pricing import PricingCache
+
+
+class CostMiddleware(AgentMiddleware):
+    """Wrap every model call. Compute cost from usage_metadata and persist.
+
+    Step 8 wires the DB writer via the recorder callback.
+    """
+
+    def __init__(
+        self,
+        pricing: PricingCache,
+        model_name: str,
+        run_id: UUID | None = None,
+        phase_id: UUID | None = None,
+        persona_name: str | None = None,
+        recorder: Any | None = None,  # callable(record) -> Awaitable[None] for DB sink (Step 8)
+    ) -> None:
+        super().__init__()
+        self.pricing = pricing
+        self.model_name = model_name
+        self.run_id = run_id
+        self.phase_id = phase_id
+        self.persona_name = persona_name
+        self.recorder = recorder
+
+    async def awrap_model_call(self, request: Any, handler: Any) -> Any:
+        started = time.perf_counter()
+        try:
+            response = await handler(request)
+        except Exception as e:
+            await self._record(
+                input_tokens=0,
+                output_tokens=0,
+                latency_ms=int((time.perf_counter() - started) * 1000),
+                status="error",
+                error_code=type(e).__name__,
+            )
+            raise
+        usage = getattr(response, "usage_metadata", None) or {}
+        in_tokens = int(usage.get("input_tokens", 0) or 0)
+        out_tokens = int(usage.get("output_tokens", 0) or 0)
+        await self._record(
+            input_tokens=in_tokens,
+            output_tokens=out_tokens,
+            latency_ms=int((time.perf_counter() - started) * 1000),
+            status="ok",
+            error_code=None,
+        )
+        return response
+
+    async def _record(
+        self,
+        *,
+        input_tokens: int,
+        output_tokens: int,
+        latency_ms: int,
+        status: str,
+        error_code: str | None,
+    ) -> None:
+        if self.recorder is None:
+            return
+        cost = self.pricing.compute_cost(self.model_name, input_tokens, output_tokens)
+        await self.recorder(
+            {
+                "model": self.model_name,
+                "run_id": self.run_id,
+                "phase_id": self.phase_id,
+                "persona_name": self.persona_name,
+                "input_tokens": input_tokens,
+                "output_tokens": output_tokens,
+                "cost_usd_total": cost,
+                "latency_ms": latency_ms,
+                "status": status,
+                "error_code": error_code,
+            }
+        )
--- a/my-deepagent/src/my_deepagent/middleware/fallback.py
+++ b/my-deepagent/src/my_deepagent/middleware/fallback.py
@@ -0,0 +1,47 @@
+"""FallbackModelMiddleware: retry the model call with a different model on transient HTTP errors."""
+
+from __future__ import annotations
+
+from typing import Any
+
+import httpx
+import openai
+from langchain.agents.middleware import AgentMiddleware
+
+
+class FallbackModelMiddleware(AgentMiddleware):
+    """When the primary model raises a transient error, retry once with the fallback model.
+
+    Transient = HTTP 429, 5xx, network errors. Auth (401/AuthenticationError) and bad request
+    (400 model_not_found) are not retried — those need human intervention.
+    """
+
+    def __init__(self, primary: Any, fallback: Any | None) -> None:
+        super().__init__()
+        self.primary = primary
+        self.fallback = fallback
+
+    async def awrap_model_call(self, request: Any, handler: Any) -> Any:
+        try:
+            return await handler(request)
+        except openai.AuthenticationError:
+            # 401 is human_required, not retryable.
+            raise
+        except (httpx.HTTPError, openai.RateLimitError, openai.APIConnectionError):
+            if self.fallback is None:
+                raise
+            # Best-effort: swap the model bound to the request and retry once.
+            patched = self._with_fallback_model(request)
+            return await handler(patched)
+
+    def _with_fallback_model(self, request: Any) -> Any:
+        """Swap the bound model in the request for the fallback model.
+
+        ModelRequest exposes a `model` attribute (BaseChatModel instance).
+        We replace it with the fallback. The original request object is mutated
+        in place because ModelRequest.__setattr__ triggers a DeprecationWarning
+        only on ToolCallRequest; ModelRequest is a plain dataclass that allows assignment.
+        """
+        if hasattr(request, "model"):
+            request.model = self.fallback
+        return request
--- a/my-deepagent/src/my_deepagent/middleware/safety.py
+++ b/my-deepagent/src/my_deepagent/middleware/safety.py
@@ -0,0 +1,126 @@
+"""SafetyShellMiddleware: destructive command + secret-path enforcement at the tool layer.
+
+Replaces deepagents.FilesystemPermission for personas using LocalShellBackend,
+since deepagents 0.6.1 does not yet support permissions + execution-capable backends.
+"""
+
+from __future__ import annotations
+
+import re
+from pathlib import Path
+from typing import Any
+
+from langchain.agents.middleware import AgentMiddleware
+from wcmatch import glob as wcglob
+
+from ..errors import MyDeepAgentError
+
+DESTRUCTIVE_PATTERNS: tuple[re.Pattern[str], ...] = tuple(
+    re.compile(p, re.IGNORECASE)
+    for p in (
+        r"\brm\s+-rf\b",
+        r"\bgit\s+reset\s+--hard\b",
+        r"\bgit\s+clean\b",
+        r"\bgit\s+push\s+--force(-with-lease)?\b",
+        r"\bgit\s+branch\s+-D\b",
+        r"\bdocker\s+volume\s+rm\b",
+        r"\bdocker\s+compose\s+down\s+-v\b",
+        r"\bDROP\s+(DATABASE|SCHEMA|TABLE)\b",
+    )
+)
+
+# Mirrors session.DEFAULT_DENY_PATHS but as relative glob patterns for wcmatch.
+# Each sensitive directory is listed twice: once for the directory itself (no trailing
+# slash — Path normalises it away) and once for everything inside it (**).
+DENY_PATH_PATTERNS: tuple[str, ...] = (
+    "**/.env*",
+    "**/*.env*",
+    "**/*token*",
+    "**/*secret*",
+    "**/*credential*",
+    "**/*.pem",
+    "**/*.key",
+    "**/.ssh",
+    "**/.ssh/**",
+    "**/.aws",
+    "**/.aws/**",
+    "**/.config/gcloud",
+    "**/.config/gcloud/**",
+    "**/.kube",
+    "**/.kube/**",
+    "**/.gnupg",
+    "**/.gnupg/**",
+)
+
+_PATH_TOOLS: frozenset[str] = frozenset({"read_file", "write_file", "edit_file", "ls"})
+
+# Tool names that carry shell commands.
+_SHELL_TOOL_NAMES: frozenset[str] = frozenset({"shell", "execute", "run_command"})
+
+_GLOB_FLAGS = wcglob.GLOBSTAR | wcglob.IGNORECASE | wcglob.DOTGLOB
+
+
+def _is_denied_path(path: str) -> bool:
+    """Return True iff the path matches any deny glob pattern."""
+    normalized = str(Path(path)).replace("\\", "/").lstrip("/")
+    for pat in DENY_PATH_PATTERNS:
+        if wcglob.globmatch(normalized, pat, flags=_GLOB_FLAGS):
+            return True
+    return False
+
+
+class SafetyShellMiddleware(AgentMiddleware):
+    """Hard-block destructive shell commands and secret-path file ops at the tool layer."""
+
+    async def awrap_tool_call(self, request: Any, handler: Any) -> Any:
+        name = self._tool_name(request)
+        args = self._tool_args(request)
+        if name in _SHELL_TOOL_NAMES:
+            self._check_shell(args)
+        elif name in _PATH_TOOLS:
+            self._check_path(name, args)
+        return await handler(request)
+
+    @staticmethod
+    def _tool_name(request: Any) -> str:
+        tool_call = getattr(request, "tool_call", None)
+        if isinstance(tool_call, dict):
+            return str(tool_call.get("name") or "")
+        return str(getattr(request, "name", "") or "")
+
+    @staticmethod
+    def _tool_args(request: Any) -> dict[str, Any]:
+        tool_call = getattr(request, "tool_call", None)
+        if isinstance(tool_call, dict):
+            return dict(tool_call.get("args") or {})
+        args = getattr(request, "args", None)
+        return dict(args) if isinstance(args, dict) else {}
+
+    def _check_shell(self, args: dict[str, Any]) -> None:
+        cmd = args.get("command") or args.get("argv") or ""
+        if isinstance(cmd, list):
+            cmd = " ".join(str(x) for x in cmd)
+        cmd_str = str(cmd)
+        for pat in DESTRUCTIVE_PATTERNS:
+            if pat.search(cmd_str):
+                raise MyDeepAgentError.human_required(
+                    "destructive_command_blocked",
+                    message=f"destructive shell command blocked: {cmd_str[:120]}",
+                    recovery_hint=(
+                        "this command is hard-blocked by my-deepagent's safety policy; "
+                        "edit the persona system_prompt to avoid suggesting it"
+                    ),
+                )
+
+    def _check_path(self, tool_name: str, args: dict[str, Any]) -> None:
+        path = args.get("file_path") or args.get("path") or args.get("file") or ""
+        if not isinstance(path, str) or not path:
+            return
+        if _is_denied_path(path):
+            raise MyDeepAgentError.human_required(
+                "secret_access_blocked",
+                message=(f"access to secret-bearing path blocked: tool={tool_name} path={path!r}"),
+                recovery_hint=(
+                    "this path matches a hard-blocked deny pattern (e.g. .env, *.key, .ssh/, .aws/)"
+                ),
+            )
--- a/my-deepagent/src/my_deepagent/monitoring/init.py
+++ b/my-deepagent/src/my_deepagent/monitoring/init.py
--- a/my-deepagent/src/my_deepagent/monitoring/langsmith.py
+++ b/my-deepagent/src/my_deepagent/monitoring/langsmith.py
@@ -0,0 +1 @@
+"""LangSmith tracing integration helpers. Implemented in Step 12."""
--- a/my-deepagent/src/my_deepagent/monitoring/pricing.py
+++ b/my-deepagent/src/my_deepagent/monitoring/pricing.py
@@ -0,0 +1,99 @@
+"""OpenRouter model pricing cache + cost computation.
+
+v0.1.0: in-process dict cache + optional DB refresh. doctor와 background refresh가
+업데이트 trigger (Step 12).
+"""
+
+from __future__ import annotations
+
+from dataclasses import dataclass
+
+import httpx
+
+from ..errors import MyDeepAgentError
+
+
+@dataclass(frozen=True)
+class ModelPrice:
+    model: str  # OpenRouter id, e.g. "deepseek/deepseek-chat"
+    input_per_1k_usd: float
+    output_per_1k_usd: float
+    context_length: int
+
+
+class PricingCache:
+    """In-memory cache of OpenRouter pricing. Caller refreshes via fetch_openrouter_pricing()."""
+
+    def __init__(self) -> None:
+        self._cache: dict[str, ModelPrice] = {}
+
+    def get(self, model: str) -> ModelPrice | None:
+        key = model.removeprefix("openrouter:")
+        return self._cache.get(key)
+
+    def set(self, prices: list[ModelPrice]) -> None:
+        for p in prices:
+            self._cache[p.model] = p
+
+    def compute_cost(self, model: str, input_tokens: int, output_tokens: int) -> float:
+        """Return USD cost. Returns 0.0 if model price is unknown (logged separately)."""
+        price = self.get(model)
+        if price is None:
+            return 0.0
+        return (input_tokens / 1000.0) * price.input_per_1k_usd + (
+            output_tokens / 1000.0
+        ) * price.output_per_1k_usd
+
+
+async def fetch_openrouter_pricing(api_key: str, base_url: str) -> list[ModelPrice]:
+    """Fetch the OpenRouter /models endpoint and parse pricing."""
+    async with httpx.AsyncClient(timeout=10.0) as client:
+        try:
+            r = await client.get(
+                f"{base_url}/models",
+                headers={"Authorization": f"Bearer {api_key}"},
+            )
+            r.raise_for_status()
+        except httpx.HTTPError as e:
+            raise MyDeepAgentError.recoverable(
+                "network_blip",
+                message=f"failed to fetch openrouter pricing: {e}",
+                cause=e,
+            ) from e
+    data: dict[str, object] = r.json()
+    return _parse_pricing_payload(data)
+
+
+def _parse_pricing_payload(data: dict[str, object]) -> list[ModelPrice]:
+    """Parse OpenRouter response.
+
+    Expected format::
+
+        {"data": [{"id": "...", "pricing": {"prompt": "...", "completion": "..."}, ...}]}
+    """
+    models = data.get("data", [])
+    if not isinstance(models, list):
+        return []
+    out: list[ModelPrice] = []
+    for m in models:
+        if not isinstance(m, dict):
+            continue
+        model_id = m.get("id")
+        pricing = m.get("pricing") or {}
+        if not isinstance(model_id, str) or not isinstance(pricing, dict):
+            continue
+        try:
+            prompt_per_token = float(pricing.get("prompt", "0") or "0")
+            completion_per_token = float(pricing.get("completion", "0") or "0")
+            ctx_len = int(m.get("context_length", 0) or 0)
+        except (TypeError, ValueError):
+            continue
+        out.append(
+            ModelPrice(
+                model=model_id,
+                input_per_1k_usd=prompt_per_token * 1000.0,
+                output_per_1k_usd=completion_per_token * 1000.0,
+                context_length=ctx_len,
+            )
+        )
+    return out
--- a/my-deepagent/src/my_deepagent/monitoring/stats.py
+++ b/my-deepagent/src/my_deepagent/monitoring/stats.py
@@ -0,0 +1 @@
+"""Run statistics aggregation and reporting. Implemented in Step 12."""
--- a/my-deepagent/src/my_deepagent/persistence/init.py
+++ b/my-deepagent/src/my_deepagent/persistence/init.py
@@ -0,0 +1,6 @@
+"""Persistence layer: SQLAlchemy async ORM + LangGraph checkpointer."""
+
+from .checkpointer import get_checkpointer_ctx
+from .db import Database
+
+__all__ = ["Database", "get_checkpointer_ctx"]
--- a/my-deepagent/src/my_deepagent/persistence/checkpointer.py
+++ b/my-deepagent/src/my_deepagent/persistence/checkpointer.py
@@ -0,0 +1,41 @@
+"""LangGraph SqliteSaver wrapper. Use only as a context manager to ensure connection cleanup.
+
+``SqliteSaver.from_conn_string`` is a ``@contextmanager`` classmethod that yields
+a ``SqliteSaver`` instance and closes the underlying sqlite3 connection on exit.
+Direct manual lifecycle management (entering context without ``with``) leaks connections
+and is not supported by this module.
+
+Usage::
+
+    with get_checkpointer_ctx(path) as saver:
+        graph = create_deep_agent(checkpointer=saver)
+        ...
+"""
+
+from __future__ import annotations
+
+from collections.abc import Iterator
+from contextlib import contextmanager
+from pathlib import Path
+
+from langgraph.checkpoint.sqlite import SqliteSaver
+
+
+@contextmanager
+def get_checkpointer_ctx(checkpoints_db_path: Path) -> Iterator[SqliteSaver]:
+    """Yield a SqliteSaver bound to *checkpoints_db_path*.
+
+    Creates the parent directory and the database file if they do not exist.
+    The underlying sqlite3 connection is closed automatically on context exit.
+    This is the only supported way to obtain a SqliteSaver in this project —
+    direct manual lifecycle management is not provided.
+
+    Args:
+        checkpoints_db_path: Filesystem path for the SQLite checkpoint database.
+
+    Yields:
+        SqliteSaver: Ready-to-use LangGraph checkpoint saver.
+    """
+    checkpoints_db_path.parent.mkdir(parents=True, exist_ok=True)
+    with SqliteSaver.from_conn_string(str(checkpoints_db_path)) as saver:
+        yield saver
--- a/my-deepagent/src/my_deepagent/persistence/db.py
+++ b/my-deepagent/src/my_deepagent/persistence/db.py
@@ -0,0 +1,91 @@
+"""Async SQLAlchemy engine + session factory with WAL mode and busy_timeout."""
+
+from __future__ import annotations
+
+from collections.abc import AsyncIterator
+from contextlib import asynccontextmanager
+
+from sqlalchemy import event
+from sqlalchemy.ext.asyncio import (
+    AsyncEngine,
+    AsyncSession,
+    async_sessionmaker,
+    create_async_engine,
+)
+
+from .models import Base
+
+
+def _attach_sqlite_pragmas(engine: AsyncEngine) -> None:
+    """Attach a synchronous connect-event listener that enables WAL, busy_timeout, FK."""
+
+    @event.listens_for(engine.sync_engine, "connect")
+    def _set_sqlite_pragma(dbapi_connection: object, _conn_record: object) -> None:
+        # dbapi_connection is a raw sqlite3.Connection delivered by SQLAlchemy's
+        # pool event callback.  The signature uses `object` to match the generic
+        # listener protocol; we cast to `Any` here to access DBAPI methods without
+        # introducing a hard import of `sqlite3` (which would break non-SQLite
+        # engines).  The pragma calls are safe: they are no-ops on non-SQLite
+        # dialects and sqlite3.Connection always has `.cursor()`.
+        import sqlite3  # local import to avoid circular or non-SQLite coupling
+
+        conn: sqlite3.Connection = dbapi_connection  # type: ignore[assignment]
+        cursor = conn.cursor()
+        cursor.execute("PRAGMA journal_mode=WAL")
+        cursor.execute("PRAGMA busy_timeout=5000")
+        cursor.execute("PRAGMA foreign_keys=ON")
+        cursor.close()
+
+
+class Database:
+    """Façade over async engine + session maker.
+
+    Usage::
+
+        db = Database("sqlite+aiosqlite:///path/to/db.sqlite3")
+        await db.init_schema()          # dev/test: create all tables directly
+        async with db.session() as s:   # production: use alembic upgrade head
+            result = await s.execute(...)
+        await db.dispose()
+
+    For production deployments, call ``alembic upgrade head`` instead of
+    ``init_schema`` so that migration history is tracked.
+    """
+
+    def __init__(self, database_url: str) -> None:
+        self._engine: AsyncEngine = create_async_engine(
+            database_url,
+            # NullPool avoids connection reuse issues in SQLite+aiosqlite tests.
+            poolclass=None,  # use the default StaticPool-compatible pool
+            echo=False,
+        )
+        _attach_sqlite_pragmas(self._engine)
+        self._session_factory: async_sessionmaker[AsyncSession] = async_sessionmaker(
+            bind=self._engine,
+            expire_on_commit=False,
+            autoflush=False,
+        )
+
+    async def init_schema(self) -> None:
+        """Create all ORM-defined tables.
+
+        For production, prefer ``alembic upgrade head``.
+        For tests, this is the fastest way to get a clean schema.
+        """
+        async with self._engine.begin() as conn:
+            await conn.run_sync(Base.metadata.create_all)
+
+    @asynccontextmanager
+    async def session(self) -> AsyncIterator[AsyncSession]:
+        """Yield an async session; commit on success, rollback on exception."""
+        async with self._session_factory() as session:
+            try:
+                yield session
+                await session.commit()
+            except Exception:
+                await session.rollback()
+                raise
+
+    async def dispose(self) -> None:
+        """Dispose the engine connection pool."""
+        await self._engine.dispose()
--- a/my-deepagent/src/my_deepagent/persistence/models.py
+++ b/my-deepagent/src/my_deepagent/persistence/models.py
@@ -0,0 +1,578 @@
+"""SQLAlchemy 2.0 async ORM models for my-deepagent persistence layer."""
+
+from __future__ import annotations
+
+import uuid
+from typing import Any
+
+from sqlalchemy import (
+    JSON,
+    Boolean,
+    Float,
+    ForeignKey,
+    Index,
+    Integer,
+    String,
+    Text,
+    UniqueConstraint,
+    text,
+)
+from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column
+
+
+class Base(DeclarativeBase):
+    """SQLAlchemy declarative base for my-deepagent."""
+
+
+# ---------------------------------------------------------------------------
+# workflow_templates
+# ---------------------------------------------------------------------------
+
+
+class WorkflowTemplateRow(Base):
+    """Content-addressed workflow template definitions."""
+
+    __tablename__ = "workflow_templates"
+
+    id: Mapped[str] = mapped_column(String(36), primary_key=True, default=lambda: str(uuid.uuid4()))
+    name: Mapped[str] = mapped_column(Text, nullable=False)
+    version: Mapped[int] = mapped_column(Integer, nullable=False)
+    hash: Mapped[str] = mapped_column(Text, nullable=False, unique=True)
+    definition: Mapped[dict[str, Any]] = mapped_column(JSON, nullable=False)
+    created_at: Mapped[str] = mapped_column(Text, nullable=False)
+
+    def __repr__(self) -> str:
+        return f"<WorkflowTemplateRow id={self.id!r} name={self.name!r} version={self.version!r}>"
+
+
+# ---------------------------------------------------------------------------
+# agent_personas
+# ---------------------------------------------------------------------------
+
+
+class AgentPersonaRow(Base):
+    """Content-addressed agent persona definitions."""
+
+    __tablename__ = "agent_personas"
+
+    id: Mapped[str] = mapped_column(String(36), primary_key=True, default=lambda: str(uuid.uuid4()))
+    name: Mapped[str] = mapped_column(Text, nullable=False)
+    version: Mapped[int] = mapped_column(Integer, nullable=False)
+    hash: Mapped[str] = mapped_column(Text, nullable=False, unique=True)
+    definition: Mapped[dict[str, Any]] = mapped_column(JSON, nullable=False)
+    created_at: Mapped[str] = mapped_column(Text, nullable=False)
+
+    def __repr__(self) -> str:
+        return f"<AgentPersonaRow id={self.id!r} name={self.name!r} version={self.version!r}>"
+
+
+# ---------------------------------------------------------------------------
+# runs
+# ---------------------------------------------------------------------------
+
+
+class RunRow(Base):
+    """Top-level run record: one row per deepagent run invocation."""
+
+    __tablename__ = "runs"
+    __table_args__ = (
+        # Partial unique index: at most one active run per (repo_path, base_branch).
+        # An "active" run is any run whose state is not 'completed', 'failed', or 'aborted'.
+        # SQLite partial index uses a WHERE clause; autogenerate cannot detect this,
+        # so it is managed via a manual alembic migration.
+        Index(
+            "ux_active_run_repo_base",
+            "repo_path",
+            "base_branch",
+            unique=True,
+            sqlite_where=text("state NOT IN ('completed', 'failed', 'aborted')"),
+        ),
+    )
+
+    id: Mapped[str] = mapped_column(String(36), primary_key=True, default=lambda: str(uuid.uuid4()))
+    # FK to workflow_templates — RESTRICT prevents deleting a template that has runs.
+    template_id: Mapped[str] = mapped_column(
+        String(36),
+        ForeignKey("workflow_templates.id", ondelete="RESTRICT"),
+        nullable=False,
+    )
+    template_hash: Mapped[str] = mapped_column(Text, nullable=False)
+    state: Mapped[str] = mapped_column(Text, nullable=False)
+    repo_path: Mapped[str] = mapped_column(Text, nullable=False)
+    base_branch: Mapped[str] = mapped_column(Text, nullable=False)
+    worktree_root: Mapped[str] = mapped_column(Text, nullable=False)
+    # current_phase_id references run_phases.id; however, runs.current_phase_id and
+    # run_phases.run_id form a circular FK pair. SQLite does not support deferrable
+    # constraints at the column level, and alembic cannot safely manage this circular
+    # dependency. Therefore current_phase_id carries NO ForeignKey constraint in the ORM.
+    # Callers must maintain referential integrity manually (i.e. always point to a valid
+    # run_phases.id that belongs to this run, or NULL).
+    current_phase_id: Mapped[str | None] = mapped_column(String(36), nullable=True)
+    started_at: Mapped[str | None] = mapped_column(Text, nullable=True)
+    ended_at: Mapped[str | None] = mapped_column(Text, nullable=True)
+    final_report_path: Mapped[str | None] = mapped_column(Text, nullable=True)
+    paused_from_state: Mapped[str | None] = mapped_column(Text, nullable=True)
+    created_at: Mapped[str] = mapped_column(Text, nullable=False)
+    updated_at: Mapped[str] = mapped_column(Text, nullable=False)
+
+    def __repr__(self) -> str:
+        return f"<RunRow id={self.id!r} state={self.state!r}>"
+
+
+# ---------------------------------------------------------------------------
+# run_inputs
+# ---------------------------------------------------------------------------
+
+
+class RunInputRow(Base):
+    """Input snapshot for a run (one-to-one with runs)."""
+
+    __tablename__ = "run_inputs"
+
+    id: Mapped[str] = mapped_column(String(36), primary_key=True, default=lambda: str(uuid.uuid4()))
+    run_id: Mapped[str] = mapped_column(
+        String(36),
+        ForeignKey("runs.id", ondelete="CASCADE"),
+        nullable=False,
+        unique=True,
+    )
+    requirements_md: Mapped[str] = mapped_column(Text, nullable=False)
+    objective: Mapped[dict[str, Any]] = mapped_column(JSON, nullable=False)
+    extra: Mapped[dict[str, Any]] = mapped_column(JSON, nullable=False)
+    input_hash: Mapped[str] = mapped_column(Text, nullable=False)
+
+    def __repr__(self) -> str:
+        return f"<RunInputRow id={self.id!r} run_id={self.run_id!r}>"
+
+
+# ---------------------------------------------------------------------------
+# run_bindings
+# ---------------------------------------------------------------------------
+
+
+class RunBindingRow(Base):
+    """Per-role persona binding for a run."""
+
+    __tablename__ = "run_bindings"
+    __table_args__ = (UniqueConstraint("run_id", "role_id", name="uq_run_bindings_run_role"),)
+
+    id: Mapped[str] = mapped_column(String(36), primary_key=True, default=lambda: str(uuid.uuid4()))
+    run_id: Mapped[str] = mapped_column(
+        String(36),
+        ForeignKey("runs.id", ondelete="CASCADE"),
+        nullable=False,
+    )
+    role_id: Mapped[str] = mapped_column(Text, nullable=False)
+    # FK to agent_personas — RESTRICT prevents deleting a persona that has bindings.
+    persona_id: Mapped[str] = mapped_column(
+        String(36),
+        ForeignKey("agent_personas.id", ondelete="RESTRICT"),
+        nullable=False,
+    )
+    persona_hash: Mapped[str] = mapped_column(Text, nullable=False)
+    backend: Mapped[str] = mapped_column(Text, nullable=False)
+    binding_hash: Mapped[str] = mapped_column(Text, nullable=False)
+
+    def __repr__(self) -> str:
+        return f"<RunBindingRow id={self.id!r} run_id={self.run_id!r} role_id={self.role_id!r}>"
+
+
+# ---------------------------------------------------------------------------
+# run_phases
+# ---------------------------------------------------------------------------
+
+
+class RunPhaseRow(Base):
+    """Per-phase execution record for a run."""
+
+    __tablename__ = "run_phases"
+    __table_args__ = (UniqueConstraint("run_id", "phase_key", name="uq_run_phases_run_phase"),)
+
+    id: Mapped[str] = mapped_column(String(36), primary_key=True, default=lambda: str(uuid.uuid4()))
+    run_id: Mapped[str] = mapped_column(
+        String(36),
+        ForeignKey("runs.id", ondelete="CASCADE"),
+        nullable=False,
+    )
+    phase_key: Mapped[str] = mapped_column(Text, nullable=False)
+    seq: Mapped[int] = mapped_column(Integer, nullable=False)
+    state: Mapped[str] = mapped_column(Text, nullable=False)
+    attempts: Mapped[int] = mapped_column(Integer, nullable=False, default=0)
+    started_at: Mapped[str | None] = mapped_column(Text, nullable=True)
+    ended_at: Mapped[str | None] = mapped_column(Text, nullable=True)
+
+    def __repr__(self) -> str:
+        return f"<RunPhaseRow id={self.id!r} run_id={self.run_id!r} phase_key={self.phase_key!r}>"
+
+
+# ---------------------------------------------------------------------------
+# run_events
+# ---------------------------------------------------------------------------
+
+
+class RunEventRow(Base):
+    """Ordered event stream for a run."""
+
+    __tablename__ = "run_events"
+    __table_args__ = (
+        UniqueConstraint("run_id", "seq", name="uq_run_events_run_seq"),
+        UniqueConstraint("run_id", "idempotency_key", name="uq_run_events_run_idempotency"),
+        Index("run_events_run_id_ts_idx", "run_id", "ts"),
+    )
+
+    id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
+    run_id: Mapped[str] = mapped_column(
+        String(36),
+        ForeignKey("runs.id", ondelete="CASCADE"),
+        nullable=False,
+    )
+    # phase_id references run_phases.id; CASCADE so events are deleted when a phase is deleted.
+    phase_id: Mapped[str | None] = mapped_column(
+        String(36),
+        ForeignKey("run_phases.id", ondelete="CASCADE"),
+        nullable=True,
+    )
+    seq: Mapped[int] = mapped_column(Integer, nullable=False)
+    type: Mapped[str] = mapped_column(Text, nullable=False)
+    payload: Mapped[dict[str, Any]] = mapped_column(JSON, nullable=False)
+    idempotency_key: Mapped[str] = mapped_column(Text, nullable=False)
+    ts: Mapped[str] = mapped_column(Text, nullable=False)
+
+    def __repr__(self) -> str:
+        return f"<RunEventRow id={self.id!r} run_id={self.run_id!r} seq={self.seq!r}>"
+
+
+# ---------------------------------------------------------------------------
+# approval_requests
+# ---------------------------------------------------------------------------
+
+
+class ApprovalRequestRow(Base):
+    """Human approval gate requests."""
+
+    __tablename__ = "approval_requests"
+
+    id: Mapped[str] = mapped_column(String(36), primary_key=True, default=lambda: str(uuid.uuid4()))
+    run_id: Mapped[str] = mapped_column(
+        String(36),
+        ForeignKey("runs.id", ondelete="CASCADE"),
+        nullable=False,
+    )
+    # phase_id references run_phases.id; CASCADE so approval requests are deleted with the phase.
+    phase_id: Mapped[str | None] = mapped_column(
+        String(36),
+        ForeignKey("run_phases.id", ondelete="CASCADE"),
+        nullable=True,
+    )
+    gate_key: Mapped[str] = mapped_column(Text, nullable=False)
+    state: Mapped[str] = mapped_column(Text, nullable=False)
+    idempotency_key: Mapped[str] = mapped_column(Text, nullable=False, unique=True)
+    payload: Mapped[dict[str, Any]] = mapped_column(JSON, nullable=False)
+    created_at: Mapped[str] = mapped_column(Text, nullable=False)
+    resolved_at: Mapped[str | None] = mapped_column(Text, nullable=True)
+
+    def __repr__(self) -> str:
+        return f"<ApprovalRequestRow id={self.id!r} gate_key={self.gate_key!r}>"
+
+
+# ---------------------------------------------------------------------------
+# approval_decisions
+# ---------------------------------------------------------------------------
+
+
+class ApprovalDecisionRow(Base):
+    """Human decisions on approval requests."""
+
+    __tablename__ = "approval_decisions"
+
+    id: Mapped[str] = mapped_column(String(36), primary_key=True, default=lambda: str(uuid.uuid4()))
+    approval_request_id: Mapped[str] = mapped_column(
+        String(36),
+        ForeignKey("approval_requests.id", ondelete="CASCADE"),
+        nullable=False,
+    )
+    action: Mapped[str] = mapped_column(Text, nullable=False)
+    comment: Mapped[str | None] = mapped_column(Text, nullable=True)
+    decided_at: Mapped[str] = mapped_column(Text, nullable=False)
+    idempotency_key: Mapped[str] = mapped_column(Text, nullable=False, unique=True)
+
+    def __repr__(self) -> str:
+        return f"<ApprovalDecisionRow id={self.id!r} action={self.action!r}>"
+
+
+# ---------------------------------------------------------------------------
+# artifacts
+# ---------------------------------------------------------------------------
+
+
+class ArtifactRow(Base):
+    """Content-addressed output artifacts from phases."""
+
+    __tablename__ = "artifacts"
+    __table_args__ = (
+        UniqueConstraint("run_id", "path", "hash", name="uq_artifacts_run_path_hash"),
+    )
+
+    id: Mapped[str] = mapped_column(String(36), primary_key=True, default=lambda: str(uuid.uuid4()))
+    run_id: Mapped[str] = mapped_column(
+        String(36),
+        ForeignKey("runs.id", ondelete="CASCADE"),
+        nullable=False,
+    )
+    # phase_id references run_phases.id; CASCADE so artifacts are deleted with the phase.
+    phase_id: Mapped[str | None] = mapped_column(
+        String(36),
+        ForeignKey("run_phases.id", ondelete="CASCADE"),
+        nullable=True,
+    )
+    path: Mapped[str] = mapped_column(Text, nullable=False)
+    schema_id: Mapped[str] = mapped_column(Text, nullable=False)
+    hash: Mapped[str] = mapped_column(Text, nullable=False)
+    valid: Mapped[bool] = mapped_column(Boolean, nullable=False)
+    validation_error: Mapped[dict[str, Any] | None] = mapped_column(JSON, nullable=True)
+    created_at: Mapped[str] = mapped_column(Text, nullable=False)
+
+    def __repr__(self) -> str:
+        return f"<ArtifactRow id={self.id!r} path={self.path!r} valid={self.valid!r}>"
+
+
+# ---------------------------------------------------------------------------
+# interactive_sessions
+# ---------------------------------------------------------------------------
+
+
+class InteractiveSessionRow(Base):
+    """Interactive (non-run) agent sessions."""
+
+    __tablename__ = "interactive_sessions"
+
+    id: Mapped[str] = mapped_column(String(36), primary_key=True, default=lambda: str(uuid.uuid4()))
+    # FK to agent_personas — RESTRICT prevents deleting a persona that has interactive sessions.
+    persona_id: Mapped[str] = mapped_column(
+        String(36),
+        ForeignKey("agent_personas.id", ondelete="RESTRICT"),
+        nullable=False,
+    )
+    persona_hash: Mapped[str] = mapped_column(Text, nullable=False)
+    started_at: Mapped[str | None] = mapped_column(Text, nullable=True)
+    ended_at: Mapped[str | None] = mapped_column(Text, nullable=True)
+    last_message_at: Mapped[str | None] = mapped_column(Text, nullable=True)
+    state: Mapped[str] = mapped_column(Text, nullable=False)
+
+    def __repr__(self) -> str:
+        return f"<InteractiveSessionRow id={self.id!r} state={self.state!r}>"
+
+
+# ---------------------------------------------------------------------------
+# tool_calls
+# ---------------------------------------------------------------------------
+
+
+class ToolCallRow(Base):
+    """Audit log of every tool invocation (run or interactive)."""
+
+    __tablename__ = "tool_calls"
+    __table_args__ = (Index("tool_calls_run_id_ts_idx", "run_id", "ts"),)
+
+    id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
+    # run_id / phase_id / interactive_session_id: exactly one must be non-NULL per row,
+    # but all three are nullable because tool_calls covers both run and interactive contexts.
+    # CASCADE ensures audit rows are removed when the parent run or session is deleted.
+    run_id: Mapped[str | None] = mapped_column(
+        String(36),
+        ForeignKey("runs.id", ondelete="CASCADE"),
+        nullable=True,
+    )
+    phase_id: Mapped[str | None] = mapped_column(
+        String(36),
+        ForeignKey("run_phases.id", ondelete="CASCADE"),
+        nullable=True,
+    )
+    interactive_session_id: Mapped[str | None] = mapped_column(
+        String(36),
+        ForeignKey("interactive_sessions.id", ondelete="CASCADE"),
+        nullable=True,
+    )
+    tool_name: Mapped[str] = mapped_column(Text, nullable=False)
+    args: Mapped[dict[str, Any]] = mapped_column(JSON, nullable=False)
+    result: Mapped[dict[str, Any] | None] = mapped_column(JSON, nullable=True)
+    error: Mapped[str | None] = mapped_column(Text, nullable=True)
+    duration_ms: Mapped[int] = mapped_column(Integer, nullable=False)
+    ts: Mapped[str] = mapped_column(Text, nullable=False)
+
+    def __repr__(self) -> str:
+        return f"<ToolCallRow id={self.id!r} tool_name={self.tool_name!r}>"
+
+
+# ---------------------------------------------------------------------------
+# llm_calls
+# ---------------------------------------------------------------------------
+
+
+class LlmCallRow(Base):
+    """Full LLM call telemetry: tokens, cost, latency, model."""
+
+    __tablename__ = "llm_calls"
+    __table_args__ = (
+        Index("llm_calls_run_id_ts_idx", "run_id", "ts"),
+        Index("llm_calls_interactive_session_id_ts_idx", "interactive_session_id", "ts"),
+        Index("llm_calls_model_ts_idx", "model", "ts"),
+    )
+
+    id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
+    # run_id / phase_id / interactive_session_id: exactly one must be non-NULL per row,
+    # but all three are nullable because llm_calls covers both run and interactive contexts.
+    # CASCADE ensures telemetry rows are removed when the parent run or session is deleted.
+    run_id: Mapped[str | None] = mapped_column(
+        String(36),
+        ForeignKey("runs.id", ondelete="CASCADE"),
+        nullable=True,
+    )
+    phase_id: Mapped[str | None] = mapped_column(
+        String(36),
+        ForeignKey("run_phases.id", ondelete="CASCADE"),
+        nullable=True,
+    )
+    interactive_session_id: Mapped[str | None] = mapped_column(
+        String(36),
+        ForeignKey("interactive_sessions.id", ondelete="CASCADE"),
+        nullable=True,
+    )
+    thread_id: Mapped[str] = mapped_column(Text, nullable=False)
+    persona_name: Mapped[str] = mapped_column(Text, nullable=False)
+    persona_version: Mapped[int] = mapped_column(Integer, nullable=False)
+    model: Mapped[str] = mapped_column(Text, nullable=False)
+    role: Mapped[str] = mapped_column(Text, nullable=False)
+    turn_index: Mapped[int] = mapped_column(Integer, nullable=False)
+    input_tokens: Mapped[int] = mapped_column(Integer, nullable=False)
+    output_tokens: Mapped[int] = mapped_column(Integer, nullable=False)
+    cached_tokens: Mapped[int] = mapped_column(Integer, nullable=False)
+    reasoning_tokens: Mapped[int] = mapped_column(Integer, nullable=False)
+    cost_usd_input: Mapped[float] = mapped_column(Float, nullable=False)
+    cost_usd_output: Mapped[float] = mapped_column(Float, nullable=False)
+    cost_usd_total: Mapped[float] = mapped_column(Float, nullable=False)
+    latency_ms: Mapped[int] = mapped_column(Integer, nullable=False)
+    status: Mapped[str] = mapped_column(Text, nullable=False)
+    error_code: Mapped[str | None] = mapped_column(Text, nullable=True)
+    request_id: Mapped[str | None] = mapped_column(Text, nullable=True)
+    ts: Mapped[str] = mapped_column(Text, nullable=False)
+
+    def __repr__(self) -> str:
+        return f"<LlmCallRow id={self.id!r} model={self.model!r} status={self.status!r}>"
+
+
+# ---------------------------------------------------------------------------
+# model_pricing
+# ---------------------------------------------------------------------------
+
+
+class ModelPricingRow(Base):
+    """Cached model pricing data (fetched from provider APIs)."""
+
+    __tablename__ = "model_pricing"
+
+    model: Mapped[str] = mapped_column(Text, primary_key=True)
+    input_per_1k_usd: Mapped[float] = mapped_column(Float, nullable=False)
+    output_per_1k_usd: Mapped[float] = mapped_column(Float, nullable=False)
+    context_length: Mapped[int] = mapped_column(Integer, nullable=False)
+    fetched_at: Mapped[str] = mapped_column(Text, nullable=False)
+    raw_payload: Mapped[str] = mapped_column(Text, nullable=False)
+
+    def __repr__(self) -> str:
+        return f"<ModelPricingRow model={self.model!r}>"
+
+
+# ---------------------------------------------------------------------------
+# budget_ledger
+# ---------------------------------------------------------------------------
+
+
+class BudgetLedgerRow(Base):
+    """Per-scope budget tracking (e.g. global, per-run, per-persona)."""
+
+    __tablename__ = "budget_ledger"
+
+    scope: Mapped[str] = mapped_column(Text, primary_key=True)
+    spent_usd: Mapped[float] = mapped_column(Float, nullable=False, default=0.0)
+    cap_usd: Mapped[float | None] = mapped_column(Float, nullable=True)
+    last_updated: Mapped[str] = mapped_column(Text, nullable=False)
+
+    def __repr__(self) -> str:
+        return f"<BudgetLedgerRow scope={self.scope!r} spent_usd={self.spent_usd!r}>"
+
+
+# ---------------------------------------------------------------------------
+# persona_consents
+# ---------------------------------------------------------------------------
+
+
+class PersonaConsentRow(Base):
+    """Persisted persona consent decisions (approve/block)."""
+
+    __tablename__ = "persona_consents"
+
+    persona_hash: Mapped[str] = mapped_column(Text, primary_key=True)
+    persona_name: Mapped[str] = mapped_column(Text, nullable=False)
+    persona_version: Mapped[int] = mapped_column(Integer, nullable=False)
+    decision: Mapped[str] = mapped_column(Text, nullable=False)
+    decided_at: Mapped[str] = mapped_column(Text, nullable=False)
+
+    def __repr__(self) -> str:
+        return f"<PersonaConsentRow persona_hash={self.persona_hash!r} decision={self.decision!r}>"
+
+
+# ---------------------------------------------------------------------------
+# phase_feedback
+# ---------------------------------------------------------------------------
+
+
+class PhaseFeedbackRow(Base):
+    """User feedback on completed phases (reaction + optional comment)."""
+
+    __tablename__ = "phase_feedback"
+
+    id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
+    # CASCADE: feedback is deleted when the run is deleted (audit data follows the run lifecycle).
+    run_id: Mapped[str] = mapped_column(
+        String(36),
+        ForeignKey("runs.id", ondelete="CASCADE"),
+        nullable=False,
+    )
+    # CASCADE: feedback is deleted when the phase is deleted.
+    phase_id: Mapped[str] = mapped_column(
+        String(36),
+        ForeignKey("run_phases.id", ondelete="CASCADE"),
+        nullable=False,
+    )
+    reaction: Mapped[str | None] = mapped_column(Text, nullable=True)
+    comment: Mapped[str | None] = mapped_column(Text, nullable=True)
+    created_at: Mapped[str] = mapped_column(Text, nullable=False)
+
+    def __repr__(self) -> str:
+        return f"<PhaseFeedbackRow id={self.id!r} run_id={self.run_id!r}>"
+
+
+# ---------------------------------------------------------------------------
+# run_commands  (schema-only; used in future steps)
+# ---------------------------------------------------------------------------
+
+
+class RunCommandRow(Base):
+    """Queued commands targeting a run (pause, resume, abort, etc.)."""
+
+    __tablename__ = "run_commands"
+
+    id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
+    run_id: Mapped[str] = mapped_column(
+        String(36),
+        ForeignKey("runs.id", ondelete="CASCADE"),
+        nullable=False,
+    )
+    command: Mapped[str] = mapped_column(Text, nullable=False)
+    payload: Mapped[dict[str, Any]] = mapped_column(JSON, nullable=False)
+    idempotency_key: Mapped[str] = mapped_column(Text, nullable=False, unique=True)
+    created_at: Mapped[str] = mapped_column(Text, nullable=False)
+    processed_at: Mapped[str | None] = mapped_column(Text, nullable=True)
+
+    def __repr__(self) -> str:
+        return f"<RunCommandRow id={self.id!r} run_id={self.run_id!r} command={self.command!r}>"
--- a/my-deepagent/src/my_deepagent/persona.py
+++ b/my-deepagent/src/my_deepagent/persona.py
@@ -0,0 +1,154 @@
+"""Persona schema + YAML loader + content-addressed hash + consent helpers."""
+
+from __future__ import annotations
+
+from pathlib import Path
+from typing import Any, Literal
+
+import yaml
+from pydantic import BaseModel, ConfigDict, Field, ValidationInfo, field_validator
+
+from .enums import Backend, Capability, RiskLevel
+from .hash import sha256
+
+
+class FilesystemPermissionSpec(BaseModel):
+    """1:1 mapping to deepagents FilesystemPermission TypedDict."""
+
+    model_config = ConfigDict(frozen=True, extra="forbid")
+
+    operations: tuple[Literal["read", "write", "edit", "ls"], ...] = Field(min_length=1)
+    paths: tuple[str, ...] = Field(min_length=1)
+    mode: Literal["allow", "deny"] = "allow"
+
+    @field_validator("paths")
+    @classmethod
+    def _validate_paths(cls, v: tuple[str, ...]) -> tuple[str, ...]:
+        for p in v:
+            if not p.startswith("/"):
+                raise ValueError(f"path must start with '/': {p!r}")
+            if "\x00" in p:
+                raise ValueError(f"path must not contain null bytes: {p!r}")
+            # Check for literal ".." segment — glob paths like "/**" are OK
+            segments = p.split("/")
+            if ".." in segments:
+                raise ValueError(f"path must not contain '..': {p!r}")
+            if "~" in p:
+                raise ValueError(f"path must not contain '~': {p!r}")
+        return v
+
+
+class PersonaSubagent(BaseModel):
+    """1:1 mapping to deepagents SubAgent TypedDict."""
+
+    model_config = ConfigDict(frozen=True, extra="forbid")
+
+    name: str = Field(min_length=1)
+    description: str = Field(min_length=10)
+    system_prompt: str = Field(min_length=10)
+    allowed_tools: tuple[str, ...] = Field(default_factory=tuple)
+    model: str | None = None
+    permissions: tuple[FilesystemPermissionSpec, ...] = Field(default_factory=tuple)
+    # deepagents accepts dict[str, Any] for interrupt_on — intentional Any
+    interrupt_on: dict[str, Any] = Field(default_factory=dict)
+
+
+class Persona(BaseModel):
+    """Persona definition from docs/schemas/personas/<name>@<version>.yaml.
+
+    Immutability: list-valued fields are stored as tuples to prevent post-construction
+    mutation that would invalidate compute_hash(). dict-valued fields (model_params,
+    interrupt_on) remain dict because they are pass-through to deepagents which expects
+    ``dict[str, Any]``; callers must not mutate them.
+    """
+
+    model_config = ConfigDict(frozen=True, extra="forbid")
+
+    name: str = Field(min_length=1)
+    version: int = Field(ge=1)
+    description: str | None = None
+    backend: Backend
+    model: str = Field(min_length=1)
+    provider_origin: str = Field(min_length=1)
+    capabilities: tuple[Capability, ...] = Field(min_length=1)
+    max_risk_level: RiskLevel
+    allowed_roles: tuple[str, ...] | None = None
+    system_prompt: str = Field(min_length=10)
+    allowed_tools: tuple[str, ...] | None = None
+    subagents: tuple[PersonaSubagent, ...] = Field(default_factory=tuple)
+    permissions: tuple[FilesystemPermissionSpec, ...] = Field(default_factory=tuple)
+    # deepagents accepts dict[str, Any] for interrupt_on — intentional Any
+    interrupt_on: dict[str, Any] | None = None
+    # deepagents accepts dict[str, Any] for model_params — intentional Any
+    model_params: dict[str, Any] = Field(default_factory=dict)
+    deepagents_backend: Literal["state", "local_shell", "filesystem", "composite", "langsmith"] = (
+        "local_shell"
+    )
+    skills: tuple[str, ...] = Field(default_factory=tuple)
+    memory_files: tuple[str, ...] = Field(default_factory=tuple)
+    fallback_model: str | None = None
+    max_cost_per_call_usd: float | None = Field(default=None, ge=0)
+
+    @field_validator("model")
+    @classmethod
+    def _validate_openrouter_model(cls, v: str, info: ValidationInfo) -> str:
+        backend = info.data.get("backend") if info.data else None
+        if backend == Backend.OPENROUTER and not v.strip():
+            raise ValueError("openrouter backend requires non-empty model")
+        return v
+
+    def compute_hash(self) -> str:
+        """Content-addressed identity hash (canonical JSON of normalized fields)."""
+        return sha256(
+            {
+                "name": self.name,
+                "version": self.version,
+                "backend": self.backend.value,
+                "model": self.model,
+                "provider_origin": self.provider_origin,
+                "capabilities": sorted(c.value for c in self.capabilities),
+                "max_risk_level": self.max_risk_level.value,
+                "allowed_roles": (
+                    sorted(self.allowed_roles) if self.allowed_roles is not None else None
+                ),
+                "system_prompt": self.system_prompt,
+                "allowed_tools": (
+                    sorted(self.allowed_tools) if self.allowed_tools is not None else None
+                ),
+                "subagents": [s.model_dump() for s in self.subagents],
+                "permissions": [p.model_dump() for p in self.permissions],
+                "interrupt_on": self.interrupt_on,
+                "model_params": self.model_params,
+                "deepagents_backend": self.deepagents_backend,
+                "fallback_model": self.fallback_model,
+                "max_cost_per_call_usd": self.max_cost_per_call_usd,
+                "skills": self.skills,
+                "memory_files": self.memory_files,
+            }
+        )
+
+
+def load_persona_yaml(path: Path) -> Persona:
+    """Load and validate a single persona yaml file."""
+    if not path.is_file():
+        raise FileNotFoundError(f"persona yaml not found: {path}")
+    data = yaml.safe_load(path.read_text(encoding="utf-8"))
+    return Persona.model_validate(data)
+
+
+def load_personas_from_dir(directory: Path) -> list[Persona]:
+    """Load all *.yaml files from a directory, sorted by filename for determinism.
+
+    Raises ValueError if the same (name, version) pair appears more than once.
+    Returns an empty list if the directory does not exist.
+    """
+    if not directory.is_dir():
+        return []
+    personas = [load_persona_yaml(p) for p in sorted(directory.glob("*.yaml"))]
+    seen: dict[tuple[str, int], str] = {}
+    for p in personas:
+        key = (p.name, p.version)
+        if key in seen:
+            raise ValueError(f"duplicate persona name={p.name!r} version={p.version}")
+        seen[key] = p.compute_hash()
+    return personas
--- a/my-deepagent/src/my_deepagent/prompt_envelope.py
+++ b/my-deepagent/src/my_deepagent/prompt_envelope.py
@@ -0,0 +1 @@
+"""Prompt envelope builder for LangChain messages. Implemented in Step 5."""
--- a/my-deepagent/src/my_deepagent/py.typed
+++ b/my-deepagent/src/my_deepagent/py.typed
--- a/my-deepagent/src/my_deepagent/run_event.py
+++ b/my-deepagent/src/my_deepagent/run_event.py
@@ -0,0 +1 @@
+"""Run event types for streaming progress. Implemented in Step 4."""
--- a/my-deepagent/src/my_deepagent/safety.py
+++ b/my-deepagent/src/my_deepagent/safety.py
@@ -0,0 +1 @@
+"""Safety gate for destructive command classification. Implemented in Step 11."""
--- a/my-deepagent/src/my_deepagent/session.py
+++ b/my-deepagent/src/my_deepagent/session.py
@@ -0,0 +1,274 @@
+"""Build a deepagents CompiledStateGraph from a Persona + run context.
+
+Connects:
+  - Persona (config) -> deepagents.create_deep_agent(...)
+  - OpenRouter (model="openrouter:...") -> ChatOpenAI(base_url=openrouter)
+  - Workspace dir -> LocalShellBackend (filesystem + shell execution)
+  - Persona.permissions + DEFAULT_DENY -> deepagents.FilesystemPermission list
+  - Subagents -> deepagents.SubAgent TypedDict list
+  - Middleware list -> passed to create_deep_agent
+"""
+
+from __future__ import annotations
+
+import os
+from pathlib import Path
+from typing import Any, Literal
+from uuid import UUID
+
+from deepagents import FilesystemPermission, SubAgent, create_deep_agent
+from deepagents.backends import (
+    CompositeBackend,
+    FilesystemBackend,
+    LocalShellBackend,
+    StateBackend,
+)
+from langchain_openai import ChatOpenAI
+
+from .config import Config
+from .errors import MyDeepAgentError
+from .persona import FilesystemPermissionSpec, Persona, PersonaSubagent
+
+DEFAULT_DENY_PATHS: tuple[str, ...] = (
+    "/.env*",
+    "/**/*.env*",
+    "/**/*token*",
+    "/**/*secret*",
+    "/**/*credential*",
+    "/**/*.pem",
+    "/**/*.key",
+    "/.ssh/**",
+    "/.aws/**",
+    "/.config/gcloud/**",
+    "/.kube/**",
+    "/.gnupg/**",
+)
+
+
+# Mapping from our richer operation set (read/write/edit/ls) to the deepagents
+# binary set (read/write). deepagents treats ls/grep/glob as read-side and
+# write_file/edit_file as write-side internally, so this collapse is safe.
+_OP_MAP: dict[str, Literal["read", "write"]] = {
+    "read": "read",
+    "write": "write",
+    "edit": "write",
+    "ls": "read",
+}
+
+
+def _map_operations(ops: tuple[str, ...] | list[str]) -> list[Literal["read", "write"]]:
+    """Deduplicate-preserve-order mapping of our ops to deepagents ops."""
+    seen: set[str] = set()
+    out: list[Literal["read", "write"]] = []
+    for op in ops:
+        mapped = _OP_MAP[op]
+        if mapped not in seen:
+            seen.add(mapped)
+            out.append(mapped)
+    return out
+
+
+def default_safety_permissions() -> list[FilesystemPermission]:
+    """Default-allow paths and deny secret-bearing paths.
+
+    Returned permissions are evaluated in order; first match wins.
+    Allow comes first so reads/writes to the worktree succeed by default;
+    then explicit denies block the secret patterns no matter what.
+    """
+    return [
+        FilesystemPermission(
+            operations=["read", "write"],
+            paths=["/**"],
+            mode="allow",
+        ),
+        FilesystemPermission(
+            operations=["read", "write"],
+            paths=list(DEFAULT_DENY_PATHS),
+            mode="deny",
+        ),
+    ]
+
+
+def _spec_to_permission(spec: FilesystemPermissionSpec) -> FilesystemPermission:
+    """Convert pydantic FilesystemPermissionSpec to deepagents FilesystemPermission.
+
+    Our schema accepts {read, write, edit, ls} for human-readable yaml. deepagents
+    collapses these to {read, write} internally; we apply the same collapse here.
+    """
+    return FilesystemPermission(
+        operations=_map_operations(spec.operations),
+        paths=list(spec.paths),
+        mode=spec.mode,
+    )
+
+
+def _subagent_to_dict(sub: PersonaSubagent) -> SubAgent:
+    """Convert PersonaSubagent -> deepagents SubAgent TypedDict.
+
+    Only includes optional keys when set; deepagents inherits defaults from the parent
+    agent when a subagent omits ``tools`` / ``model`` / ``permissions`` / ``interrupt_on``.
+    """
+    out: dict[str, Any] = {
+        "name": sub.name,
+        "description": sub.description,
+        "system_prompt": sub.system_prompt,
+    }
+    if sub.allowed_tools:
+        out["tools"] = list(sub.allowed_tools)
+    if sub.model is not None:
+        out["model"] = sub.model
+    if sub.permissions:
+        out["permissions"] = [_spec_to_permission(p) for p in sub.permissions]
+    if sub.interrupt_on:
+        out["interrupt_on"] = sub.interrupt_on
+    return out  # type: ignore[return-value]  # TypedDict construction from dict literal
+
+
+def _resolve_openrouter_api_key(config: Config) -> str:
+    """Pull the OpenRouter API key from config -> env -> error.
+
+    Priority: config.openrouter_api_key -> MYDEEPAGENT_OPENROUTER_API_KEY -> OPENROUTER_API_KEY.
+    """
+    if config.openrouter_api_key:
+        return config.openrouter_api_key
+    env_key = os.environ.get("MYDEEPAGENT_OPENROUTER_API_KEY") or os.environ.get(
+        "OPENROUTER_API_KEY"
+    )
+    if env_key:
+        return env_key
+    raise MyDeepAgentError.human_required(
+        "backend_auth_failed",
+        message="OpenRouter API key is not configured",
+        recovery_hint=(
+            "set MYDEEPAGENT_OPENROUTER_API_KEY in .env or run `mydeepagent login openrouter`"
+        ),
+    )
+
+
+def resolve_model_instance(
+    persona: Persona, config: Config, model_override: str | None = None
+) -> Any:
+    """Persona -> langchain BaseChatModel instance or 'provider:model' string.
+
+    For ``openrouter:`` prefix, returns a ``ChatOpenAI`` with ``base_url=openrouter``.
+    For other providers (``anthropic:``, ``openai:``, ``google:``), returns the string as-is
+    so that deepagents' ``init_chat_model`` resolves it via the matching integration package.
+    """
+    model_spec = model_override or persona.model
+    if model_spec.startswith("openrouter:"):
+        params = persona.model_params
+        return ChatOpenAI(
+            model=model_spec.removeprefix("openrouter:"),
+            api_key=_resolve_openrouter_api_key(config),
+            base_url=config.openrouter_base_url,
+            max_tokens=params.get("max_tokens", 4096),
+            temperature=params.get("temperature", 0.2),
+            top_p=params.get("top_p", 1.0),
+        )
+    return model_spec
+
+
+def build_backend(persona: Persona, root_dir: Path) -> Any:
+    """Persona.deepagents_backend -> concrete deepagents backend instance.
+
+    Returns:
+        LocalShellBackend for "local_shell" (filesystem + shell execute, the default).
+        FilesystemBackend for "filesystem" (filesystem only, no shell).
+        None for "state" (deepagents default StateBackend, in-process state only).
+        CompositeBackend for "composite" (local_shell + state-backed /memories/ namespace).
+
+    Raises:
+        MyDeepAgentError(fatal, config_invalid) for unknown backend identifiers
+        or "langsmith" which is reserved for a future milestone.
+    """
+    name = persona.deepagents_backend
+    if name == "local_shell":
+        return LocalShellBackend(
+            root_dir=str(root_dir),
+            virtual_mode=False,
+            timeout=120,
+            max_output_bytes=100_000,
+            inherit_env=False,
+        )
+    if name == "filesystem":
+        return FilesystemBackend(root_dir=str(root_dir), virtual_mode=False, max_file_size_mb=10)
+    if name == "state":
+        return None  # deepagents default StateBackend
+    if name == "composite":
+        return CompositeBackend(
+            default=LocalShellBackend(root_dir=str(root_dir), virtual_mode=False),
+            routes={"/memories/": StateBackend()},
+        )
+    raise MyDeepAgentError.fatal(
+        "config_invalid",
+        message=f"unsupported deepagents_backend: {name!r}",
+        recovery_hint="use one of: local_shell, filesystem, state, composite",
+    )
+
+
+def build_agent(
+    persona: Persona,
+    config: Config,
+    *,
+    root_dir: Path,
+    middleware: list[Any] | None = None,
+    checkpointer: Any | None = None,
+    run_id: UUID | None = None,
+    phase_key: str | None = None,
+    model_override: str | None = None,
+) -> Any:
+    """Construct a deepagents CompiledStateGraph for the given persona.
+
+    Returns a CompiledStateGraph. Caller invokes via
+    ``agent.invoke / ainvoke / astream / astream_events`` with ``{"messages": [...]}`` input.
+
+    deepagents 0.6.1 limitation: FilesystemPermission is rejected when the backend
+    implements SandboxBackendProtocol (e.g. LocalShellBackend). SafetyShellMiddleware
+    enforces path + destructive-command safety in those cases instead.
+    """
+    from .middleware.safety import SafetyShellMiddleware
+
+    model = resolve_model_instance(persona, config, model_override)
+    backend = build_backend(persona, root_dir)
+
+    # SafetyShellMiddleware is always first; caller-supplied middleware appends.
+    all_middleware: list[Any] = [SafetyShellMiddleware()]
+    if middleware:
+        all_middleware.extend(middleware)
+
+    subagents: list[SubAgent] = [_subagent_to_dict(s) for s in persona.subagents]
+
+    kwargs: dict[str, Any] = {
+        "model": model,
+        "system_prompt": persona.system_prompt,
+        "middleware": all_middleware,
+    }
+    if backend is not None:
+        kwargs["backend"] = backend
+
+    # deepagents 0.6.1: FilesystemPermission + SandboxBackendProtocol backend raises
+    # NotImplementedError. Skip permissions kwarg for local_shell; SafetyShellMiddleware
+    # handles path enforcement instead. Other backends (state, filesystem, composite)
+    # still use the deepagents permissions system.
+    use_permissions = persona.deepagents_backend != "local_shell"
+    if use_permissions:
+        permissions: list[FilesystemPermission] = [
+            *(_spec_to_permission(p) for p in persona.permissions),
+            *default_safety_permissions(),
+        ]
+        kwargs["permissions"] = permissions
+
+    if persona.allowed_tools:
+        kwargs["tools"] = list(persona.allowed_tools)
+    if subagents:
+        kwargs["subagents"] = subagents
+    if persona.interrupt_on:
+        kwargs["interrupt_on"] = persona.interrupt_on
+    if checkpointer is not None:
+        kwargs["checkpointer"] = checkpointer
+    if persona.skills:
+        kwargs["skills"] = list(persona.skills)
+    if persona.memory_files:
+        kwargs["memory"] = list(persona.memory_files)
+
+    return create_deep_agent(**kwargs)
--- a/my-deepagent/src/my_deepagent/slash.py
+++ b/my-deepagent/src/my_deepagent/slash.py
@@ -0,0 +1 @@
+"""Slash command registry and dispatcher. Implemented in Step 10."""
--- a/my-deepagent/src/my_deepagent/tui/init.py
+++ b/my-deepagent/src/my_deepagent/tui/init.py
--- a/my-deepagent/src/my_deepagent/tui/approval.py
+++ b/my-deepagent/src/my_deepagent/tui/approval.py
@@ -0,0 +1 @@
+"""TUI approval dialog for human-in-the-loop actions. Implemented in Step 7."""
--- a/my-deepagent/src/my_deepagent/tui/render.py
+++ b/my-deepagent/src/my_deepagent/tui/render.py
@@ -0,0 +1 @@
+"""TUI Rich panel and table renderers. Implemented in Step 10."""
--- a/my-deepagent/src/my_deepagent/tui/stream.py
+++ b/my-deepagent/src/my_deepagent/tui/stream.py
@@ -0,0 +1 @@
+"""TUI streaming output renderer for run events. Implemented in Step 10."""
--- a/my-deepagent/src/my_deepagent/workflow.py
+++ b/my-deepagent/src/my_deepagent/workflow.py
@@ -0,0 +1,127 @@
+"""WorkflowTemplate schema + YAML loader."""
+
+from __future__ import annotations
+
+from collections import Counter
+from pathlib import Path
+
+import yaml
+from pydantic import BaseModel, ConfigDict, Field, field_validator, model_validator
+
+from .enums import Backend, Capability, RiskLevel
+from .hash import sha256
+
+
+class ExpectedArtifact(BaseModel):
+    """Expected output artifact of a workflow phase."""
+
+    model_config = ConfigDict(frozen=True, extra="forbid", populate_by_name=True)
+
+    path: str = Field(min_length=1)
+    # yaml uses 'schema' key; pydantic attribute is schema_id to avoid shadowing BaseModel.schema
+    schema_id: str = Field(min_length=1, alias="schema")
+
+
+class WorkflowPhase(BaseModel):
+    """Single phase definition inside a workflow template."""
+
+    model_config = ConfigDict(frozen=True, extra="forbid")
+
+    key: str = Field(min_length=1, pattern=r"^[a-z][a-z0-9_]*$")
+    title: str = Field(min_length=1)
+    risk: RiskLevel
+    role: str = Field(min_length=1)
+    expected_artifact: ExpectedArtifact | None = None
+    gates: tuple[str, ...] = Field(default_factory=tuple)
+    timeout_seconds: int | None = Field(default=None, ge=1)
+    instructions: str = Field(min_length=10)
+    max_budget_usd: float | None = Field(default=None, ge=0)
+
+
+class WorkflowRole(BaseModel):
+    """Role definition: what capabilities a bound persona must have."""
+
+    model_config = ConfigDict(frozen=True, extra="forbid")
+
+    id: str = Field(min_length=1, pattern=r"^[a-z][a-z0-9_]*$")
+    required_capabilities: tuple[Capability, ...] = Field(min_length=1)
+    preferred_backends: tuple[Backend, ...] = Field(default_factory=tuple)
+    fallback_personas: tuple[str, ...] = Field(default_factory=tuple)
+
+
+class WorkflowTemplate(BaseModel):
+    """Complete workflow template loaded from docs/schemas/workflows/<name>@<version>.yaml."""
+
+    model_config = ConfigDict(frozen=True, extra="forbid")
+
+    name: str = Field(min_length=1)
+    version: int = Field(ge=1)
+    description: str | None = None
+    roles: tuple[WorkflowRole, ...] = Field(min_length=1)
+    phases: tuple[WorkflowPhase, ...] = Field(min_length=1)
+    default_gates: tuple[str, ...] = Field(default_factory=tuple)
+    max_total_budget_usd: float | None = Field(default=None, ge=0)
+
+    @model_validator(mode="after")
+    def _validate_phase_roles(self) -> WorkflowTemplate:
+        role_ids = {r.id for r in self.roles}
+        for ph in self.phases:
+            if ph.role not in role_ids:
+                raise ValueError(f"phase '{ph.key}' references unknown role '{ph.role}'")
+        return self
+
+    @model_validator(mode="after")
+    def _validate_unique_phase_keys(self) -> WorkflowTemplate:
+        counts = Counter(ph.key for ph in self.phases)
+        duplicates = sorted(k for k, c in counts.items() if c > 1)
+        if duplicates:
+            raise ValueError(f"duplicate phase keys: {duplicates}")
+        return self
+
+    @field_validator("roles")
+    @classmethod
+    def _validate_unique_role_ids(cls, v: tuple[WorkflowRole, ...]) -> tuple[WorkflowRole, ...]:
+        counts = Counter(r.id for r in v)
+        duplicates = sorted(k for k, c in counts.items() if c > 1)
+        if duplicates:
+            raise ValueError(f"duplicate role ids: {duplicates}")
+        return v
+
+    def compute_hash(self) -> str:
+        """Content-addressed identity hash of this template."""
+        return sha256(
+            {
+                "name": self.name,
+                "version": self.version,
+                "roles": [r.model_dump() for r in self.roles],
+                "phases": [ph.model_dump(by_alias=True) for ph in self.phases],
+                "default_gates": sorted(self.default_gates),
+                "max_total_budget_usd": self.max_total_budget_usd,
+            }
+        )
+
+
+def load_workflow_yaml(path: Path) -> WorkflowTemplate:
+    """Load and validate a single workflow yaml file."""
+    if not path.is_file():
+        raise FileNotFoundError(f"workflow yaml not found: {path}")
+    data = yaml.safe_load(path.read_text(encoding="utf-8"))
+    return WorkflowTemplate.model_validate(data)
+
+
+def load_workflows_from_dir(directory: Path) -> list[WorkflowTemplate]:
+    """Load all *.yaml workflow files from a directory, sorted by filename.
+
+    Raises ValueError if the same (name, version) pair appears more than once.
+    Returns an empty list if the directory does not exist.
+    """
+    if not directory.is_dir():
+        return []
+    workflows = [load_workflow_yaml(p) for p in sorted(directory.glob("*.yaml"))]
+    seen: set[tuple[str, int]] = set()
+    for w in workflows:
+        key = (w.name, w.version)
+        if key in seen:
+            raise ValueError(f"duplicate workflow name={w.name!r} version={w.version}")
+        seen.add(key)
+    return workflows
--- a/my-deepagent/tests/init.py
+++ b/my-deepagent/tests/init.py
--- a/my-deepagent/tests/fixtures/init.py
+++ b/my-deepagent/tests/fixtures/init.py
--- a/my-deepagent/tests/integration/init.py
+++ b/my-deepagent/tests/integration/init.py
--- a/my-deepagent/tests/integration/test_checkpointer.py
+++ b/my-deepagent/tests/integration/test_checkpointer.py
@@ -0,0 +1,78 @@
+"""Integration tests for src/my_deepagent/persistence/checkpointer.py."""
+
+from __future__ import annotations
+
+import sqlite3
+from pathlib import Path
+
+from my_deepagent.persistence.checkpointer import get_checkpointer_ctx
+
+
+class TestGetCheckpointerCtx:
+    """Tests for the get_checkpointer_ctx context manager."""
+
+    def test_ctx_yields_saver_and_cleans_up(self, tmp_path: Path) -> None:
+        """Entering the context yields a SqliteSaver; exiting releases the connection."""
+        db_path = tmp_path / "ck.db"
+        with get_checkpointer_ctx(db_path) as saver:
+            assert saver is not None
+            # The DB file must exist while inside the context.
+            assert db_path.exists()
+
+        # After context exit the file must still exist (not deleted).
+        assert db_path.exists()
+
+    def test_db_file_created_on_enter(self, tmp_path: Path) -> None:
+        """The sqlite file is created when the context is entered."""
+        db_path = tmp_path / "nested" / "dir" / "ck.db"
+        assert not db_path.exists()
+
+        with get_checkpointer_ctx(db_path):
+            assert db_path.exists()
+
+    def test_parent_dir_created_if_missing(self, tmp_path: Path) -> None:
+        """Parent directory is created automatically even if it does not exist."""
+        db_path = tmp_path / "a" / "b" / "c" / "ck.db"
+        assert not db_path.parent.exists()
+
+        with get_checkpointer_ctx(db_path):
+            assert db_path.parent.exists()
+
+    def test_connection_released_after_ctx_exit(self, tmp_path: Path) -> None:
+        """After exiting the context manager, another process/connection can open the DB."""
+        db_path = tmp_path / "ck.db"
+
+        with get_checkpointer_ctx(db_path):
+            pass  # enter and exit
+
+        # If the connection were leaked (not closed), WAL mode can still allow reads,
+        # but we verify by opening with a fresh sqlite3 connection — this must succeed.
+        with sqlite3.connect(str(db_path)) as conn:
+            cur = conn.execute("SELECT name FROM sqlite_master WHERE type='table'")
+            # LangGraph creates its checkpoint tables; result must be a list (not error).
+            tables = [row[0] for row in cur.fetchall()]
+        assert isinstance(tables, list)
+
+    def test_meta_and_checkpoint_db_no_lock_conflict(self, tmp_path: Path) -> None:
+        """Using two separate DB files in the same directory causes no locking conflict."""
+        meta_db = tmp_path / "meta.db"
+        ck_db = tmp_path / "checkpoints.db"
+
+        # Simulate concurrent use: open both within the same scope.
+        with get_checkpointer_ctx(ck_db) as saver:
+            # Write something to the meta DB while the checkpointer holds its connection.
+            with sqlite3.connect(str(meta_db)) as conn:
+                conn.execute("CREATE TABLE IF NOT EXISTS kv (k TEXT PRIMARY KEY, v TEXT)")
+                conn.execute("INSERT OR REPLACE INTO kv VALUES ('key', 'value')")
+                conn.commit()
+
+            assert saver is not None
+
+        # Both files must exist and be independently readable.
+        assert meta_db.exists()
+        assert ck_db.exists()
+
+        with sqlite3.connect(str(meta_db)) as conn:
+            row = conn.execute("SELECT v FROM kv WHERE k='key'").fetchone()
+        assert row is not None
+        assert row[0] == "value"
--- a/my-deepagent/tests/integration/test_openrouter_smoke.py
+++ b/my-deepagent/tests/integration/test_openrouter_smoke.py
@@ -0,0 +1,143 @@
+"""Real OpenRouter API smoke test. Costs ~$0.001-$0.003 per full run.
+
+Skipped automatically when no API key is configured.
+Uses deepseek/deepseek-chat (cheapest available) with max_tokens=50.
+"""
+
+from __future__ import annotations
+
+import os
+from pathlib import Path
+
+import pytest
+
+from my_deepagent.config import load_config
+from my_deepagent.persona import Persona
+from my_deepagent.session import resolve_model_instance
+
+_HAS_KEY = (
+    bool(os.environ.get("MYDEEPAGENT_OPENROUTER_API_KEY") or os.environ.get("OPENROUTER_API_KEY"))
+    or Path(".env").is_file()
+)
+
+pytestmark = [
+    pytest.mark.integration,
+    pytest.mark.skipif(not _HAS_KEY, reason="no OpenRouter API key configured"),
+]
+
+
+def _smoke_persona() -> Persona:
+    return Persona.model_validate(
+        {
+            "name": "smoke-test",
+            "version": 1,
+            "backend": "openrouter",
+            "model": "openrouter:deepseek/deepseek-chat",
+            "provider_origin": "China/DeepSeek",
+            "capabilities": ["evidence_check"],
+            "max_risk_level": "low",
+            "system_prompt": (
+                "You are a smoke-test echo bot. Reply only with the literal token 'OK'."
+            ),
+            "model_params": {"max_tokens": 50, "temperature": 0.0},
+            # deepagents 0.6.x: local_shell backend + permissions 동시 사용 시
+            # NotImplementedError 발생. state 백엔드는 permissions 제약 없음.
+            "deepagents_backend": "state",
+        }
+    )
+
+
+def _smoke_persona_local_shell() -> Persona:
+    return Persona.model_validate(
+        {
+            "name": "smoke-test-local-shell",
+            "version": 1,
+            "backend": "openrouter",
+            "model": "openrouter:deepseek/deepseek-chat",
+            "provider_origin": "China/DeepSeek",
+            "capabilities": ["evidence_check"],
+            "max_risk_level": "low",
+            "system_prompt": (
+                "You are a smoke-test echo bot. Reply only with the literal token 'OK'."
+            ),
+            "model_params": {"max_tokens": 50, "temperature": 0.0},
+            # local_shell backend: SafetyShellMiddleware enforces path + destructive-command
+            # policy; permissions kwarg is skipped to avoid deepagents 0.6.1 NotImplementedError.
+            "deepagents_backend": "local_shell",
+        }
+    )
+
+
+def test_openrouter_chat_completion_returns_response() -> None:
+    """ChatOpenAI 인스턴스로 1회 호출하여 OpenRouter base_url + auth + 응답 흐름 검증."""
+    config = load_config()
+    persona = _smoke_persona()
+    chat = resolve_model_instance(persona, config)
+    response = chat.invoke(
+        [
+            ("system", persona.system_prompt),
+            ("user", "Reply with the exact string 'OK' and nothing else."),
+        ]
+    )
+    assert response is not None
+    content = response.content
+    # langchain BaseMessage.content는 str | list[content_block_dict]
+    if isinstance(content, str):
+        assert len(content) > 0
+    else:
+        assert len(content) > 0
+
+
+def test_openrouter_usage_metadata_present() -> None:
+    """response.usage_metadata가 input_tokens/output_tokens를 채워야 cost 계측 가능."""
+    config = load_config()
+    persona = _smoke_persona()
+    chat = resolve_model_instance(persona, config)
+    response = chat.invoke(
+        [
+            ("system", persona.system_prompt),
+            ("user", "Reply with 'OK'."),
+        ]
+    )
+    usage = getattr(response, "usage_metadata", None)
+    assert usage is not None, "OpenRouter response must include usage_metadata"
+    assert usage.get("input_tokens", 0) > 0
+    assert usage.get("output_tokens", 0) > 0
+
+
+def test_openrouter_deepagents_create_smoke() -> None:
+    """deepagents create_deep_agent + 실 OpenRouter 호출 1회. 가장 비싼 검증."""
+    config = load_config()
+    persona = _smoke_persona()
+    from my_deepagent.session import build_agent
+
+    agent = build_agent(persona, config, root_dir=Path.cwd())
+    result = agent.invoke({"messages": [{"role": "user", "content": "Reply with 'OK' only."}]})
+    messages = result.get("messages", [])
+    assert len(messages) > 0
+    last = messages[-1]
+    content = getattr(last, "content", "")
+    if isinstance(content, list):
+        content = " ".join(str(c) for c in content)
+    assert len(str(content)) > 0
+
+
+def test_openrouter_deepagents_local_shell_smoke(tmp_path: Path) -> None:
+    """Real OpenRouter call via deepagents + LocalShellBackend + SafetyShellMiddleware.
+
+    Verifies deepagents 0.6.1 workaround: local_shell backend with permissions kwarg
+    skipped, SafetyShellMiddleware automatically injected by build_agent.
+    """
+    config = load_config()
+    persona = _smoke_persona_local_shell()
+    from my_deepagent.session import build_agent
+
+    agent = build_agent(persona, config, root_dir=tmp_path)
+    result = agent.invoke({"messages": [{"role": "user", "content": "Reply 'OK' only."}]})
+    messages = result.get("messages", [])
+    assert len(messages) > 0
+    last = messages[-1]
+    content = getattr(last, "content", "")
+    if isinstance(content, list):
+        content = " ".join(str(c) for c in content)
+    assert len(str(content)) > 0
--- a/my-deepagent/tests/integration/test_persistence.py
+++ b/my-deepagent/tests/integration/test_persistence.py
@@ -0,0 +1,670 @@
+"""Integration tests for src/my_deepagent/persistence/ (DB engine + ORM models)."""
+
+from __future__ import annotations
+
+import subprocess
+import sys
+import uuid
+from pathlib import Path
+from typing import Any
+
+import pytest
+import pytest_asyncio
+from sqlalchemy import text
+from sqlalchemy.exc import IntegrityError
+
+from my_deepagent.persistence.db import Database
+from my_deepagent.persistence.models import (
+    AgentPersonaRow,
+    RunEventRow,
+    RunInputRow,
+    RunPhaseRow,
+    RunRow,
+    WorkflowTemplateRow,
+)
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+_NOW = "2026-05-15T00:00:00+00:00"
+
+
+def _make_id() -> str:
+    return str(uuid.uuid4())
+
+
+def _workflow_template_row(template_id: str) -> WorkflowTemplateRow:
+    """Return a WorkflowTemplateRow that satisfies the runs.template_id FK."""
+    return WorkflowTemplateRow(
+        id=template_id,
+        name="test-wf",
+        version=1,
+        hash=template_id,  # unique per invocation
+        definition={},
+        created_at=_NOW,
+    )
+
+
+def _run_row(run_id: str | None = None, template_id: str | None = None) -> RunRow:
+    rid = run_id or _make_id()
+    tid = template_id or _make_id()
+    return RunRow(
+        id=rid,
+        template_id=tid,
+        template_hash="a" * 64,
+        state="pending",
+        repo_path="/repo",
+        base_branch="main",
+        worktree_root="/wt",
+        created_at=_NOW,
+        updated_at=_NOW,
+    )
+
+
+# ---------------------------------------------------------------------------
+# Fixtures
+# ---------------------------------------------------------------------------
+
+
+@pytest.fixture()
+def db_url(tmp_path: Path) -> str:
+    return f"sqlite+aiosqlite:///{tmp_path}/test.db"
+
+
+@pytest_asyncio.fixture()
+async def db(db_url: str) -> Database:  # type: ignore[misc]
+    database = Database(db_url)
+    await database.init_schema()
+    yield database  # type: ignore[misc]
+    await database.dispose()
+
+
+# ---------------------------------------------------------------------------
+# A.1: All 18 tables exist after init_schema
+# ---------------------------------------------------------------------------
+
+EXPECTED_TABLES = {
+    "workflow_templates",
+    "agent_personas",
+    "runs",
+    "run_inputs",
+    "run_bindings",
+    "run_phases",
+    "run_events",
+    "approval_requests",
+    "approval_decisions",
+    "artifacts",
+    "interactive_sessions",
+    "tool_calls",
+    "llm_calls",
+    "model_pricing",
+    "budget_ledger",
+    "persona_consents",
+    "phase_feedback",
+    "run_commands",
+}
+
+
+@pytest.mark.asyncio
+async def test_init_schema_creates_all_tables(db: Database) -> None:
+    """All expected tables must exist in sqlite_master after init_schema."""
+    async with db.session() as session:
+        result = await session.execute(
+            text("SELECT name FROM sqlite_master WHERE type='table' ORDER BY name")
+        )
+        table_names = {row[0] for row in result.fetchall()}
+    table_names.discard("alembic_version")
+    assert EXPECTED_TABLES <= table_names, f"Missing tables: {EXPECTED_TABLES - table_names}"
+
+
+# ---------------------------------------------------------------------------
+# A.2: WAL mode active
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_wal_mode_active(db: Database) -> None:
+    """journal_mode PRAGMA must return 'wal' after connection."""
+    async with db.session() as session:
+        result = await session.execute(text("PRAGMA journal_mode"))
+        mode = result.scalar()
+    assert mode == "wal", f"Expected 'wal', got {mode!r}"
+
+
+# ---------------------------------------------------------------------------
+# A.3: busy_timeout active
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_busy_timeout_active(db: Database) -> None:
+    """busy_timeout PRAGMA must return 5000."""
+    async with db.session() as session:
+        result = await session.execute(text("PRAGMA busy_timeout"))
+        timeout = result.scalar()
+    assert timeout == 5000, f"Expected 5000, got {timeout!r}"
+
+
+# ---------------------------------------------------------------------------
+# A.4: foreign_keys active
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_foreign_keys_active(db: Database) -> None:
+    """foreign_keys PRAGMA must return 1."""
+    async with db.session() as session:
+        result = await session.execute(text("PRAGMA foreign_keys"))
+        fk = result.scalar()
+    assert fk == 1, f"Expected 1, got {fk!r}"
+
+
+# ---------------------------------------------------------------------------
+# A.5: basic insert + select round-trip
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_run_row_insert_and_select(db: Database) -> None:
+    """RunRow insert then SELECT must return the same state."""
+    rid = _make_id()
+    tid = _make_id()
+    template = _workflow_template_row(tid)
+    run = _run_row(rid, template_id=tid)
+    async with db.session() as session:
+        session.add(template)
+        await session.flush()
+        session.add(run)
+    async with db.session() as session:
+        fetched = await session.get(RunRow, rid)
+    assert fetched is not None
+    assert fetched.id == rid
+    assert fetched.state == "pending"
+
+
+@pytest.mark.asyncio
+async def test_agent_persona_row_insert_and_select(db: Database) -> None:
+    """AgentPersonaRow insert then SELECT must return the same record."""
+    persona_id = _make_id()
+    persona = AgentPersonaRow(
+        id=persona_id,
+        name="test-persona",
+        version=1,
+        hash="b" * 64,
+        definition={"model": "test"},
+        created_at=_NOW,
+    )
+    async with db.session() as session:
+        session.add(persona)
+    async with db.session() as session:
+        fetched = await session.get(AgentPersonaRow, persona_id)
+    assert fetched is not None
+    assert fetched.name == "test-persona"
+    assert fetched.version == 1
+
+
+# ---------------------------------------------------------------------------
+# A.6: UNIQUE constraint — workflow_templates.hash duplicate
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_workflow_template_hash_unique_constraint(db: Database) -> None:
+    """Inserting two WorkflowTemplateRows with the same hash must raise IntegrityError."""
+
+    def make_template(tid: str) -> WorkflowTemplateRow:
+        return WorkflowTemplateRow(
+            id=tid,
+            name="my-wf",
+            version=1,
+            hash="c" * 64,  # same hash for both
+            definition={},
+            created_at=_NOW,
+        )
+
+    t1 = make_template(_make_id())
+    async with db.session() as session:
+        session.add(t1)
+
+    t2 = make_template(_make_id())
+    with pytest.raises(IntegrityError):
+        async with db.session() as session:
+            session.add(t2)
+
+
+# ---------------------------------------------------------------------------
+# A.7: FK CASCADE — RunRow delete cascades to RunInputRow
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_fk_cascade_run_delete_cascades_run_input(db: Database) -> None:
+    """Deleting a RunRow must cascade-delete its RunInputRow."""
+    rid = _make_id()
+    tid = _make_id()
+    template = _workflow_template_row(tid)
+    run = _run_row(rid, template_id=tid)
+    inp = RunInputRow(
+        id=_make_id(),
+        run_id=rid,
+        requirements_md="# Requirements",
+        objective={"goal": "test"},
+        extra={},
+        input_hash="d" * 64,
+    )
+    # Insert parent and child in the same transaction so FK is satisfied.
+    async with db.session() as session:
+        session.add(template)
+        await session.flush()  # persist template before run references it
+        session.add(run)
+        await session.flush()  # persist run before inp references it
+        session.add(inp)
+
+    async with db.session() as session:
+        fetched_run = await session.get(RunRow, rid)
+        assert fetched_run is not None
+        await session.delete(fetched_run)
+
+    async with db.session() as session:
+        result = await session.execute(
+            text("SELECT id FROM run_inputs WHERE run_id = :rid"),
+            {"rid": rid},
+        )
+        rows = result.fetchall()
+    assert rows == [], f"Expected cascade delete of run_inputs, got {rows}"
+
+
+# ---------------------------------------------------------------------------
+# A.8: JSON column round-trip
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_json_column_round_trip(db: Database) -> None:
+    """RunEventRow.payload nested dict must survive DB round-trip intact."""
+    rid = _make_id()
+    tid = _make_id()
+    template = _workflow_template_row(tid)
+    run = _run_row(rid, template_id=tid)
+    payload: dict[str, Any] = {
+        "nested": {"list": [1, 2, 3], "flag": True},
+        "msg": "hello",
+    }
+    event = RunEventRow(
+        run_id=rid,
+        seq=1,
+        type="phase_started",
+        payload=payload,
+        idempotency_key="idem-1",
+        ts=_NOW,
+    )
+    async with db.session() as session:
+        session.add(template)
+        await session.flush()  # persist template before run references it
+        session.add(run)
+        await session.flush()  # persist run before event references it
+        session.add(event)
+    async with db.session() as session:
+        result = await session.execute(
+            text("SELECT payload FROM run_events WHERE run_id = :rid"), {"rid": rid}
+        )
+        raw = result.scalar()
+    import json as _json
+
+    restored = _json.loads(raw) if isinstance(raw, str) else raw
+    assert restored == payload
+
+
+# ---------------------------------------------------------------------------
+# A.9: UUID string column round-trip
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_uuid_column_round_trip(db: Database) -> None:
+    """UUID primary key stored as string must compare equal after retrieval."""
+    expected_id = str(uuid.uuid4())
+    tid = _make_id()
+    template = _workflow_template_row(tid)
+    run = RunRow(
+        id=expected_id,
+        template_id=tid,
+        template_hash="e" * 64,
+        state="running",
+        repo_path="/r",
+        base_branch="main",
+        worktree_root="/w",
+        created_at=_NOW,
+        updated_at=_NOW,
+    )
+    async with db.session() as session:
+        session.add(template)
+        await session.flush()
+        session.add(run)
+    async with db.session() as session:
+        fetched = await session.get(RunRow, expected_id)
+    assert fetched is not None
+    assert fetched.id == expected_id
+
+
+# ---------------------------------------------------------------------------
+# A.10: UNIQUE(run_id, seq) on run_events
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_run_events_unique_run_seq(db: Database) -> None:
+    """Two RunEventRows with the same (run_id, seq) must raise IntegrityError."""
+    rid = _make_id()
+    tid = _make_id()
+    template = _workflow_template_row(tid)
+    run = _run_row(rid, template_id=tid)
+    async with db.session() as session:
+        session.add(template)
+        await session.flush()
+        session.add(run)
+        await session.flush()
+        session.add(
+            RunEventRow(
+                run_id=rid,
+                seq=1,
+                type="x",
+                payload={},
+                idempotency_key="key-a",
+                ts=_NOW,
+            )
+        )
+
+    with pytest.raises(IntegrityError):
+        async with db.session() as session:
+            session.add(
+                RunEventRow(
+                    run_id=rid,
+                    seq=1,  # same seq → collision on (run_id, seq)
+                    type="x",
+                    payload={},
+                    idempotency_key="key-b",
+                    ts=_NOW,
+                )
+            )
+
+
+# ---------------------------------------------------------------------------
+# A.11: UNIQUE(run_id, idempotency_key) on run_events
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_run_events_unique_idempotency_key(db: Database) -> None:
+    """Two RunEventRows with the same (run_id, idempotency_key) must raise IntegrityError."""
+    rid = _make_id()
+    tid = _make_id()
+    template = _workflow_template_row(tid)
+    run = _run_row(rid, template_id=tid)
+    async with db.session() as session:
+        session.add(template)
+        await session.flush()
+        session.add(run)
+        await session.flush()
+        session.add(
+            RunEventRow(
+                run_id=rid,
+                seq=1,
+                type="x",
+                payload={},
+                idempotency_key="shared-key",
+                ts=_NOW,
+            )
+        )
+
+    with pytest.raises(IntegrityError):
+        async with db.session() as session:
+            session.add(
+                RunEventRow(
+                    run_id=rid,
+                    seq=2,  # different seq
+                    type="x",
+                    payload={},
+                    idempotency_key="shared-key",  # same idem key → collision
+                    ts=_NOW,
+                )
+            )
+
+
+# ---------------------------------------------------------------------------
+# A.12: Index existence on run_events
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_run_events_index_exists(db: Database) -> None:
+    """The run_events_run_id_ts_idx index must exist in sqlite_master."""
+    async with db.session() as session:
+        result = await session.execute(
+            text(
+                "SELECT name FROM sqlite_master "
+                "WHERE type='index' AND name='run_events_run_id_ts_idx'"
+            )
+        )
+        names = [row[0] for row in result.fetchall()]
+    assert "run_events_run_id_ts_idx" in names
+
+
+# ---------------------------------------------------------------------------
+# A.13: dispose + new session works
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_dispose_and_reconnect(db_url: str) -> None:
+    """After dispose(), creating a new Database and querying must succeed."""
+    db1 = Database(db_url)
+    await db1.init_schema()
+    await db1.dispose()
+
+    db2 = Database(db_url)
+    async with db2.session() as session:
+        result = await session.execute(
+            text("SELECT name FROM sqlite_master WHERE type='table' ORDER BY name")
+        )
+        tables = [row[0] for row in result.fetchall()]
+    await db2.dispose()
+    assert "runs" in tables
+
+
+# ---------------------------------------------------------------------------
+# A.14: Alembic upgrade head produces valid schema
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_alembic_upgrade_head_produces_valid_schema(tmp_path: Path) -> None:
+    """Running alembic upgrade head on a fresh DB must create the expected tables."""
+    db_path = tmp_path / "alembic_test.db"
+    db_url = f"sqlite:///{db_path}"  # sync URL for alembic env.py
+
+    project_root = Path(__file__).parent.parent.parent
+
+    result = subprocess.run(
+        [
+            sys.executable,
+            "-m",
+            "alembic",
+            "upgrade",
+            "head",
+        ],
+        cwd=str(project_root),
+        env={**__import__("os").environ, "DATABASE_URL": db_url},
+        capture_output=True,
+        text=True,
+    )
+    assert result.returncode == 0, (
+        f"alembic upgrade head failed:\nSTDOUT:\n{result.stdout}\nSTDERR:\n{result.stderr}"
+    )
+
+    import sqlite3
+
+    with sqlite3.connect(str(db_path)) as conn:
+        cur = conn.execute("SELECT name FROM sqlite_master WHERE type='table' ORDER BY name")
+        tables = {row[0] for row in cur.fetchall()}
+
+    tables.discard("alembic_version")
+    assert EXPECTED_TABLES <= tables, f"Missing after alembic upgrade: {EXPECTED_TABLES - tables}"
+
+
+# ---------------------------------------------------------------------------
+# P0-1: partial unique index ux_active_run_repo_base
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_active_run_unique_index_blocks_duplicate(db: Database) -> None:
+    """Two active runs with the same (repo_path, base_branch) must raise IntegrityError."""
+    tid = _make_id()
+    template = _workflow_template_row(tid)
+    rid1 = _make_id()
+    run1 = _run_row(rid1, template_id=tid)
+    run1.state = "running"
+
+    rid2 = _make_id()
+    run2 = _run_row(rid2, template_id=tid)
+    run2.state = "pending"
+    # Same repo_path and base_branch — both active → must violate unique index.
+
+    async with db.session() as session:
+        session.add(template)
+        await session.flush()
+        session.add(run1)
+
+    with pytest.raises(IntegrityError):
+        async with db.session() as session:
+            session.add(run2)
+
+
+@pytest.mark.asyncio
+async def test_active_run_unique_index_allows_completed(db: Database) -> None:
+    """A completed run allows a new active run with the same (repo_path, base_branch)."""
+    tid = _make_id()
+    template = _workflow_template_row(tid)
+    rid1 = _make_id()
+    run1 = _run_row(rid1, template_id=tid)
+    run1.state = "completed"
+
+    rid2 = _make_id()
+    run2 = _run_row(rid2, template_id=tid)
+    run2.state = "running"
+    # Same repo/branch; run1 is completed (excluded) → run2 must succeed.
+
+    async with db.session() as session:
+        session.add(template)
+        await session.flush()
+        session.add(run1)
+
+    async with db.session() as session:
+        session.add(run2)
+
+    async with db.session() as session:
+        fetched = await session.get(RunRow, rid2)
+    assert fetched is not None
+    assert fetched.state == "running"
+
+
+# ---------------------------------------------------------------------------
+# P0-3: FK CASCADE — RunRow delete cascades to all audit children
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_fk_cascade_run_delete_cascades_phase_feedback(db: Database) -> None:
+    """Deleting a RunRow cascades to phase_feedback and run_phases rows."""
+    from my_deepagent.persistence.models import PhaseFeedbackRow
+
+    tid = _make_id()
+    rid = _make_id()
+    phase_id = _make_id()
+    template = _workflow_template_row(tid)
+    run = _run_row(rid, template_id=tid)
+    phase = RunPhaseRow(
+        id=phase_id,
+        run_id=rid,
+        phase_key="plan",
+        seq=1,
+        state="completed",
+        attempts=1,
+    )
+    feedback = PhaseFeedbackRow(
+        run_id=rid,
+        phase_id=phase_id,
+        reaction="thumbs_up",
+        created_at=_NOW,
+    )
+
+    async with db.session() as session:
+        session.add(template)
+        await session.flush()
+        session.add(run)
+        await session.flush()
+        session.add(phase)
+        await session.flush()
+        session.add(feedback)
+
+    async with db.session() as session:
+        fetched_run = await session.get(RunRow, rid)
+        assert fetched_run is not None
+        await session.delete(fetched_run)
+
+    async with db.session() as session:
+        fb_result = await session.execute(
+            text("SELECT id FROM phase_feedback WHERE run_id = :rid"), {"rid": rid}
+        )
+        ph_result = await session.execute(
+            text("SELECT id FROM run_phases WHERE run_id = :rid"), {"rid": rid}
+        )
+    assert fb_result.fetchall() == [], "phase_feedback must cascade-delete with run"
+    assert ph_result.fetchall() == [], "run_phases must cascade-delete with run"
+
+
+# ---------------------------------------------------------------------------
+# P0-3: FK RESTRICT — deleting WorkflowTemplateRow with runs is blocked
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_fk_restrict_template_delete_blocked_by_run(db: Database) -> None:
+    """Deleting a WorkflowTemplateRow that has a referencing RunRow must raise IntegrityError."""
+    tid = _make_id()
+    rid = _make_id()
+    template = _workflow_template_row(tid)
+    run = _run_row(rid, template_id=tid)
+
+    async with db.session() as session:
+        session.add(template)
+        await session.flush()
+        session.add(run)
+
+    with pytest.raises(IntegrityError):
+        async with db.session() as session:
+            fetched = await session.get(WorkflowTemplateRow, tid)
+            assert fetched is not None
+            await session.delete(fetched)
+
+
+# ---------------------------------------------------------------------------
+# P0-1: partial unique index exists in sqlite_master after init_schema
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_active_run_partial_index_exists_in_schema(db: Database) -> None:
+    """ux_active_run_repo_base partial unique index must exist after init_schema."""
+    async with db.session() as session:
+        result = await session.execute(
+            text(
+                "SELECT sql FROM sqlite_master "
+                "WHERE type='index' AND name='ux_active_run_repo_base'"
+            )
+        )
+        row = result.fetchone()
+    assert row is not None, "ux_active_run_repo_base index missing from sqlite_master"
+    assert "WHERE" in (row[0] or ""), f"Expected WHERE clause in index SQL, got: {row[0]}"
--- a/my-deepagent/tests/unit/init.py
+++ b/my-deepagent/tests/unit/init.py
--- a/my-deepagent/tests/unit/test_artifact_schema.py
+++ b/my-deepagent/tests/unit/test_artifact_schema.py
@@ -0,0 +1,391 @@
+"""Unit tests for src/my_deepagent/artifact_schema.py."""
+
+from __future__ import annotations
+
+import json
+from pathlib import Path
+from typing import Any
+
+import pytest
+
+from my_deepagent.artifact_schema import (
+    ArtifactSchemaRegistry,
+    ValidationFinding,
+    ValidationResult,
+)
+from my_deepagent.errors import MyDeepAgentError
+
+# ---------------------------------------------------------------------------
+# Fixtures
+# ---------------------------------------------------------------------------
+
+REPO_ROOT = Path(__file__).parent.parent.parent
+SEED_ROOT = REPO_ROOT / "docs" / "schemas" / "artifacts"
+
+SEED_SCHEMA_IDS = [
+    "common/final-report@1",
+    "dev/phase-plan@1",
+    "dev/review-finding-batch@1",
+    "dev/spec@1",
+]
+
+
+@pytest.fixture
+def seed_registry() -> ArtifactSchemaRegistry:
+    return ArtifactSchemaRegistry(roots=[SEED_ROOT])
+
+
+@pytest.fixture
+def valid_spec() -> dict[str, Any]:
+    return {
+        "runId": "00000000-0000-4000-8000-000000000000",
+        "phaseKey": "spec",
+        "requirements": "User wants a CLI tool that analyzes log files.",
+        "acceptance_criteria": ["parses .log files", "outputs JSON summary"],
+        "approach": "Build a typer-based CLI using regex and json output.",
+        "risks": ["log format variations may break parser"],
+    }
+
+
+# ---------------------------------------------------------------------------
+# 1. Seed schema load success (4 schemas)
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.parametrize("schema_id", SEED_SCHEMA_IDS)
+def test_seed_schema_loads(seed_registry: ArtifactSchemaRegistry, schema_id: str) -> None:
+    schema = seed_registry.load(schema_id)
+    assert isinstance(schema, dict)
+    assert schema.get("$id") == schema_id
+
+
+# ---------------------------------------------------------------------------
+# 2. Load result caching — same dict object on second call
+# ---------------------------------------------------------------------------
+
+
+def test_load_caches_same_object(seed_registry: ArtifactSchemaRegistry) -> None:
+    first = seed_registry.load("dev/spec@1")
+    second = seed_registry.load("dev/spec@1")
+    assert first is second
+
+
+# ---------------------------------------------------------------------------
+# 3. Unknown schema_id → artifact_schema_unknown
+# ---------------------------------------------------------------------------
+
+
+def test_unknown_schema_id_raises(seed_registry: ArtifactSchemaRegistry) -> None:
+    with pytest.raises(MyDeepAgentError) as exc_info:
+        seed_registry.load("dev/nonexistent@99")
+    assert exc_info.value.code == "artifact_schema_unknown"
+
+
+# ---------------------------------------------------------------------------
+# 4. Invalid schema_id format (no slash) → artifact_schema_unknown
+# ---------------------------------------------------------------------------
+
+
+def test_invalid_schema_id_no_slash(seed_registry: ArtifactSchemaRegistry) -> None:
+    with pytest.raises(MyDeepAgentError) as exc_info:
+        seed_registry.load("foo")
+    assert exc_info.value.code == "artifact_schema_unknown"
+
+
+# ---------------------------------------------------------------------------
+# 5. schema_id starting with "/" → rejected (no slash separating domain/name)
+# ---------------------------------------------------------------------------
+
+
+def test_invalid_schema_id_leading_slash(seed_registry: ArtifactSchemaRegistry) -> None:
+    # "/foo/bar" has a slash but the domain portion would be empty
+    # After splitting on "/", domain="" which is not a valid domain/name pair.
+    # The registry treats it as a path traversal risk: Path("/foo/bar.json")
+    # is absolute and will never exist under a root directory (is_file() → False).
+    with pytest.raises(MyDeepAgentError) as exc_info:
+        seed_registry.load("/dev/spec@1")
+    assert exc_info.value.code == "artifact_schema_unknown"
+
+
+# ---------------------------------------------------------------------------
+# 6. Empty schema_id → artifact_schema_unknown
+# ---------------------------------------------------------------------------
+
+
+def test_empty_schema_id_raises(seed_registry: ArtifactSchemaRegistry) -> None:
+    with pytest.raises(MyDeepAgentError) as exc_info:
+        seed_registry.load("")
+    assert exc_info.value.code == "artifact_schema_unknown"
+
+
+# ---------------------------------------------------------------------------
+# 7. Fallback: schema absent in first root, present in second
+# ---------------------------------------------------------------------------
+
+
+def test_fallback_to_second_root(tmp_path: Path) -> None:
+    first_root = tmp_path / "first"
+    first_root.mkdir()
+    second_root = tmp_path / "second"
+    (second_root / "dev").mkdir(parents=True)
+    schema: dict[str, Any] = {
+        "$schema": "https://json-schema.org/draft/2020-12/schema",
+        "$id": "dev/thing@1",
+        "type": "object",
+    }
+    (second_root / "dev" / "thing@1.json").write_text(json.dumps(schema), encoding="utf-8")
+    registry = ArtifactSchemaRegistry(roots=[first_root, second_root])
+    loaded = registry.load("dev/thing@1")
+    assert loaded["$id"] == "dev/thing@1"
+
+
+# ---------------------------------------------------------------------------
+# 8. validate with valid data → ok=True
+# ---------------------------------------------------------------------------
+
+
+def test_validate_valid_spec(
+    seed_registry: ArtifactSchemaRegistry, valid_spec: dict[str, Any]
+) -> None:
+    result = seed_registry.validate("dev/spec@1", valid_spec)
+    assert result.ok is True
+    assert result.errors == ()
+
+
+# ---------------------------------------------------------------------------
+# 9. validate with invalid data → ok=False, findings non-empty
+# ---------------------------------------------------------------------------
+
+
+def test_validate_invalid_data_returns_findings(
+    seed_registry: ArtifactSchemaRegistry,
+) -> None:
+    result = seed_registry.validate("dev/spec@1", {"wrong": "data"})
+    assert result.ok is False
+    assert len(result.errors) > 0
+    for finding in result.errors:
+        assert isinstance(finding, ValidationFinding)
+
+
+# ---------------------------------------------------------------------------
+# 10. Missing required field → validator="required", path correct
+# ---------------------------------------------------------------------------
+
+
+def test_validate_missing_required_field(
+    seed_registry: ArtifactSchemaRegistry, valid_spec: dict[str, Any]
+) -> None:
+    data = {k: v for k, v in valid_spec.items() if k != "requirements"}
+    result = seed_registry.validate("dev/spec@1", data)
+    assert result.ok is False
+    required_findings = [f for f in result.errors if f.validator == "required"]
+    assert any("requirements" in f.message for f in required_findings)
+
+
+# ---------------------------------------------------------------------------
+# 11. Invalid enum value → validator="enum", expected has enum list
+# ---------------------------------------------------------------------------
+
+
+def test_validate_invalid_enum_severity(seed_registry: ArtifactSchemaRegistry) -> None:
+    data = {
+        "runId": "00000000-0000-4000-8000-000000000000",
+        "phaseKey": "review",
+        "reviewerRole": "code-reviewer",
+        "findings": [
+            {
+                "severity": "bogus",
+                "category": "correctness",
+                "summary": "something is wrong here",
+            }
+        ],
+        "summary": "Overall review summary with enough length.",
+    }
+    result = seed_registry.validate("dev/review-finding-batch@1", data)
+    assert result.ok is False
+    enum_findings = [f for f in result.errors if f.validator == "enum"]
+    assert len(enum_findings) > 0
+    finding = enum_findings[0]
+    assert isinstance(finding.expected, list)
+    assert "bogus" not in finding.expected
+
+
+# ---------------------------------------------------------------------------
+# 12. Wrong type → validator="type", expected has type name
+# ---------------------------------------------------------------------------
+
+
+def test_validate_wrong_type(
+    seed_registry: ArtifactSchemaRegistry, valid_spec: dict[str, Any]
+) -> None:
+    data = dict(valid_spec)
+    data["acceptance_criteria"] = "should be a list, not a string"
+    result = seed_registry.validate("dev/spec@1", data)
+    assert result.ok is False
+    type_findings = [f for f in result.errors if f.validator == "type"]
+    assert len(type_findings) > 0
+    assert type_findings[0].expected == "array"
+
+
+# ---------------------------------------------------------------------------
+# 13. Nested error path — /findings/0/severity format
+# ---------------------------------------------------------------------------
+
+
+def test_validate_nested_error_path(seed_registry: ArtifactSchemaRegistry) -> None:
+    data = {
+        "runId": "00000000-0000-4000-8000-000000000000",
+        "phaseKey": "review",
+        "reviewerRole": "code-reviewer",
+        "findings": [
+            {
+                "severity": "not-valid",
+                "category": "correctness",
+                "summary": "a finding summary",
+            }
+        ],
+        "summary": "Overall review summary with enough length.",
+    }
+    result = seed_registry.validate("dev/review-finding-batch@1", data)
+    assert result.ok is False
+    paths = [f.path for f in result.errors]
+    assert any(p.startswith("/findings/0/") for p in paths)
+
+
+# ---------------------------------------------------------------------------
+# 14. known_schema_ids() returns all 4 seed schemas, sorted
+# ---------------------------------------------------------------------------
+
+
+def test_known_schema_ids_returns_seeds(seed_registry: ArtifactSchemaRegistry) -> None:
+    ids = seed_registry.known_schema_ids()
+    for expected in SEED_SCHEMA_IDS:
+        assert expected in ids
+    assert ids == sorted(ids)
+
+
+# ---------------------------------------------------------------------------
+# 15. Empty roots list → config_invalid
+# ---------------------------------------------------------------------------
+
+
+def test_empty_roots_raises() -> None:
+    with pytest.raises(MyDeepAgentError) as exc_info:
+        ArtifactSchemaRegistry(roots=[])
+    assert exc_info.value.code == "config_invalid"
+
+
+# ---------------------------------------------------------------------------
+# 16. Corrupted JSON file → artifact_schema_load_failed
+# ---------------------------------------------------------------------------
+
+
+def test_corrupted_json_raises(tmp_path: Path) -> None:
+    (tmp_path / "dev").mkdir()
+    (tmp_path / "dev" / "broken@1.json").write_text("{", encoding="utf-8")
+    registry = ArtifactSchemaRegistry(roots=[tmp_path])
+    with pytest.raises(MyDeepAgentError) as exc_info:
+        registry.load("dev/broken@1")
+    assert exc_info.value.code == "artifact_schema_load_failed"
+
+
+# ---------------------------------------------------------------------------
+# 17. Valid JSON but not a dict → artifact_schema_load_failed
+# ---------------------------------------------------------------------------
+
+
+def test_non_dict_json_raises(tmp_path: Path) -> None:
+    (tmp_path / "dev").mkdir()
+    (tmp_path / "dev" / "array@1.json").write_text("[1, 2, 3]", encoding="utf-8")
+    registry = ArtifactSchemaRegistry(roots=[tmp_path])
+    with pytest.raises(MyDeepAgentError) as exc_info:
+        registry.load("dev/array@1")
+    assert exc_info.value.code == "artifact_schema_load_failed"
+
+
+# ---------------------------------------------------------------------------
+# 18. Schema itself is invalid Draft 2020-12 → artifact_schema_load_failed
+# ---------------------------------------------------------------------------
+
+
+def test_invalid_draft_schema_raises(tmp_path: Path) -> None:
+    (tmp_path / "dev").mkdir()
+    bad_schema = {"type": "not_a_type"}
+    (tmp_path / "dev" / "bad@1.json").write_text(json.dumps(bad_schema), encoding="utf-8")
+    registry = ArtifactSchemaRegistry(roots=[tmp_path])
+    with pytest.raises(MyDeepAgentError) as exc_info:
+        registry.load("dev/bad@1")
+    assert exc_info.value.code == "artifact_schema_load_failed"
+
+
+# ---------------------------------------------------------------------------
+# 19. Validator caching: _validator called twice returns same instance
+# ---------------------------------------------------------------------------
+
+
+def test_validator_instance_cached(seed_registry: ArtifactSchemaRegistry) -> None:
+    # Access internal cache to verify the same validator instance is reused.
+    v1 = seed_registry._validator("dev/spec@1")
+    v2 = seed_registry._validator("dev/spec@1")
+    assert v1 is v2
+
+
+# ---------------------------------------------------------------------------
+# 20. dev/spec@1 valid example produces ok=True (full fixture check)
+# ---------------------------------------------------------------------------
+
+
+def test_spec_valid_example_ok(seed_registry: ArtifactSchemaRegistry) -> None:
+    valid_spec: dict[str, Any] = {
+        "runId": "00000000-0000-4000-8000-000000000000",
+        "phaseKey": "spec",
+        "requirements": "User wants a CLI tool that analyzes log files.",
+        "acceptance_criteria": ["parses .log files", "outputs JSON summary"],
+        "approach": "Build a typer-based CLI using regex and json output.",
+        "risks": ["log format variations may break parser"],
+    }
+    result = seed_registry.validate("dev/spec@1", valid_spec)
+    assert result.ok is True
+    assert result.errors == ()
+
+
+# ---------------------------------------------------------------------------
+# Bonus: ValidationResult and ValidationFinding are frozen dataclasses
+# ---------------------------------------------------------------------------
+
+
+def test_validation_result_frozen() -> None:
+    result = ValidationResult(ok=True)
+    with pytest.raises((AttributeError, TypeError)):
+        result.ok = False  # type: ignore[misc]
+
+
+def test_validation_finding_frozen() -> None:
+    finding = ValidationFinding(path="/foo", message="err", validator="type", expected="string")
+    with pytest.raises((AttributeError, TypeError)):
+        finding.path = "/bar"  # type: ignore[misc]
+
+
+# ---------------------------------------------------------------------------
+# Bonus: known_schema_ids with nonexistent root dir is silently skipped
+# ---------------------------------------------------------------------------
+
+
+def test_known_schema_ids_skips_nonexistent_root(tmp_path: Path) -> None:
+    missing = tmp_path / "does_not_exist"
+    registry = ArtifactSchemaRegistry(roots=[missing])
+    assert registry.known_schema_ids() == []
+
+
+# ---------------------------------------------------------------------------
+# Bonus: validate with non-dict top-level data
+# ---------------------------------------------------------------------------
+
+
+def test_validate_non_dict_data_returns_error(
+    seed_registry: ArtifactSchemaRegistry,
+) -> None:
+    result = seed_registry.validate("dev/spec@1", [1, 2, 3])
+    assert result.ok is False
+    type_findings = [f for f in result.errors if f.validator == "type"]
+    assert len(type_findings) > 0
--- a/my-deepagent/tests/unit/test_binding.py
+++ b/my-deepagent/tests/unit/test_binding.py
@@ -0,0 +1,644 @@
+"""Unit tests for src/my_deepagent/binding.py."""
+
+from __future__ import annotations
+
+import fcntl
+import json
+import re
+from pathlib import Path
+
+import pytest
+
+from my_deepagent.binding import (
+    BackendAvailability,
+    Binding,
+    BindingOverride,
+    PersonaConsentStore,
+    bind_personas,
+    filter_consented_personas,
+    is_persona_eligible_for_role,
+)
+from my_deepagent.enums import Backend, Capability
+from my_deepagent.errors import MyDeepAgentError
+from my_deepagent.persona import Persona, load_personas_from_dir
+from my_deepagent.workflow import WorkflowTemplate, load_workflows_from_dir
+
+# ---------------------------------------------------------------------------
+# PersonaConsentStore file-lock (fcntl.flock) verification
+# ---------------------------------------------------------------------------
+
+
+def test_consent_store_set_acquires_exclusive_lock(
+    tmp_path: Path, monkeypatch: pytest.MonkeyPatch
+) -> None:
+    """set() must take an exclusive flock and release it."""
+    ops: list[int] = []
+    orig_flock = fcntl.flock
+
+    def spy(fd: int, op: int) -> None:
+        ops.append(op)
+        orig_flock(fd, op)
+
+    monkeypatch.setattr(fcntl, "flock", spy)
+    store = PersonaConsentStore(tmp_path / "consents.json")
+    store.set("hash_abc", "approve")
+    assert fcntl.LOCK_EX in ops
+    assert fcntl.LOCK_UN in ops
+
+
+def test_consent_store_revoke_acquires_exclusive_lock(
+    tmp_path: Path, monkeypatch: pytest.MonkeyPatch
+) -> None:
+    ops: list[int] = []
+    orig_flock = fcntl.flock
+
+    def spy(fd: int, op: int) -> None:
+        ops.append(op)
+        orig_flock(fd, op)
+
+    monkeypatch.setattr(fcntl, "flock", spy)
+    store = PersonaConsentStore(tmp_path / "consents.json")
+    store.set("h", "approve")
+    ops.clear()
+    store.revoke("h")
+    assert fcntl.LOCK_EX in ops
+    assert fcntl.LOCK_UN in ops
+
+
+def test_consent_store_get_acquires_shared_lock(
+    tmp_path: Path, monkeypatch: pytest.MonkeyPatch
+) -> None:
+    """get() takes a shared lock (LOCK_SH) so multiple readers don't serialise."""
+    ops: list[int] = []
+    orig_flock = fcntl.flock
+
+    def spy(fd: int, op: int) -> None:
+        ops.append(op)
+        orig_flock(fd, op)
+
+    monkeypatch.setattr(fcntl, "flock", spy)
+    store = PersonaConsentStore(tmp_path / "consents.json")
+    store.set("h", "approve")
+    ops.clear()
+    _ = store.get("h")
+    assert fcntl.LOCK_SH in ops
+    assert fcntl.LOCK_UN in ops
+
+
+def test_consent_store_lock_file_created(tmp_path: Path) -> None:
+    """A .lock sidecar file is created next to the consent store on first write."""
+    path = tmp_path / "consents.json"
+    store = PersonaConsentStore(path)
+    store.set("h", "approve")
+    assert (tmp_path / "consents.json.lock").is_file()
+
+
+# ---------------------------------------------------------------------------
+# Fixtures / helpers
+# ---------------------------------------------------------------------------
+
+PERSONAS_DIR = Path(__file__).parent.parent.parent / "docs" / "schemas" / "personas"
+WORKFLOWS_DIR = Path(__file__).parent.parent.parent / "docs" / "schemas" / "workflows"
+
+
+def _minimal_persona(**overrides: object) -> Persona:
+    base: dict[str, object] = {
+        "name": "test-persona",
+        "version": 1,
+        "backend": "openrouter",
+        "model": "openrouter:anthropic/claude-sonnet-4-6",
+        "provider_origin": "US/Anthropic",
+        "capabilities": ["spec_write", "phase_planning"],
+        "max_risk_level": "low",
+        "system_prompt": "You are a test persona for unit tests.",
+    }
+    base.update(overrides)
+    return Persona.model_validate(base)
+
+
+def _all_available() -> BackendAvailability:
+    return BackendAvailability(available_backends=frozenset(Backend))
+
+
+def _none_available() -> BackendAvailability:
+    return BackendAvailability(available_backends=frozenset())
+
+
+@pytest.fixture()
+def consent_store(tmp_path: Path) -> PersonaConsentStore:
+    return PersonaConsentStore(tmp_path / "consents.json")
+
+
+@pytest.fixture()
+def seed_personas() -> list[Persona]:
+    return load_personas_from_dir(PERSONAS_DIR)
+
+
+@pytest.fixture()
+def spec_and_review() -> WorkflowTemplate:
+    workflows = load_workflows_from_dir(WORKFLOWS_DIR)
+    return next(w for w in workflows if w.name == "spec-and-review")
+
+
+# ---------------------------------------------------------------------------
+# is_persona_eligible_for_role
+# ---------------------------------------------------------------------------
+
+
+def test_eligible_all_ok(spec_and_review: WorkflowTemplate) -> None:
+    spec_writer_role = next(r for r in spec_and_review.roles if r.id == "spec_writer")
+    p = _minimal_persona(capabilities=["spec_write", "phase_planning"], max_risk_level="low")
+    ok, reason = is_persona_eligible_for_role(p, spec_writer_role, spec_and_review)
+    assert ok is True
+    assert reason is None
+
+
+def test_eligible_missing_capability(spec_and_review: WorkflowTemplate) -> None:
+    spec_writer_role = next(r for r in spec_and_review.roles if r.id == "spec_writer")
+    # only spec_write, missing phase_planning
+    p = _minimal_persona(capabilities=["spec_write"], max_risk_level="low")
+    ok, reason = is_persona_eligible_for_role(p, spec_writer_role, spec_and_review)
+    assert ok is False
+    assert reason is not None
+    assert "phase_planning" in reason
+
+
+def test_eligible_allowed_roles_mismatch(spec_and_review: WorkflowTemplate) -> None:
+    spec_writer_role = next(r for r in spec_and_review.roles if r.id == "spec_writer")
+    p = _minimal_persona(
+        capabilities=["spec_write", "phase_planning"],
+        max_risk_level="low",
+        allowed_roles=["reviewer"],  # does not include spec_writer
+    )
+    ok, reason = is_persona_eligible_for_role(p, spec_writer_role, spec_and_review)
+    assert ok is False
+    assert reason is not None
+    assert "allowed_roles" in reason
+
+
+def test_eligible_allowed_roles_matches(spec_and_review: WorkflowTemplate) -> None:
+    spec_writer_role = next(r for r in spec_and_review.roles if r.id == "spec_writer")
+    p = _minimal_persona(
+        capabilities=["spec_write", "phase_planning"],
+        max_risk_level="low",
+        allowed_roles=["spec_writer"],
+    )
+    ok, reason = is_persona_eligible_for_role(p, spec_writer_role, spec_and_review)
+    assert ok is True
+    assert reason is None
+
+
+def test_eligible_risk_too_high(spec_and_review: WorkflowTemplate) -> None:
+    """bug-fix workflow has a 'medium' risk phase; a low-only persona is ineligible for it."""
+    bug_fix = load_workflows_from_dir(WORKFLOWS_DIR)
+    bug_fix_wf = next(w for w in bug_fix if w.name == "bug-fix-with-reproduction")
+    fixer_role = next(r for r in bug_fix_wf.roles if r.id == "fixer")
+    # fixer role has a 'medium' risk phase
+    p = _minimal_persona(
+        capabilities=["code_edit", "test_first_development"],
+        max_risk_level="low",  # too low for medium phase
+    )
+    ok, reason = is_persona_eligible_for_role(p, fixer_role, bug_fix_wf)
+    assert ok is False
+    assert reason is not None
+    assert "medium" in reason
+
+
+def test_eligible_risk_exact_match(spec_and_review: WorkflowTemplate) -> None:
+    spec_writer_role = next(r for r in spec_and_review.roles if r.id == "spec_writer")
+    p = _minimal_persona(capabilities=["spec_write", "phase_planning"], max_risk_level="low")
+    ok, _ = is_persona_eligible_for_role(p, spec_writer_role, spec_and_review)
+    assert ok is True
+
+
+# ---------------------------------------------------------------------------
+# bind_personas: end-to-end with seed data
+# ---------------------------------------------------------------------------
+
+
+def test_bind_personas_spec_and_review_success(
+    seed_personas: list[Persona],
+    spec_and_review: WorkflowTemplate,
+    consent_store: PersonaConsentStore,
+) -> None:
+    bindings = bind_personas(spec_and_review, seed_personas, _all_available(), consent_store)
+    assert set(bindings.keys()) == {"spec_writer", "reviewer", "verifier"}
+    for role_id, binding in bindings.items():
+        assert isinstance(binding, Binding)
+        assert binding.role_id == role_id
+        assert re.fullmatch(r"[0-9a-f]{64}", binding.binding_hash)
+
+
+def test_bind_personas_binding_hash_deterministic(
+    seed_personas: list[Persona],
+    spec_and_review: WorkflowTemplate,
+    consent_store: PersonaConsentStore,
+) -> None:
+    b1 = bind_personas(spec_and_review, seed_personas, _all_available(), consent_store)
+    b2 = bind_personas(spec_and_review, seed_personas, _all_available(), consent_store)
+    for role_id in b1:
+        assert b1[role_id].binding_hash == b2[role_id].binding_hash
+
+
+def test_bind_personas_spec_writer_is_spec_writer(
+    seed_personas: list[Persona],
+    spec_and_review: WorkflowTemplate,
+    consent_store: PersonaConsentStore,
+) -> None:
+    bindings = bind_personas(spec_and_review, seed_personas, _all_available(), consent_store)
+    spec_persona = bindings["spec_writer"].persona
+    assert Capability.SPEC_WRITE in spec_persona.capabilities
+    assert Capability.PHASE_PLANNING in spec_persona.capabilities
+
+
+# ---------------------------------------------------------------------------
+# bind_personas: override
+# ---------------------------------------------------------------------------
+
+
+def test_bind_personas_override_picks_pinned(
+    seed_personas: list[Persona],
+    spec_and_review: WorkflowTemplate,
+    consent_store: PersonaConsentStore,
+) -> None:
+    override = BindingOverride.parse({"spec_writer": "openrouter-claude-spec-writer@1"})
+    bindings = bind_personas(
+        spec_and_review, seed_personas, _all_available(), consent_store, override
+    )
+    assert bindings["spec_writer"].persona.name == "openrouter-claude-spec-writer"
+
+
+def test_bind_personas_override_invalid_persona_raises(
+    seed_personas: list[Persona],
+    spec_and_review: WorkflowTemplate,
+    consent_store: PersonaConsentStore,
+) -> None:
+    override = BindingOverride.parse({"spec_writer": "nonexistent-persona@1"})
+    with pytest.raises(MyDeepAgentError) as exc_info:
+        bind_personas(spec_and_review, seed_personas, _all_available(), consent_store, override)
+    assert exc_info.value.code == "no_eligible_persona"
+
+
+# ---------------------------------------------------------------------------
+# bind_personas: backend unavailable
+# ---------------------------------------------------------------------------
+
+
+def test_bind_personas_backend_unavailable_raises(
+    seed_personas: list[Persona],
+    spec_and_review: WorkflowTemplate,
+    consent_store: PersonaConsentStore,
+) -> None:
+    with pytest.raises(MyDeepAgentError) as exc_info:
+        bind_personas(spec_and_review, seed_personas, _none_available(), consent_store)
+    assert exc_info.value.code == "backend_unavailable"
+
+
+# ---------------------------------------------------------------------------
+# bind_personas: model_unavailable for openrouter with empty model
+# ---------------------------------------------------------------------------
+
+
+def test_bind_personas_model_unavailable_raises(
+    spec_and_review: WorkflowTemplate,
+    consent_store: PersonaConsentStore,
+) -> None:
+    """Verify FAKE backend binds successfully (positive path for non-openrouter backends).
+
+    We cannot construct an openrouter persona with empty model via model_validate because
+    the validator rejects it. Instead verify the happy path: FAKE backend + non-empty
+    model should bind without errors when the FAKE backend is available.
+    """
+    from my_deepagent.workflow import WorkflowPhase, WorkflowRole
+
+    role = WorkflowRole.model_validate(
+        {
+            "id": "spec_writer",
+            "required_capabilities": ["spec_write", "phase_planning"],
+            "preferred_backends": ["fake"],
+        }
+    )
+    phase = WorkflowPhase.model_validate(
+        {
+            "key": "spec",
+            "title": "Write spec",
+            "risk": "low",
+            "role": "spec_writer",
+            "instructions": "Write the specification document.",
+        }
+    )
+    tmpl = WorkflowTemplate.model_validate(
+        {
+            "name": "fake-wf",
+            "version": 1,
+            "roles": [role.model_dump()],
+            "phases": [phase.model_dump()],
+        }
+    )
+    fake_persona = _minimal_persona(
+        backend="fake",
+        model="fake-model",
+        capabilities=["spec_write", "phase_planning"],
+    )
+    fake_avail = BackendAvailability(available_backends=frozenset({Backend.FAKE}))
+    # Should succeed with FAKE backend + non-empty model
+    bindings = bind_personas(tmpl, [fake_persona], fake_avail, consent_store)
+    assert "spec_writer" in bindings
+
+
+# ---------------------------------------------------------------------------
+# bind_personas: no eligible persona
+# ---------------------------------------------------------------------------
+
+
+def test_bind_personas_no_eligible_raises(
+    spec_and_review: WorkflowTemplate,
+    consent_store: PersonaConsentStore,
+) -> None:
+    # Provide a persona with wrong capabilities
+    bad_persona = _minimal_persona(capabilities=["backtest_run"])
+    with pytest.raises(MyDeepAgentError) as exc_info:
+        bind_personas(spec_and_review, [bad_persona], _all_available(), consent_store)
+    assert exc_info.value.code == "no_eligible_persona"
+
+
+# ---------------------------------------------------------------------------
+# PersonaConsentStore: get / set / revoke
+# ---------------------------------------------------------------------------
+
+
+def test_consent_store_get_none_when_absent(consent_store: PersonaConsentStore) -> None:
+    assert consent_store.get("abc123") is None
+
+
+def test_consent_store_set_and_get(consent_store: PersonaConsentStore) -> None:
+    consent_store.set("abc123", "approve")
+    assert consent_store.get("abc123") == "approve"
+
+
+def test_consent_store_block(consent_store: PersonaConsentStore) -> None:
+    consent_store.set("abc123", "block")
+    assert consent_store.get("abc123") == "block"
+
+
+def test_consent_store_once(consent_store: PersonaConsentStore) -> None:
+    consent_store.set("abc123", "once")
+    assert consent_store.get("abc123") == "once"
+
+
+def test_consent_store_revoke(consent_store: PersonaConsentStore) -> None:
+    consent_store.set("abc123", "approve")
+    consent_store.revoke("abc123")
+    assert consent_store.get("abc123") is None
+
+
+def test_consent_store_revoke_absent_is_noop(consent_store: PersonaConsentStore) -> None:
+    consent_store.revoke("not_present")  # must not raise
+
+
+def test_consent_store_overwrite(consent_store: PersonaConsentStore) -> None:
+    consent_store.set("abc123", "approve")
+    consent_store.set("abc123", "block")
+    assert consent_store.get("abc123") == "block"
+
+
+def test_consent_store_unknown_decision_returns_none(
+    consent_store: PersonaConsentStore,
+    tmp_path: Path,
+) -> None:
+    """Corrupt decision value (not approve/block/once) returns None, not raise."""
+    path = tmp_path / "consents.json"
+    path.write_text(
+        json.dumps({"abc123": {"decision": "foobar", "decided_at": "2026-01-01T00:00:00+00:00"}}),
+        encoding="utf-8",
+    )
+    store = PersonaConsentStore(path)
+    assert store.get("abc123") is None
+
+
+def test_consent_store_corrupted_json_raises_fatal(tmp_path: Path) -> None:
+    path = tmp_path / "consents.json"
+    path.write_text("{invalid json", encoding="utf-8")
+    store = PersonaConsentStore(path)
+    with pytest.raises(MyDeepAgentError) as exc_info:
+        store.get("abc123")
+    assert exc_info.value.code == "internal_state_corruption"
+
+
+def test_consent_store_atomic_write(consent_store: PersonaConsentStore) -> None:
+    """The .tmp file must not remain after a successful write."""
+    consent_store.set("abc", "approve")
+    tmp_file = consent_store._path.with_suffix(".json.tmp")
+    assert not tmp_file.exists(), ".tmp leftover after successful write"
+
+
+def test_consent_store_json_format(consent_store: PersonaConsentStore) -> None:
+    """Stored JSON must be valid and contain decision + decided_at."""
+    consent_store.set("myhash", "once")
+    raw = consent_store._path.read_text(encoding="utf-8")
+    data = json.loads(raw)
+    assert "myhash" in data
+    assert data["myhash"]["decision"] == "once"
+    assert "decided_at" in data["myhash"]
+
+
+# ---------------------------------------------------------------------------
+# filter_consented_personas
+# ---------------------------------------------------------------------------
+
+
+def test_filter_removes_blocked(consent_store: PersonaConsentStore) -> None:
+    p1 = _minimal_persona(name="p1")
+    p2 = _minimal_persona(name="p2")
+    consent_store.set(p2.compute_hash(), "block")
+    result = filter_consented_personas([p1, p2], consent_store)
+    assert len(result) == 1
+    assert result[0].name == "p1"
+
+
+def test_filter_keeps_approved(consent_store: PersonaConsentStore) -> None:
+    p = _minimal_persona()
+    consent_store.set(p.compute_hash(), "approve")
+    result = filter_consented_personas([p], consent_store)
+    assert len(result) == 1
+
+
+def test_filter_keeps_once(consent_store: PersonaConsentStore) -> None:
+    p = _minimal_persona()
+    consent_store.set(p.compute_hash(), "once")
+    result = filter_consented_personas([p], consent_store)
+    assert len(result) == 1
+
+
+def test_filter_keeps_none_decision(consent_store: PersonaConsentStore) -> None:
+    """Persona with no stored decision passes through."""
+    p = _minimal_persona()
+    result = filter_consented_personas([p], consent_store)
+    assert len(result) == 1
+
+
+def test_filter_empty_list(consent_store: PersonaConsentStore) -> None:
+    result = filter_consented_personas([], consent_store)
+    assert result == []
+
+
+# ---------------------------------------------------------------------------
+# bind_personas: consent-blocked persona detection
+# ---------------------------------------------------------------------------
+
+
+def test_bind_personas_all_eligible_blocked_raises(
+    seed_personas: list[Persona],
+    spec_and_review: WorkflowTemplate,
+    consent_store: PersonaConsentStore,
+) -> None:
+    # Block all spec_writer-eligible personas
+    for p in seed_personas:
+        if Capability.SPEC_WRITE in p.capabilities and Capability.PHASE_PLANNING in p.capabilities:
+            consent_store.set(p.compute_hash(), "block")
+
+    with pytest.raises(MyDeepAgentError) as exc_info:
+        bind_personas(spec_and_review, seed_personas, _all_available(), consent_store)
+    assert exc_info.value.code in ("persona_blocked_by_user", "no_eligible_persona")
+
+
+def test_bind_personas_override_blocked_raises(
+    seed_personas: list[Persona],
+    spec_and_review: WorkflowTemplate,
+    consent_store: PersonaConsentStore,
+) -> None:
+    spec_writer = next(p for p in seed_personas if p.name == "openrouter-claude-spec-writer")
+    consent_store.set(spec_writer.compute_hash(), "block")
+    override = BindingOverride.parse({"spec_writer": "openrouter-claude-spec-writer@1"})
+    with pytest.raises(MyDeepAgentError) as exc_info:
+        bind_personas(spec_and_review, seed_personas, _all_available(), consent_store, override)
+    assert exc_info.value.code == "persona_blocked_by_user"
+
+
+# ---------------------------------------------------------------------------
+# _auto_select: preferred_backends order
+# ---------------------------------------------------------------------------
+
+
+def test_auto_select_prefers_preferred_backend(spec_and_review: WorkflowTemplate) -> None:
+    """Persona with preferred backend wins over non-preferred even if alphabetically later."""
+    from my_deepagent.binding import _auto_select
+
+    spec_writer_role = next(r for r in spec_and_review.roles if r.id == "spec_writer")
+    # preferred_backends = ["openrouter"]
+    p_openrouter = _minimal_persona(
+        name="z-openrouter-persona",
+        backend="openrouter",
+        capabilities=["spec_write", "phase_planning"],
+    )
+    p_fake = _minimal_persona(
+        name="a-fake-persona",
+        backend="fake",
+        capabilities=["spec_write", "phase_planning"],
+    )
+    chosen = _auto_select([p_openrouter, p_fake], spec_writer_role)
+    assert chosen.name == "z-openrouter-persona"
+
+
+def test_auto_select_higher_version_wins(spec_and_review: WorkflowTemplate) -> None:
+    from my_deepagent.binding import _auto_select
+
+    spec_writer_role = next(r for r in spec_and_review.roles if r.id == "spec_writer")
+    p_v1 = _minimal_persona(version=1, capabilities=["spec_write", "phase_planning"])
+    p_v2 = _minimal_persona(version=2, capabilities=["spec_write", "phase_planning"])
+    chosen = _auto_select([p_v1, p_v2], spec_writer_role)
+    assert chosen.version == 2
+
+
+def test_auto_select_name_asc_tiebreak(spec_and_review: WorkflowTemplate) -> None:
+    from my_deepagent.binding import _auto_select
+
+    spec_writer_role = next(r for r in spec_and_review.roles if r.id == "spec_writer")
+    caps = ["spec_write", "phase_planning"]
+    p_b = _minimal_persona(name="b-persona", version=1, capabilities=caps)
+    p_a = _minimal_persona(name="a-persona", version=1, capabilities=caps)
+    chosen = _auto_select([p_b, p_a], spec_writer_role)
+    assert chosen.name == "a-persona"
+
+
+# ---------------------------------------------------------------------------
+# Step 2 patch: FAKE backend recovery hint
+# ---------------------------------------------------------------------------
+
+
+def test_backend_recovery_hint_fake() -> None:
+    """FAKE backend recovery hint must mention 'fake' and 'tests only'."""
+    from my_deepagent.binding import _backend_recovery_hint
+
+    hint = _backend_recovery_hint(Backend.FAKE)
+    assert "fake" in hint.lower()
+    assert "tests only" in hint.lower() or "test harness" in hint.lower()
+
+
+# ---------------------------------------------------------------------------
+# Step 2 patch: override with non-integer version raises with diagnostic
+# ---------------------------------------------------------------------------
+
+
+def test_bind_personas_override_non_integer_version_raises(
+    seed_personas: list[Persona],
+    spec_and_review: WorkflowTemplate,
+    consent_store: PersonaConsentStore,
+) -> None:
+    """An override spec with a non-integer version must raise with clear diagnostic."""
+    override = BindingOverride(persona_pinned={"spec_writer": "openrouter-claude-spec-writer@abc"})
+    with pytest.raises(MyDeepAgentError) as exc_info:
+        bind_personas(spec_and_review, seed_personas, _all_available(), consent_store, override)
+    assert exc_info.value.code == "no_eligible_persona"
+    assert "non-integer version" in str(exc_info.value)
+
+
+# ---------------------------------------------------------------------------
+# Step 2 patch: override with ineligible persona surfaces reason
+# ---------------------------------------------------------------------------
+
+
+def test_bind_personas_override_ineligible_persona_surfaces_reason(
+    seed_personas: list[Persona],
+    spec_and_review: WorkflowTemplate,
+    consent_store: PersonaConsentStore,
+) -> None:
+    """Override that names an ineligible persona must surface the ineligibility reason."""
+    # 'spec_writer' role needs spec_write + phase_planning.
+    # Find a persona in seed that does NOT have those caps so we can force it.
+    ineligible = next(
+        p for p in seed_personas if "spec_write" not in [c.value for c in p.capabilities]
+    )
+    override = BindingOverride(
+        persona_pinned={"spec_writer": f"{ineligible.name}@{ineligible.version}"}
+    )
+    with pytest.raises(MyDeepAgentError) as exc_info:
+        bind_personas(spec_and_review, seed_personas, _all_available(), consent_store, override)
+    assert exc_info.value.code == "no_eligible_persona"
+    err_str = str(exc_info.value)
+    # The error message must say the persona is ineligible with a reason.
+    assert "ineligible" in err_str or "missing" in err_str
+
+
+# ---------------------------------------------------------------------------
+# Step 2 patch: PersonaConsentStore atomic write calls os.fsync
+# ---------------------------------------------------------------------------
+
+
+def test_consent_store_write_calls_fsync(tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> None:
+    """PersonaConsentStore.set() must call os.fsync() for atomic durability."""
+    import os
+
+    called: list[int] = []
+    orig_fsync = os.fsync
+
+    def spy(fd: int) -> None:
+        called.append(fd)
+        orig_fsync(fd)
+
+    monkeypatch.setattr(os, "fsync", spy)
+
+    store = PersonaConsentStore(tmp_path / "consents.json")
+    store.set("hash_abc", "approve")
+
+    assert len(called) >= 1, "os.fsync must be called at least once during write"
--- a/my-deepagent/tests/unit/test_config.py
+++ b/my-deepagent/tests/unit/test_config.py
@@ -0,0 +1,238 @@
+"""Unit tests for src/my_deepagent/config.py."""
+
+from __future__ import annotations
+
+from pathlib import Path
+
+import pytest
+from pydantic import ValidationError
+
+from my_deepagent.config import Config, load_config
+
+# ---------------------------------------------------------------------------
+# Default values (no env, no file)
+# ---------------------------------------------------------------------------
+
+
+def test_default_log_level(monkeypatch: pytest.MonkeyPatch) -> None:
+    _clear_env(monkeypatch)
+    cfg = Config()
+    assert cfg.log_level == "info"
+
+
+def test_default_lang(monkeypatch: pytest.MonkeyPatch) -> None:
+    _clear_env(monkeypatch)
+    cfg = Config()
+    assert cfg.lang == "ko"
+
+
+def test_default_budget_daily_usd(monkeypatch: pytest.MonkeyPatch) -> None:
+    _clear_env(monkeypatch)
+    cfg = Config()
+    assert cfg.budget_daily_usd == pytest.approx(5.0)
+
+
+def test_default_budget_run_usd(monkeypatch: pytest.MonkeyPatch) -> None:
+    _clear_env(monkeypatch)
+    cfg = Config()
+    assert cfg.budget_run_usd == pytest.approx(1.0)
+
+
+def test_default_budget_on_hit(monkeypatch: pytest.MonkeyPatch) -> None:
+    _clear_env(monkeypatch)
+    cfg = Config()
+    assert cfg.budget_on_hit == "prompt"
+
+
+def test_default_persona(monkeypatch: pytest.MonkeyPatch) -> None:
+    _clear_env(monkeypatch)
+    cfg = Config()
+    assert cfg.default_persona == "default-interactive"
+
+
+def test_default_openrouter_api_key_is_none(monkeypatch: pytest.MonkeyPatch) -> None:
+    _clear_env(monkeypatch)
+    # _env_file=None bypasses any .env that may exist in the cwd (e.g. dev keys).
+    cfg = Config(_env_file=None)  # type: ignore[call-arg]
+    assert cfg.openrouter_api_key is None
+
+
+# ---------------------------------------------------------------------------
+# Env var overrides
+# ---------------------------------------------------------------------------
+
+
+def test_env_budget_daily_usd(monkeypatch: pytest.MonkeyPatch) -> None:
+    _clear_env(monkeypatch)
+    monkeypatch.setenv("MYDEEPAGENT_BUDGET_DAILY_USD", "10")
+    cfg = Config()
+    assert cfg.budget_daily_usd == pytest.approx(10.0)
+
+
+def test_env_lang_en(monkeypatch: pytest.MonkeyPatch) -> None:
+    _clear_env(monkeypatch)
+    monkeypatch.setenv("MYDEEPAGENT_LANG", "en")
+    cfg = Config()
+    assert cfg.lang == "en"
+
+
+def test_env_log_level_debug(monkeypatch: pytest.MonkeyPatch) -> None:
+    _clear_env(monkeypatch)
+    monkeypatch.setenv("MYDEEPAGENT_LOG_LEVEL", "debug")
+    cfg = Config()
+    assert cfg.log_level == "debug"
+
+
+def test_env_openrouter_api_key(monkeypatch: pytest.MonkeyPatch) -> None:
+    _clear_env(monkeypatch)
+    monkeypatch.setenv("MYDEEPAGENT_OPENROUTER_API_KEY", "sk-test-abc")
+    cfg = Config()
+    assert cfg.openrouter_api_key == "sk-test-abc"
+
+
+def test_env_langsmith_tracing(monkeypatch: pytest.MonkeyPatch) -> None:
+    _clear_env(monkeypatch)
+    monkeypatch.setenv("MYDEEPAGENT_LANGSMITH_TRACING", "true")
+    cfg = Config()
+    assert cfg.langsmith_tracing is True
+
+
+# ---------------------------------------------------------------------------
+# Validation errors for invalid values
+# ---------------------------------------------------------------------------
+
+
+def test_invalid_lang_raises(monkeypatch: pytest.MonkeyPatch) -> None:
+    _clear_env(monkeypatch)
+    monkeypatch.setenv("MYDEEPAGENT_LANG", "fr")
+    with pytest.raises(ValidationError):
+        Config()
+
+
+def test_invalid_log_level_raises(monkeypatch: pytest.MonkeyPatch) -> None:
+    _clear_env(monkeypatch)
+    monkeypatch.setenv("MYDEEPAGENT_LOG_LEVEL", "verbose")
+    with pytest.raises(ValidationError):
+        Config()
+
+
+def test_invalid_budget_on_hit_raises(monkeypatch: pytest.MonkeyPatch) -> None:
+    _clear_env(monkeypatch)
+    monkeypatch.setenv("MYDEEPAGENT_BUDGET_ON_HIT", "explode")
+    with pytest.raises(ValidationError):
+        Config()
+
+
+def test_negative_budget_raises(monkeypatch: pytest.MonkeyPatch) -> None:
+    _clear_env(monkeypatch)
+    with pytest.raises(ValidationError):
+        Config(budget_daily_usd=-1.0)
+
+
+# ---------------------------------------------------------------------------
+# Frozen check
+# ---------------------------------------------------------------------------
+
+
+def test_frozen_prevents_mutation(monkeypatch: pytest.MonkeyPatch) -> None:
+    _clear_env(monkeypatch)
+    cfg = Config()
+    with pytest.raises((ValidationError, TypeError)):
+        cfg.budget_daily_usd = 99  # type: ignore[misc]
+
+
+# ---------------------------------------------------------------------------
+# Path expansion (~ → absolute path)
+# ---------------------------------------------------------------------------
+
+
+def test_tilde_expansion_workspace_root(monkeypatch: pytest.MonkeyPatch) -> None:
+    _clear_env(monkeypatch)
+    monkeypatch.setenv("MYDEEPAGENT_WORKSPACE_ROOT", "~/foo/bar")
+    cfg = Config()
+    assert cfg.workspace_root.is_absolute()
+    assert "~" not in str(cfg.workspace_root)
+
+
+def test_tilde_expansion_data_dir(monkeypatch: pytest.MonkeyPatch) -> None:
+    _clear_env(monkeypatch)
+    monkeypatch.setenv("MYDEEPAGENT_DATA_DIR", "~/mydata")
+    cfg = Config()
+    assert cfg.data_dir.is_absolute()
+
+
+# ---------------------------------------------------------------------------
+# TOML priority
+# ---------------------------------------------------------------------------
+
+
+def test_toml_overrides_default(monkeypatch: pytest.MonkeyPatch, tmp_path: Path) -> None:
+    _clear_env(monkeypatch)
+    toml_file = tmp_path / "config.toml"
+    toml_file.write_text('lang = "en"\nbudget_daily_usd = 7.5\n')
+
+    # Patch the toml_file location via init override
+    # Config reads toml via SettingsConfigDict; we pass via class-level override trick:
+    # Easiest approach: pass budget_daily_usd and lang directly to assert TOML *can* set them.
+    # For true TOML path injection, subclass Config temporarily.
+    class PatchedConfig(Config):
+        model_config = Config.model_config.copy()
+
+    PatchedConfig.model_config["toml_file"] = str(toml_file)
+
+    cfg = PatchedConfig()
+    assert cfg.lang == "en"
+    assert cfg.budget_daily_usd == pytest.approx(7.5)
+
+
+# ---------------------------------------------------------------------------
+# load_config helper
+# ---------------------------------------------------------------------------
+
+
+def test_load_config_with_overrides(monkeypatch: pytest.MonkeyPatch) -> None:
+    _clear_env(monkeypatch)
+    cfg = load_config(budget_daily_usd=20.0, lang="en")
+    assert cfg.budget_daily_usd == pytest.approx(20.0)
+    assert cfg.lang == "en"
+
+
+def test_load_config_default(monkeypatch: pytest.MonkeyPatch) -> None:
+    _clear_env(monkeypatch)
+    cfg = load_config()
+    assert cfg.log_level == "info"
+
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+
+_ENV_KEYS = [
+    "MYDEEPAGENT_BUDGET_DAILY_USD",
+    "MYDEEPAGENT_BUDGET_DAILY_WARN_USD",
+    "MYDEEPAGENT_BUDGET_RUN_USD",
+    "MYDEEPAGENT_BUDGET_RUN_WARN_USD",
+    "MYDEEPAGENT_BUDGET_ON_HIT",
+    "MYDEEPAGENT_LANG",
+    "MYDEEPAGENT_LOG_LEVEL",
+    "MYDEEPAGENT_OPENROUTER_API_KEY",
+    "MYDEEPAGENT_OPENROUTER_BASE_URL",
+    "MYDEEPAGENT_LANGSMITH_TRACING",
+    "MYDEEPAGENT_LANGSMITH_API_KEY",
+    "MYDEEPAGENT_LANGSMITH_PROJECT",
+    "MYDEEPAGENT_DATABASE_URL",
+    "MYDEEPAGENT_WORKSPACE_ROOT",
+    "MYDEEPAGENT_DATA_DIR",
+    "MYDEEPAGENT_CONFIG_DIR",
+    "MYDEEPAGENT_STATE_DIR",
+    "MYDEEPAGENT_DEFAULT_PERSONA",
+]
+
+
+def _clear_env(monkeypatch: pytest.MonkeyPatch) -> None:
+    """Remove all MYDEEPAGENT_ env vars to isolate tests from the real environment."""
+    for key in _ENV_KEYS:
+        monkeypatch.delenv(key, raising=False)
+    # Also prevent dotenv file from being loaded
+    monkeypatch.setenv("MYDEEPAGENT_ENV_FILE", "")
--- a/my-deepagent/tests/unit/test_enums.py
+++ b/my-deepagent/tests/unit/test_enums.py
@@ -0,0 +1,235 @@
+"""Unit tests for src/my_deepagent/enums.py."""
+
+import pytest
+
+from my_deepagent.enums import (
+    ApprovalDecisionAction,
+    ApprovalState,
+    Backend,
+    Capability,
+    ErrorClass,
+    RiskLevel,
+    RunPhaseState,
+    RunState,
+    SessionState,
+)
+
+# ---------------------------------------------------------------------------
+# Backend
+# ---------------------------------------------------------------------------
+
+
+def test_backend_openrouter_value() -> None:
+    assert Backend.OPENROUTER == "openrouter"
+
+
+def test_backend_anthropic_value() -> None:
+    assert Backend.ANTHROPIC == "anthropic"
+
+
+def test_backend_openai_value() -> None:
+    assert Backend.OPENAI == "openai"
+
+
+def test_backend_google_value() -> None:
+    assert Backend.GOOGLE == "google"
+
+
+def test_backend_fake_value() -> None:
+    assert Backend.FAKE == "fake"
+
+
+def test_backend_str_equality() -> None:
+    # StrEnum members compare equal to their string values
+    assert Backend.OPENROUTER == "openrouter"
+    assert str(Backend.OPENROUTER) == "openrouter"
+
+
+# ---------------------------------------------------------------------------
+# Capability
+# ---------------------------------------------------------------------------
+
+
+def test_capability_count() -> None:
+    assert len(list(Capability)) == 13
+
+
+def test_capability_spec_write() -> None:
+    assert Capability.SPEC_WRITE == "spec_write"
+
+
+def test_capability_code_edit() -> None:
+    assert Capability.CODE_EDIT == "code_edit"
+
+
+def test_capability_final_report_compose() -> None:
+    assert Capability.FINAL_REPORT_COMPOSE == "final_report_compose"
+
+
+def test_capability_all_are_str() -> None:
+    for cap in Capability:
+        assert isinstance(cap, str)
+
+
+# ---------------------------------------------------------------------------
+# RiskLevel
+# ---------------------------------------------------------------------------
+
+
+def test_risk_level_values() -> None:
+    assert RiskLevel.LOW == "low"
+    assert RiskLevel.MEDIUM == "medium"
+    assert RiskLevel.HIGH == "high"
+
+
+# ---------------------------------------------------------------------------
+# ApprovalDecisionAction
+# ---------------------------------------------------------------------------
+
+
+def test_approval_decision_action_approve() -> None:
+    assert ApprovalDecisionAction.APPROVE == "approve"
+
+
+def test_approval_decision_action_reject() -> None:
+    assert ApprovalDecisionAction.REJECT == "reject"
+
+
+def test_approval_decision_action_request_changes() -> None:
+    assert ApprovalDecisionAction.REQUEST_CHANGES == "request_changes"
+
+
+def test_approval_decision_action_abort() -> None:
+    assert ApprovalDecisionAction.ABORT == "abort"
+
+
+# ---------------------------------------------------------------------------
+# ApprovalState
+# ---------------------------------------------------------------------------
+
+
+def test_approval_state_all_values() -> None:
+    expected = {"pending", "approved", "rejected", "changes_requested", "aborted", "paused"}
+    actual = {s.value for s in ApprovalState}
+    assert actual == expected
+
+
+# ---------------------------------------------------------------------------
+# RunState
+# ---------------------------------------------------------------------------
+
+
+def test_run_state_all_values() -> None:
+    expected = {
+        "created",
+        "bound",
+        "planning",
+        "awaiting_approval",
+        "executing",
+        "paused",
+        "completed",
+        "failed",
+        "aborted",
+    }
+    actual = {s.value for s in RunState}
+    assert actual == expected
+
+
+def test_run_state_count() -> None:
+    assert len(list(RunState)) == 9
+
+
+# ---------------------------------------------------------------------------
+# RunPhaseState
+# ---------------------------------------------------------------------------
+
+
+def test_run_phase_state_all_values() -> None:
+    expected = {
+        "pending",
+        "running",
+        "awaiting_artifact",
+        "validating",
+        "awaiting_approval",
+        "completed",
+        "failed",
+        "skipped",
+    }
+    actual = {s.value for s in RunPhaseState}
+    assert actual == expected
+
+
+def test_run_phase_state_count() -> None:
+    assert len(list(RunPhaseState)) == 8
+
+
+# ---------------------------------------------------------------------------
+# SessionState
+# ---------------------------------------------------------------------------
+
+
+def test_session_state_all_values() -> None:
+    expected = {
+        "CREATED",
+        "BOOTSTRAPPING",
+        "READY",
+        "BUSY",
+        "WAITING_FOR_APPROVAL",
+        "ARTIFACT_TIMEOUT",
+        "HUNG",
+        "CRASHED",
+        "RESUMING",
+        "REBOOTSTRAPPED",
+        "FAILED_NEEDS_HUMAN",
+    }
+    actual = {s.value for s in SessionState}
+    assert actual == expected
+
+
+def test_session_state_count() -> None:
+    assert len(list(SessionState)) == 11
+
+
+# ---------------------------------------------------------------------------
+# ErrorClass
+# ---------------------------------------------------------------------------
+
+
+def test_error_class_recoverable() -> None:
+    assert ErrorClass.RECOVERABLE == "recoverable"
+
+
+def test_error_class_human_required() -> None:
+    assert ErrorClass.HUMAN_REQUIRED == "human_required"
+
+
+def test_error_class_fatal() -> None:
+    assert ErrorClass.FATAL == "fatal"
+
+
+def test_error_class_count() -> None:
+    assert len(list(ErrorClass)) == 3
+
+
+# ---------------------------------------------------------------------------
+# StrEnum serialization / deserialization
+# ---------------------------------------------------------------------------
+
+
+def test_str_enum_from_value() -> None:
+    assert Backend("openrouter") is Backend.OPENROUTER
+
+
+def test_str_enum_in_dict() -> None:
+    # StrEnum should work as dict key and compare with string
+    d = {Backend.OPENROUTER: "openrouter backend"}
+    assert d["openrouter"] == "openrouter backend"
+
+
+@pytest.mark.parametrize(
+    "state",
+    list(RunState),
+)
+def test_run_state_parametrize(state: RunState) -> None:
+    assert isinstance(state, str)
+    assert RunState(state.value) is state
--- a/my-deepagent/tests/unit/test_errors.py
+++ b/my-deepagent/tests/unit/test_errors.py
@@ -0,0 +1,208 @@
+"""Unit tests for src/my_deepagent/errors.py."""
+
+from uuid import UUID, uuid4
+
+import pytest
+
+from my_deepagent.enums import ErrorClass
+from my_deepagent.errors import BudgetExhaustedError, MyDeepAgentError
+
+
+def test_cause_sets_suppress_context() -> None:
+    """Wrapping a cause must suppress the implicit context per PEP 3134."""
+    original = ValueError("root cause")
+    err = MyDeepAgentError.recoverable("wrapped", cause=original)
+    assert err.__cause__ is original
+    assert err.__suppress_context__ is True
+
+
+def test_no_cause_does_not_set_suppress_context() -> None:
+    err = MyDeepAgentError.recoverable("no_cause")
+    assert err.__cause__ is None
+    assert err.__suppress_context__ is False
+
+
+def test_factory_returns_base_class_not_subclass() -> None:
+    """LSP fix: factory methods always return MyDeepAgentError, never BudgetExhaustedError."""
+    err = BudgetExhaustedError.recoverable("foo")
+    assert type(err) is MyDeepAgentError
+
+
+# ---------------------------------------------------------------------------
+# MyDeepAgentError factory methods
+# ---------------------------------------------------------------------------
+
+
+def test_recoverable_class() -> None:
+    err = MyDeepAgentError.recoverable("network_blip", recovery_hint="retry")
+    assert err.error_class == ErrorClass.RECOVERABLE
+
+
+def test_recoverable_code() -> None:
+    err = MyDeepAgentError.recoverable("network_blip")
+    assert err.code == "network_blip"
+
+
+def test_recoverable_recovery_hint() -> None:
+    err = MyDeepAgentError.recoverable("network_blip", recovery_hint="retry after 1s")
+    assert err.recovery_hint == "retry after 1s"
+
+
+def test_human_required_class() -> None:
+    err = MyDeepAgentError.human_required("destructive_command_blocked")
+    assert err.error_class == ErrorClass.HUMAN_REQUIRED
+
+
+def test_human_required_code() -> None:
+    err = MyDeepAgentError.human_required("destructive_command_blocked")
+    assert err.code == "destructive_command_blocked"
+
+
+def test_fatal_class() -> None:
+    err = MyDeepAgentError.fatal("unrecoverable_state")
+    assert err.error_class == ErrorClass.FATAL
+
+
+def test_fatal_code() -> None:
+    err = MyDeepAgentError.fatal("unrecoverable_state")
+    assert err.code == "unrecoverable_state"
+
+
+# ---------------------------------------------------------------------------
+# run_id / phase_id context
+# ---------------------------------------------------------------------------
+
+
+def test_run_id_attached() -> None:
+    run_id = uuid4()
+    err = MyDeepAgentError.recoverable("timeout", run_id=run_id)
+    assert err.run_id == run_id
+
+
+def test_phase_id_attached() -> None:
+    phase_id = uuid4()
+    err = MyDeepAgentError.recoverable("artifact_missing", phase_id=phase_id)
+    assert err.phase_id == phase_id
+
+
+def test_run_id_none_by_default() -> None:
+    err = MyDeepAgentError.recoverable("x")
+    assert err.run_id is None
+
+
+# ---------------------------------------------------------------------------
+# __cause__ propagation
+# ---------------------------------------------------------------------------
+
+
+def test_cause_propagation() -> None:
+    original = ValueError("root cause")
+    err = MyDeepAgentError.recoverable("wrapped", cause=original)
+    assert err.__cause__ is original
+
+
+def test_cause_none_by_default() -> None:
+    err = MyDeepAgentError.recoverable("no_cause")
+    assert err.__cause__ is None
+
+
+# ---------------------------------------------------------------------------
+# __repr__ format
+# ---------------------------------------------------------------------------
+
+
+def test_repr_contains_class_and_code() -> None:
+    err = MyDeepAgentError.recoverable("some_code")
+    r = repr(err)
+    assert "class=recoverable" in r
+    assert "code=some_code" in r
+
+
+def test_repr_contains_run_id_when_present() -> None:
+    run_id = UUID("12345678-1234-5678-1234-567812345678")
+    err = MyDeepAgentError.recoverable("x", run_id=run_id)
+    assert str(run_id) in repr(err)
+
+
+def test_repr_contains_hint_when_present() -> None:
+    err = MyDeepAgentError.recoverable("x", recovery_hint="do something")
+    assert "do something" in repr(err)
+
+
+def test_repr_no_hint_when_absent() -> None:
+    err = MyDeepAgentError.recoverable("x")
+    assert "hint" not in repr(err)
+
+
+# ---------------------------------------------------------------------------
+# Exception hierarchy
+# ---------------------------------------------------------------------------
+
+
+def test_my_deepagent_error_is_exception() -> None:
+    err = MyDeepAgentError.recoverable("x")
+    assert isinstance(err, Exception)
+
+
+def test_budget_exhausted_is_my_deepagent_error() -> None:
+    err = BudgetExhaustedError("day:2026-05-15", 1.20, 1.00)
+    assert isinstance(err, MyDeepAgentError)
+
+
+# ---------------------------------------------------------------------------
+# BudgetExhaustedError
+# ---------------------------------------------------------------------------
+
+
+def test_budget_exhausted_scope() -> None:
+    err = BudgetExhaustedError("day:2026-05-15", 1.20, 1.00)
+    assert err.scope == "day:2026-05-15"
+
+
+def test_budget_exhausted_projected_usd() -> None:
+    err = BudgetExhaustedError("day:2026-05-15", 1.20, 1.00)
+    assert err.projected_usd == pytest.approx(1.20)
+
+
+def test_budget_exhausted_cap_usd() -> None:
+    err = BudgetExhaustedError("day:2026-05-15", 1.20, 1.00)
+    assert err.cap_usd == pytest.approx(1.00)
+
+
+def test_budget_exhausted_error_class() -> None:
+    err = BudgetExhaustedError("day:2026-05-15", 1.20, 1.00)
+    assert err.error_class == ErrorClass.HUMAN_REQUIRED
+
+
+def test_budget_exhausted_code() -> None:
+    err = BudgetExhaustedError("day:2026-05-15", 1.20, 1.00)
+    assert err.code == "budget_exhausted"
+
+
+def test_budget_exhausted_default_recovery_hint() -> None:
+    err = BudgetExhaustedError("day:2026-05-15", 1.20, 1.00)
+    assert err.recovery_hint is not None
+    assert len(err.recovery_hint) > 0
+
+
+def test_budget_exhausted_custom_recovery_hint() -> None:
+    err = BudgetExhaustedError("day:2026-05-15", 1.20, 1.00, recovery_hint="call support")
+    assert err.recovery_hint == "call support"
+
+
+def test_budget_exhausted_run_id() -> None:
+    run_id = uuid4()
+    err = BudgetExhaustedError("run:abc", 0.5, 0.4, run_id=run_id)
+    assert err.run_id == run_id
+
+
+def test_budget_exhausted_message_contains_scope() -> None:
+    err = BudgetExhaustedError("day:2026-05-15", 1.20, 1.00)
+    assert "day:2026-05-15" in str(err)
+
+
+def test_budget_exhausted_message_contains_values() -> None:
+    err = BudgetExhaustedError("scope", 1.2345, 1.0000)
+    msg = str(err)
+    assert "1.2345" in msg
+    assert "1.0000" in msg
--- a/my-deepagent/tests/unit/test_hash.py
+++ b/my-deepagent/tests/unit/test_hash.py
@@ -0,0 +1,121 @@
+"""Unit tests for src/my_deepagent/hash.py."""
+
+import re
+
+import pytest
+
+from my_deepagent.hash import canonicalize, sha256
+
+# ---------------------------------------------------------------------------
+# canonicalize: key ordering
+# ---------------------------------------------------------------------------
+
+
+def test_canonicalize_sorts_keys() -> None:
+    assert canonicalize({"b": 1, "a": 2}) == '{"a":2,"b":1}'
+
+
+def test_canonicalize_nested_sorts_keys() -> None:
+    result = canonicalize({"x": {"b": 2, "a": 1}})
+    assert result == '{"x":{"a":1,"b":2}}'
+
+
+def test_canonicalize_empty_dict() -> None:
+    assert canonicalize({}) == "{}"
+
+
+def test_canonicalize_empty_list() -> None:
+    assert canonicalize([]) == "[]"
+
+
+def test_canonicalize_none() -> None:
+    assert canonicalize(None) == "null"
+
+
+def test_canonicalize_integer() -> None:
+    assert canonicalize(42) == "42"
+
+
+def test_canonicalize_float() -> None:
+    # 0.1 has a known floating-point representation
+    result = canonicalize(0.1)
+    assert result == "0.1"
+
+
+def test_canonicalize_no_whitespace() -> None:
+    result = canonicalize({"a": 1, "b": 2})
+    assert " " not in result
+
+
+def test_canonicalize_list_preserves_order() -> None:
+    # Lists should not be reordered
+    assert canonicalize([3, 1, 2]) == "[3,1,2]"
+
+
+def test_canonicalize_string_value() -> None:
+    assert canonicalize("hello") == '"hello"'
+
+
+def test_canonicalize_boolean() -> None:
+    assert canonicalize(True) == "true"
+    assert canonicalize(False) == "false"
+
+
+def test_canonicalize_nan_raises() -> None:
+    import math
+
+    with pytest.raises(ValueError):
+        canonicalize(math.nan)
+
+
+# ---------------------------------------------------------------------------
+# sha256: determinism
+# ---------------------------------------------------------------------------
+
+
+def test_sha256_deterministic() -> None:
+    value = {"a": 1, "b": [1, 2, 3]}
+    results = [sha256(value) for _ in range(100)]
+    assert len(set(results)) == 1
+
+
+def test_sha256_returns_64_char_hex() -> None:
+    result = sha256({"a": 1})
+    assert re.fullmatch(r"[0-9a-f]{64}", result) is not None
+
+
+def test_sha256_different_inputs_different_hash() -> None:
+    h1 = sha256({"a": 1})
+    h2 = sha256({"a": 2})
+    assert h1 != h2
+
+
+def test_sha256_key_order_irrelevant() -> None:
+    # Same content, different insertion order → same hash
+    h1 = sha256({"a": 1, "b": 2})
+    h2 = sha256({"b": 2, "a": 1})
+    assert h1 == h2
+
+
+def test_sha256_empty_dict() -> None:
+    result = sha256({})
+    assert re.fullmatch(r"[0-9a-f]{64}", result) is not None
+
+
+def test_sha256_none() -> None:
+    result = sha256(None)
+    assert re.fullmatch(r"[0-9a-f]{64}", result) is not None
+
+
+def test_sha256_nested() -> None:
+    h1 = sha256({"x": {"a": 1, "b": 2}})
+    h2 = sha256({"x": {"b": 2, "a": 1}})
+    assert h1 == h2
+
+
+def test_sha256_known_value() -> None:
+    # Pre-computed: sha256('{"a":1}') in UTF-8
+    import hashlib
+
+    expected = hashlib.sha256(b'{"a":1}').hexdigest()
+    assert sha256({"a": 1}) == expected
--- a/my-deepagent/tests/unit/test_middleware_audit.py
+++ b/my-deepagent/tests/unit/test_middleware_audit.py
@@ -0,0 +1,118 @@
+"""Unit tests for src/my_deepagent/middleware/audit.py."""
+
+from __future__ import annotations
+
+from typing import Any
+from unittest.mock import AsyncMock, MagicMock
+from uuid import UUID
+
+import pytest
+
+from my_deepagent.middleware.audit import AuditToolMiddleware
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+
+def _make_request(name: str = "read_file", args: dict[str, Any] | None = None) -> MagicMock:
+    request = MagicMock()
+    request.tool_call = {"name": name, "args": args or {"path": "x.py"}}
+    return request
+
+
+# ---------------------------------------------------------------------------
+# awrap_tool_call — success path
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_audit_middleware_records_correct_fields_on_success() -> None:
+    recorder = AsyncMock()
+    mw = AuditToolMiddleware(
+        run_id=UUID("00000000-0000-0000-0000-000000000001"),
+        phase_id=UUID("00000000-0000-0000-0000-000000000002"),
+        interactive_session_id=UUID("00000000-0000-0000-0000-000000000003"),
+        recorder=recorder,
+    )
+    result_value = "file contents here"
+    handler = AsyncMock(return_value=result_value)
+    request = _make_request(name="read_file", args={"path": "src/main.py"})
+
+    result = await mw.awrap_tool_call(request, handler)
+
+    assert result == result_value
+    recorder.assert_awaited_once()
+    record: dict[str, Any] = recorder.call_args[0][0]
+    assert record["tool_name"] == "read_file"
+    assert record["args"] == {"path": "src/main.py"}
+    assert record["result"] == result_value
+    assert record["error"] is None
+    assert record["duration_ms"] >= 0
+    assert record["run_id"] == UUID("00000000-0000-0000-0000-000000000001")
+
+
+@pytest.mark.asyncio
+async def test_audit_middleware_no_recorder_is_noop() -> None:
+    mw = AuditToolMiddleware()
+    handler = AsyncMock(return_value="ok")
+    result = await mw.awrap_tool_call(_make_request(), handler)
+    assert result == "ok"
+
+
+# ---------------------------------------------------------------------------
+# awrap_tool_call — error path
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_audit_middleware_records_error_code_on_exception() -> None:
+    recorder = AsyncMock()
+    mw = AuditToolMiddleware(recorder=recorder)
+    handler = AsyncMock(side_effect=PermissionError("access denied"))
+
+    with pytest.raises(PermissionError):
+        await mw.awrap_tool_call(_make_request(), handler)
+
+    recorder.assert_awaited_once()
+    record: dict[str, Any] = recorder.call_args[0][0]
+    assert record["error"] == "PermissionError"
+    assert record["result"] is None
+
+
+@pytest.mark.asyncio
+async def test_audit_middleware_reraises_exception() -> None:
+    mw = AuditToolMiddleware(recorder=AsyncMock())
+    handler = AsyncMock(side_effect=ValueError("bad args"))
+    with pytest.raises(ValueError, match="bad args"):
+        await mw.awrap_tool_call(_make_request(), handler)
+
+
+# ---------------------------------------------------------------------------
+# result serialization
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_audit_middleware_serializes_non_primitive_result_as_str() -> None:
+    recorder = AsyncMock()
+    mw = AuditToolMiddleware(recorder=recorder)
+
+    class _CustomResult:
+        def __str__(self) -> str:
+            return "custom-result-str"
+
+    handler = AsyncMock(return_value=_CustomResult())
+    await mw.awrap_tool_call(_make_request(), handler)
+    record = recorder.call_args[0][0]
+    assert record["result"] == "custom-result-str"
+
+
+@pytest.mark.asyncio
+async def test_audit_middleware_passes_dict_result_as_is() -> None:
+    recorder = AsyncMock()
+    mw = AuditToolMiddleware(recorder=recorder)
+    handler = AsyncMock(return_value={"key": "value"})
+    await mw.awrap_tool_call(_make_request(), handler)
+    record = recorder.call_args[0][0]
+    assert record["result"] == {"key": "value"}
--- a/my-deepagent/tests/unit/test_middleware_cost.py
+++ b/my-deepagent/tests/unit/test_middleware_cost.py
@@ -0,0 +1,143 @@
+"""Unit tests for src/my_deepagent/middleware/cost.py."""
+
+from __future__ import annotations
+
+from typing import Any
+from unittest.mock import AsyncMock, MagicMock
+from uuid import UUID
+
+import pytest
+
+from my_deepagent.middleware.cost import CostMiddleware
+from my_deepagent.monitoring.pricing import ModelPrice, PricingCache
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+
+def _make_pricing_cache(
+    model: str = "anthropic/claude-sonnet",
+    input_per_1k: float = 0.003,
+    output_per_1k: float = 0.015,
+) -> PricingCache:
+    cache = PricingCache()
+    cache.set(
+        [
+            ModelPrice(
+                model=model,
+                input_per_1k_usd=input_per_1k,
+                output_per_1k_usd=output_per_1k,
+                context_length=200000,
+            )
+        ]
+    )
+    return cache
+
+
+def _make_response(input_tokens: int = 100, output_tokens: int = 50) -> MagicMock:
+    response = MagicMock()
+    response.usage_metadata = {"input_tokens": input_tokens, "output_tokens": output_tokens}
+    return response
+
+
+# ---------------------------------------------------------------------------
+# awrap_model_call — success path
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_cost_middleware_records_correct_fields_on_success() -> None:
+    recorder = AsyncMock()
+    cache = _make_pricing_cache()
+    mw = CostMiddleware(
+        pricing=cache,
+        model_name="anthropic/claude-sonnet",
+        run_id=UUID("00000000-0000-0000-0000-000000000001"),
+        phase_id=UUID("00000000-0000-0000-0000-000000000002"),
+        persona_name="test-persona",
+        recorder=recorder,
+    )
+    response = _make_response(input_tokens=1000, output_tokens=500)
+    handler = AsyncMock(return_value=response)
+    request = MagicMock()
+
+    result = await mw.awrap_model_call(request, handler)
+
+    assert result is response
+    recorder.assert_awaited_once()
+    record: dict[str, Any] = recorder.call_args[0][0]
+    assert record["model"] == "anthropic/claude-sonnet"
+    assert record["input_tokens"] == 1000
+    assert record["output_tokens"] == 500
+    assert record["status"] == "ok"
+    assert record["error_code"] is None
+    assert record["latency_ms"] >= 0
+    # cost: (1000/1000 * 0.003) + (500/1000 * 0.015)
+    expected_cost = 0.003 * 1.0 + 0.015 * 0.5
+    assert record["cost_usd_total"] == pytest.approx(expected_cost)
+
+
+@pytest.mark.asyncio
+async def test_cost_middleware_no_recorder_is_noop() -> None:
+    cache = _make_pricing_cache()
+    mw = CostMiddleware(pricing=cache, model_name="anthropic/claude-sonnet")
+    response = _make_response()
+    handler = AsyncMock(return_value=response)
+    # Should not raise even with recorder=None
+    result = await mw.awrap_model_call(MagicMock(), handler)
+    assert result is response
+
+
+# ---------------------------------------------------------------------------
+# awrap_model_call — error path
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_cost_middleware_records_error_on_handler_exception() -> None:
+    recorder = AsyncMock()
+    cache = _make_pricing_cache()
+    mw = CostMiddleware(
+        pricing=cache,
+        model_name="anthropic/claude-sonnet",
+        recorder=recorder,
+    )
+    handler = AsyncMock(side_effect=RuntimeError("timeout"))
+
+    with pytest.raises(RuntimeError, match="timeout"):
+        await mw.awrap_model_call(MagicMock(), handler)
+
+    recorder.assert_awaited_once()
+    record: dict[str, Any] = recorder.call_args[0][0]
+    assert record["status"] == "error"
+    assert record["error_code"] == "RuntimeError"
+    assert record["input_tokens"] == 0
+    assert record["output_tokens"] == 0
+
+
+@pytest.mark.asyncio
+async def test_cost_middleware_reraises_exception() -> None:
+    cache = _make_pricing_cache()
+    mw = CostMiddleware(pricing=cache, model_name="m", recorder=AsyncMock())
+    handler = AsyncMock(side_effect=ValueError("bad input"))
+
+    with pytest.raises(ValueError, match="bad input"):
+        await mw.awrap_model_call(MagicMock(), handler)
+
+
+# ---------------------------------------------------------------------------
+# cost computation via cache
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_cost_zero_when_model_not_in_cache() -> None:
+    recorder = AsyncMock()
+    cache = PricingCache()  # empty
+    mw = CostMiddleware(pricing=cache, model_name="unknown/model", recorder=recorder)
+    response = _make_response(input_tokens=1000, output_tokens=1000)
+    handler = AsyncMock(return_value=response)
+    await mw.awrap_model_call(MagicMock(), handler)
+    record = recorder.call_args[0][0]
+    assert record["cost_usd_total"] == 0.0
--- a/my-deepagent/tests/unit/test_middleware_fallback.py
+++ b/my-deepagent/tests/unit/test_middleware_fallback.py
@@ -0,0 +1,168 @@
+"""Unit tests for src/my_deepagent/middleware/fallback.py."""
+
+from __future__ import annotations
+
+from typing import Any
+from unittest.mock import AsyncMock, MagicMock
+
+import httpx
+import openai
+import pytest
+
+from my_deepagent.middleware.fallback import FallbackModelMiddleware
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+
+def _make_request(has_model_attr: bool = True) -> MagicMock:
+    request = MagicMock()
+    if not has_model_attr:
+        del request.model
+    return request
+
+
+# ---------------------------------------------------------------------------
+# Fallback on RateLimitError
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_fallback_on_rate_limit_error_calls_handler_with_fallback() -> None:
+    primary = MagicMock(name="primary-model")
+    fallback = MagicMock(name="fallback-model")
+    mw = FallbackModelMiddleware(primary=primary, fallback=fallback)
+
+    call_count = 0
+    fallback_model_seen: Any = None
+
+    async def handler(request: Any) -> str:
+        nonlocal call_count, fallback_model_seen
+        call_count += 1
+        if call_count == 1:
+            raise openai.RateLimitError(
+                "rate limit",
+                response=MagicMock(status_code=429, headers={}),
+                body={},
+            )
+        fallback_model_seen = getattr(request, "model", None)
+        return "fallback-response"
+
+    request = _make_request()
+    result = await mw.awrap_model_call(request, handler)
+    assert result == "fallback-response"
+    assert call_count == 2
+    assert fallback_model_seen is fallback
+
+
+@pytest.mark.asyncio
+async def test_fallback_on_api_connection_error() -> None:
+    primary = MagicMock()
+    fallback = MagicMock()
+    mw = FallbackModelMiddleware(primary=primary, fallback=fallback)
+
+    call_count = 0
+
+    async def handler(request: Any) -> str:
+        nonlocal call_count
+        call_count += 1
+        if call_count == 1:
+            raise openai.APIConnectionError(request=MagicMock())
+        return "connection-fallback"
+
+    result = await mw.awrap_model_call(_make_request(), handler)
+    assert result == "connection-fallback"
+    assert call_count == 2
+
+
+@pytest.mark.asyncio
+async def test_fallback_on_httpx_error() -> None:
+    primary = MagicMock()
+    fallback = MagicMock()
+    mw = FallbackModelMiddleware(primary=primary, fallback=fallback)
+
+    call_count = 0
+
+    async def handler(request: Any) -> str:
+        nonlocal call_count
+        call_count += 1
+        if call_count == 1:
+            raise httpx.ConnectError("connect failed")
+        return "httpx-fallback"
+
+    result = await mw.awrap_model_call(_make_request(), handler)
+    assert result == "httpx-fallback"
+    assert call_count == 2
+
+
+# ---------------------------------------------------------------------------
+# No fallback — exception propagates
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_no_fallback_raises_original_error() -> None:
+    mw = FallbackModelMiddleware(primary=MagicMock(), fallback=None)
+    handler = AsyncMock(
+        side_effect=openai.RateLimitError(
+            "rate limit",
+            response=MagicMock(status_code=429, headers={}),
+            body={},
+        )
+    )
+    with pytest.raises(openai.RateLimitError):
+        await mw.awrap_model_call(_make_request(), handler)
+
+
+# ---------------------------------------------------------------------------
+# AuthenticationError — never retried
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_auth_error_is_not_retried() -> None:
+    primary = MagicMock()
+    fallback = MagicMock()
+    mw = FallbackModelMiddleware(primary=primary, fallback=fallback)
+
+    call_count = 0
+
+    async def handler(request: Any) -> str:
+        nonlocal call_count
+        call_count += 1
+        raise openai.AuthenticationError(
+            "bad api key",
+            response=MagicMock(status_code=401, headers={}),
+            body={},
+        )
+
+    with pytest.raises(openai.AuthenticationError):
+        await mw.awrap_model_call(_make_request(), handler)
+
+    # Handler should only be called once (no retry for auth errors)
+    assert call_count == 1
+
+
+# ---------------------------------------------------------------------------
+# _with_fallback_model
+# ---------------------------------------------------------------------------
+
+
+def test_with_fallback_model_swaps_model_attribute() -> None:
+    primary = MagicMock(name="primary")
+    fallback = MagicMock(name="fallback")
+    mw = FallbackModelMiddleware(primary=primary, fallback=fallback)
+
+    request = MagicMock()
+    request.model = primary
+    patched = mw._with_fallback_model(request)
+    assert patched.model is fallback
+
+
+def test_with_fallback_model_no_model_attr_does_not_crash() -> None:
+    mw = FallbackModelMiddleware(primary=MagicMock(), fallback=MagicMock())
+    request = MagicMock(spec=[])  # no attributes
+    # Should not raise
+    patched = mw._with_fallback_model(request)
+    assert patched is request
--- a/my-deepagent/tests/unit/test_middleware_safety.py
+++ b/my-deepagent/tests/unit/test_middleware_safety.py
@@ -0,0 +1,258 @@
+"""Unit tests for src/my_deepagent/middleware/safety.py."""
+
+from __future__ import annotations
+
+from typing import Any
+from unittest.mock import AsyncMock, MagicMock
+
+import pytest
+
+from my_deepagent.errors import MyDeepAgentError
+from my_deepagent.middleware.safety import SafetyShellMiddleware, _is_denied_path
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+
+def _make_shell_request(cmd: str | list[str], tool_name: str = "shell") -> MagicMock:
+    request = MagicMock()
+    if isinstance(cmd, list):
+        request.tool_call = {"name": tool_name, "args": {"argv": cmd}}
+    else:
+        request.tool_call = {"name": tool_name, "args": {"command": cmd}}
+    return request
+
+
+def _make_other_tool_request(
+    name: str = "read_file", args: dict[str, Any] | None = None
+) -> MagicMock:
+    request = MagicMock()
+    request.tool_call = {"name": name, "args": args or {}}
+    return request
+
+
+# ---------------------------------------------------------------------------
+# Destructive commands — should raise
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_rm_rf_slash_is_blocked() -> None:
+    mw = SafetyShellMiddleware()
+    with pytest.raises(MyDeepAgentError) as exc_info:
+        await mw.awrap_tool_call(_make_shell_request("rm -rf /"), AsyncMock())
+    assert exc_info.value.code == "destructive_command_blocked"
+
+
+@pytest.mark.asyncio
+async def test_rm_rf_with_path_is_blocked() -> None:
+    mw = SafetyShellMiddleware()
+    with pytest.raises(MyDeepAgentError) as exc_info:
+        await mw.awrap_tool_call(_make_shell_request("rm -rf ./build"), AsyncMock())
+    assert exc_info.value.code == "destructive_command_blocked"
+
+
+@pytest.mark.asyncio
+async def test_git_push_force_is_blocked() -> None:
+    mw = SafetyShellMiddleware()
+    with pytest.raises(MyDeepAgentError):
+        await mw.awrap_tool_call(_make_shell_request("git push --force origin main"), AsyncMock())
+
+
+@pytest.mark.asyncio
+async def test_git_push_force_with_lease_is_blocked() -> None:
+    mw = SafetyShellMiddleware()
+    with pytest.raises(MyDeepAgentError):
+        await mw.awrap_tool_call(
+            _make_shell_request("git push --force-with-lease origin main"), AsyncMock()
+        )
+
+
+@pytest.mark.asyncio
+async def test_git_reset_hard_is_blocked() -> None:
+    mw = SafetyShellMiddleware()
+    with pytest.raises(MyDeepAgentError):
+        await mw.awrap_tool_call(_make_shell_request("git reset --hard HEAD"), AsyncMock())
+
+
+@pytest.mark.asyncio
+async def test_git_clean_is_blocked() -> None:
+    mw = SafetyShellMiddleware()
+    with pytest.raises(MyDeepAgentError):
+        await mw.awrap_tool_call(_make_shell_request("git clean -fd"), AsyncMock())
+
+
+@pytest.mark.asyncio
+async def test_drop_table_sql_is_blocked() -> None:
+    mw = SafetyShellMiddleware()
+    with pytest.raises(MyDeepAgentError):
+        await mw.awrap_tool_call(_make_shell_request("psql -c 'DROP TABLE users'"), AsyncMock())
+
+
+@pytest.mark.asyncio
+async def test_execute_tool_name_also_blocked() -> None:
+    """The 'execute' tool name is also checked for destructive patterns."""
+    mw = SafetyShellMiddleware()
+    with pytest.raises(MyDeepAgentError):
+        await mw.awrap_tool_call(
+            _make_shell_request("rm -rf /tmp/data", tool_name="execute"), AsyncMock()
+        )
+
+
+# ---------------------------------------------------------------------------
+# argv (list) form — should also be blocked
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_rm_rf_as_list_argv_is_blocked() -> None:
+    mw = SafetyShellMiddleware()
+    with pytest.raises(MyDeepAgentError):
+        await mw.awrap_tool_call(
+            _make_shell_request(["rm", "-rf", "/tmp"], tool_name="shell"), AsyncMock()
+        )
+
+
+# ---------------------------------------------------------------------------
+# Safe commands — should pass through
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_ls_la_passes_through() -> None:
+    mw = SafetyShellMiddleware()
+    handler = AsyncMock(return_value="total 42")
+    result = await mw.awrap_tool_call(_make_shell_request("ls -la"), handler)
+    assert result == "total 42"
+    handler.assert_awaited_once()
+
+
+@pytest.mark.asyncio
+async def test_git_status_passes_through() -> None:
+    mw = SafetyShellMiddleware()
+    handler = AsyncMock(return_value="On branch main")
+    result = await mw.awrap_tool_call(_make_shell_request("git status"), handler)
+    assert result == "On branch main"
+
+
+@pytest.mark.asyncio
+async def test_git_push_without_force_passes_through() -> None:
+    mw = SafetyShellMiddleware()
+    handler = AsyncMock(return_value="ok")
+    result = await mw.awrap_tool_call(_make_shell_request("git push origin main"), handler)
+    assert result == "ok"
+
+
+# ---------------------------------------------------------------------------
+# Non-shell tools — should NOT be inspected
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_read_file_tool_with_destructive_content_passes() -> None:
+    """read_file is not a shell tool; its content should not be blocked."""
+    mw = SafetyShellMiddleware()
+    handler = AsyncMock(return_value="file content")
+    request = _make_other_tool_request("read_file", {"path": "/some/file.py"})
+    result = await mw.awrap_tool_call(request, handler)
+    assert result == "file content"
+
+
+@pytest.mark.asyncio
+async def test_unknown_tool_not_checked() -> None:
+    mw = SafetyShellMiddleware()
+    handler = AsyncMock(return_value="ok")
+    result = await mw.awrap_tool_call(_make_other_tool_request("arbitrary_tool"), handler)
+    assert result == "ok"
+
+
+# ---------------------------------------------------------------------------
+# _is_denied_path unit tests
+# ---------------------------------------------------------------------------
+
+
+def test_is_denied_path_env_file() -> None:
+    assert _is_denied_path(".env") is True
+
+
+def test_is_denied_path_env_local_in_subdir() -> None:
+    assert _is_denied_path("config/.env.local") is True
+
+
+def test_is_denied_path_ssh_key() -> None:
+    assert _is_denied_path(".ssh/id_rsa") is True
+
+
+def test_is_denied_path_safe_source_file() -> None:
+    assert _is_denied_path("src/main.py") is False
+
+
+def test_is_denied_path_token_file() -> None:
+    assert _is_denied_path("api_token.json") is True
+
+
+def test_is_denied_path_aws_credentials() -> None:
+    assert _is_denied_path(".aws/credentials") is True
+
+
+def test_is_denied_path_pem_file() -> None:
+    assert _is_denied_path("key.pem") is True
+
+
+def test_is_denied_path_absolute_env() -> None:
+    # absolute path normalised by lstrip('/')
+    assert _is_denied_path("/.env") is True
+
+
+# ---------------------------------------------------------------------------
+# Secret-path tool blocking via awrap_tool_call
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_read_file_env_path_is_blocked() -> None:
+    mw = SafetyShellMiddleware()
+    request = _make_other_tool_request("read_file", {"file_path": ".env"})
+    with pytest.raises(MyDeepAgentError) as exc_info:
+        await mw.awrap_tool_call(request, AsyncMock())
+    assert exc_info.value.code == "secret_access_blocked"
+
+
+@pytest.mark.asyncio
+async def test_write_file_pem_path_is_blocked() -> None:
+    mw = SafetyShellMiddleware()
+    request = _make_other_tool_request("write_file", {"file_path": "key.pem"})
+    with pytest.raises(MyDeepAgentError) as exc_info:
+        await mw.awrap_tool_call(request, AsyncMock())
+    assert exc_info.value.code == "secret_access_blocked"
+
+
+@pytest.mark.asyncio
+async def test_ls_ssh_dir_is_blocked() -> None:
+    mw = SafetyShellMiddleware()
+    request = _make_other_tool_request("ls", {"path": ".ssh/"})
+    with pytest.raises(MyDeepAgentError) as exc_info:
+        await mw.awrap_tool_call(request, AsyncMock())
+    assert exc_info.value.code == "secret_access_blocked"
+
+
+@pytest.mark.asyncio
+async def test_read_file_safe_path_passes() -> None:
+    mw = SafetyShellMiddleware()
+    handler = AsyncMock(return_value="content")
+    request = _make_other_tool_request("read_file", {"file_path": "src/foo.py"})
+    result = await mw.awrap_tool_call(request, handler)
+    assert result == "content"
+    handler.assert_awaited_once()
+
+
+@pytest.mark.asyncio
+async def test_execute_tool_path_arg_not_path_checked() -> None:
+    """execute tool goes through shell-check only, not path-check."""
+    mw = SafetyShellMiddleware()
+    handler = AsyncMock(return_value="ok")
+    # safe shell command with a path arg — should not be blocked via path logic
+    request = _make_shell_request("ls /some/safe/dir", tool_name="execute")
+    result = await mw.awrap_tool_call(request, handler)
+    assert result == "ok"
--- a/my-deepagent/tests/unit/test_persona.py
+++ b/my-deepagent/tests/unit/test_persona.py
@@ -0,0 +1,332 @@
+"""Unit tests for src/my_deepagent/persona.py."""
+
+from __future__ import annotations
+
+import re
+from pathlib import Path
+
+import pytest
+from pydantic import ValidationError
+
+from my_deepagent.enums import Backend
+from my_deepagent.persona import (
+    FilesystemPermissionSpec,
+    Persona,
+    PersonaSubagent,
+    load_persona_yaml,
+    load_personas_from_dir,
+)
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+PERSONAS_DIR = Path(__file__).parent.parent.parent / "docs" / "schemas" / "personas"
+
+
+def _minimal_persona_dict(**overrides: object) -> dict[str, object]:
+    """Return a minimal valid persona dict, overridable per-test."""
+    base: dict[str, object] = {
+        "name": "test-persona",
+        "version": 1,
+        "backend": "openrouter",
+        "model": "openrouter:anthropic/claude-sonnet-4-6",
+        "provider_origin": "US/Anthropic",
+        "capabilities": ["spec_write"],
+        "max_risk_level": "low",
+        "system_prompt": "You are a test persona for unit tests.",
+    }
+    base.update(overrides)
+    return base
+
+
+# ---------------------------------------------------------------------------
+# Seed yaml: all 10 load successfully
+# ---------------------------------------------------------------------------
+
+
+def test_all_seed_personas_load() -> None:
+    personas = load_personas_from_dir(PERSONAS_DIR)
+    assert len(personas) == 10
+
+
+def test_seed_persona_names_unique() -> None:
+    personas = load_personas_from_dir(PERSONAS_DIR)
+    keys = [(p.name, p.version) for p in personas]
+    assert len(keys) == len(set(keys))
+
+
+def test_seed_personas_backends_are_openrouter() -> None:
+    personas = load_personas_from_dir(PERSONAS_DIR)
+    for p in personas:
+        assert p.backend == Backend.OPENROUTER
+
+
+def test_seed_persona_capabilities_non_empty() -> None:
+    personas = load_personas_from_dir(PERSONAS_DIR)
+    for p in personas:
+        assert len(p.capabilities) >= 1
+
+
+def test_seed_persona_hash_is_64_char_hex() -> None:
+    personas = load_personas_from_dir(PERSONAS_DIR)
+    for p in personas:
+        h = p.compute_hash()
+        assert re.fullmatch(r"[0-9a-f]{64}", h), f"{p.name}: bad hash {h!r}"
+
+
+def test_seed_persona_frozen() -> None:
+    """Frozen model: attribute assignment must raise."""
+    personas = load_personas_from_dir(PERSONAS_DIR)
+    p = personas[0]
+    with pytest.raises((TypeError, ValidationError)):
+        p.name = "mutated"  # type: ignore[misc]
+
+
+# ---------------------------------------------------------------------------
+# extra="forbid": unknown fields rejected
+# ---------------------------------------------------------------------------
+
+
+def test_persona_extra_field_raises() -> None:
+    data = _minimal_persona_dict(unknown_field="surprise")
+    with pytest.raises(ValidationError, match="extra"):
+        Persona.model_validate(data)
+
+
+# ---------------------------------------------------------------------------
+# FilesystemPermissionSpec validators
+# ---------------------------------------------------------------------------
+
+
+def test_permission_path_no_leading_slash_raises() -> None:
+    with pytest.raises(ValidationError, match="must start with '/'"):
+        FilesystemPermissionSpec(operations=["read"], paths=["relative/path"])
+
+
+def test_permission_path_dotdot_raises() -> None:
+    with pytest.raises(ValidationError, match=r"must not contain '\.\.'"):
+        FilesystemPermissionSpec(operations=["read"], paths=["/foo/../bar"])
+
+
+def test_permission_path_tilde_raises() -> None:
+    with pytest.raises(ValidationError, match="must not contain '~'"):
+        FilesystemPermissionSpec(operations=["read"], paths=["/path/~expansion/secret"])
+
+
+def test_permission_path_glob_ok() -> None:
+    """Glob patterns like /** should not trigger the path validator."""
+    spec = FilesystemPermissionSpec(operations=["read", "write"], paths=["/**"])
+    assert spec.paths == ("/**",)
+
+
+def test_permission_mode_default_allow() -> None:
+    spec = FilesystemPermissionSpec(operations=["read"], paths=["/tmp"])
+    assert spec.mode == "allow"
+
+
+def test_permission_deny_mode() -> None:
+    spec = FilesystemPermissionSpec(operations=["write"], paths=["/.env"], mode="deny")
+    assert spec.mode == "deny"
+
+
+def test_permission_extra_field_raises() -> None:
+    with pytest.raises(ValidationError):
+        FilesystemPermissionSpec(operations=["read"], paths=["/tmp"], unknown=True)  # type: ignore[call-arg]
+
+
+# ---------------------------------------------------------------------------
+# Persona.compute_hash: determinism
+# ---------------------------------------------------------------------------
+
+
+def test_compute_hash_deterministic() -> None:
+    p = Persona.model_validate(_minimal_persona_dict())
+    hashes = [p.compute_hash() for _ in range(20)]
+    assert len(set(hashes)) == 1
+
+
+def test_compute_hash_different_personas_differ() -> None:
+    p1 = Persona.model_validate(_minimal_persona_dict(name="p1"))
+    p2 = Persona.model_validate(_minimal_persona_dict(name="p2"))
+    assert p1.compute_hash() != p2.compute_hash()
+
+
+def test_compute_hash_version_affects_hash() -> None:
+    p1 = Persona.model_validate(_minimal_persona_dict(version=1))
+    p2 = Persona.model_validate(_minimal_persona_dict(version=2))
+    assert p1.compute_hash() != p2.compute_hash()
+
+
+# ---------------------------------------------------------------------------
+# Persona: min_length, ge validators
+# ---------------------------------------------------------------------------
+
+
+def test_persona_empty_capabilities_raises() -> None:
+    data = _minimal_persona_dict(capabilities=[])
+    with pytest.raises(ValidationError):
+        Persona.model_validate(data)
+
+
+def test_persona_version_zero_raises() -> None:
+    data = _minimal_persona_dict(version=0)
+    with pytest.raises(ValidationError):
+        Persona.model_validate(data)
+
+
+def test_persona_negative_max_cost_raises() -> None:
+    data = _minimal_persona_dict(max_cost_per_call_usd=-0.01)
+    with pytest.raises(ValidationError):
+        Persona.model_validate(data)
+
+
+def test_persona_system_prompt_too_short_raises() -> None:
+    data = _minimal_persona_dict(system_prompt="short")
+    with pytest.raises(ValidationError):
+        Persona.model_validate(data)
+
+
+# ---------------------------------------------------------------------------
+# load_persona_yaml: file not found
+# ---------------------------------------------------------------------------
+
+
+def test_load_persona_yaml_missing_file(tmp_path: Path) -> None:
+    with pytest.raises(FileNotFoundError):
+        load_persona_yaml(tmp_path / "nonexistent.yaml")
+
+
+# ---------------------------------------------------------------------------
+# load_personas_from_dir: duplicate detection
+# ---------------------------------------------------------------------------
+
+
+def test_load_personas_from_dir_duplicate_raises(tmp_path: Path) -> None:
+    import yaml
+
+    data = _minimal_persona_dict()
+    for fname in ("persona-a@1.yaml", "persona-b@1.yaml"):
+        (tmp_path / fname).write_text(yaml.dump(data), encoding="utf-8")
+
+    with pytest.raises(ValueError, match="duplicate persona"):
+        load_personas_from_dir(tmp_path)
+
+
+def test_load_personas_from_dir_missing_dir() -> None:
+    result = load_personas_from_dir(Path("/nonexistent_directory_xyz"))
+    assert result == []
+
+
+def test_load_personas_from_dir_sorted_by_filename(tmp_path: Path) -> None:
+    """Files are loaded in filename order for determinism."""
+    import yaml
+
+    for i, name in enumerate(["zz-persona", "aa-persona"]):
+        data = _minimal_persona_dict(name=name, version=1)
+        (tmp_path / f"{name}@1.yaml").write_text(yaml.dump(data), encoding="utf-8")
+
+    personas = load_personas_from_dir(tmp_path)
+    assert personas[0].name == "aa-persona"
+    assert personas[1].name == "zz-persona"
+
+
+# ---------------------------------------------------------------------------
+# PersonaSubagent: extra="forbid", min_length
+# ---------------------------------------------------------------------------
+
+
+def test_subagent_extra_field_raises() -> None:
+    with pytest.raises(ValidationError):
+        PersonaSubagent(
+            name="x",
+            description="at least ten chars here",
+            system_prompt="at least ten chars here",
+            unknown_field=True,  # type: ignore[call-arg]
+        )
+
+
+def test_subagent_short_description_raises() -> None:
+    with pytest.raises(ValidationError):
+        PersonaSubagent(name="x", description="short", system_prompt="at least ten chars here")
+
+
+# ---------------------------------------------------------------------------
+# Snapshot: specific persona hashes are stable
+# ---------------------------------------------------------------------------
+
+
+def test_default_interactive_hash_prefix() -> None:
+    """Hash of default-interactive@1 must start with 8193103c.
+
+    Hash updated: permissions block removed from yaml (deepagents 0.6.1 workaround).
+    """
+    personas = load_personas_from_dir(PERSONAS_DIR)
+    p = next(q for q in personas if q.name == "default-interactive")
+    assert p.compute_hash().startswith("8193103c")
+
+
+def test_spec_writer_hash_prefix() -> None:
+    """Hash of openrouter-claude-spec-writer@1 must be stable."""
+    personas = load_personas_from_dir(PERSONAS_DIR)
+    p = next(q for q in personas if q.name == "openrouter-claude-spec-writer")
+    h = p.compute_hash()
+    assert len(h) == 64
+    assert re.fullmatch(r"[0-9a-f]{64}", h)
+
+
+# ---------------------------------------------------------------------------
+# Step 2 patch: null byte path rejection
+# ---------------------------------------------------------------------------
+
+
+def test_filesystem_permission_null_byte_rejected() -> None:
+    """Null bytes in a filesystem permission path must be rejected."""
+    with pytest.raises(ValidationError, match="null bytes"):
+        FilesystemPermissionSpec.model_validate(
+            {
+                "operations": ["read"],
+                "paths": ["/foo\x00/bar"],
+                "mode": "deny",
+            }
+        )
+
+
+# ---------------------------------------------------------------------------
+# Deep immutability: nested list-valued fields are tuples (cannot be mutated)
+# ---------------------------------------------------------------------------
+
+
+def test_persona_capabilities_immutable() -> None:
+    """capabilities is a tuple — .append() must raise AttributeError."""
+    p = Persona.model_validate(_minimal_persona_dict())
+    with pytest.raises((AttributeError, TypeError)):
+        p.capabilities.append(None)  # type: ignore[attr-defined]
+
+
+def test_persona_subagents_immutable() -> None:
+    """subagents is a tuple — .append() must raise AttributeError."""
+    p = Persona.model_validate(_minimal_persona_dict())
+    with pytest.raises((AttributeError, TypeError)):
+        p.subagents.append(None)  # type: ignore[attr-defined]
+
+
+def test_persona_skills_immutable() -> None:
+    """skills is a tuple — .append() must raise AttributeError."""
+    p = Persona.model_validate(_minimal_persona_dict())
+    with pytest.raises((AttributeError, TypeError)):
+        p.skills.append("new_skill")  # type: ignore[attr-defined]
+
+
+def test_filesystem_permission_paths_immutable() -> None:
+    """paths is a tuple — .append() must raise AttributeError."""
+    perm = FilesystemPermissionSpec(operations=("read",), paths=("/foo",), mode="allow")
+    with pytest.raises((AttributeError, TypeError)):
+        perm.paths.append("/bar")  # type: ignore[attr-defined]
+
+
+def test_filesystem_permission_operations_immutable() -> None:
+    """operations is a tuple — .append() must raise AttributeError."""
+    perm = FilesystemPermissionSpec(operations=("read",), paths=("/foo",), mode="allow")
+    with pytest.raises((AttributeError, TypeError)):
+        perm.operations.append("write")  # type: ignore[attr-defined]
--- a/my-deepagent/tests/unit/test_pricing.py
+++ b/my-deepagent/tests/unit/test_pricing.py
@@ -0,0 +1,229 @@
+"""Unit tests for src/my_deepagent/monitoring/pricing.py."""
+
+from __future__ import annotations
+
+import httpx
+import pytest
+import respx
+
+from my_deepagent.errors import MyDeepAgentError
+from my_deepagent.monitoring.pricing import (
+    ModelPrice,
+    PricingCache,
+    _parse_pricing_payload,
+    fetch_openrouter_pricing,
+)
+
+# ---------------------------------------------------------------------------
+# _parse_pricing_payload
+# ---------------------------------------------------------------------------
+
+
+def test_parse_valid_payload_returns_model_prices() -> None:
+    data = {
+        "data": [
+            {
+                "id": "deepseek/deepseek-chat",
+                "pricing": {"prompt": "0.000001", "completion": "0.000002"},
+                "context_length": 32768,
+            },
+            {
+                "id": "anthropic/claude-sonnet",
+                "pricing": {"prompt": "0.000003", "completion": "0.000015"},
+                "context_length": 200000,
+            },
+        ]
+    }
+    result = _parse_pricing_payload(data)
+    assert len(result) == 2
+    assert result[0].model == "deepseek/deepseek-chat"
+    assert result[0].input_per_1k_usd == pytest.approx(0.001)
+    assert result[0].output_per_1k_usd == pytest.approx(0.002)
+    assert result[0].context_length == 32768
+    assert result[1].model == "anthropic/claude-sonnet"
+
+
+def test_parse_empty_data_list_returns_empty() -> None:
+    result = _parse_pricing_payload({"data": []})
+    assert result == []
+
+
+def test_parse_data_is_not_list_returns_empty() -> None:
+    # data is a dict instead of list — malformed response
+    result = _parse_pricing_payload({"data": {"id": "bad"}})
+    assert result == []
+
+
+def test_parse_missing_data_key_returns_empty() -> None:
+    result = _parse_pricing_payload({})
+    assert result == []
+
+
+def test_parse_skips_entries_without_id() -> None:
+    data = {
+        "data": [
+            {"pricing": {"prompt": "0.000001", "completion": "0.000002"}, "context_length": 1000},
+        ]
+    }
+    result = _parse_pricing_payload(data)
+    assert result == []
+
+
+def test_parse_skips_entries_with_invalid_pricing_values() -> None:
+    data = {
+        "data": [
+            {
+                "id": "model/x",
+                "pricing": {"prompt": "not-a-number", "completion": "also-bad"},
+                "context_length": 1000,
+            }
+        ]
+    }
+    result = _parse_pricing_payload(data)
+    assert result == []
+
+
+def test_parse_handles_null_pricing_gracefully() -> None:
+    data = {
+        "data": [
+            {"id": "model/y", "pricing": None, "context_length": 0},
+        ]
+    }
+    result = _parse_pricing_payload(data)
+    # pricing=None -> {} -> prompt/completion default to "0"
+    assert len(result) == 1
+    assert result[0].input_per_1k_usd == 0.0
+    assert result[0].output_per_1k_usd == 0.0
+
+
+def test_parse_handles_missing_context_length() -> None:
+    data = {
+        "data": [
+            {"id": "model/z", "pricing": {"prompt": "0.000001", "completion": "0.000002"}},
+        ]
+    }
+    result = _parse_pricing_payload(data)
+    assert len(result) == 1
+    assert result[0].context_length == 0
+
+
+def test_parse_non_dict_entry_is_skipped() -> None:
+    data = {"data": ["not-a-dict", None]}
+    result = _parse_pricing_payload(data)
+    assert result == []
+
+
+# ---------------------------------------------------------------------------
+# PricingCache.compute_cost
+# ---------------------------------------------------------------------------
+
+
+def test_compute_cost_known_model() -> None:
+    cache = PricingCache()
+    cache.set(
+        [
+            ModelPrice(
+                model="deepseek/deepseek-chat",
+                input_per_1k_usd=0.001,
+                output_per_1k_usd=0.002,
+                context_length=32768,
+            )
+        ]
+    )
+    cost = cache.compute_cost("deepseek/deepseek-chat", input_tokens=1000, output_tokens=500)
+    assert cost == pytest.approx(0.001 * 1.0 + 0.002 * 0.5)
+
+
+def test_compute_cost_openrouter_prefix_stripped() -> None:
+    cache = PricingCache()
+    cache.set(
+        [
+            ModelPrice(
+                model="deepseek/deepseek-chat",
+                input_per_1k_usd=0.001,
+                output_per_1k_usd=0.002,
+                context_length=32768,
+            )
+        ]
+    )
+    # Should strip "openrouter:" prefix when looking up
+    cost = cache.compute_cost(
+        "openrouter:deepseek/deepseek-chat", input_tokens=1000, output_tokens=0
+    )
+    assert cost == pytest.approx(0.001)
+
+
+def test_compute_cost_unknown_model_returns_zero() -> None:
+    cache = PricingCache()
+    cost = cache.compute_cost("unknown/model", input_tokens=1000, output_tokens=1000)
+    assert cost == 0.0
+
+
+def test_compute_cost_zero_tokens_returns_zero() -> None:
+    cache = PricingCache()
+    cache.set(
+        [ModelPrice(model="m/x", input_per_1k_usd=1.0, output_per_1k_usd=2.0, context_length=1000)]
+    )
+    assert cache.compute_cost("m/x", input_tokens=0, output_tokens=0) == 0.0
+
+
+def test_pricing_cache_get_strips_openrouter_prefix() -> None:
+    cache = PricingCache()
+    cache.set(
+        [ModelPrice(model="a/b", input_per_1k_usd=0.5, output_per_1k_usd=1.0, context_length=0)]
+    )
+    assert cache.get("openrouter:a/b") is not None
+    assert cache.get("a/b") is not None
+
+
+# ---------------------------------------------------------------------------
+# fetch_openrouter_pricing (respx mock)
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_fetch_openrouter_pricing_success() -> None:
+    payload = {
+        "data": [
+            {
+                "id": "deepseek/deepseek-chat",
+                "pricing": {"prompt": "0.000001", "completion": "0.000002"},
+                "context_length": 64000,
+            }
+        ]
+    }
+    with respx.mock:
+        respx.get("https://openrouter.ai/api/v1/models").mock(
+            return_value=httpx.Response(200, json=payload)
+        )
+        result = await fetch_openrouter_pricing(
+            api_key="sk-or-test", base_url="https://openrouter.ai/api/v1"
+        )
+    assert len(result) == 1
+    assert result[0].model == "deepseek/deepseek-chat"
+
+
+@pytest.mark.asyncio
+async def test_fetch_openrouter_pricing_http_error_raises_recoverable() -> None:
+    with respx.mock:
+        respx.get("https://openrouter.ai/api/v1/models").mock(
+            return_value=httpx.Response(401, json={"error": "unauthorized"})
+        )
+        with pytest.raises(MyDeepAgentError) as exc_info:
+            await fetch_openrouter_pricing(
+                api_key="bad-key", base_url="https://openrouter.ai/api/v1"
+            )
+    assert exc_info.value.code == "network_blip"
+
+
+@pytest.mark.asyncio
+async def test_fetch_openrouter_pricing_connect_error_raises_recoverable() -> None:
+    with respx.mock:
+        respx.get("https://openrouter.ai/api/v1/models").mock(
+            side_effect=httpx.ConnectError("connection refused")
+        )
+        with pytest.raises(MyDeepAgentError) as exc_info:
+            await fetch_openrouter_pricing(
+                api_key="sk-or-test", base_url="https://openrouter.ai/api/v1"
+            )
+    assert exc_info.value.code == "network_blip"
--- a/my-deepagent/tests/unit/test_session.py
+++ b/my-deepagent/tests/unit/test_session.py
@@ -0,0 +1,454 @@
+"""Unit tests for src/my_deepagent/session.py.
+
+Tests verify the dataclass-based deepagents API (FilesystemPermission attributes,
+build_backend backend type dispatch, _map_operations deduplication, etc.).
+No real API calls are made.
+"""
+
+from __future__ import annotations
+
+from pathlib import Path
+from typing import Any
+
+import pytest
+from deepagents import FilesystemPermission
+from deepagents.backends import (
+    CompositeBackend,
+    FilesystemBackend,
+    LocalShellBackend,
+)
+from langchain_openai import ChatOpenAI
+from langgraph.graph.state import CompiledStateGraph
+
+from my_deepagent.config import load_config
+from my_deepagent.errors import MyDeepAgentError
+from my_deepagent.persona import FilesystemPermissionSpec, Persona, PersonaSubagent
+from my_deepagent.session import (
+    _map_operations,
+    _resolve_openrouter_api_key,
+    _spec_to_permission,
+    _subagent_to_dict,
+    build_agent,
+    build_backend,
+    default_safety_permissions,
+    resolve_model_instance,
+)
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+
+def _minimal_persona(**overrides: Any) -> Persona:
+    base: dict[str, Any] = {
+        "name": "test-persona",
+        "version": 1,
+        "backend": "openrouter",
+        "model": "openrouter:anthropic/claude-sonnet-4-6",
+        "provider_origin": "US/Anthropic",
+        "capabilities": ["spec_write"],
+        "max_risk_level": "low",
+        "system_prompt": "You are a test assistant for unit tests.",
+    }
+    base.update(overrides)
+    return Persona.model_validate(base)
+
+
+def _minimal_permission_spec(
+    operations: list[str] | None = None,
+    paths: list[str] | None = None,
+    mode: str = "allow",
+) -> FilesystemPermissionSpec:
+    return FilesystemPermissionSpec(
+        operations=tuple(operations or ["read"]),
+        paths=tuple(paths or ["/**"]),
+        mode=mode,  # type: ignore[arg-type]
+    )
+
+
+def _minimal_subagent(**overrides: Any) -> PersonaSubagent:
+    base: dict[str, Any] = {
+        "name": "test-sub",
+        "description": "A test subagent description.",
+        "system_prompt": "You are a subagent for unit tests.",
+    }
+    base.update(overrides)
+    return PersonaSubagent.model_validate(base)
+
+
+# ---------------------------------------------------------------------------
+# default_safety_permissions — dataclass attribute access
+# ---------------------------------------------------------------------------
+
+
+def test_default_safety_permissions_returns_two_entries() -> None:
+    perms = default_safety_permissions()
+    assert len(perms) == 2
+
+
+def test_default_safety_permissions_returns_filesystem_permission_instances() -> None:
+    perms = default_safety_permissions()
+    for p in perms:
+        assert isinstance(p, FilesystemPermission)
+
+
+def test_default_safety_permissions_allow_is_first() -> None:
+    perms = default_safety_permissions()
+    assert perms[0].mode == "allow"
+    assert "/**" in perms[0].paths
+
+
+def test_default_safety_permissions_allow_has_both_operations() -> None:
+    perms = default_safety_permissions()
+    assert "read" in perms[0].operations
+    assert "write" in perms[0].operations
+
+
+def test_default_safety_permissions_deny_is_second() -> None:
+    perms = default_safety_permissions()
+    assert perms[1].mode == "deny"
+    deny_paths = perms[1].paths
+    assert any("env" in p for p in deny_paths)
+    assert any("ssh" in p for p in deny_paths)
+
+
+def test_default_safety_permissions_deny_covers_secrets() -> None:
+    perms = default_safety_permissions()
+    deny_paths = perms[1].paths
+    assert any("secret" in p for p in deny_paths)
+    assert any("token" in p for p in deny_paths)
+    assert any("pem" in p for p in deny_paths)
+
+
+# ---------------------------------------------------------------------------
+# _map_operations — 8 케이스
+# ---------------------------------------------------------------------------
+
+
+def test_map_operations_read() -> None:
+    assert _map_operations(("read",)) == ["read"]
+
+
+def test_map_operations_write() -> None:
+    assert _map_operations(("write",)) == ["write"]
+
+
+def test_map_operations_edit_maps_to_write() -> None:
+    assert _map_operations(("edit",)) == ["write"]
+
+
+def test_map_operations_ls_maps_to_read() -> None:
+    assert _map_operations(("ls",)) == ["read"]
+
+
+def test_map_operations_deduplicates_all_four() -> None:
+    result = _map_operations(("read", "write", "edit", "ls"))
+    assert result == ["read", "write"]
+
+
+def test_map_operations_ls_and_edit() -> None:
+    assert _map_operations(("ls", "edit")) == ["read", "write"]
+
+
+def test_map_operations_preserves_order_write_then_read() -> None:
+    result = _map_operations(("write", "read"))
+    assert result == ["write", "read"]
+
+
+def test_map_operations_empty_returns_empty() -> None:
+    assert _map_operations(()) == []
+
+
+# ---------------------------------------------------------------------------
+# _spec_to_permission — dataclass attribute + mapping
+# ---------------------------------------------------------------------------
+
+
+def test_spec_to_permission_returns_filesystem_permission() -> None:
+    spec = _minimal_permission_spec(operations=["read"], paths=["/**"], mode="allow")
+    result = _spec_to_permission(spec)
+    assert isinstance(result, FilesystemPermission)
+
+
+def test_spec_to_permission_maps_read_write_correctly() -> None:
+    spec = _minimal_permission_spec(operations=["read", "write"], paths=["/**"], mode="allow")
+    result = _spec_to_permission(spec)
+    assert result.operations == ["read", "write"]
+    assert result.paths == ["/**"]
+    assert result.mode == "allow"
+
+
+def test_spec_to_permission_maps_edit_to_write() -> None:
+    spec = _minimal_permission_spec(operations=["edit"], paths=["/src/**"], mode="allow")
+    result = _spec_to_permission(spec)
+    assert result.operations == ["write"]
+
+
+def test_spec_to_permission_maps_ls_to_read() -> None:
+    spec = _minimal_permission_spec(operations=["ls"], paths=["/data/**"], mode="allow")
+    result = _spec_to_permission(spec)
+    assert result.operations == ["read"]
+
+
+def test_spec_to_permission_deduplicates_read_edit_ls() -> None:
+    spec = _minimal_permission_spec(
+        operations=["read", "edit", "ls"], paths=["/workspace/**"], mode="allow"
+    )
+    result = _spec_to_permission(spec)
+    # read=read, edit=write, ls=read → ["read", "write"]
+    assert result.operations == ["read", "write"]
+
+
+def test_spec_to_permission_deny_mode_passthrough() -> None:
+    spec = _minimal_permission_spec(operations=["read"], paths=["/.env*"], mode="deny")
+    result = _spec_to_permission(spec)
+    assert result.mode == "deny"
+    assert "/.env*" in result.paths
+
+
+# ---------------------------------------------------------------------------
+# _subagent_to_dict
+# ---------------------------------------------------------------------------
+
+
+def test_subagent_to_dict_required_fields() -> None:
+    sub = _minimal_subagent()
+    d = _subagent_to_dict(sub)
+    assert d["name"] == "test-sub"
+    assert d["description"] == "A test subagent description."
+    assert d["system_prompt"] == "You are a subagent for unit tests."
+
+
+def test_subagent_to_dict_optional_tools_included_when_set() -> None:
+    sub = _minimal_subagent(allowed_tools=["read_file", "write_file"])
+    d = _subagent_to_dict(sub)
+    assert "tools" in d
+    assert d["tools"] == ["read_file", "write_file"]
+
+
+def test_subagent_to_dict_no_tools_key_when_empty() -> None:
+    sub = _minimal_subagent()
+    d = _subagent_to_dict(sub)
+    assert "tools" not in d
+
+
+def test_subagent_to_dict_optional_model_included_when_set() -> None:
+    sub = _minimal_subagent(model="openrouter:deepseek/deepseek-chat")
+    d = _subagent_to_dict(sub)
+    assert "model" in d
+    assert d["model"] == "openrouter:deepseek/deepseek-chat"
+
+
+def test_subagent_to_dict_no_model_key_when_none() -> None:
+    sub = _minimal_subagent()
+    d = _subagent_to_dict(sub)
+    assert "model" not in d
+
+
+def test_subagent_to_dict_permissions_included_when_set() -> None:
+    sub = _minimal_subagent(
+        permissions=[{"operations": ["read"], "paths": ["/**"], "mode": "allow"}]
+    )
+    d = _subagent_to_dict(sub)
+    assert "permissions" in d
+    assert len(d["permissions"]) == 1
+    # permissions 안의 항목도 FilesystemPermission 인스턴스
+    assert isinstance(d["permissions"][0], FilesystemPermission)
+
+
+def test_subagent_to_dict_permissions_empty_not_included() -> None:
+    sub = _minimal_subagent()
+    d = _subagent_to_dict(sub)
+    assert "permissions" not in d
+
+
+def test_subagent_to_dict_interrupt_on_included_when_set() -> None:
+    sub = _minimal_subagent(interrupt_on={"write_file": {"allowed_decisions": ["approve"]}})
+    d = _subagent_to_dict(sub)
+    assert "interrupt_on" in d
+
+
+def test_subagent_to_dict_no_interrupt_on_when_empty() -> None:
+    sub = _minimal_subagent()
+    d = _subagent_to_dict(sub)
+    assert "interrupt_on" not in d
+
+
+# ---------------------------------------------------------------------------
+# _resolve_openrouter_api_key
+# ---------------------------------------------------------------------------
+
+
+def test_resolve_api_key_from_config() -> None:
+    config = load_config(openrouter_api_key="sk-or-from-config")
+    key = _resolve_openrouter_api_key(config)
+    assert key == "sk-or-from-config"
+
+
+def test_resolve_api_key_from_mydeepagent_env(monkeypatch: pytest.MonkeyPatch) -> None:
+    monkeypatch.delenv("MYDEEPAGENT_OPENROUTER_API_KEY", raising=False)
+    monkeypatch.delenv("OPENROUTER_API_KEY", raising=False)
+    monkeypatch.setenv("MYDEEPAGENT_OPENROUTER_API_KEY", "sk-or-env-mydeepagent")
+    config = load_config(openrouter_api_key=None)
+    key = _resolve_openrouter_api_key(config)
+    assert key == "sk-or-env-mydeepagent"
+
+
+def test_resolve_api_key_fallback_to_openrouter_env(monkeypatch: pytest.MonkeyPatch) -> None:
+    monkeypatch.delenv("MYDEEPAGENT_OPENROUTER_API_KEY", raising=False)
+    monkeypatch.delenv("OPENROUTER_API_KEY", raising=False)
+    monkeypatch.setenv("OPENROUTER_API_KEY", "sk-or-env-fallback")
+    config = load_config(openrouter_api_key=None)
+    key = _resolve_openrouter_api_key(config)
+    assert key == "sk-or-env-fallback"
+
+
+def test_resolve_api_key_raises_when_missing(monkeypatch: pytest.MonkeyPatch) -> None:
+    monkeypatch.delenv("MYDEEPAGENT_OPENROUTER_API_KEY", raising=False)
+    monkeypatch.delenv("OPENROUTER_API_KEY", raising=False)
+    config = load_config(openrouter_api_key=None)
+    with pytest.raises(MyDeepAgentError) as exc_info:
+        _resolve_openrouter_api_key(config)
+    assert exc_info.value.code == "backend_auth_failed"
+
+
+def test_resolve_api_key_config_takes_priority_over_env(monkeypatch: pytest.MonkeyPatch) -> None:
+    monkeypatch.setenv("MYDEEPAGENT_OPENROUTER_API_KEY", "sk-or-env")
+    config = load_config(openrouter_api_key="sk-or-config-wins")
+    key = _resolve_openrouter_api_key(config)
+    assert key == "sk-or-config-wins"
+
+
+# ---------------------------------------------------------------------------
+# resolve_model_instance
+# ---------------------------------------------------------------------------
+
+
+def test_resolve_model_openrouter_returns_chat_openai() -> None:
+    config = load_config(openrouter_api_key="sk-or-test")
+    persona = _minimal_persona(model="openrouter:anthropic/claude-sonnet-4-6")
+    instance = resolve_model_instance(persona, config)
+    assert isinstance(instance, ChatOpenAI)
+    assert instance.openai_api_base == config.openrouter_base_url
+
+
+def test_resolve_model_openrouter_uses_model_params() -> None:
+    config = load_config(openrouter_api_key="sk-or-test")
+    persona = _minimal_persona(
+        model="openrouter:anthropic/claude-sonnet-4-6",
+        model_params={"max_tokens": 1024, "temperature": 0.5},
+    )
+    instance = resolve_model_instance(persona, config)
+    assert isinstance(instance, ChatOpenAI)
+    assert instance.max_tokens == 1024
+
+
+def test_resolve_model_non_openrouter_returns_string() -> None:
+    config = load_config()
+    persona = _minimal_persona(
+        backend="anthropic",
+        model="anthropic:claude-3-5-sonnet-20241022",
+    )
+    result = resolve_model_instance(persona, config)
+    assert isinstance(result, str)
+    assert result == "anthropic:claude-3-5-sonnet-20241022"
+
+
+def test_resolve_model_with_override_openrouter() -> None:
+    config = load_config(openrouter_api_key="sk-or-test")
+    persona = _minimal_persona(model="openrouter:anthropic/claude-sonnet-4-6")
+    instance = resolve_model_instance(
+        persona, config, model_override="openrouter:deepseek/deepseek-chat"
+    )
+    assert isinstance(instance, ChatOpenAI)
+    assert "deepseek-chat" in instance.model_name
+
+
+# ---------------------------------------------------------------------------
+# build_backend — 5 케이스
+# ---------------------------------------------------------------------------
+
+
+def test_build_backend_local_shell(tmp_path: Path) -> None:
+    persona = _minimal_persona(deepagents_backend="local_shell")
+    result = build_backend(persona, tmp_path)
+    assert isinstance(result, LocalShellBackend)
+
+
+def test_build_backend_filesystem(tmp_path: Path) -> None:
+    persona = _minimal_persona(deepagents_backend="filesystem")
+    result = build_backend(persona, tmp_path)
+    assert isinstance(result, FilesystemBackend)
+
+
+def test_build_backend_state_returns_none(tmp_path: Path) -> None:
+    persona = _minimal_persona(deepagents_backend="state")
+    result = build_backend(persona, tmp_path)
+    assert result is None
+
+
+def test_build_backend_composite(tmp_path: Path) -> None:
+    persona = _minimal_persona(deepagents_backend="composite")
+    result = build_backend(persona, tmp_path)
+    assert isinstance(result, CompositeBackend)
+
+
+def test_build_backend_langsmith_raises_config_invalid(tmp_path: Path) -> None:
+    persona = _minimal_persona(deepagents_backend="langsmith")
+    with pytest.raises(MyDeepAgentError) as exc_info:
+        build_backend(persona, tmp_path)
+    assert exc_info.value.code == "config_invalid"
+
+
+# ---------------------------------------------------------------------------
+# build_agent
+# ---------------------------------------------------------------------------
+
+
+def test_build_agent_returns_compiled_state_graph(tmp_path: Path) -> None:
+    """build_agent should construct a CompiledStateGraph without calling the LLM API."""
+    config = load_config(openrouter_api_key="sk-or-test")
+    persona = _minimal_persona(deepagents_backend="state")
+    graph = build_agent(persona, config, root_dir=tmp_path)
+    assert isinstance(graph, CompiledStateGraph)
+    assert hasattr(graph, "invoke")
+    assert hasattr(graph, "ainvoke")
+
+
+def test_build_agent_with_middleware_list(tmp_path: Path) -> None:
+    """Extra middleware is accepted without error.
+
+    build_agent automatically prepends SafetyShellMiddleware. Callers should pass
+    *other* middleware here; passing a second SafetyShellMiddleware would hit
+    deepagents' duplicate-name guard.
+    """
+    from my_deepagent.middleware.audit import AuditToolMiddleware
+
+    config = load_config(openrouter_api_key="sk-or-test")
+    persona = _minimal_persona(deepagents_backend="state")
+    graph = build_agent(
+        persona,
+        config,
+        root_dir=tmp_path,
+        middleware=[AuditToolMiddleware()],
+    )
+    assert isinstance(graph, CompiledStateGraph)
+
+
+def test_build_agent_filesystem_backend(tmp_path: Path) -> None:
+    """build_agent works with filesystem backend."""
+    config = load_config(openrouter_api_key="sk-or-test")
+    persona = _minimal_persona(deepagents_backend="filesystem")
+    graph = build_agent(persona, config, root_dir=tmp_path)
+    assert isinstance(graph, CompiledStateGraph)
+
+
+def test_build_agent_with_persona_permissions(tmp_path: Path) -> None:
+    """build_agent merges persona permissions with default safety permissions."""
+    config = load_config(openrouter_api_key="sk-or-test")
+    persona = _minimal_persona(
+        deepagents_backend="state",
+        permissions=[{"operations": ["read"], "paths": ["/workspace/**"], "mode": "allow"}],
+    )
+    graph = build_agent(persona, config, root_dir=tmp_path)
+    assert isinstance(graph, CompiledStateGraph)
--- a/my-deepagent/tests/unit/test_session_seed_integration.py
+++ b/my-deepagent/tests/unit/test_session_seed_integration.py
@@ -0,0 +1,55 @@
+"""Seed persona integration tests for session.py model resolution."""
+
+from __future__ import annotations
+
+from pathlib import Path
+
+import pytest
+from langchain_openai import ChatOpenAI
+
+from my_deepagent.config import load_config
+from my_deepagent.enums import Backend
+from my_deepagent.persona import load_personas_from_dir
+from my_deepagent.session import resolve_model_instance
+
+PERSONAS_DIR = Path(__file__).parent.parent.parent / "docs" / "schemas" / "personas"
+
+
+@pytest.fixture
+def seed_personas() -> list:  # type: ignore[type-arg]
+    return load_personas_from_dir(PERSONAS_DIR)
+
+
+def test_resolve_model_instance_seed_personas(seed_personas: list) -> None:  # type: ignore[type-arg]
+    """resolve_model_instance should return ChatOpenAI for openrouter personas, str otherwise."""
+    config = load_config(openrouter_api_key="sk-or-dummy")
+    for persona in seed_personas:
+        instance = resolve_model_instance(persona, config)
+        if persona.backend == Backend.OPENROUTER:
+            assert isinstance(instance, ChatOpenAI), (
+                f"persona {persona.name!r} with backend=openrouter should return ChatOpenAI, "
+                f"got {type(instance)}"
+            )
+            # base_url should point to openrouter
+            assert instance.openai_api_base is not None
+            base = instance.openai_api_base
+            assert "openrouter" in base or base == config.openrouter_base_url
+        else:
+            assert isinstance(instance, str), (
+                f"persona {persona.name!r} with backend={persona.backend} should return str, "
+                f"got {type(instance)}"
+            )
+
+
+def test_all_seed_personas_have_non_empty_model(seed_personas: list) -> None:  # type: ignore[type-arg]
+    for persona in seed_personas:
+        assert persona.model, f"persona {persona.name!r} has empty model"
+
+
+def test_all_openrouter_seed_personas_have_openrouter_prefix(seed_personas: list) -> None:  # type: ignore[type-arg]
+    for persona in seed_personas:
+        if persona.backend == Backend.OPENROUTER:
+            assert persona.model.startswith("openrouter:"), (
+                f"persona {persona.name!r} has backend=openrouter but model={persona.model!r} "
+                "does not start with 'openrouter:'"
+            )
--- a/my-deepagent/tests/unit/test_workflow.py
+++ b/my-deepagent/tests/unit/test_workflow.py
@@ -0,0 +1,335 @@
+"""Unit tests for src/my_deepagent/workflow.py."""
+
+from __future__ import annotations
+
+import re
+from pathlib import Path
+
+import pytest
+from pydantic import ValidationError
+
+from my_deepagent.workflow import (
+    ExpectedArtifact,
+    WorkflowTemplate,
+    load_workflow_yaml,
+    load_workflows_from_dir,
+)
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+WORKFLOWS_DIR = Path(__file__).parent.parent.parent / "docs" / "schemas" / "workflows"
+
+
+def _minimal_role(**overrides: object) -> dict[str, object]:
+    base: dict[str, object] = {
+        "id": "spec_writer",
+        "required_capabilities": ["spec_write"],
+    }
+    base.update(overrides)
+    return base
+
+
+def _minimal_phase(**overrides: object) -> dict[str, object]:
+    base: dict[str, object] = {
+        "key": "spec",
+        "title": "Write spec",
+        "risk": "low",
+        "role": "spec_writer",
+        "instructions": "Write the specification document for the feature.",
+    }
+    base.update(overrides)
+    return base
+
+
+def _minimal_template(**overrides: object) -> dict[str, object]:
+    base: dict[str, object] = {
+        "name": "test-workflow",
+        "version": 1,
+        "roles": [_minimal_role()],
+        "phases": [_minimal_phase()],
+    }
+    base.update(overrides)
+    return base
+
+
+# ---------------------------------------------------------------------------
+# Seed yaml: all 3 load successfully
+# ---------------------------------------------------------------------------
+
+
+def test_all_seed_workflows_load() -> None:
+    workflows = load_workflows_from_dir(WORKFLOWS_DIR)
+    assert len(workflows) == 3
+
+
+def test_seed_workflow_names() -> None:
+    workflows = load_workflows_from_dir(WORKFLOWS_DIR)
+    names = {w.name for w in workflows}
+    assert names == {"spec-and-review", "bug-fix-with-reproduction", "code-investigation"}
+
+
+def test_seed_workflow_roles_non_empty() -> None:
+    workflows = load_workflows_from_dir(WORKFLOWS_DIR)
+    for w in workflows:
+        assert len(w.roles) >= 1
+
+
+def test_seed_workflow_phases_non_empty() -> None:
+    workflows = load_workflows_from_dir(WORKFLOWS_DIR)
+    for w in workflows:
+        assert len(w.phases) >= 1
+
+
+def test_seed_workflow_phase_keys_unique() -> None:
+    workflows = load_workflows_from_dir(WORKFLOWS_DIR)
+    for w in workflows:
+        keys = [ph.key for ph in w.phases]
+        assert len(keys) == len(set(keys)), f"{w.name}: duplicate phase keys"
+
+
+# ---------------------------------------------------------------------------
+# WorkflowTemplate validators
+# ---------------------------------------------------------------------------
+
+
+def test_phase_references_undefined_role_raises() -> None:
+    data = _minimal_template(
+        roles=[_minimal_role(id="spec_writer")],
+        phases=[_minimal_phase(role="nonexistent_role")],
+    )
+    with pytest.raises(ValidationError, match="unknown role"):
+        WorkflowTemplate.model_validate(data)
+
+
+def test_duplicate_phase_keys_raises() -> None:
+    data = _minimal_template(
+        roles=[_minimal_role(id="spec_writer")],
+        phases=[
+            _minimal_phase(key="spec"),
+            _minimal_phase(key="spec"),
+        ],
+    )
+    with pytest.raises(ValidationError, match="duplicate phase keys"):
+        WorkflowTemplate.model_validate(data)
+
+
+def test_duplicate_role_ids_raises() -> None:
+    data = _minimal_template(
+        roles=[_minimal_role(id="spec_writer"), _minimal_role(id="spec_writer")],
+        phases=[_minimal_phase(role="spec_writer")],
+    )
+    with pytest.raises(ValidationError, match="duplicate role ids"):
+        WorkflowTemplate.model_validate(data)
+
+
+def test_phase_key_uppercase_raises() -> None:
+    data = _minimal_template(phases=[_minimal_phase(key="SPEC")])
+    with pytest.raises(ValidationError):
+        WorkflowTemplate.model_validate(data)
+
+
+def test_phase_key_with_hyphen_raises() -> None:
+    """Hyphens are not allowed in phase keys (only a-z, 0-9, _)."""
+    data = _minimal_template(phases=[_minimal_phase(key="spec-one")])
+    with pytest.raises(ValidationError):
+        WorkflowTemplate.model_validate(data)
+
+
+def test_phase_key_leading_digit_raises() -> None:
+    data = _minimal_template(phases=[_minimal_phase(key="1spec")])
+    with pytest.raises(ValidationError):
+        WorkflowTemplate.model_validate(data)
+
+
+def test_phase_key_snake_case_ok() -> None:
+    data = _minimal_template(phases=[_minimal_phase(key="spec_write_phase")])
+    wt = WorkflowTemplate.model_validate(data)
+    assert wt.phases[0].key == "spec_write_phase"
+
+
+def test_role_id_pattern_invalid_raises() -> None:
+    data = _minimal_template(
+        roles=[_minimal_role(id="Spec-Writer")],
+        phases=[_minimal_phase(role="spec_writer")],
+    )
+    with pytest.raises(ValidationError):
+        WorkflowTemplate.model_validate(data)
+
+
+# ---------------------------------------------------------------------------
+# ExpectedArtifact: alias mapping
+# ---------------------------------------------------------------------------
+
+
+def test_expected_artifact_schema_alias() -> None:
+    """yaml uses 'schema' key; Python attribute is schema_id."""
+    art = ExpectedArtifact.model_validate({"path": "artifacts/spec.json", "schema": "dev/spec@1"})
+    assert art.schema_id == "dev/spec@1"
+    assert art.path == "artifacts/spec.json"
+
+
+def test_expected_artifact_extra_field_raises() -> None:
+    with pytest.raises(ValidationError):
+        ExpectedArtifact.model_validate({"path": "x.json", "schema": "dev/spec@1", "unknown": True})
+
+
+def test_expected_artifact_missing_schema_raises() -> None:
+    with pytest.raises(ValidationError):
+        ExpectedArtifact.model_validate({"path": "x.json"})
+
+
+# ---------------------------------------------------------------------------
+# WorkflowTemplate frozen + extra="forbid"
+# ---------------------------------------------------------------------------
+
+
+def test_template_frozen() -> None:
+    wt = WorkflowTemplate.model_validate(_minimal_template())
+    with pytest.raises((TypeError, ValidationError)):
+        wt.name = "mutated"  # type: ignore[misc]
+
+
+def test_template_extra_field_raises() -> None:
+    data = _minimal_template(extra_unknown_field="oops")
+    with pytest.raises(ValidationError):
+        WorkflowTemplate.model_validate(data)
+
+
+# ---------------------------------------------------------------------------
+# compute_hash: determinism
+# ---------------------------------------------------------------------------
+
+
+def test_compute_hash_deterministic() -> None:
+    wt = WorkflowTemplate.model_validate(_minimal_template())
+    hashes = [wt.compute_hash() for _ in range(20)]
+    assert len(set(hashes)) == 1
+
+
+def test_compute_hash_returns_64_char_hex() -> None:
+    wt = WorkflowTemplate.model_validate(_minimal_template())
+    h = wt.compute_hash()
+    assert re.fullmatch(r"[0-9a-f]{64}", h)
+
+
+def test_compute_hash_different_templates_differ() -> None:
+    wt1 = WorkflowTemplate.model_validate(_minimal_template(name="wf1"))
+    wt2 = WorkflowTemplate.model_validate(_minimal_template(name="wf2"))
+    assert wt1.compute_hash() != wt2.compute_hash()
+
+
+# ---------------------------------------------------------------------------
+# load_workflow_yaml: file not found
+# ---------------------------------------------------------------------------
+
+
+def test_load_workflow_yaml_missing_file(tmp_path: Path) -> None:
+    with pytest.raises(FileNotFoundError):
+        load_workflow_yaml(tmp_path / "no.yaml")
+
+
+# ---------------------------------------------------------------------------
+# load_workflows_from_dir: duplicate detection + missing dir
+# ---------------------------------------------------------------------------
+
+
+def test_load_workflows_from_dir_duplicate_raises(tmp_path: Path) -> None:
+    import yaml
+
+    data = _minimal_template()
+    for fname in ("wf-a@1.yaml", "wf-b@1.yaml"):
+        (tmp_path / fname).write_text(yaml.dump(data), encoding="utf-8")
+
+    with pytest.raises(ValueError, match="duplicate workflow"):
+        load_workflows_from_dir(tmp_path)
+
+
+def test_load_workflows_from_dir_missing_dir() -> None:
+    result = load_workflows_from_dir(Path("/nonexistent_wf_dir_xyz"))
+    assert result == []
+
+
+# ---------------------------------------------------------------------------
+# Snapshot: seed hashes are stable
+# ---------------------------------------------------------------------------
+
+
+def test_spec_and_review_hash_prefix() -> None:
+    workflows = load_workflows_from_dir(WORKFLOWS_DIR)
+    w = next(x for x in workflows if x.name == "spec-and-review")
+    assert w.compute_hash().startswith("1c94587647b16f0d")
+
+
+def test_bug_fix_hash_prefix() -> None:
+    workflows = load_workflows_from_dir(WORKFLOWS_DIR)
+    w = next(x for x in workflows if x.name == "bug-fix-with-reproduction")
+    assert w.compute_hash().startswith("a137c9656f10e88a")
+
+
+# ---------------------------------------------------------------------------
+# Step 2 patch: Counter-based duplicate role ids report is sorted
+# ---------------------------------------------------------------------------
+
+
+def test_workflow_duplicate_role_ids_reported_sorted() -> None:
+    """Multiple duplicated role ids must be reported in sorted order."""
+    with pytest.raises(ValidationError, match=r"duplicate role ids: \['a', 'b'\]"):
+        WorkflowTemplate.model_validate(
+            {
+                "name": "x",
+                "version": 1,
+                "roles": [
+                    {"id": "b", "required_capabilities": ["spec_write"]},
+                    {"id": "a", "required_capabilities": ["spec_write"]},
+                    {"id": "a", "required_capabilities": ["spec_write"]},
+                    {"id": "b", "required_capabilities": ["spec_write"]},
+                ],
+                "phases": [
+                    {
+                        "key": "x",
+                        "title": "x",
+                        "risk": "low",
+                        "role": "a",
+                        "instructions": "x" * 20,
+                    }
+                ],
+            }
+        )
+
+
+def test_code_investigation_hash_prefix() -> None:
+    workflows = load_workflows_from_dir(WORKFLOWS_DIR)
+    w = next(x for x in workflows if x.name == "code-investigation")
+    assert w.compute_hash().startswith("5b80ea2e248d5232")
+
+
+# ---------------------------------------------------------------------------
+# Deep immutability: nested list-valued fields are tuples (cannot be mutated)
+# ---------------------------------------------------------------------------
+
+
+def test_workflow_phases_immutable() -> None:
+    """phases is a tuple — .append() must raise AttributeError."""
+    wt = WorkflowTemplate.model_validate(_minimal_template())
+    with pytest.raises((AttributeError, TypeError)):
+        wt.phases.append(None)  # type: ignore[attr-defined]
+
+
+def test_workflow_roles_immutable() -> None:
+    """roles is a tuple — .append() must raise AttributeError."""
+    wt = WorkflowTemplate.model_validate(_minimal_template())
+    with pytest.raises((AttributeError, TypeError)):
+        wt.roles.append(None)  # type: ignore[attr-defined]
+
+
+def test_workflow_role_required_capabilities_immutable() -> None:
+    """required_capabilities is a tuple — .append() must raise AttributeError."""
+    from my_deepagent.workflow import WorkflowRole
+
+    role = WorkflowRole.model_validate(
+        {"id": "spec_writer", "required_capabilities": ["spec_write"]}
+    )
+    with pytest.raises((AttributeError, TypeError)):
+        role.required_capabilities.append(None)  # type: ignore[attr-defined]
--- a/my-deepagent/uv.lock
+++ b/my-deepagent/uv.lock
				`@@ -0,0 +1 @@`
				`"""CLI doctor command for environment diagnostics. Implemented in Step 12."""`
				`@@ -0,0 +1 @@`
				`"""CLI interactive subcommand. Implemented in Step 10."""`
				`@@ -0,0 +1 @@`
				`"""Typer CLI entry point. Filled in Step 6."""`
				`@@ -0,0 +1 @@`
				`"""CLI run command implementation. Implemented in Step 6."""`
				`@@ -0,0 +1 @@`
				`"""CLI seed command for importing persona/workflow YAML assets. Implemented in Step 6."""`
				`@@ -0,0 +1 @@`
				`"""CLI stats command for usage summary. Implemented in Step 12."""`
				`@@ -0,0 +1 @@`
				`"""LangGraph run engine orchestrator. Implemented in Step 7."""`
				`@@ -0,0 +1 @@`
				`"""Interactive REPL loop for TUI sessions. Implemented in Step 10."""`
				`@@ -0,0 +1 @@`
				`"""LangSmith tracing integration helpers. Implemented in Step 12."""`
				`@@ -0,0 +1 @@`
				`"""Run statistics aggregation and reporting. Implemented in Step 12."""`