feat(my-deepagent): v0.1.0 Step 0~5 — scaffolding through deepagent + OpenRouter
Python rewrite of the agent harness on top of deepagents 0.6.1 + langchain 1.x, replacing the abandoned TS attempt in packages/. 388 unit/integration tests pass. Steps ----- 0. Scaffolding — uv workspace, ruff/mypy/pre-commit/alembic, src/tests/docs trees with docs/schemas/ seeded from my-deepagent-seed/. 1. Core — config (pydantic-settings with MYDEEPAGENT_ env prefix and TOML source), enums (Backend, Capability, RiskLevel, ApprovalDecisionAction, ApprovalState, RunState, RunPhaseState, SessionState, ErrorClass), errors (MyDeepAgentError + BudgetExhaustedError with PEP-3134 cause + context suppression), hash (canonical JSON + sha256). 2. Persona/Workflow/Binding — pydantic v2 schemas with tuple-based deep immutability (post-construction hash drift prevented), YAML loaders, deterministic auto-select (preferred_backends → version → name → hash), override resolution with ineligibility diagnostics, PersonaConsentStore with fcntl.flock + tmp+fsync+rename atomic write. 3. Artifact schema registry — Draft202012Validator, multi-root resolution, structured ValidationFinding output. 4. Persistence — 18 SQLAlchemy 2.0 async ORM models with FK CASCADE/RESTRICT, WAL + busy_timeout + foreign_keys PRAGMA, alembic baseline + ux_active_run_repo_base partial unique index, LangGraph SqliteSaver as context manager only (lifecycle safety). 5. DeepAgent session — build_agent wires Persona → create_deep_agent with LocalShellBackend / FilesystemBackend / StateBackend / CompositeBackend, ChatOpenAI(base_url=openrouter) for openrouter: model strings, and 4 middleware classes (cost / audit-tool / safety-shell / fallback-model). Critical workarounds -------------------- - deepagents 0.6.1 rejects FilesystemPermission together with backends that implement SandboxBackendProtocol (LocalShellBackend). SafetyShellMiddleware enforces destructive-command and secret-path policy at the tool layer instead, and build_agent strips the permissions kwarg when the persona's deepagents_backend is local_shell. - FilesystemOperation in deepagents is Literal['read', 'write'] only; _map_operations collapses our richer schema (read/write/edit/ls) safely. Real OpenRouter smoke --------------------- test_openrouter_deepagents_local_shell_smoke calls DeepSeek via deepagents + LocalShellBackend + SafetyShellMiddleware end-to-end. PASS, ~$0.000001 cost, input=9 / output=1 tokens with content "OK". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
6
my-deepagent/.env.example
Normal file
6
my-deepagent/.env.example
Normal file
@@ -0,0 +1,6 @@
|
|||||||
|
MYDEEPAGENT_OPENROUTER_API_KEY=
|
||||||
|
# MYDEEPAGENT_LANGSMITH_TRACING=true
|
||||||
|
# MYDEEPAGENT_LANGSMITH_API_KEY=
|
||||||
|
# MYDEEPAGENT_LANGSMITH_PROJECT=my-deepagent
|
||||||
|
# MYDEEPAGENT_DATA_DIR=
|
||||||
|
# MYDEEPAGENT_LANG=ko
|
||||||
17
my-deepagent/.gitignore
vendored
Normal file
17
my-deepagent/.gitignore
vendored
Normal file
@@ -0,0 +1,17 @@
|
|||||||
|
__pycache__/
|
||||||
|
*.py[cod]
|
||||||
|
*.egg-info/
|
||||||
|
.venv/
|
||||||
|
.pytest_cache/
|
||||||
|
.mypy_cache/
|
||||||
|
.ruff_cache/
|
||||||
|
|
||||||
|
.env
|
||||||
|
.env.local
|
||||||
|
|
||||||
|
*.db
|
||||||
|
*.db-journal
|
||||||
|
*.db-wal
|
||||||
|
*.db-shm
|
||||||
|
|
||||||
|
.DS_Store
|
||||||
19
my-deepagent/.pre-commit-config.yaml
Normal file
19
my-deepagent/.pre-commit-config.yaml
Normal file
@@ -0,0 +1,19 @@
|
|||||||
|
repos:
|
||||||
|
- repo: local
|
||||||
|
hooks:
|
||||||
|
- id: ruff
|
||||||
|
name: ruff check
|
||||||
|
entry: uv run ruff check --fix
|
||||||
|
language: system
|
||||||
|
types: [python]
|
||||||
|
- id: ruff-format
|
||||||
|
name: ruff format
|
||||||
|
entry: uv run ruff format
|
||||||
|
language: system
|
||||||
|
types: [python]
|
||||||
|
- id: mypy
|
||||||
|
name: mypy
|
||||||
|
entry: uv run mypy --strict src
|
||||||
|
language: system
|
||||||
|
types: [python]
|
||||||
|
pass_filenames: false
|
||||||
1
my-deepagent/.python-version
Normal file
1
my-deepagent/.python-version
Normal file
@@ -0,0 +1 @@
|
|||||||
|
3.12
|
||||||
26
my-deepagent/CHANGELOG.md
Normal file
26
my-deepagent/CHANGELOG.md
Normal file
@@ -0,0 +1,26 @@
|
|||||||
|
# Changelog
|
||||||
|
|
||||||
|
## [Unreleased]
|
||||||
|
|
||||||
|
### Added
|
||||||
|
- persistence/models.py (P0-1): partial unique index `ux_active_run_repo_base` on `runs(repo_path, base_branch) WHERE state NOT IN ('completed','failed','aborted')` — prevents duplicate active runs per repo/branch
|
||||||
|
- persistence/models.py (P0-3): FK constraints added to `RunRow.template_id` (RESTRICT), `RunBindingRow.persona_id` (RESTRICT), `InteractiveSessionRow.persona_id` (RESTRICT), `RunEventRow.phase_id` (CASCADE), `ApprovalRequestRow.phase_id` (CASCADE), `ArtifactRow.phase_id` (CASCADE), `ToolCallRow.run_id/phase_id/interactive_session_id` (CASCADE), `LlmCallRow.run_id/phase_id/interactive_session_id` (CASCADE), `PhaseFeedbackRow.run_id/phase_id` (CASCADE)
|
||||||
|
- alembic/versions/839f2233e346: new migration adding partial unique index and all FK constraints above; uses SQLite table-rebuild pattern with PRAGMA foreign_keys=OFF/ON guard
|
||||||
|
- persistence/checkpointer.py (P0-4): removed `get_checkpointer` (leaking connection helper); only `get_checkpointer_ctx` context manager is now exported
|
||||||
|
- tests/integration/test_checkpointer.py: 5 tests for checkpointer ctx lifecycle (file creation, parent dir, connection cleanup, lock-free concurrent use)
|
||||||
|
- tests/integration/test_persistence.py: 7 new P0 verification tests (active-run partial index blocks/allows, cascade-delete of phase_feedback+run_phases, RESTRICT on template delete, index exists in sqlite_master)
|
||||||
|
- tests/unit/test_session.py: full rewrite to deepagents dataclass API — FilesystemPermission attribute access (.mode/.paths/.operations), build_backend type dispatch (5 cases), _map_operations deduplication (8 cases), _spec_to_permission mapping, updated _subagent_to_dict and _resolve_openrouter_api_key tests; 47 unit tests total
|
||||||
|
- tests/integration/test_openrouter_smoke.py: real OpenRouter/DeepSeek smoke test (3 tests, ~$0.001-$0.003/run, max_tokens=50); skipped automatically when no API key is configured; validates ChatOpenAI response, usage_metadata tokens, and deepagents CompiledStateGraph end-to-end
|
||||||
|
- pyproject.toml: registered `integration` pytest marker to silence --strict-markers error
|
||||||
|
- v0.1.0 scaffolding (Step 0): src/tests/docs trees, ruff/mypy/pre-commit/alembic config
|
||||||
|
- Seed assets copied to docs/schemas/ (personas/workflows/artifacts validated)
|
||||||
|
- Core module (Step 1): config, enums, errors, hash + unit tests
|
||||||
|
- Persona / Workflow / Binding module (Step 2): pydantic schemas, YAML loaders, deterministic auto-select, override, consent store with atomic write
|
||||||
|
- Step 1 review patches (P0/P1): exception chain context suppression, classmethod LSP fix, workspace_root realpath canonicalization, config_invalid error mapping
|
||||||
|
|
||||||
|
### Changed
|
||||||
|
- deepagents 0.6.1 LocalShellBackend + permissions conflict workaround: removed `permissions` block from all 10 seed personas; `SafetyShellMiddleware` now enforces destructive-command + secret-path policy at the tool layer for local_shell backend agents.
|
||||||
|
- `build_agent` automatically prepends `SafetyShellMiddleware` to every agent and skips `permissions` kwarg when `deepagents_backend == "local_shell"`.
|
||||||
|
- `SafetyShellMiddleware` extended with secret-path enforcement: `read_file`/`write_file`/`edit_file`/`ls` tool calls are blocked when `file_path`/`path` matches any `DENY_PATH_PATTERNS` glob (wcmatch GLOBSTAR|IGNORECASE|DOTGLOB).
|
||||||
|
- All env vars require `MYDEEPAGENT_` prefix (e.g. `MYDEEPAGENT_OPENROUTER_API_KEY`, `MYDEEPAGENT_BUDGET_DAILY_USD`). `.env.example` updated accordingly. This isolates my-deepagent's env namespace from other tools.
|
||||||
|
- Persona / Workflow / FilesystemPermission models now store list-valued fields as tuples (deep immutability — prevents post-construction mutation that would invalidate compute_hash()).
|
||||||
149
my-deepagent/alembic.ini
Normal file
149
my-deepagent/alembic.ini
Normal file
@@ -0,0 +1,149 @@
|
|||||||
|
# A generic, single database configuration.
|
||||||
|
|
||||||
|
[alembic]
|
||||||
|
# path to migration scripts.
|
||||||
|
# this is typically a path given in POSIX (e.g. forward slashes)
|
||||||
|
# format, relative to the token %(here)s which refers to the location of this
|
||||||
|
# ini file
|
||||||
|
script_location = %(here)s/alembic
|
||||||
|
|
||||||
|
# template used to generate migration file names; The default value is %%(rev)s_%%(slug)s
|
||||||
|
# Uncomment the line below if you want the files to be prepended with date and time
|
||||||
|
# see https://alembic.sqlalchemy.org/en/latest/tutorial.html#editing-the-ini-file
|
||||||
|
# for all available tokens
|
||||||
|
# file_template = %%(year)d_%%(month).2d_%%(day).2d_%%(hour).2d%%(minute).2d-%%(rev)s_%%(slug)s
|
||||||
|
# Or organize into date-based subdirectories (requires recursive_version_locations = true)
|
||||||
|
# file_template = %%(year)d/%%(month).2d/%%(day).2d_%%(hour).2d%%(minute).2d_%%(second).2d_%%(rev)s_%%(slug)s
|
||||||
|
|
||||||
|
# sys.path path, will be prepended to sys.path if present.
|
||||||
|
# defaults to the current working directory. for multiple paths, the path separator
|
||||||
|
# is defined by "path_separator" below.
|
||||||
|
prepend_sys_path = .
|
||||||
|
|
||||||
|
|
||||||
|
# timezone to use when rendering the date within the migration file
|
||||||
|
# as well as the filename.
|
||||||
|
# If specified, requires the tzdata library which can be installed by adding
|
||||||
|
# `alembic[tz]` to the pip requirements.
|
||||||
|
# string value is passed to ZoneInfo()
|
||||||
|
# leave blank for localtime
|
||||||
|
# timezone =
|
||||||
|
|
||||||
|
# max length of characters to apply to the "slug" field
|
||||||
|
# truncate_slug_length = 40
|
||||||
|
|
||||||
|
# set to 'true' to run the environment during
|
||||||
|
# the 'revision' command, regardless of autogenerate
|
||||||
|
# revision_environment = false
|
||||||
|
|
||||||
|
# set to 'true' to allow .pyc and .pyo files without
|
||||||
|
# a source .py file to be detected as revisions in the
|
||||||
|
# versions/ directory
|
||||||
|
# sourceless = false
|
||||||
|
|
||||||
|
# version location specification; This defaults
|
||||||
|
# to <script_location>/versions. When using multiple version
|
||||||
|
# directories, initial revisions must be specified with --version-path.
|
||||||
|
# The path separator used here should be the separator specified by "path_separator"
|
||||||
|
# below.
|
||||||
|
# version_locations = %(here)s/bar:%(here)s/bat:%(here)s/alembic/versions
|
||||||
|
|
||||||
|
# path_separator; This indicates what character is used to split lists of file
|
||||||
|
# paths, including version_locations and prepend_sys_path within configparser
|
||||||
|
# files such as alembic.ini.
|
||||||
|
# The default rendered in new alembic.ini files is "os", which uses os.pathsep
|
||||||
|
# to provide os-dependent path splitting.
|
||||||
|
#
|
||||||
|
# Note that in order to support legacy alembic.ini files, this default does NOT
|
||||||
|
# take place if path_separator is not present in alembic.ini. If this
|
||||||
|
# option is omitted entirely, fallback logic is as follows:
|
||||||
|
#
|
||||||
|
# 1. Parsing of the version_locations option falls back to using the legacy
|
||||||
|
# "version_path_separator" key, which if absent then falls back to the legacy
|
||||||
|
# behavior of splitting on spaces and/or commas.
|
||||||
|
# 2. Parsing of the prepend_sys_path option falls back to the legacy
|
||||||
|
# behavior of splitting on spaces, commas, or colons.
|
||||||
|
#
|
||||||
|
# Valid values for path_separator are:
|
||||||
|
#
|
||||||
|
# path_separator = :
|
||||||
|
# path_separator = ;
|
||||||
|
# path_separator = space
|
||||||
|
# path_separator = newline
|
||||||
|
#
|
||||||
|
# Use os.pathsep. Default configuration used for new projects.
|
||||||
|
path_separator = os
|
||||||
|
|
||||||
|
# set to 'true' to search source files recursively
|
||||||
|
# in each "version_locations" directory
|
||||||
|
# new in Alembic version 1.10
|
||||||
|
# recursive_version_locations = false
|
||||||
|
|
||||||
|
# the output encoding used when revision files
|
||||||
|
# are written from script.py.mako
|
||||||
|
# output_encoding = utf-8
|
||||||
|
|
||||||
|
# database URL. This is consumed by the user-maintained env.py script only.
|
||||||
|
# other means of configuring database URLs may be customized within the env.py
|
||||||
|
# file.
|
||||||
|
sqlalchemy.url = driver://user:pass@localhost/dbname
|
||||||
|
|
||||||
|
|
||||||
|
[post_write_hooks]
|
||||||
|
# post_write_hooks defines scripts or Python functions that are run
|
||||||
|
# on newly generated revision scripts. See the documentation for further
|
||||||
|
# detail and examples
|
||||||
|
|
||||||
|
# format using "black" - use the console_scripts runner, against the "black" entrypoint
|
||||||
|
# hooks = black
|
||||||
|
# black.type = console_scripts
|
||||||
|
# black.entrypoint = black
|
||||||
|
# black.options = -l 79 REVISION_SCRIPT_FILENAME
|
||||||
|
|
||||||
|
# lint with attempts to fix using "ruff" - use the module runner, against the "ruff" module
|
||||||
|
# hooks = ruff
|
||||||
|
# ruff.type = module
|
||||||
|
# ruff.module = ruff
|
||||||
|
# ruff.options = check --fix REVISION_SCRIPT_FILENAME
|
||||||
|
|
||||||
|
# Alternatively, use the exec runner to execute a binary found on your PATH
|
||||||
|
# hooks = ruff
|
||||||
|
# ruff.type = exec
|
||||||
|
# ruff.executable = ruff
|
||||||
|
# ruff.options = check --fix REVISION_SCRIPT_FILENAME
|
||||||
|
|
||||||
|
# Logging configuration. This is also consumed by the user-maintained
|
||||||
|
# env.py script only.
|
||||||
|
[loggers]
|
||||||
|
keys = root,sqlalchemy,alembic
|
||||||
|
|
||||||
|
[handlers]
|
||||||
|
keys = console
|
||||||
|
|
||||||
|
[formatters]
|
||||||
|
keys = generic
|
||||||
|
|
||||||
|
[logger_root]
|
||||||
|
level = WARNING
|
||||||
|
handlers = console
|
||||||
|
qualname =
|
||||||
|
|
||||||
|
[logger_sqlalchemy]
|
||||||
|
level = WARNING
|
||||||
|
handlers =
|
||||||
|
qualname = sqlalchemy.engine
|
||||||
|
|
||||||
|
[logger_alembic]
|
||||||
|
level = INFO
|
||||||
|
handlers =
|
||||||
|
qualname = alembic
|
||||||
|
|
||||||
|
[handler_console]
|
||||||
|
class = StreamHandler
|
||||||
|
args = (sys.stderr,)
|
||||||
|
level = NOTSET
|
||||||
|
formatter = generic
|
||||||
|
|
||||||
|
[formatter_generic]
|
||||||
|
format = %(levelname)-5.5s [%(name)s] %(message)s
|
||||||
|
datefmt = %H:%M:%S
|
||||||
1
my-deepagent/alembic/README
Normal file
1
my-deepagent/alembic/README
Normal file
@@ -0,0 +1 @@
|
|||||||
|
Generic single-database configuration.
|
||||||
83
my-deepagent/alembic/env.py
Normal file
83
my-deepagent/alembic/env.py
Normal file
@@ -0,0 +1,83 @@
|
|||||||
|
import os
|
||||||
|
from logging.config import fileConfig
|
||||||
|
|
||||||
|
from sqlalchemy import engine_from_config, pool
|
||||||
|
|
||||||
|
from alembic import context
|
||||||
|
|
||||||
|
# this is the Alembic Config object, which provides
|
||||||
|
# access to the values within the .ini file in use.
|
||||||
|
config = context.config
|
||||||
|
|
||||||
|
# Load DATABASE_URL from environment, falling back to a local SQLite file.
|
||||||
|
# Alembic uses synchronous SQLAlchemy, so strip the async driver prefix when
|
||||||
|
# present (sqlite+aiosqlite:// → sqlite://).
|
||||||
|
_raw_url: str = os.environ.get("DATABASE_URL", "sqlite:///./database.sqlite3")
|
||||||
|
_sync_url: str = _raw_url.replace("sqlite+aiosqlite://", "sqlite://")
|
||||||
|
config.set_main_option("sqlalchemy.url", _sync_url)
|
||||||
|
|
||||||
|
# Interpret the config file for Python logging.
|
||||||
|
# This line sets up loggers basically.
|
||||||
|
if config.config_file_name is not None:
|
||||||
|
fileConfig(config.config_file_name)
|
||||||
|
|
||||||
|
# add your model's MetaData object here
|
||||||
|
# for 'autogenerate' support
|
||||||
|
from my_deepagent.persistence.models import Base # noqa: E402
|
||||||
|
|
||||||
|
target_metadata = Base.metadata
|
||||||
|
|
||||||
|
# other values from the config, defined by the needs of env.py,
|
||||||
|
# can be acquired:
|
||||||
|
# my_important_option = config.get_main_option("my_important_option")
|
||||||
|
# ... etc.
|
||||||
|
|
||||||
|
|
||||||
|
def run_migrations_offline() -> None:
|
||||||
|
"""Run migrations in 'offline' mode.
|
||||||
|
|
||||||
|
This configures the context with just a URL
|
||||||
|
and not an Engine, though an Engine is acceptable
|
||||||
|
here as well. By skipping the Engine creation
|
||||||
|
we don't even need a DBAPI to be available.
|
||||||
|
|
||||||
|
Calls to context.execute() here emit the given string to the
|
||||||
|
script output.
|
||||||
|
|
||||||
|
"""
|
||||||
|
url = config.get_main_option("sqlalchemy.url")
|
||||||
|
context.configure(
|
||||||
|
url=url,
|
||||||
|
target_metadata=target_metadata,
|
||||||
|
literal_binds=True,
|
||||||
|
dialect_opts={"paramstyle": "named"},
|
||||||
|
)
|
||||||
|
|
||||||
|
with context.begin_transaction():
|
||||||
|
context.run_migrations()
|
||||||
|
|
||||||
|
|
||||||
|
def run_migrations_online() -> None:
|
||||||
|
"""Run migrations in 'online' mode.
|
||||||
|
|
||||||
|
In this scenario we need to create an Engine
|
||||||
|
and associate a connection with the context.
|
||||||
|
|
||||||
|
"""
|
||||||
|
connectable = engine_from_config(
|
||||||
|
config.get_section(config.config_ini_section, {}),
|
||||||
|
prefix="sqlalchemy.",
|
||||||
|
poolclass=pool.NullPool,
|
||||||
|
)
|
||||||
|
|
||||||
|
with connectable.connect() as connection:
|
||||||
|
context.configure(connection=connection, target_metadata=target_metadata)
|
||||||
|
|
||||||
|
with context.begin_transaction():
|
||||||
|
context.run_migrations()
|
||||||
|
|
||||||
|
|
||||||
|
if context.is_offline_mode():
|
||||||
|
run_migrations_offline()
|
||||||
|
else:
|
||||||
|
run_migrations_online()
|
||||||
28
my-deepagent/alembic/script.py.mako
Normal file
28
my-deepagent/alembic/script.py.mako
Normal file
@@ -0,0 +1,28 @@
|
|||||||
|
"""${message}
|
||||||
|
|
||||||
|
Revision ID: ${up_revision}
|
||||||
|
Revises: ${down_revision | comma,n}
|
||||||
|
Create Date: ${create_date}
|
||||||
|
|
||||||
|
"""
|
||||||
|
from typing import Sequence, Union
|
||||||
|
|
||||||
|
from alembic import op
|
||||||
|
import sqlalchemy as sa
|
||||||
|
${imports if imports else ""}
|
||||||
|
|
||||||
|
# revision identifiers, used by Alembic.
|
||||||
|
revision: str = ${repr(up_revision)}
|
||||||
|
down_revision: Union[str, Sequence[str], None] = ${repr(down_revision)}
|
||||||
|
branch_labels: Union[str, Sequence[str], None] = ${repr(branch_labels)}
|
||||||
|
depends_on: Union[str, Sequence[str], None] = ${repr(depends_on)}
|
||||||
|
|
||||||
|
|
||||||
|
def upgrade() -> None:
|
||||||
|
"""Upgrade schema."""
|
||||||
|
${upgrades if upgrades else "pass"}
|
||||||
|
|
||||||
|
|
||||||
|
def downgrade() -> None:
|
||||||
|
"""Downgrade schema."""
|
||||||
|
${downgrades if downgrades else "pass"}
|
||||||
@@ -0,0 +1,303 @@
|
|||||||
|
"""baseline schema for v0.1.0
|
||||||
|
|
||||||
|
Revision ID: 79945fdc2649
|
||||||
|
Revises:
|
||||||
|
Create Date: 2026-05-15 17:19:09.577439
|
||||||
|
|
||||||
|
"""
|
||||||
|
|
||||||
|
from collections.abc import Sequence
|
||||||
|
|
||||||
|
import sqlalchemy as sa
|
||||||
|
|
||||||
|
from alembic import op
|
||||||
|
|
||||||
|
# revision identifiers, used by Alembic.
|
||||||
|
revision: str = "79945fdc2649"
|
||||||
|
down_revision: str | Sequence[str] | None = None
|
||||||
|
branch_labels: str | Sequence[str] | None = None
|
||||||
|
depends_on: str | Sequence[str] | None = None
|
||||||
|
|
||||||
|
|
||||||
|
def upgrade() -> None:
|
||||||
|
"""Upgrade schema."""
|
||||||
|
# ### commands auto generated by Alembic - please adjust! ###
|
||||||
|
op.create_table(
|
||||||
|
"agent_personas",
|
||||||
|
sa.Column("id", sa.String(length=36), nullable=False),
|
||||||
|
sa.Column("name", sa.Text(), nullable=False),
|
||||||
|
sa.Column("version", sa.Integer(), nullable=False),
|
||||||
|
sa.Column("hash", sa.Text(), nullable=False),
|
||||||
|
sa.Column("definition", sa.JSON(), nullable=False),
|
||||||
|
sa.Column("created_at", sa.Text(), nullable=False),
|
||||||
|
sa.PrimaryKeyConstraint("id"),
|
||||||
|
sa.UniqueConstraint("hash"),
|
||||||
|
)
|
||||||
|
op.create_table(
|
||||||
|
"budget_ledger",
|
||||||
|
sa.Column("scope", sa.Text(), nullable=False),
|
||||||
|
sa.Column("spent_usd", sa.Float(), nullable=False),
|
||||||
|
sa.Column("cap_usd", sa.Float(), nullable=True),
|
||||||
|
sa.Column("last_updated", sa.Text(), nullable=False),
|
||||||
|
sa.PrimaryKeyConstraint("scope"),
|
||||||
|
)
|
||||||
|
op.create_table(
|
||||||
|
"interactive_sessions",
|
||||||
|
sa.Column("id", sa.String(length=36), nullable=False),
|
||||||
|
sa.Column("persona_id", sa.String(length=36), nullable=False),
|
||||||
|
sa.Column("persona_hash", sa.Text(), nullable=False),
|
||||||
|
sa.Column("started_at", sa.Text(), nullable=True),
|
||||||
|
sa.Column("ended_at", sa.Text(), nullable=True),
|
||||||
|
sa.Column("last_message_at", sa.Text(), nullable=True),
|
||||||
|
sa.Column("state", sa.Text(), nullable=False),
|
||||||
|
sa.PrimaryKeyConstraint("id"),
|
||||||
|
)
|
||||||
|
op.create_table(
|
||||||
|
"llm_calls",
|
||||||
|
sa.Column("id", sa.Integer(), autoincrement=True, nullable=False),
|
||||||
|
sa.Column("run_id", sa.String(length=36), nullable=True),
|
||||||
|
sa.Column("phase_id", sa.String(length=36), nullable=True),
|
||||||
|
sa.Column("interactive_session_id", sa.String(length=36), nullable=True),
|
||||||
|
sa.Column("thread_id", sa.Text(), nullable=False),
|
||||||
|
sa.Column("persona_name", sa.Text(), nullable=False),
|
||||||
|
sa.Column("persona_version", sa.Integer(), nullable=False),
|
||||||
|
sa.Column("model", sa.Text(), nullable=False),
|
||||||
|
sa.Column("role", sa.Text(), nullable=False),
|
||||||
|
sa.Column("turn_index", sa.Integer(), nullable=False),
|
||||||
|
sa.Column("input_tokens", sa.Integer(), nullable=False),
|
||||||
|
sa.Column("output_tokens", sa.Integer(), nullable=False),
|
||||||
|
sa.Column("cached_tokens", sa.Integer(), nullable=False),
|
||||||
|
sa.Column("reasoning_tokens", sa.Integer(), nullable=False),
|
||||||
|
sa.Column("cost_usd_input", sa.Float(), nullable=False),
|
||||||
|
sa.Column("cost_usd_output", sa.Float(), nullable=False),
|
||||||
|
sa.Column("cost_usd_total", sa.Float(), nullable=False),
|
||||||
|
sa.Column("latency_ms", sa.Integer(), nullable=False),
|
||||||
|
sa.Column("status", sa.Text(), nullable=False),
|
||||||
|
sa.Column("error_code", sa.Text(), nullable=True),
|
||||||
|
sa.Column("request_id", sa.Text(), nullable=True),
|
||||||
|
sa.Column("ts", sa.Text(), nullable=False),
|
||||||
|
sa.PrimaryKeyConstraint("id"),
|
||||||
|
)
|
||||||
|
op.create_index(
|
||||||
|
"llm_calls_interactive_session_id_ts_idx",
|
||||||
|
"llm_calls",
|
||||||
|
["interactive_session_id", "ts"],
|
||||||
|
unique=False,
|
||||||
|
)
|
||||||
|
op.create_index("llm_calls_model_ts_idx", "llm_calls", ["model", "ts"], unique=False)
|
||||||
|
op.create_index("llm_calls_run_id_ts_idx", "llm_calls", ["run_id", "ts"], unique=False)
|
||||||
|
op.create_table(
|
||||||
|
"model_pricing",
|
||||||
|
sa.Column("model", sa.Text(), nullable=False),
|
||||||
|
sa.Column("input_per_1k_usd", sa.Float(), nullable=False),
|
||||||
|
sa.Column("output_per_1k_usd", sa.Float(), nullable=False),
|
||||||
|
sa.Column("context_length", sa.Integer(), nullable=False),
|
||||||
|
sa.Column("fetched_at", sa.Text(), nullable=False),
|
||||||
|
sa.Column("raw_payload", sa.Text(), nullable=False),
|
||||||
|
sa.PrimaryKeyConstraint("model"),
|
||||||
|
)
|
||||||
|
op.create_table(
|
||||||
|
"persona_consents",
|
||||||
|
sa.Column("persona_hash", sa.Text(), nullable=False),
|
||||||
|
sa.Column("persona_name", sa.Text(), nullable=False),
|
||||||
|
sa.Column("persona_version", sa.Integer(), nullable=False),
|
||||||
|
sa.Column("decision", sa.Text(), nullable=False),
|
||||||
|
sa.Column("decided_at", sa.Text(), nullable=False),
|
||||||
|
sa.PrimaryKeyConstraint("persona_hash"),
|
||||||
|
)
|
||||||
|
op.create_table(
|
||||||
|
"phase_feedback",
|
||||||
|
sa.Column("id", sa.Integer(), autoincrement=True, nullable=False),
|
||||||
|
sa.Column("run_id", sa.String(length=36), nullable=False),
|
||||||
|
sa.Column("phase_id", sa.String(length=36), nullable=False),
|
||||||
|
sa.Column("reaction", sa.Text(), nullable=True),
|
||||||
|
sa.Column("comment", sa.Text(), nullable=True),
|
||||||
|
sa.Column("created_at", sa.Text(), nullable=False),
|
||||||
|
sa.PrimaryKeyConstraint("id"),
|
||||||
|
)
|
||||||
|
op.create_table(
|
||||||
|
"runs",
|
||||||
|
sa.Column("id", sa.String(length=36), nullable=False),
|
||||||
|
sa.Column("template_id", sa.String(length=36), nullable=False),
|
||||||
|
sa.Column("template_hash", sa.Text(), nullable=False),
|
||||||
|
sa.Column("state", sa.Text(), nullable=False),
|
||||||
|
sa.Column("repo_path", sa.Text(), nullable=False),
|
||||||
|
sa.Column("base_branch", sa.Text(), nullable=False),
|
||||||
|
sa.Column("worktree_root", sa.Text(), nullable=False),
|
||||||
|
sa.Column("current_phase_id", sa.String(length=36), nullable=True),
|
||||||
|
sa.Column("started_at", sa.Text(), nullable=True),
|
||||||
|
sa.Column("ended_at", sa.Text(), nullable=True),
|
||||||
|
sa.Column("final_report_path", sa.Text(), nullable=True),
|
||||||
|
sa.Column("paused_from_state", sa.Text(), nullable=True),
|
||||||
|
sa.Column("created_at", sa.Text(), nullable=False),
|
||||||
|
sa.Column("updated_at", sa.Text(), nullable=False),
|
||||||
|
sa.PrimaryKeyConstraint("id"),
|
||||||
|
)
|
||||||
|
op.create_table(
|
||||||
|
"tool_calls",
|
||||||
|
sa.Column("id", sa.Integer(), autoincrement=True, nullable=False),
|
||||||
|
sa.Column("run_id", sa.String(length=36), nullable=True),
|
||||||
|
sa.Column("phase_id", sa.String(length=36), nullable=True),
|
||||||
|
sa.Column("interactive_session_id", sa.String(length=36), nullable=True),
|
||||||
|
sa.Column("tool_name", sa.Text(), nullable=False),
|
||||||
|
sa.Column("args", sa.JSON(), nullable=False),
|
||||||
|
sa.Column("result", sa.JSON(), nullable=True),
|
||||||
|
sa.Column("error", sa.Text(), nullable=True),
|
||||||
|
sa.Column("duration_ms", sa.Integer(), nullable=False),
|
||||||
|
sa.Column("ts", sa.Text(), nullable=False),
|
||||||
|
sa.PrimaryKeyConstraint("id"),
|
||||||
|
)
|
||||||
|
op.create_index("tool_calls_run_id_ts_idx", "tool_calls", ["run_id", "ts"], unique=False)
|
||||||
|
op.create_table(
|
||||||
|
"workflow_templates",
|
||||||
|
sa.Column("id", sa.String(length=36), nullable=False),
|
||||||
|
sa.Column("name", sa.Text(), nullable=False),
|
||||||
|
sa.Column("version", sa.Integer(), nullable=False),
|
||||||
|
sa.Column("hash", sa.Text(), nullable=False),
|
||||||
|
sa.Column("definition", sa.JSON(), nullable=False),
|
||||||
|
sa.Column("created_at", sa.Text(), nullable=False),
|
||||||
|
sa.PrimaryKeyConstraint("id"),
|
||||||
|
sa.UniqueConstraint("hash"),
|
||||||
|
)
|
||||||
|
op.create_table(
|
||||||
|
"approval_requests",
|
||||||
|
sa.Column("id", sa.String(length=36), nullable=False),
|
||||||
|
sa.Column("run_id", sa.String(length=36), nullable=False),
|
||||||
|
sa.Column("phase_id", sa.String(length=36), nullable=True),
|
||||||
|
sa.Column("gate_key", sa.Text(), nullable=False),
|
||||||
|
sa.Column("state", sa.Text(), nullable=False),
|
||||||
|
sa.Column("idempotency_key", sa.Text(), nullable=False),
|
||||||
|
sa.Column("payload", sa.JSON(), nullable=False),
|
||||||
|
sa.Column("created_at", sa.Text(), nullable=False),
|
||||||
|
sa.Column("resolved_at", sa.Text(), nullable=True),
|
||||||
|
sa.ForeignKeyConstraint(["run_id"], ["runs.id"], ondelete="CASCADE"),
|
||||||
|
sa.PrimaryKeyConstraint("id"),
|
||||||
|
sa.UniqueConstraint("idempotency_key"),
|
||||||
|
)
|
||||||
|
op.create_table(
|
||||||
|
"artifacts",
|
||||||
|
sa.Column("id", sa.String(length=36), nullable=False),
|
||||||
|
sa.Column("run_id", sa.String(length=36), nullable=False),
|
||||||
|
sa.Column("phase_id", sa.String(length=36), nullable=True),
|
||||||
|
sa.Column("path", sa.Text(), nullable=False),
|
||||||
|
sa.Column("schema_id", sa.Text(), nullable=False),
|
||||||
|
sa.Column("hash", sa.Text(), nullable=False),
|
||||||
|
sa.Column("valid", sa.Boolean(), nullable=False),
|
||||||
|
sa.Column("validation_error", sa.JSON(), nullable=True),
|
||||||
|
sa.Column("created_at", sa.Text(), nullable=False),
|
||||||
|
sa.ForeignKeyConstraint(["run_id"], ["runs.id"], ondelete="CASCADE"),
|
||||||
|
sa.PrimaryKeyConstraint("id"),
|
||||||
|
sa.UniqueConstraint("run_id", "path", "hash", name="uq_artifacts_run_path_hash"),
|
||||||
|
)
|
||||||
|
op.create_table(
|
||||||
|
"run_bindings",
|
||||||
|
sa.Column("id", sa.String(length=36), nullable=False),
|
||||||
|
sa.Column("run_id", sa.String(length=36), nullable=False),
|
||||||
|
sa.Column("role_id", sa.Text(), nullable=False),
|
||||||
|
sa.Column("persona_id", sa.String(length=36), nullable=False),
|
||||||
|
sa.Column("persona_hash", sa.Text(), nullable=False),
|
||||||
|
sa.Column("backend", sa.Text(), nullable=False),
|
||||||
|
sa.Column("binding_hash", sa.Text(), nullable=False),
|
||||||
|
sa.ForeignKeyConstraint(["run_id"], ["runs.id"], ondelete="CASCADE"),
|
||||||
|
sa.PrimaryKeyConstraint("id"),
|
||||||
|
sa.UniqueConstraint("run_id", "role_id", name="uq_run_bindings_run_role"),
|
||||||
|
)
|
||||||
|
op.create_table(
|
||||||
|
"run_commands",
|
||||||
|
sa.Column("id", sa.Integer(), autoincrement=True, nullable=False),
|
||||||
|
sa.Column("run_id", sa.String(length=36), nullable=False),
|
||||||
|
sa.Column("command", sa.Text(), nullable=False),
|
||||||
|
sa.Column("payload", sa.JSON(), nullable=False),
|
||||||
|
sa.Column("idempotency_key", sa.Text(), nullable=False),
|
||||||
|
sa.Column("created_at", sa.Text(), nullable=False),
|
||||||
|
sa.Column("processed_at", sa.Text(), nullable=True),
|
||||||
|
sa.ForeignKeyConstraint(["run_id"], ["runs.id"], ondelete="CASCADE"),
|
||||||
|
sa.PrimaryKeyConstraint("id"),
|
||||||
|
sa.UniqueConstraint("idempotency_key"),
|
||||||
|
)
|
||||||
|
op.create_table(
|
||||||
|
"run_events",
|
||||||
|
sa.Column("id", sa.Integer(), autoincrement=True, nullable=False),
|
||||||
|
sa.Column("run_id", sa.String(length=36), nullable=False),
|
||||||
|
sa.Column("phase_id", sa.String(length=36), nullable=True),
|
||||||
|
sa.Column("seq", sa.Integer(), nullable=False),
|
||||||
|
sa.Column("type", sa.Text(), nullable=False),
|
||||||
|
sa.Column("payload", sa.JSON(), nullable=False),
|
||||||
|
sa.Column("idempotency_key", sa.Text(), nullable=False),
|
||||||
|
sa.Column("ts", sa.Text(), nullable=False),
|
||||||
|
sa.ForeignKeyConstraint(["run_id"], ["runs.id"], ondelete="CASCADE"),
|
||||||
|
sa.PrimaryKeyConstraint("id"),
|
||||||
|
sa.UniqueConstraint("run_id", "idempotency_key", name="uq_run_events_run_idempotency"),
|
||||||
|
sa.UniqueConstraint("run_id", "seq", name="uq_run_events_run_seq"),
|
||||||
|
)
|
||||||
|
op.create_index("run_events_run_id_ts_idx", "run_events", ["run_id", "ts"], unique=False)
|
||||||
|
op.create_table(
|
||||||
|
"run_inputs",
|
||||||
|
sa.Column("id", sa.String(length=36), nullable=False),
|
||||||
|
sa.Column("run_id", sa.String(length=36), nullable=False),
|
||||||
|
sa.Column("requirements_md", sa.Text(), nullable=False),
|
||||||
|
sa.Column("objective", sa.JSON(), nullable=False),
|
||||||
|
sa.Column("extra", sa.JSON(), nullable=False),
|
||||||
|
sa.Column("input_hash", sa.Text(), nullable=False),
|
||||||
|
sa.ForeignKeyConstraint(["run_id"], ["runs.id"], ondelete="CASCADE"),
|
||||||
|
sa.PrimaryKeyConstraint("id"),
|
||||||
|
sa.UniqueConstraint("run_id"),
|
||||||
|
)
|
||||||
|
op.create_table(
|
||||||
|
"run_phases",
|
||||||
|
sa.Column("id", sa.String(length=36), nullable=False),
|
||||||
|
sa.Column("run_id", sa.String(length=36), nullable=False),
|
||||||
|
sa.Column("phase_key", sa.Text(), nullable=False),
|
||||||
|
sa.Column("seq", sa.Integer(), nullable=False),
|
||||||
|
sa.Column("state", sa.Text(), nullable=False),
|
||||||
|
sa.Column("attempts", sa.Integer(), nullable=False),
|
||||||
|
sa.Column("started_at", sa.Text(), nullable=True),
|
||||||
|
sa.Column("ended_at", sa.Text(), nullable=True),
|
||||||
|
sa.ForeignKeyConstraint(["run_id"], ["runs.id"], ondelete="CASCADE"),
|
||||||
|
sa.PrimaryKeyConstraint("id"),
|
||||||
|
sa.UniqueConstraint("run_id", "phase_key", name="uq_run_phases_run_phase"),
|
||||||
|
)
|
||||||
|
op.create_table(
|
||||||
|
"approval_decisions",
|
||||||
|
sa.Column("id", sa.String(length=36), nullable=False),
|
||||||
|
sa.Column("approval_request_id", sa.String(length=36), nullable=False),
|
||||||
|
sa.Column("action", sa.Text(), nullable=False),
|
||||||
|
sa.Column("comment", sa.Text(), nullable=True),
|
||||||
|
sa.Column("decided_at", sa.Text(), nullable=False),
|
||||||
|
sa.Column("idempotency_key", sa.Text(), nullable=False),
|
||||||
|
sa.ForeignKeyConstraint(
|
||||||
|
["approval_request_id"], ["approval_requests.id"], ondelete="CASCADE"
|
||||||
|
),
|
||||||
|
sa.PrimaryKeyConstraint("id"),
|
||||||
|
sa.UniqueConstraint("idempotency_key"),
|
||||||
|
)
|
||||||
|
# ### end Alembic commands ###
|
||||||
|
|
||||||
|
|
||||||
|
def downgrade() -> None:
|
||||||
|
"""Downgrade schema."""
|
||||||
|
# ### commands auto generated by Alembic - please adjust! ###
|
||||||
|
op.drop_table("approval_decisions")
|
||||||
|
op.drop_table("run_phases")
|
||||||
|
op.drop_table("run_inputs")
|
||||||
|
op.drop_index("run_events_run_id_ts_idx", table_name="run_events")
|
||||||
|
op.drop_table("run_events")
|
||||||
|
op.drop_table("run_commands")
|
||||||
|
op.drop_table("run_bindings")
|
||||||
|
op.drop_table("artifacts")
|
||||||
|
op.drop_table("approval_requests")
|
||||||
|
op.drop_table("workflow_templates")
|
||||||
|
op.drop_index("tool_calls_run_id_ts_idx", table_name="tool_calls")
|
||||||
|
op.drop_table("tool_calls")
|
||||||
|
op.drop_table("runs")
|
||||||
|
op.drop_table("phase_feedback")
|
||||||
|
op.drop_table("persona_consents")
|
||||||
|
op.drop_table("model_pricing")
|
||||||
|
op.drop_index("llm_calls_run_id_ts_idx", table_name="llm_calls")
|
||||||
|
op.drop_index("llm_calls_model_ts_idx", table_name="llm_calls")
|
||||||
|
op.drop_index("llm_calls_interactive_session_id_ts_idx", table_name="llm_calls")
|
||||||
|
op.drop_table("llm_calls")
|
||||||
|
op.drop_table("interactive_sessions")
|
||||||
|
op.drop_table("budget_ledger")
|
||||||
|
op.drop_table("agent_personas")
|
||||||
|
# ### end Alembic commands ###
|
||||||
@@ -0,0 +1,638 @@
|
|||||||
|
"""add active-run partial unique index and FK constraints
|
||||||
|
|
||||||
|
Revision ID: 839f2233e346
|
||||||
|
Revises: 79945fdc2649
|
||||||
|
Create Date: 2026-05-15 18:51:14.343577
|
||||||
|
|
||||||
|
Notes:
|
||||||
|
- P0-1: Adds partial unique index ux_active_run_repo_base on runs(repo_path, base_branch)
|
||||||
|
WHERE state NOT IN ('completed', 'failed', 'aborted'). SQLAlchemy autogenerate
|
||||||
|
cannot detect sqlite_where clauses, so this index is managed manually.
|
||||||
|
- P0-3: Adds FK constraints that were missing in the baseline migration:
|
||||||
|
* runs.template_id -> workflow_templates.id RESTRICT
|
||||||
|
* run_bindings.persona_id -> agent_personas.id RESTRICT
|
||||||
|
* interactive_sessions.persona_id -> agent_personas.id RESTRICT
|
||||||
|
* run_events.phase_id -> run_phases.id CASCADE
|
||||||
|
* approval_requests.phase_id -> run_phases.id CASCADE
|
||||||
|
* artifacts.phase_id -> run_phases.id CASCADE
|
||||||
|
* tool_calls.run_id -> runs.id CASCADE
|
||||||
|
* tool_calls.phase_id -> run_phases.id CASCADE
|
||||||
|
* tool_calls.interactive_session_id -> interactive_sessions.id CASCADE
|
||||||
|
* llm_calls.run_id -> runs.id CASCADE
|
||||||
|
* llm_calls.phase_id -> run_phases.id CASCADE
|
||||||
|
* llm_calls.interactive_session_id -> interactive_sessions.id CASCADE
|
||||||
|
* phase_feedback.run_id -> runs.id CASCADE
|
||||||
|
* phase_feedback.phase_id -> run_phases.id CASCADE
|
||||||
|
- runs.current_phase_id intentionally has NO FK: it forms a circular reference with
|
||||||
|
run_phases.run_id. SQLite does not support deferrable FK constraints in the same
|
||||||
|
way as PostgreSQL, so referential integrity for this column is enforced by
|
||||||
|
application code rather than the database.
|
||||||
|
- SQLite does not support ADD CONSTRAINT via ALTER TABLE. All FK additions are done
|
||||||
|
by recreating the affected tables (copy-data-drop-rename pattern).
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from collections.abc import Sequence
|
||||||
|
|
||||||
|
from alembic import op
|
||||||
|
|
||||||
|
# revision identifiers, used by Alembic.
|
||||||
|
revision: str = "839f2233e346"
|
||||||
|
down_revision: str | Sequence[str] | None = "79945fdc2649"
|
||||||
|
branch_labels: str | Sequence[str] | None = None
|
||||||
|
depends_on: str | Sequence[str] | None = None
|
||||||
|
|
||||||
|
|
||||||
|
def upgrade() -> None:
|
||||||
|
"""Upgrade schema.
|
||||||
|
|
||||||
|
SQLite does not support ALTER TABLE ... ADD CONSTRAINT, so each table that needs
|
||||||
|
a new FK is rebuilt using the standard SQLite table-rename pattern:
|
||||||
|
1. Disable FK enforcement during rebuild (PRAGMA foreign_keys=OFF).
|
||||||
|
2. Create new table with correct FK constraints.
|
||||||
|
3. Copy data from old table.
|
||||||
|
4. Drop old table.
|
||||||
|
5. Rename new table to original name.
|
||||||
|
6. Re-enable FK enforcement (PRAGMA foreign_keys=ON).
|
||||||
|
|
||||||
|
Indexes and unique constraints referencing the old table are also recreated.
|
||||||
|
"""
|
||||||
|
# Disable FK enforcement during table rebuild to avoid constraint violations
|
||||||
|
# while the old tables (with no FK columns) are temporarily inconsistent.
|
||||||
|
op.execute("PRAGMA foreign_keys=OFF")
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# runs: add template_id FK (RESTRICT) + P0-1 partial unique index.
|
||||||
|
# Rebuild because SQLite cannot ADD CONSTRAINT.
|
||||||
|
# The partial unique index is created after the rebuild (not before)
|
||||||
|
# because DROP TABLE would destroy any pre-existing index on the old table.
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
op.execute(
|
||||||
|
"""
|
||||||
|
CREATE TABLE runs_new (
|
||||||
|
id TEXT NOT NULL,
|
||||||
|
template_id TEXT NOT NULL
|
||||||
|
REFERENCES workflow_templates (id) ON DELETE RESTRICT,
|
||||||
|
template_hash TEXT NOT NULL,
|
||||||
|
state TEXT NOT NULL,
|
||||||
|
repo_path TEXT NOT NULL,
|
||||||
|
base_branch TEXT NOT NULL,
|
||||||
|
worktree_root TEXT NOT NULL,
|
||||||
|
current_phase_id TEXT,
|
||||||
|
started_at TEXT,
|
||||||
|
ended_at TEXT,
|
||||||
|
final_report_path TEXT,
|
||||||
|
paused_from_state TEXT,
|
||||||
|
created_at TEXT NOT NULL,
|
||||||
|
updated_at TEXT NOT NULL,
|
||||||
|
PRIMARY KEY (id)
|
||||||
|
)
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
op.execute(
|
||||||
|
"INSERT INTO runs_new SELECT id, template_id, template_hash, state, "
|
||||||
|
"repo_path, base_branch, worktree_root, current_phase_id, "
|
||||||
|
"started_at, ended_at, final_report_path, paused_from_state, "
|
||||||
|
"created_at, updated_at FROM runs"
|
||||||
|
)
|
||||||
|
op.execute("DROP TABLE runs")
|
||||||
|
op.execute("ALTER TABLE runs_new RENAME TO runs")
|
||||||
|
# P0-1: partial unique index — created after the rebuild.
|
||||||
|
op.execute(
|
||||||
|
"CREATE UNIQUE INDEX ux_active_run_repo_base "
|
||||||
|
"ON runs (repo_path, base_branch) "
|
||||||
|
"WHERE state NOT IN ('completed', 'failed', 'aborted')"
|
||||||
|
)
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# run_bindings: add persona_id FK (RESTRICT)
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
op.execute(
|
||||||
|
"""
|
||||||
|
CREATE TABLE run_bindings_new (
|
||||||
|
id TEXT NOT NULL,
|
||||||
|
run_id TEXT NOT NULL
|
||||||
|
REFERENCES runs (id) ON DELETE CASCADE,
|
||||||
|
role_id TEXT NOT NULL,
|
||||||
|
persona_id TEXT NOT NULL
|
||||||
|
REFERENCES agent_personas (id) ON DELETE RESTRICT,
|
||||||
|
persona_hash TEXT NOT NULL,
|
||||||
|
backend TEXT NOT NULL,
|
||||||
|
binding_hash TEXT NOT NULL,
|
||||||
|
PRIMARY KEY (id),
|
||||||
|
UNIQUE (run_id, role_id)
|
||||||
|
)
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
op.execute(
|
||||||
|
"INSERT INTO run_bindings_new SELECT id, run_id, role_id, persona_id, "
|
||||||
|
"persona_hash, backend, binding_hash FROM run_bindings"
|
||||||
|
)
|
||||||
|
op.execute("DROP TABLE run_bindings")
|
||||||
|
op.execute("ALTER TABLE run_bindings_new RENAME TO run_bindings")
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# interactive_sessions: add persona_id FK (RESTRICT)
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
op.execute(
|
||||||
|
"""
|
||||||
|
CREATE TABLE interactive_sessions_new (
|
||||||
|
id TEXT NOT NULL,
|
||||||
|
persona_id TEXT NOT NULL
|
||||||
|
REFERENCES agent_personas (id) ON DELETE RESTRICT,
|
||||||
|
persona_hash TEXT NOT NULL,
|
||||||
|
started_at TEXT,
|
||||||
|
ended_at TEXT,
|
||||||
|
last_message_at TEXT,
|
||||||
|
state TEXT NOT NULL,
|
||||||
|
PRIMARY KEY (id)
|
||||||
|
)
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
op.execute(
|
||||||
|
"INSERT INTO interactive_sessions_new SELECT id, persona_id, persona_hash, "
|
||||||
|
"started_at, ended_at, last_message_at, state FROM interactive_sessions"
|
||||||
|
)
|
||||||
|
op.execute("DROP TABLE interactive_sessions")
|
||||||
|
op.execute("ALTER TABLE interactive_sessions_new RENAME TO interactive_sessions")
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# run_events: add phase_id FK (CASCADE)
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
op.execute(
|
||||||
|
"""
|
||||||
|
CREATE TABLE run_events_new (
|
||||||
|
id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
|
||||||
|
run_id TEXT NOT NULL
|
||||||
|
REFERENCES runs (id) ON DELETE CASCADE,
|
||||||
|
phase_id TEXT
|
||||||
|
REFERENCES run_phases (id) ON DELETE CASCADE,
|
||||||
|
seq INTEGER NOT NULL,
|
||||||
|
type TEXT NOT NULL,
|
||||||
|
payload JSON NOT NULL,
|
||||||
|
idempotency_key TEXT NOT NULL,
|
||||||
|
ts TEXT NOT NULL,
|
||||||
|
UNIQUE (run_id, seq),
|
||||||
|
UNIQUE (run_id, idempotency_key)
|
||||||
|
)
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
op.execute(
|
||||||
|
"INSERT INTO run_events_new SELECT id, run_id, phase_id, seq, type, "
|
||||||
|
"payload, idempotency_key, ts FROM run_events"
|
||||||
|
)
|
||||||
|
op.execute("DROP INDEX IF EXISTS run_events_run_id_ts_idx")
|
||||||
|
op.execute("DROP TABLE run_events")
|
||||||
|
op.execute("ALTER TABLE run_events_new RENAME TO run_events")
|
||||||
|
op.execute("CREATE INDEX run_events_run_id_ts_idx ON run_events (run_id, ts)")
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# approval_requests: add phase_id FK (CASCADE)
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
op.execute(
|
||||||
|
"""
|
||||||
|
CREATE TABLE approval_requests_new (
|
||||||
|
id TEXT NOT NULL,
|
||||||
|
run_id TEXT NOT NULL
|
||||||
|
REFERENCES runs (id) ON DELETE CASCADE,
|
||||||
|
phase_id TEXT
|
||||||
|
REFERENCES run_phases (id) ON DELETE CASCADE,
|
||||||
|
gate_key TEXT NOT NULL,
|
||||||
|
state TEXT NOT NULL,
|
||||||
|
idempotency_key TEXT NOT NULL,
|
||||||
|
payload JSON NOT NULL,
|
||||||
|
created_at TEXT NOT NULL,
|
||||||
|
resolved_at TEXT,
|
||||||
|
PRIMARY KEY (id),
|
||||||
|
UNIQUE (idempotency_key)
|
||||||
|
)
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
op.execute(
|
||||||
|
"INSERT INTO approval_requests_new SELECT id, run_id, phase_id, gate_key, "
|
||||||
|
"state, idempotency_key, payload, created_at, resolved_at FROM approval_requests"
|
||||||
|
)
|
||||||
|
op.execute("DROP TABLE approval_requests")
|
||||||
|
op.execute("ALTER TABLE approval_requests_new RENAME TO approval_requests")
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# artifacts: add phase_id FK (CASCADE)
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
op.execute(
|
||||||
|
"""
|
||||||
|
CREATE TABLE artifacts_new (
|
||||||
|
id TEXT NOT NULL,
|
||||||
|
run_id TEXT NOT NULL
|
||||||
|
REFERENCES runs (id) ON DELETE CASCADE,
|
||||||
|
phase_id TEXT
|
||||||
|
REFERENCES run_phases (id) ON DELETE CASCADE,
|
||||||
|
path TEXT NOT NULL,
|
||||||
|
schema_id TEXT NOT NULL,
|
||||||
|
hash TEXT NOT NULL,
|
||||||
|
valid INTEGER NOT NULL,
|
||||||
|
validation_error JSON,
|
||||||
|
created_at TEXT NOT NULL,
|
||||||
|
PRIMARY KEY (id),
|
||||||
|
UNIQUE (run_id, path, hash)
|
||||||
|
)
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
op.execute(
|
||||||
|
"INSERT INTO artifacts_new SELECT id, run_id, phase_id, path, schema_id, "
|
||||||
|
"hash, valid, validation_error, created_at FROM artifacts"
|
||||||
|
)
|
||||||
|
op.execute("DROP TABLE artifacts")
|
||||||
|
op.execute("ALTER TABLE artifacts_new RENAME TO artifacts")
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# tool_calls: add run_id / phase_id / interactive_session_id FKs (CASCADE)
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
op.execute(
|
||||||
|
"""
|
||||||
|
CREATE TABLE tool_calls_new (
|
||||||
|
id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
|
||||||
|
run_id TEXT
|
||||||
|
REFERENCES runs (id) ON DELETE CASCADE,
|
||||||
|
phase_id TEXT
|
||||||
|
REFERENCES run_phases (id) ON DELETE CASCADE,
|
||||||
|
interactive_session_id TEXT
|
||||||
|
REFERENCES interactive_sessions (id) ON DELETE CASCADE,
|
||||||
|
tool_name TEXT NOT NULL,
|
||||||
|
args JSON NOT NULL,
|
||||||
|
result JSON,
|
||||||
|
error TEXT,
|
||||||
|
duration_ms INTEGER NOT NULL,
|
||||||
|
ts TEXT NOT NULL
|
||||||
|
)
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
op.execute(
|
||||||
|
"INSERT INTO tool_calls_new SELECT id, run_id, phase_id, interactive_session_id, "
|
||||||
|
"tool_name, args, result, error, duration_ms, ts FROM tool_calls"
|
||||||
|
)
|
||||||
|
op.execute("DROP INDEX IF EXISTS tool_calls_run_id_ts_idx")
|
||||||
|
op.execute("DROP TABLE tool_calls")
|
||||||
|
op.execute("ALTER TABLE tool_calls_new RENAME TO tool_calls")
|
||||||
|
op.execute("CREATE INDEX tool_calls_run_id_ts_idx ON tool_calls (run_id, ts)")
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# llm_calls: add run_id / phase_id / interactive_session_id FKs (CASCADE)
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
op.execute(
|
||||||
|
"""
|
||||||
|
CREATE TABLE llm_calls_new (
|
||||||
|
id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
|
||||||
|
run_id TEXT
|
||||||
|
REFERENCES runs (id) ON DELETE CASCADE,
|
||||||
|
phase_id TEXT
|
||||||
|
REFERENCES run_phases (id) ON DELETE CASCADE,
|
||||||
|
interactive_session_id TEXT
|
||||||
|
REFERENCES interactive_sessions (id) ON DELETE CASCADE,
|
||||||
|
thread_id TEXT NOT NULL,
|
||||||
|
persona_name TEXT NOT NULL,
|
||||||
|
persona_version INTEGER NOT NULL,
|
||||||
|
model TEXT NOT NULL,
|
||||||
|
role TEXT NOT NULL,
|
||||||
|
turn_index INTEGER NOT NULL,
|
||||||
|
input_tokens INTEGER NOT NULL,
|
||||||
|
output_tokens INTEGER NOT NULL,
|
||||||
|
cached_tokens INTEGER NOT NULL,
|
||||||
|
reasoning_tokens INTEGER NOT NULL,
|
||||||
|
cost_usd_input REAL NOT NULL,
|
||||||
|
cost_usd_output REAL NOT NULL,
|
||||||
|
cost_usd_total REAL NOT NULL,
|
||||||
|
latency_ms INTEGER NOT NULL,
|
||||||
|
status TEXT NOT NULL,
|
||||||
|
error_code TEXT,
|
||||||
|
request_id TEXT,
|
||||||
|
ts TEXT NOT NULL
|
||||||
|
)
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
op.execute(
|
||||||
|
"INSERT INTO llm_calls_new SELECT id, run_id, phase_id, interactive_session_id, "
|
||||||
|
"thread_id, persona_name, persona_version, model, role, turn_index, "
|
||||||
|
"input_tokens, output_tokens, cached_tokens, reasoning_tokens, "
|
||||||
|
"cost_usd_input, cost_usd_output, cost_usd_total, latency_ms, status, "
|
||||||
|
"error_code, request_id, ts FROM llm_calls"
|
||||||
|
)
|
||||||
|
op.execute("DROP INDEX IF EXISTS llm_calls_run_id_ts_idx")
|
||||||
|
op.execute("DROP INDEX IF EXISTS llm_calls_interactive_session_id_ts_idx")
|
||||||
|
op.execute("DROP INDEX IF EXISTS llm_calls_model_ts_idx")
|
||||||
|
op.execute("DROP TABLE llm_calls")
|
||||||
|
op.execute("ALTER TABLE llm_calls_new RENAME TO llm_calls")
|
||||||
|
op.execute("CREATE INDEX llm_calls_run_id_ts_idx ON llm_calls (run_id, ts)")
|
||||||
|
op.execute(
|
||||||
|
"CREATE INDEX llm_calls_interactive_session_id_ts_idx "
|
||||||
|
"ON llm_calls (interactive_session_id, ts)"
|
||||||
|
)
|
||||||
|
op.execute("CREATE INDEX llm_calls_model_ts_idx ON llm_calls (model, ts)")
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# phase_feedback: add run_id / phase_id FKs (CASCADE)
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
op.execute(
|
||||||
|
"""
|
||||||
|
CREATE TABLE phase_feedback_new (
|
||||||
|
id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
|
||||||
|
run_id TEXT NOT NULL
|
||||||
|
REFERENCES runs (id) ON DELETE CASCADE,
|
||||||
|
phase_id TEXT NOT NULL
|
||||||
|
REFERENCES run_phases (id) ON DELETE CASCADE,
|
||||||
|
reaction TEXT,
|
||||||
|
comment TEXT,
|
||||||
|
created_at TEXT NOT NULL
|
||||||
|
)
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
op.execute(
|
||||||
|
"INSERT INTO phase_feedback_new SELECT id, run_id, phase_id, "
|
||||||
|
"reaction, comment, created_at FROM phase_feedback"
|
||||||
|
)
|
||||||
|
op.execute("DROP TABLE phase_feedback")
|
||||||
|
op.execute("ALTER TABLE phase_feedback_new RENAME TO phase_feedback")
|
||||||
|
|
||||||
|
# Re-enable FK enforcement now that all tables have been rebuilt.
|
||||||
|
op.execute("PRAGMA foreign_keys=ON")
|
||||||
|
|
||||||
|
|
||||||
|
def downgrade() -> None:
|
||||||
|
"""Downgrade schema.
|
||||||
|
|
||||||
|
Reverses all FK additions and drops the partial unique index.
|
||||||
|
Tables that were rebuilt are reverted to their pre-upgrade structure
|
||||||
|
(no FK constraints on the affected columns).
|
||||||
|
"""
|
||||||
|
op.execute("PRAGMA foreign_keys=OFF")
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Revert phase_feedback
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
op.execute(
|
||||||
|
"""
|
||||||
|
CREATE TABLE phase_feedback_old (
|
||||||
|
id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
|
||||||
|
run_id TEXT NOT NULL,
|
||||||
|
phase_id TEXT NOT NULL,
|
||||||
|
reaction TEXT,
|
||||||
|
comment TEXT,
|
||||||
|
created_at TEXT NOT NULL
|
||||||
|
)
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
op.execute(
|
||||||
|
"INSERT INTO phase_feedback_old SELECT id, run_id, phase_id, "
|
||||||
|
"reaction, comment, created_at FROM phase_feedback"
|
||||||
|
)
|
||||||
|
op.execute("DROP TABLE phase_feedback")
|
||||||
|
op.execute("ALTER TABLE phase_feedback_old RENAME TO phase_feedback")
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Revert llm_calls
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
op.execute(
|
||||||
|
"""
|
||||||
|
CREATE TABLE llm_calls_old (
|
||||||
|
id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
|
||||||
|
run_id TEXT,
|
||||||
|
phase_id TEXT,
|
||||||
|
interactive_session_id TEXT,
|
||||||
|
thread_id TEXT NOT NULL,
|
||||||
|
persona_name TEXT NOT NULL,
|
||||||
|
persona_version INTEGER NOT NULL,
|
||||||
|
model TEXT NOT NULL,
|
||||||
|
role TEXT NOT NULL,
|
||||||
|
turn_index INTEGER NOT NULL,
|
||||||
|
input_tokens INTEGER NOT NULL,
|
||||||
|
output_tokens INTEGER NOT NULL,
|
||||||
|
cached_tokens INTEGER NOT NULL,
|
||||||
|
reasoning_tokens INTEGER NOT NULL,
|
||||||
|
cost_usd_input REAL NOT NULL,
|
||||||
|
cost_usd_output REAL NOT NULL,
|
||||||
|
cost_usd_total REAL NOT NULL,
|
||||||
|
latency_ms INTEGER NOT NULL,
|
||||||
|
status TEXT NOT NULL,
|
||||||
|
error_code TEXT,
|
||||||
|
request_id TEXT,
|
||||||
|
ts TEXT NOT NULL
|
||||||
|
)
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
op.execute(
|
||||||
|
"INSERT INTO llm_calls_old SELECT id, run_id, phase_id, interactive_session_id, "
|
||||||
|
"thread_id, persona_name, persona_version, model, role, turn_index, "
|
||||||
|
"input_tokens, output_tokens, cached_tokens, reasoning_tokens, "
|
||||||
|
"cost_usd_input, cost_usd_output, cost_usd_total, latency_ms, status, "
|
||||||
|
"error_code, request_id, ts FROM llm_calls"
|
||||||
|
)
|
||||||
|
op.execute("DROP INDEX IF EXISTS llm_calls_run_id_ts_idx")
|
||||||
|
op.execute("DROP INDEX IF EXISTS llm_calls_interactive_session_id_ts_idx")
|
||||||
|
op.execute("DROP INDEX IF EXISTS llm_calls_model_ts_idx")
|
||||||
|
op.execute("DROP TABLE llm_calls")
|
||||||
|
op.execute("ALTER TABLE llm_calls_old RENAME TO llm_calls")
|
||||||
|
op.execute("CREATE INDEX llm_calls_run_id_ts_idx ON llm_calls (run_id, ts)")
|
||||||
|
op.execute(
|
||||||
|
"CREATE INDEX llm_calls_interactive_session_id_ts_idx "
|
||||||
|
"ON llm_calls (interactive_session_id, ts)"
|
||||||
|
)
|
||||||
|
op.execute("CREATE INDEX llm_calls_model_ts_idx ON llm_calls (model, ts)")
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Revert tool_calls
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
op.execute(
|
||||||
|
"""
|
||||||
|
CREATE TABLE tool_calls_old (
|
||||||
|
id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
|
||||||
|
run_id TEXT,
|
||||||
|
phase_id TEXT,
|
||||||
|
interactive_session_id TEXT,
|
||||||
|
tool_name TEXT NOT NULL,
|
||||||
|
args JSON NOT NULL,
|
||||||
|
result JSON,
|
||||||
|
error TEXT,
|
||||||
|
duration_ms INTEGER NOT NULL,
|
||||||
|
ts TEXT NOT NULL
|
||||||
|
)
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
op.execute(
|
||||||
|
"INSERT INTO tool_calls_old SELECT id, run_id, phase_id, interactive_session_id, "
|
||||||
|
"tool_name, args, result, error, duration_ms, ts FROM tool_calls"
|
||||||
|
)
|
||||||
|
op.execute("DROP INDEX IF EXISTS tool_calls_run_id_ts_idx")
|
||||||
|
op.execute("DROP TABLE tool_calls")
|
||||||
|
op.execute("ALTER TABLE tool_calls_old RENAME TO tool_calls")
|
||||||
|
op.execute("CREATE INDEX tool_calls_run_id_ts_idx ON tool_calls (run_id, ts)")
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Revert artifacts
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
op.execute(
|
||||||
|
"""
|
||||||
|
CREATE TABLE artifacts_old (
|
||||||
|
id TEXT NOT NULL,
|
||||||
|
run_id TEXT NOT NULL
|
||||||
|
REFERENCES runs (id) ON DELETE CASCADE,
|
||||||
|
phase_id TEXT,
|
||||||
|
path TEXT NOT NULL,
|
||||||
|
schema_id TEXT NOT NULL,
|
||||||
|
hash TEXT NOT NULL,
|
||||||
|
valid INTEGER NOT NULL,
|
||||||
|
validation_error JSON,
|
||||||
|
created_at TEXT NOT NULL,
|
||||||
|
PRIMARY KEY (id),
|
||||||
|
UNIQUE (run_id, path, hash)
|
||||||
|
)
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
op.execute(
|
||||||
|
"INSERT INTO artifacts_old SELECT id, run_id, phase_id, path, schema_id, "
|
||||||
|
"hash, valid, validation_error, created_at FROM artifacts"
|
||||||
|
)
|
||||||
|
op.execute("DROP TABLE artifacts")
|
||||||
|
op.execute("ALTER TABLE artifacts_old RENAME TO artifacts")
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Revert approval_requests
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
op.execute(
|
||||||
|
"""
|
||||||
|
CREATE TABLE approval_requests_old (
|
||||||
|
id TEXT NOT NULL,
|
||||||
|
run_id TEXT NOT NULL
|
||||||
|
REFERENCES runs (id) ON DELETE CASCADE,
|
||||||
|
phase_id TEXT,
|
||||||
|
gate_key TEXT NOT NULL,
|
||||||
|
state TEXT NOT NULL,
|
||||||
|
idempotency_key TEXT NOT NULL,
|
||||||
|
payload JSON NOT NULL,
|
||||||
|
created_at TEXT NOT NULL,
|
||||||
|
resolved_at TEXT,
|
||||||
|
PRIMARY KEY (id),
|
||||||
|
UNIQUE (idempotency_key)
|
||||||
|
)
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
op.execute(
|
||||||
|
"INSERT INTO approval_requests_old SELECT id, run_id, phase_id, gate_key, "
|
||||||
|
"state, idempotency_key, payload, created_at, resolved_at FROM approval_requests"
|
||||||
|
)
|
||||||
|
op.execute("DROP TABLE approval_requests")
|
||||||
|
op.execute("ALTER TABLE approval_requests_old RENAME TO approval_requests")
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Revert run_events
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
op.execute(
|
||||||
|
"""
|
||||||
|
CREATE TABLE run_events_old (
|
||||||
|
id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
|
||||||
|
run_id TEXT NOT NULL
|
||||||
|
REFERENCES runs (id) ON DELETE CASCADE,
|
||||||
|
phase_id TEXT,
|
||||||
|
seq INTEGER NOT NULL,
|
||||||
|
type TEXT NOT NULL,
|
||||||
|
payload JSON NOT NULL,
|
||||||
|
idempotency_key TEXT NOT NULL,
|
||||||
|
ts TEXT NOT NULL,
|
||||||
|
UNIQUE (run_id, seq),
|
||||||
|
UNIQUE (run_id, idempotency_key)
|
||||||
|
)
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
op.execute(
|
||||||
|
"INSERT INTO run_events_old SELECT id, run_id, phase_id, seq, type, "
|
||||||
|
"payload, idempotency_key, ts FROM run_events"
|
||||||
|
)
|
||||||
|
op.execute("DROP INDEX IF EXISTS run_events_run_id_ts_idx")
|
||||||
|
op.execute("DROP TABLE run_events")
|
||||||
|
op.execute("ALTER TABLE run_events_old RENAME TO run_events")
|
||||||
|
op.execute("CREATE INDEX run_events_run_id_ts_idx ON run_events (run_id, ts)")
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Revert interactive_sessions
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
op.execute(
|
||||||
|
"""
|
||||||
|
CREATE TABLE interactive_sessions_old (
|
||||||
|
id TEXT NOT NULL,
|
||||||
|
persona_id TEXT NOT NULL,
|
||||||
|
persona_hash TEXT NOT NULL,
|
||||||
|
started_at TEXT,
|
||||||
|
ended_at TEXT,
|
||||||
|
last_message_at TEXT,
|
||||||
|
state TEXT NOT NULL,
|
||||||
|
PRIMARY KEY (id)
|
||||||
|
)
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
op.execute(
|
||||||
|
"INSERT INTO interactive_sessions_old SELECT id, persona_id, persona_hash, "
|
||||||
|
"started_at, ended_at, last_message_at, state FROM interactive_sessions"
|
||||||
|
)
|
||||||
|
op.execute("DROP TABLE interactive_sessions")
|
||||||
|
op.execute("ALTER TABLE interactive_sessions_old RENAME TO interactive_sessions")
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Revert run_bindings
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
op.execute(
|
||||||
|
"""
|
||||||
|
CREATE TABLE run_bindings_old (
|
||||||
|
id TEXT NOT NULL,
|
||||||
|
run_id TEXT NOT NULL
|
||||||
|
REFERENCES runs (id) ON DELETE CASCADE,
|
||||||
|
role_id TEXT NOT NULL,
|
||||||
|
persona_id TEXT NOT NULL,
|
||||||
|
persona_hash TEXT NOT NULL,
|
||||||
|
backend TEXT NOT NULL,
|
||||||
|
binding_hash TEXT NOT NULL,
|
||||||
|
PRIMARY KEY (id),
|
||||||
|
UNIQUE (run_id, role_id)
|
||||||
|
)
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
op.execute(
|
||||||
|
"INSERT INTO run_bindings_old SELECT id, run_id, role_id, persona_id, "
|
||||||
|
"persona_hash, backend, binding_hash FROM run_bindings"
|
||||||
|
)
|
||||||
|
op.execute("DROP TABLE run_bindings")
|
||||||
|
op.execute("ALTER TABLE run_bindings_old RENAME TO run_bindings")
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Revert runs (remove template_id FK)
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
op.execute("DROP INDEX IF EXISTS ux_active_run_repo_base")
|
||||||
|
op.execute(
|
||||||
|
"""
|
||||||
|
CREATE TABLE runs_old (
|
||||||
|
id TEXT NOT NULL,
|
||||||
|
template_id TEXT NOT NULL,
|
||||||
|
template_hash TEXT NOT NULL,
|
||||||
|
state TEXT NOT NULL,
|
||||||
|
repo_path TEXT NOT NULL,
|
||||||
|
base_branch TEXT NOT NULL,
|
||||||
|
worktree_root TEXT NOT NULL,
|
||||||
|
current_phase_id TEXT,
|
||||||
|
started_at TEXT,
|
||||||
|
ended_at TEXT,
|
||||||
|
final_report_path TEXT,
|
||||||
|
paused_from_state TEXT,
|
||||||
|
created_at TEXT NOT NULL,
|
||||||
|
updated_at TEXT NOT NULL,
|
||||||
|
PRIMARY KEY (id)
|
||||||
|
)
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
op.execute(
|
||||||
|
"INSERT INTO runs_old SELECT id, template_id, template_hash, state, "
|
||||||
|
"repo_path, base_branch, worktree_root, current_phase_id, "
|
||||||
|
"started_at, ended_at, final_report_path, paused_from_state, "
|
||||||
|
"created_at, updated_at FROM runs"
|
||||||
|
)
|
||||||
|
op.execute("DROP TABLE runs")
|
||||||
|
op.execute("ALTER TABLE runs_old RENAME TO runs")
|
||||||
|
|
||||||
|
op.execute("PRAGMA foreign_keys=ON")
|
||||||
BIN
my-deepagent/database.sqlite3
Normal file
BIN
my-deepagent/database.sqlite3
Normal file
Binary file not shown.
0
my-deepagent/docs/adr/.gitkeep
Normal file
0
my-deepagent/docs/adr/.gitkeep
Normal file
114
my-deepagent/docs/schemas/artifacts/common/final-report@1.json
Normal file
114
my-deepagent/docs/schemas/artifacts/common/final-report@1.json
Normal file
@@ -0,0 +1,114 @@
|
|||||||
|
{
|
||||||
|
"$schema": "https://json-schema.org/draft/2020-12/schema",
|
||||||
|
"$id": "common/final-report@1",
|
||||||
|
"title": "Common Final Report",
|
||||||
|
"description": "워크플로 실행 최종 보고서",
|
||||||
|
"type": "object",
|
||||||
|
"required": ["runId", "templateHash", "status", "phases", "endedAt"],
|
||||||
|
"additionalProperties": false,
|
||||||
|
"properties": {
|
||||||
|
"runId": {
|
||||||
|
"type": "string",
|
||||||
|
"format": "uuid",
|
||||||
|
"description": "실행 고유 식별자 (UUID)"
|
||||||
|
},
|
||||||
|
"templateHash": {
|
||||||
|
"type": "string",
|
||||||
|
"pattern": "^[a-f0-9]{64}$",
|
||||||
|
"description": "워크플로 템플릿의 sha256 해시 (hex)"
|
||||||
|
},
|
||||||
|
"status": {
|
||||||
|
"type": "string",
|
||||||
|
"enum": ["completed", "failed", "aborted"],
|
||||||
|
"description": "실행 최종 상태"
|
||||||
|
},
|
||||||
|
"inputs": {
|
||||||
|
"type": "object",
|
||||||
|
"description": "실행 입력값 (선택)"
|
||||||
|
},
|
||||||
|
"phases": {
|
||||||
|
"type": "array",
|
||||||
|
"items": {
|
||||||
|
"type": "object",
|
||||||
|
"required": ["key", "state"],
|
||||||
|
"additionalProperties": false,
|
||||||
|
"properties": {
|
||||||
|
"key": {
|
||||||
|
"type": "string",
|
||||||
|
"description": "phase 키"
|
||||||
|
},
|
||||||
|
"state": {
|
||||||
|
"type": "string",
|
||||||
|
"enum": ["pending", "running", "completed", "failed", "skipped"],
|
||||||
|
"description": "phase 실행 상태"
|
||||||
|
},
|
||||||
|
"started_at": {
|
||||||
|
"type": "string",
|
||||||
|
"format": "date-time",
|
||||||
|
"description": "시작 시각 (선택)"
|
||||||
|
},
|
||||||
|
"ended_at": {
|
||||||
|
"type": "string",
|
||||||
|
"format": "date-time",
|
||||||
|
"description": "종료 시각 (선택)"
|
||||||
|
},
|
||||||
|
"attempts": {
|
||||||
|
"type": "integer",
|
||||||
|
"minimum": 0,
|
||||||
|
"description": "시도 횟수 (선택)"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"description": "각 phase 실행 기록"
|
||||||
|
},
|
||||||
|
"approvals": {
|
||||||
|
"type": "array",
|
||||||
|
"items": {
|
||||||
|
"type": "object"
|
||||||
|
},
|
||||||
|
"description": "승인 기록 목록 (선택)"
|
||||||
|
},
|
||||||
|
"findings": {
|
||||||
|
"type": "array",
|
||||||
|
"items": {
|
||||||
|
"type": "object"
|
||||||
|
},
|
||||||
|
"description": "수집된 finding 목록 (선택)"
|
||||||
|
},
|
||||||
|
"artifacts": {
|
||||||
|
"type": "array",
|
||||||
|
"items": {
|
||||||
|
"type": "object",
|
||||||
|
"required": ["path", "schema"],
|
||||||
|
"additionalProperties": false,
|
||||||
|
"properties": {
|
||||||
|
"path": {
|
||||||
|
"type": "string",
|
||||||
|
"description": "산출물 파일 경로"
|
||||||
|
},
|
||||||
|
"schema": {
|
||||||
|
"type": "string",
|
||||||
|
"description": "산출물 JSON Schema ID"
|
||||||
|
},
|
||||||
|
"hash": {
|
||||||
|
"type": "string",
|
||||||
|
"description": "산출물 파일 해시 (선택)"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"description": "생성된 산출물 목록 (선택)"
|
||||||
|
},
|
||||||
|
"unresolved": {
|
||||||
|
"type": "array",
|
||||||
|
"items": {
|
||||||
|
"type": "string"
|
||||||
|
},
|
||||||
|
"description": "미해결 항목 목록 (선택)"
|
||||||
|
},
|
||||||
|
"endedAt": {
|
||||||
|
"type": "string",
|
||||||
|
"format": "date-time",
|
||||||
|
"description": "실행 종료 시각"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
80
my-deepagent/docs/schemas/artifacts/dev/phase-plan@1.json
Normal file
80
my-deepagent/docs/schemas/artifacts/dev/phase-plan@1.json
Normal file
@@ -0,0 +1,80 @@
|
|||||||
|
{
|
||||||
|
"$schema": "https://json-schema.org/draft/2020-12/schema",
|
||||||
|
"$id": "dev/phase-plan@1",
|
||||||
|
"title": "Dev Phase Plan",
|
||||||
|
"description": "실행 단계 계획 (spec 기반 phase 분해)",
|
||||||
|
"type": "object",
|
||||||
|
"required": ["runId", "phaseKey", "phases"],
|
||||||
|
"additionalProperties": false,
|
||||||
|
"properties": {
|
||||||
|
"runId": {
|
||||||
|
"type": "string",
|
||||||
|
"format": "uuid",
|
||||||
|
"description": "실행 고유 식별자 (spec.json과 동일한 UUID)"
|
||||||
|
},
|
||||||
|
"phaseKey": {
|
||||||
|
"type": "string",
|
||||||
|
"minLength": 1,
|
||||||
|
"description": "현재 phase 키 (통상 planning)"
|
||||||
|
},
|
||||||
|
"phases": {
|
||||||
|
"type": "array",
|
||||||
|
"items": {
|
||||||
|
"type": "object",
|
||||||
|
"required": ["key", "title", "role", "instructions"],
|
||||||
|
"additionalProperties": false,
|
||||||
|
"properties": {
|
||||||
|
"key": {
|
||||||
|
"type": "string",
|
||||||
|
"pattern": "^[a-z][a-z0-9-]*$",
|
||||||
|
"description": "단계 고유 식별자 (영소문자, 하이픈 허용)"
|
||||||
|
},
|
||||||
|
"title": {
|
||||||
|
"type": "string",
|
||||||
|
"minLength": 1,
|
||||||
|
"description": "단계 제목"
|
||||||
|
},
|
||||||
|
"role": {
|
||||||
|
"type": "string",
|
||||||
|
"minLength": 1,
|
||||||
|
"description": "담당 역할 ID"
|
||||||
|
},
|
||||||
|
"instructions": {
|
||||||
|
"type": "string",
|
||||||
|
"minLength": 10,
|
||||||
|
"description": "담당자에 대한 구체적인 지시사항"
|
||||||
|
},
|
||||||
|
"expected_artifact": {
|
||||||
|
"type": "object",
|
||||||
|
"required": ["path", "schema"],
|
||||||
|
"additionalProperties": false,
|
||||||
|
"properties": {
|
||||||
|
"path": {
|
||||||
|
"type": "string",
|
||||||
|
"description": "산출물 파일 경로"
|
||||||
|
},
|
||||||
|
"schema": {
|
||||||
|
"type": "string",
|
||||||
|
"description": "산출물 JSON Schema ID"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"description": "이 단계에서 생성할 산출물 (선택)"
|
||||||
|
},
|
||||||
|
"depends_on": {
|
||||||
|
"type": "array",
|
||||||
|
"items": {
|
||||||
|
"type": "string"
|
||||||
|
},
|
||||||
|
"description": "이 단계 실행 전에 완료돼야 할 선행 단계 키 목록 (선택)"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"description": "실행 단계 목록"
|
||||||
|
},
|
||||||
|
"estimated_duration_hours": {
|
||||||
|
"type": "number",
|
||||||
|
"minimum": 0,
|
||||||
|
"description": "전체 예상 소요 시간 (시간 단위, 선택)"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -0,0 +1,76 @@
|
|||||||
|
{
|
||||||
|
"$schema": "https://json-schema.org/draft/2020-12/schema",
|
||||||
|
"$id": "dev/review-finding-batch@1",
|
||||||
|
"title": "Dev Review Finding Batch",
|
||||||
|
"description": "코드 리뷰 또는 검증 결과 finding 묶음",
|
||||||
|
"type": "object",
|
||||||
|
"required": ["runId", "phaseKey", "reviewerRole", "findings", "summary"],
|
||||||
|
"additionalProperties": false,
|
||||||
|
"properties": {
|
||||||
|
"runId": {
|
||||||
|
"type": "string",
|
||||||
|
"format": "uuid",
|
||||||
|
"description": "실행 고유 식별자 (UUID)"
|
||||||
|
},
|
||||||
|
"phaseKey": {
|
||||||
|
"type": "string",
|
||||||
|
"minLength": 1,
|
||||||
|
"description": "현재 phase 키 (예: review, verify)"
|
||||||
|
},
|
||||||
|
"reviewerRole": {
|
||||||
|
"type": "string",
|
||||||
|
"minLength": 1,
|
||||||
|
"description": "리뷰어 역할 (예: code-reviewer, verifier, security-auditor)"
|
||||||
|
},
|
||||||
|
"findings": {
|
||||||
|
"type": "array",
|
||||||
|
"items": {
|
||||||
|
"type": "object",
|
||||||
|
"required": ["severity", "category", "summary"],
|
||||||
|
"additionalProperties": false,
|
||||||
|
"properties": {
|
||||||
|
"severity": {
|
||||||
|
"type": "string",
|
||||||
|
"enum": ["info", "low", "medium", "high", "critical"],
|
||||||
|
"description": "심각도"
|
||||||
|
},
|
||||||
|
"category": {
|
||||||
|
"type": "string",
|
||||||
|
"enum": ["correctness", "evidence", "style", "security", "performance", "other"],
|
||||||
|
"description": "finding 카테고리"
|
||||||
|
},
|
||||||
|
"summary": {
|
||||||
|
"type": "string",
|
||||||
|
"minLength": 1,
|
||||||
|
"description": "문제 요약 (보안 finding은 OWASP 카테고리 prefix 권장)"
|
||||||
|
},
|
||||||
|
"filePath": {
|
||||||
|
"type": "string",
|
||||||
|
"description": "해당 파일 경로 (선택)"
|
||||||
|
},
|
||||||
|
"line": {
|
||||||
|
"type": "integer",
|
||||||
|
"minimum": 1,
|
||||||
|
"description": "해당 라인 번호 (선택)"
|
||||||
|
},
|
||||||
|
"evidence": {
|
||||||
|
"type": "string",
|
||||||
|
"description": "증거 코드 또는 설명 (선택)"
|
||||||
|
},
|
||||||
|
"verifierStatus": {
|
||||||
|
"type": "string",
|
||||||
|
"enum": ["unverified", "confirmed", "rejected"],
|
||||||
|
"default": "unverified",
|
||||||
|
"description": "verifier의 검증 상태"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"description": "발견된 finding 목록"
|
||||||
|
},
|
||||||
|
"summary": {
|
||||||
|
"type": "string",
|
||||||
|
"minLength": 10,
|
||||||
|
"description": "전체 리뷰 요약"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
46
my-deepagent/docs/schemas/artifacts/dev/spec@1.json
Normal file
46
my-deepagent/docs/schemas/artifacts/dev/spec@1.json
Normal file
@@ -0,0 +1,46 @@
|
|||||||
|
{
|
||||||
|
"$schema": "https://json-schema.org/draft/2020-12/schema",
|
||||||
|
"$id": "dev/spec@1",
|
||||||
|
"title": "Dev Spec",
|
||||||
|
"description": "요구사항 분석 및 구현 접근법 명세",
|
||||||
|
"type": "object",
|
||||||
|
"required": ["runId", "phaseKey", "requirements", "acceptance_criteria", "approach", "risks"],
|
||||||
|
"additionalProperties": false,
|
||||||
|
"properties": {
|
||||||
|
"runId": {
|
||||||
|
"type": "string",
|
||||||
|
"format": "uuid",
|
||||||
|
"description": "실행 고유 식별자 (UUID)"
|
||||||
|
},
|
||||||
|
"phaseKey": {
|
||||||
|
"type": "string",
|
||||||
|
"minLength": 1,
|
||||||
|
"description": "현재 phase 키 (예: spec, diagnose, fix)"
|
||||||
|
},
|
||||||
|
"requirements": {
|
||||||
|
"type": "string",
|
||||||
|
"minLength": 10,
|
||||||
|
"description": "요구사항 상세 설명"
|
||||||
|
},
|
||||||
|
"acceptance_criteria": {
|
||||||
|
"type": "array",
|
||||||
|
"items": {
|
||||||
|
"type": "string"
|
||||||
|
},
|
||||||
|
"minItems": 1,
|
||||||
|
"description": "수락 기준 목록 (측정 가능하고 검증 가능해야 함)"
|
||||||
|
},
|
||||||
|
"approach": {
|
||||||
|
"type": "string",
|
||||||
|
"minLength": 10,
|
||||||
|
"description": "구현 또는 접근 방법 설명"
|
||||||
|
},
|
||||||
|
"risks": {
|
||||||
|
"type": "array",
|
||||||
|
"items": {
|
||||||
|
"type": "string"
|
||||||
|
},
|
||||||
|
"description": "위험 요소 목록 (없으면 빈 배열)"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -0,0 +1,54 @@
|
|||||||
|
name: default-interactive
|
||||||
|
version: 1
|
||||||
|
description: "interactive 모드 만능 어시스턴트. 탐색·수정·실행 모두 지원."
|
||||||
|
backend: openrouter
|
||||||
|
model: "openrouter:anthropic/claude-haiku-4-5"
|
||||||
|
provider_origin: "US/Anthropic"
|
||||||
|
capabilities:
|
||||||
|
- spec_write
|
||||||
|
- code_edit
|
||||||
|
- code_review
|
||||||
|
- evidence_check
|
||||||
|
- command_execute
|
||||||
|
max_risk_level: high
|
||||||
|
system_prompt: |
|
||||||
|
당신은 my-deepagent의 기본 interactive 어시스턴트입니다. 한국어로 대화합니다.
|
||||||
|
|
||||||
|
## 역할
|
||||||
|
사용자의 요청을 받아 코드 탐색, 수정, 실행 안내를 모두 수행합니다.
|
||||||
|
|
||||||
|
## deepagents 도구 사용법
|
||||||
|
- write_todos: 작업을 시작하기 전 반드시 write_todos로 계획을 번호 목록으로 작성합니다.
|
||||||
|
- read_file: 코드 파일을 읽어 현재 상태를 파악합니다.
|
||||||
|
- glob: 파일 패턴으로 관련 파일 목록을 찾습니다.
|
||||||
|
- grep: 특정 패턴을 코드베이스에서 검색합니다.
|
||||||
|
- edit_file: 기존 파일을 수정합니다. 변경 범위는 최소화합니다.
|
||||||
|
- write_file: 새 파일을 작성합니다.
|
||||||
|
- task: 복잡한 하위 작업을 subagent에게 위임합니다.
|
||||||
|
- execute: 명령어 실행이 필요할 때 사용자에게 안내합니다.
|
||||||
|
|
||||||
|
## 행동 원칙
|
||||||
|
- 항상 read_file/glob/grep으로 기존 코드를 파악한 뒤 수정합니다.
|
||||||
|
- 큰 변경은 write_todos로 단계별 계획 후 진행합니다.
|
||||||
|
- 완료 전 계획의 모든 항목이 구현됐는지 확인합니다.
|
||||||
|
- 모르면 솔직하게 말하고 사용자와 방향을 결정합니다.
|
||||||
|
allowed_tools:
|
||||||
|
- read_file
|
||||||
|
- write_file
|
||||||
|
- edit_file
|
||||||
|
- ls
|
||||||
|
- glob
|
||||||
|
- grep
|
||||||
|
- write_todos
|
||||||
|
- task
|
||||||
|
deepagents_backend: local_shell
|
||||||
|
fallback_model: "openrouter:deepseek/deepseek-chat"
|
||||||
|
max_cost_per_call_usd: 0.05
|
||||||
|
model_params:
|
||||||
|
max_tokens: 2048
|
||||||
|
temperature: 0.3
|
||||||
|
top_p: 1.0
|
||||||
|
interrupt_on:
|
||||||
|
execute:
|
||||||
|
allowed_decisions: [approve, reject]
|
||||||
|
write_file: false
|
||||||
@@ -0,0 +1,66 @@
|
|||||||
|
name: openrouter-claude-architect
|
||||||
|
version: 1
|
||||||
|
description: "시니어 아키텍트. 스택 선정·큰 리팩토링·데이터 모델 변경. 항상 trade-off 명시."
|
||||||
|
backend: openrouter
|
||||||
|
model: "openrouter:anthropic/claude-opus-4-1"
|
||||||
|
provider_origin: "US/Anthropic"
|
||||||
|
capabilities:
|
||||||
|
- spec_write
|
||||||
|
- phase_planning
|
||||||
|
- code_edit
|
||||||
|
max_risk_level: high
|
||||||
|
system_prompt: |
|
||||||
|
당신은 my-deepagent의 시니어 Architect입니다. 한국어로 대화합니다.
|
||||||
|
|
||||||
|
## 역할
|
||||||
|
크고 위험한 기술적 결정을 담당합니다:
|
||||||
|
- 기술 스택 선정 및 변경
|
||||||
|
- 대규모 리팩토링 계획
|
||||||
|
- 데이터 모델 설계 및 변경
|
||||||
|
- 시스템 경계 및 인터페이스 설계
|
||||||
|
|
||||||
|
## deepagents 도구 사용법
|
||||||
|
- write_todos: 반드시 먼저 분석 범위와 의사결정 기준을 write_todos로 작성합니다.
|
||||||
|
- read_file: 기존 아키텍처·설정·코드를 충분히 읽습니다.
|
||||||
|
- glob: 전체 프로젝트 구조를 파악합니다.
|
||||||
|
- grep: 의존성·패턴·사용처를 검색합니다.
|
||||||
|
- write_file: 아키텍처 결정 기록(ADR)을 artifacts/에 저장합니다.
|
||||||
|
- edit_file: 아키텍처 레벨의 코드 변경을 수행합니다.
|
||||||
|
- task: 구체적인 구현은 code-editor 또는 다른 전문 subagent에게 위임합니다.
|
||||||
|
|
||||||
|
## 의사결정 원칙
|
||||||
|
- 모든 결정에 trade-off를 명시합니다.
|
||||||
|
- 항상 대안 2~3개를 제시하고 선택 이유를 설명합니다.
|
||||||
|
- "지금 당장은 과도하지만 나중에 필요할 것" 같은 추측 기반 결정은 하지 않습니다.
|
||||||
|
- 결정 전 충분한 근거를 read_file/grep으로 수집합니다.
|
||||||
|
- 불가역적 변경은 사용자 승인 후 진행합니다.
|
||||||
|
|
||||||
|
## 보고 형식
|
||||||
|
결정 사항:
|
||||||
|
선택: [선택한 접근법]
|
||||||
|
이유: [구체적 근거]
|
||||||
|
대안 A: [접근법] — trade-off: [장단점]
|
||||||
|
대안 B: [접근법] — trade-off: [장단점]
|
||||||
|
리스크: [알려진 위험]
|
||||||
|
allowed_tools:
|
||||||
|
- read_file
|
||||||
|
- write_file
|
||||||
|
- edit_file
|
||||||
|
- ls
|
||||||
|
- glob
|
||||||
|
- grep
|
||||||
|
- write_todos
|
||||||
|
- task
|
||||||
|
deepagents_backend: local_shell
|
||||||
|
fallback_model: "openrouter:anthropic/claude-sonnet-4-6"
|
||||||
|
max_cost_per_call_usd: 0.50
|
||||||
|
model_params:
|
||||||
|
max_tokens: 4096
|
||||||
|
temperature: 0.2
|
||||||
|
top_p: 1.0
|
||||||
|
interrupt_on:
|
||||||
|
execute:
|
||||||
|
allowed_decisions: [approve, reject]
|
||||||
|
write_file: false
|
||||||
|
task:
|
||||||
|
allowed_decisions: [approve, reject]
|
||||||
@@ -0,0 +1,54 @@
|
|||||||
|
name: openrouter-claude-code-editor
|
||||||
|
version: 1
|
||||||
|
description: "코드 수정 전문. read → plan → edit → verify 순서 엄수."
|
||||||
|
backend: openrouter
|
||||||
|
model: "openrouter:anthropic/claude-sonnet-4-6"
|
||||||
|
provider_origin: "US/Anthropic"
|
||||||
|
capabilities:
|
||||||
|
- code_edit
|
||||||
|
- test_first_development
|
||||||
|
- command_execute
|
||||||
|
max_risk_level: medium
|
||||||
|
system_prompt: |
|
||||||
|
당신은 my-deepagent의 Code Editor입니다. 한국어로 대화합니다.
|
||||||
|
|
||||||
|
## 역할
|
||||||
|
코드를 안전하고 정확하게 수정합니다. 항상 컨텍스트 파악 → 계획 → 수정 → 검증 순서를 지킵니다.
|
||||||
|
|
||||||
|
## deepagents 도구 사용법
|
||||||
|
- read_file: 수정할 파일과 관련 파일을 반드시 먼저 읽습니다.
|
||||||
|
- glob: 수정에 영향받는 파일들을 검색합니다.
|
||||||
|
- grep: 함수·변수 사용처를 검색해 영향 범위를 파악합니다.
|
||||||
|
- write_todos: 컨텍스트 파악 후 반드시 번호 목록으로 수정 계획을 작성합니다.
|
||||||
|
- edit_file: 기존 파일의 일부를 수정합니다. 최소한의 변경만 합니다.
|
||||||
|
- write_file: 새 파일을 작성하거나 전체를 새로 작성할 때 사용합니다.
|
||||||
|
- task: 복잡한 하위 작업을 subagent에게 위임합니다.
|
||||||
|
- execute: 테스트 실행 명령어를 사용자에게 안내합니다.
|
||||||
|
|
||||||
|
## 코드 수정 원칙
|
||||||
|
- 수정 전 반드시 read_file로 현재 코드를 파악합니다.
|
||||||
|
- write_todos로 계획 작성 후 단계별로 수정합니다.
|
||||||
|
- 한 번에 너무 큰 변경은 금지합니다. 단계적으로 진행합니다.
|
||||||
|
- test_first_development: 수정 전 테스트 케이스를 먼저 작성합니다.
|
||||||
|
- 수정 후 execute로 테스트 실행을 안내합니다.
|
||||||
|
- TODO, FIXME, 스텁 코드는 완성 전에 완료 선언하지 않습니다.
|
||||||
|
allowed_tools:
|
||||||
|
- read_file
|
||||||
|
- write_file
|
||||||
|
- edit_file
|
||||||
|
- ls
|
||||||
|
- glob
|
||||||
|
- grep
|
||||||
|
- write_todos
|
||||||
|
- task
|
||||||
|
deepagents_backend: local_shell
|
||||||
|
fallback_model: "openrouter:anthropic/claude-haiku-4-5"
|
||||||
|
max_cost_per_call_usd: 0.15
|
||||||
|
model_params:
|
||||||
|
max_tokens: 4096
|
||||||
|
temperature: 0.2
|
||||||
|
top_p: 1.0
|
||||||
|
interrupt_on:
|
||||||
|
execute:
|
||||||
|
allowed_decisions: [approve, reject]
|
||||||
|
write_file: false
|
||||||
@@ -0,0 +1,75 @@
|
|||||||
|
name: openrouter-claude-code-reviewer
|
||||||
|
version: 1
|
||||||
|
description: "시니어 코드 리뷰어. dev/review-finding-batch@1 형식으로 review.json 작성."
|
||||||
|
backend: openrouter
|
||||||
|
model: "openrouter:anthropic/claude-sonnet-4-6"
|
||||||
|
provider_origin: "US/Anthropic"
|
||||||
|
capabilities:
|
||||||
|
- code_review
|
||||||
|
- evidence_check
|
||||||
|
max_risk_level: low
|
||||||
|
system_prompt: |
|
||||||
|
당신은 my-deepagent의 시니어 Code Reviewer입니다. 한국어로 대화합니다.
|
||||||
|
|
||||||
|
## 역할
|
||||||
|
코드를 꼼꼼히 리뷰하고 dev/review-finding-batch@1 JSON Schema에 맞는 review.json을 작성합니다.
|
||||||
|
보안 관련 항목은 security-auditor subagent에게 task로 위임합니다.
|
||||||
|
|
||||||
|
## deepagents 도구 사용법
|
||||||
|
- write_todos: 리뷰 시작 전 반드시 번호 목록으로 리뷰 계획을 작성합니다.
|
||||||
|
- read_file: 리뷰할 파일들을 읽습니다.
|
||||||
|
- glob: 리뷰 대상 파일 목록을 검색합니다.
|
||||||
|
- grep: 패턴 검색으로 문제 가능성이 있는 코드를 찾습니다.
|
||||||
|
- write_file: 완성된 review.json을 artifacts/review.json에 작성합니다.
|
||||||
|
- task: 보안 리뷰는 security-auditor subagent에게 위임합니다.
|
||||||
|
|
||||||
|
## review.json 작성 규칙
|
||||||
|
- runId: UUID 형식
|
||||||
|
- phaseKey: "review"
|
||||||
|
- reviewerRole: "code-reviewer"
|
||||||
|
- findings: 발견된 문제 목록
|
||||||
|
- severity: info | low | medium | high | critical
|
||||||
|
- category: correctness | evidence | style | security | performance | other
|
||||||
|
- summary: 문제 요약 (구체적으로)
|
||||||
|
- filePath: 해당 파일 경로 (선택)
|
||||||
|
- line: 해당 라인 번호 (선택)
|
||||||
|
- evidence: 증거 코드 또는 설명 (선택)
|
||||||
|
- verifierStatus: "unverified" (초기값)
|
||||||
|
- summary: 전체 리뷰 요약 (10자 이상)
|
||||||
|
|
||||||
|
## 리뷰 원칙
|
||||||
|
- 증거(evidence) 없는 주관적 비판은 하지 않습니다.
|
||||||
|
- 각 finding은 구체적인 파일 경로와 라인 번호를 포함합니다.
|
||||||
|
- 보안 이슈는 task로 security-auditor에게 위임합니다.
|
||||||
|
- 완성된 리뷰는 반드시 write_file로 artifacts/review.json에 저장합니다.
|
||||||
|
allowed_tools:
|
||||||
|
- read_file
|
||||||
|
- ls
|
||||||
|
- glob
|
||||||
|
- grep
|
||||||
|
- write_todos
|
||||||
|
- write_file
|
||||||
|
deepagents_backend: local_shell
|
||||||
|
fallback_model: "openrouter:anthropic/claude-haiku-4-5"
|
||||||
|
max_cost_per_call_usd: 0.10
|
||||||
|
model_params:
|
||||||
|
max_tokens: 4096
|
||||||
|
temperature: 0.2
|
||||||
|
top_p: 1.0
|
||||||
|
subagents:
|
||||||
|
- name: security-auditor
|
||||||
|
description: "보안 관점 격리 리뷰. OWASP 카테고리 사용."
|
||||||
|
system_prompt: |
|
||||||
|
당신은 보안 리뷰 전문 subagent입니다. 한국어로 대화합니다.
|
||||||
|
코드를 OWASP 관점에서 검토하고 보안 이슈를 finding으로 보고합니다.
|
||||||
|
각 finding의 summary 앞에 반드시 OWASP 카테고리 prefix를 붙입니다.
|
||||||
|
예: "[A01:Broken Access Control] 관리자 엔드포인트에 인증이 없음"
|
||||||
|
allowed_tools:
|
||||||
|
- read_file
|
||||||
|
- glob
|
||||||
|
- grep
|
||||||
|
model: "openrouter:anthropic/claude-sonnet-4-6"
|
||||||
|
interrupt_on:
|
||||||
|
execute:
|
||||||
|
allowed_decisions: [approve, reject]
|
||||||
|
write_file: false
|
||||||
@@ -0,0 +1,54 @@
|
|||||||
|
name: openrouter-claude-debugger
|
||||||
|
version: 1
|
||||||
|
description: "버그 진단 전문. 재현 → 가설 → 검증 → 수정 순서 엄수."
|
||||||
|
backend: openrouter
|
||||||
|
model: "openrouter:anthropic/claude-sonnet-4-6"
|
||||||
|
provider_origin: "US/Anthropic"
|
||||||
|
capabilities:
|
||||||
|
- code_edit
|
||||||
|
- evidence_check
|
||||||
|
- command_execute
|
||||||
|
max_risk_level: medium
|
||||||
|
system_prompt: |
|
||||||
|
당신은 my-deepagent의 Debugger입니다. 한국어로 대화합니다.
|
||||||
|
|
||||||
|
## 역할
|
||||||
|
버그를 체계적으로 진단하고 수정합니다.
|
||||||
|
항상 재현 → 가설 수립 → 가설 검증 → 수정 순서를 지킵니다.
|
||||||
|
|
||||||
|
## deepagents 도구 사용법
|
||||||
|
- write_todos: 디버깅 시작 전 반드시 재현 조건·가설·검증 계획을 작성합니다.
|
||||||
|
- read_file: 버그가 발생한 파일과 관련 파일을 읽습니다.
|
||||||
|
- glob: 영향받는 파일 범위를 검색합니다.
|
||||||
|
- grep: 에러 메시지, 함수명, 변수명으로 관련 코드를 검색합니다.
|
||||||
|
- execute: 테스트·로그 확인 명령어를 사용자에게 안내합니다.
|
||||||
|
- edit_file: 최소한의 변경으로 버그를 수정합니다.
|
||||||
|
- write_file: 재현 스크립트 또는 진단 결과를 저장합니다.
|
||||||
|
- task: 로그 분석이 필요할 때 log-analyzer subagent에게 위임합니다.
|
||||||
|
|
||||||
|
## 디버깅 원칙
|
||||||
|
- 추측만으로 수정하지 않습니다. 반드시 가설을 검증합니다.
|
||||||
|
- 여러 가설이 있을 때는 가장 단순한 것부터 검증합니다.
|
||||||
|
- root cause를 dev/spec@1 형식으로 artifacts/diagnosis.json에 문서화합니다.
|
||||||
|
- 수정 후 execute로 회귀 테스트 실행을 안내합니다.
|
||||||
|
- "버그를 고쳤다"고 하려면 테스트로 검증이 완료돼야 합니다.
|
||||||
|
allowed_tools:
|
||||||
|
- read_file
|
||||||
|
- write_file
|
||||||
|
- edit_file
|
||||||
|
- ls
|
||||||
|
- glob
|
||||||
|
- grep
|
||||||
|
- write_todos
|
||||||
|
- task
|
||||||
|
deepagents_backend: local_shell
|
||||||
|
fallback_model: "openrouter:anthropic/claude-haiku-4-5"
|
||||||
|
max_cost_per_call_usd: 0.15
|
||||||
|
model_params:
|
||||||
|
max_tokens: 4096
|
||||||
|
temperature: 0.2
|
||||||
|
top_p: 1.0
|
||||||
|
interrupt_on:
|
||||||
|
execute:
|
||||||
|
allowed_decisions: [approve, reject]
|
||||||
|
write_file: false
|
||||||
@@ -0,0 +1,58 @@
|
|||||||
|
name: openrouter-claude-phase-planner
|
||||||
|
version: 1
|
||||||
|
description: "spec을 읽고 dev/phase-plan@1 형식으로 실행 단계 계획 작성."
|
||||||
|
backend: openrouter
|
||||||
|
model: "openrouter:anthropic/claude-sonnet-4-6"
|
||||||
|
provider_origin: "US/Anthropic"
|
||||||
|
capabilities:
|
||||||
|
- phase_planning
|
||||||
|
- task_dag_planning
|
||||||
|
max_risk_level: low
|
||||||
|
system_prompt: |
|
||||||
|
당신은 my-deepagent의 Phase Planner입니다. 한국어로 대화합니다.
|
||||||
|
|
||||||
|
## 역할
|
||||||
|
artifacts/spec.json을 읽고 dev/phase-plan@1 JSON Schema에 맞는 phase-plan.json을 작성합니다.
|
||||||
|
|
||||||
|
## deepagents 도구 사용법
|
||||||
|
- write_todos: 작업 시작 전 반드시 번호 목록으로 계획을 작성합니다.
|
||||||
|
- read_file: artifacts/spec.json 및 관련 문서를 읽습니다.
|
||||||
|
- glob: 관련 파일을 검색합니다.
|
||||||
|
- grep: 코드베이스에서 패턴을 검색합니다.
|
||||||
|
- write_file: 완성된 phase-plan.json을 artifacts/phase-plan.json에 작성합니다.
|
||||||
|
|
||||||
|
## phase-plan.json 작성 규칙
|
||||||
|
- runId: spec.json과 동일한 UUID 사용
|
||||||
|
- phaseKey: "planning"
|
||||||
|
- phases: 각 실행 단계 배열
|
||||||
|
- key: 단계 고유 식별자 (영소문자-하이픈)
|
||||||
|
- title: 단계 제목
|
||||||
|
- role: 담당 역할 (spec_writer | reviewer | verifier | debugger | fixer 등)
|
||||||
|
- instructions: 해당 단계의 구체적인 지시사항
|
||||||
|
- expected_artifact: 선택사항 (path, schema)
|
||||||
|
- depends_on: 선택사항 (선행 단계 키 목록)
|
||||||
|
- estimated_duration_hours: 전체 예상 소요 시간 (선택사항)
|
||||||
|
|
||||||
|
## 행동 원칙
|
||||||
|
- spec의 acceptance_criteria를 단계별로 달성할 수 있게 phase를 설계합니다.
|
||||||
|
- 병렬 실행 가능한 단계는 depends_on 없이 배치합니다.
|
||||||
|
- 각 phase의 instructions는 담당자가 명확히 이해할 수 있도록 구체적으로 작성합니다.
|
||||||
|
- 완성된 plan은 반드시 write_file로 artifacts/phase-plan.json에 저장합니다.
|
||||||
|
allowed_tools:
|
||||||
|
- read_file
|
||||||
|
- write_file
|
||||||
|
- ls
|
||||||
|
- glob
|
||||||
|
- grep
|
||||||
|
- write_todos
|
||||||
|
deepagents_backend: local_shell
|
||||||
|
fallback_model: "openrouter:anthropic/claude-haiku-4-5"
|
||||||
|
max_cost_per_call_usd: 0.10
|
||||||
|
model_params:
|
||||||
|
max_tokens: 4096
|
||||||
|
temperature: 0.2
|
||||||
|
top_p: 1.0
|
||||||
|
interrupt_on:
|
||||||
|
execute:
|
||||||
|
allowed_decisions: [approve, reject]
|
||||||
|
write_file: false
|
||||||
@@ -0,0 +1,61 @@
|
|||||||
|
name: openrouter-claude-security-auditor
|
||||||
|
version: 1
|
||||||
|
description: "보안 전문 리뷰어. OWASP Top 10 기준 인증·권한·입력검증·비밀유출 중심."
|
||||||
|
backend: openrouter
|
||||||
|
model: "openrouter:anthropic/claude-sonnet-4-6"
|
||||||
|
provider_origin: "US/Anthropic"
|
||||||
|
capabilities:
|
||||||
|
- code_review
|
||||||
|
- evidence_check
|
||||||
|
max_risk_level: low
|
||||||
|
system_prompt: |
|
||||||
|
당신은 my-deepagent의 Security Auditor입니다. 한국어로 대화합니다.
|
||||||
|
|
||||||
|
## 역할
|
||||||
|
코드를 OWASP Top 10 기준으로 보안 취약점을 분석하고 review.json을 작성합니다.
|
||||||
|
|
||||||
|
## 집중 영역
|
||||||
|
- A01: Broken Access Control (인증·권한 미흡)
|
||||||
|
- A02: Cryptographic Failures (암호화·비밀 유출)
|
||||||
|
- A03: Injection (SQL, Command, LDAP 등)
|
||||||
|
- A05: Security Misconfiguration (설정 오류)
|
||||||
|
- A06: Vulnerable Components (공급망 위험)
|
||||||
|
- A07: Authentication Failures (인증 우회)
|
||||||
|
- A09: Security Logging Failures (감사 로그 누락)
|
||||||
|
|
||||||
|
## deepagents 도구 사용법
|
||||||
|
- write_todos: 감사 시작 전 반드시 번호 목록으로 감사 계획을 작성합니다.
|
||||||
|
- read_file: 보안 감사 대상 파일을 읽습니다.
|
||||||
|
- glob: 설정 파일, 인증 관련 파일을 검색합니다.
|
||||||
|
- grep: 위험 패턴 (eval, exec, subprocess, os.system, sql 등)을 검색합니다.
|
||||||
|
- write_file: 완성된 security-review.json을 artifacts/security-review.json에 작성합니다.
|
||||||
|
- write_todos: 감사 단계를 계획합니다.
|
||||||
|
|
||||||
|
## finding 작성 규칙
|
||||||
|
- summary 앞에 반드시 OWASP 카테고리 prefix: "[A0X:Category] 요약"
|
||||||
|
- severity는 CVSS 관점에서 판단 (critical/high/medium/low/info)
|
||||||
|
- category는 "security" 사용
|
||||||
|
- evidence: 취약한 코드 라인 또는 설정값을 직접 인용
|
||||||
|
- 증거 없는 추측성 finding은 작성하지 않습니다.
|
||||||
|
|
||||||
|
## 행동 원칙
|
||||||
|
- grep으로 위험 패턴을 먼저 검색한 뒤 read_file로 맥락을 확인합니다.
|
||||||
|
- 하드코딩된 비밀값, 환경 변수 누출, 권한 없는 경로 접근을 집중적으로 검토합니다.
|
||||||
|
- 완성된 결과는 write_file로 반드시 저장합니다.
|
||||||
|
allowed_tools:
|
||||||
|
- read_file
|
||||||
|
- glob
|
||||||
|
- grep
|
||||||
|
- write_file
|
||||||
|
- write_todos
|
||||||
|
deepagents_backend: local_shell
|
||||||
|
fallback_model: "openrouter:anthropic/claude-haiku-4-5"
|
||||||
|
max_cost_per_call_usd: 0.10
|
||||||
|
model_params:
|
||||||
|
max_tokens: 4096
|
||||||
|
temperature: 0.2
|
||||||
|
top_p: 1.0
|
||||||
|
interrupt_on:
|
||||||
|
execute:
|
||||||
|
allowed_decisions: [approve, reject]
|
||||||
|
write_file: false
|
||||||
@@ -0,0 +1,54 @@
|
|||||||
|
name: openrouter-claude-spec-writer
|
||||||
|
version: 1
|
||||||
|
description: "시니어 spec writer. 요구사항 분석 → dev/spec@1 schema JSON 작성."
|
||||||
|
backend: openrouter
|
||||||
|
model: "openrouter:anthropic/claude-sonnet-4-6"
|
||||||
|
provider_origin: "US/Anthropic"
|
||||||
|
capabilities:
|
||||||
|
- spec_write
|
||||||
|
- phase_planning
|
||||||
|
max_risk_level: low
|
||||||
|
system_prompt: |
|
||||||
|
당신은 my-deepagent의 시니어 Spec Writer입니다. 한국어로 대화합니다.
|
||||||
|
|
||||||
|
## 역할
|
||||||
|
사용자의 요구사항을 분석해 dev/spec@1 JSON Schema에 맞는 spec.json을 작성합니다.
|
||||||
|
|
||||||
|
## deepagents 도구 사용법
|
||||||
|
- write_todos: 작업 시작 전 반드시 번호 목록으로 계획을 작성합니다.
|
||||||
|
- read_file: 기존 코드·문서를 읽어 맥락을 파악합니다.
|
||||||
|
- glob: 관련 파일 목록을 검색합니다.
|
||||||
|
- grep: 특정 패턴을 코드베이스에서 찾습니다.
|
||||||
|
- write_file: 완성된 spec.json을 artifacts/spec.json 경로에 작성합니다.
|
||||||
|
|
||||||
|
## spec.json 작성 규칙
|
||||||
|
- runId: UUID 형식 (예: "00000000-0000-0000-0000-000000000001")
|
||||||
|
- phaseKey: 현재 phase 키 문자열
|
||||||
|
- requirements: 사용자 요구사항 상세 설명 (10자 이상)
|
||||||
|
- acceptance_criteria: 수락 기준 목록 (1개 이상, 구체적으로)
|
||||||
|
- approach: 구현 접근법 설명 (10자 이상)
|
||||||
|
- risks: 위험 요소 목록 (없으면 빈 배열 [])
|
||||||
|
|
||||||
|
## 행동 원칙
|
||||||
|
- 기존 코드베이스를 read_file/glob/grep으로 충분히 탐색한 뒤 spec을 작성합니다.
|
||||||
|
- acceptance_criteria는 측정 가능하고 검증 가능하게 작성합니다.
|
||||||
|
- 불명확한 요구사항은 합리적으로 가정하고 assumptions 섹션에 명시합니다.
|
||||||
|
- 완성된 spec은 반드시 write_file로 artifacts/spec.json에 저장합니다.
|
||||||
|
allowed_tools:
|
||||||
|
- read_file
|
||||||
|
- write_file
|
||||||
|
- ls
|
||||||
|
- glob
|
||||||
|
- grep
|
||||||
|
- write_todos
|
||||||
|
deepagents_backend: local_shell
|
||||||
|
fallback_model: "openrouter:anthropic/claude-haiku-4-5"
|
||||||
|
max_cost_per_call_usd: 0.10
|
||||||
|
model_params:
|
||||||
|
max_tokens: 4096
|
||||||
|
temperature: 0.2
|
||||||
|
top_p: 1.0
|
||||||
|
interrupt_on:
|
||||||
|
execute:
|
||||||
|
allowed_decisions: [approve, reject]
|
||||||
|
write_file: false
|
||||||
@@ -0,0 +1,53 @@
|
|||||||
|
name: openrouter-deepseek-log-analyzer
|
||||||
|
version: 1
|
||||||
|
description: "로그 파일·스택 트레이스 분석. 패턴 식별·빈도 집계·핵심 라인 추출."
|
||||||
|
backend: openrouter
|
||||||
|
model: "openrouter:deepseek/deepseek-chat"
|
||||||
|
provider_origin: "China/DeepSeek"
|
||||||
|
capabilities:
|
||||||
|
- evidence_check
|
||||||
|
- metric_extract
|
||||||
|
max_risk_level: low
|
||||||
|
system_prompt: |
|
||||||
|
당신은 my-deepagent의 Log Analyzer입니다. 한국어로 대화합니다.
|
||||||
|
|
||||||
|
## 역할
|
||||||
|
로그 파일과 스택 트레이스를 분석해 패턴을 식별하고 핵심 정보를 추출합니다.
|
||||||
|
|
||||||
|
## deepagents 도구 사용법
|
||||||
|
- write_todos: 분석 시작 전 반드시 번호 목록으로 분석 계획을 작성합니다.
|
||||||
|
- read_file: 로그 파일을 읽습니다.
|
||||||
|
- glob: 로그 파일 목록을 검색합니다 (*.log, *.txt, stderr 등).
|
||||||
|
- grep: 에러 패턴, 예외 클래스, 특정 메시지를 검색합니다.
|
||||||
|
- write_file: 분석 결과를 artifacts/log-analysis.json에 작성합니다.
|
||||||
|
|
||||||
|
## 분석 항목
|
||||||
|
- 에러 유형별 빈도 집계 (가장 많이 나타나는 에러 우선)
|
||||||
|
- 스택 트레이스 패턴 식별 (같은 root cause 그룹화)
|
||||||
|
- 타임라인 재구성 (이벤트 순서)
|
||||||
|
- 핵심 라인 추출 (실제로 중요한 라인만)
|
||||||
|
- 연관 에러 파악 (한 에러가 다른 에러를 유발하는지)
|
||||||
|
|
||||||
|
## 출력 원칙
|
||||||
|
- 원본 로그를 전부 요약하지 않습니다. 핵심만 추출합니다.
|
||||||
|
- 빈도 높은 패턴을 먼저 보고합니다.
|
||||||
|
- 추측은 "추정:" prefix를 붙여 명확히 구분합니다.
|
||||||
|
- 완성된 분석 결과는 write_file로 artifacts/log-analysis.json에 저장합니다.
|
||||||
|
allowed_tools:
|
||||||
|
- read_file
|
||||||
|
- ls
|
||||||
|
- glob
|
||||||
|
- grep
|
||||||
|
- write_file
|
||||||
|
- write_todos
|
||||||
|
deepagents_backend: local_shell
|
||||||
|
fallback_model: "openrouter:anthropic/claude-haiku-4-5"
|
||||||
|
max_cost_per_call_usd: 0.005
|
||||||
|
model_params:
|
||||||
|
max_tokens: 4096
|
||||||
|
temperature: 0.2
|
||||||
|
top_p: 1.0
|
||||||
|
interrupt_on:
|
||||||
|
execute:
|
||||||
|
allowed_decisions: [approve, reject]
|
||||||
|
write_file: false
|
||||||
@@ -0,0 +1,54 @@
|
|||||||
|
name: openrouter-deepseek-verifier
|
||||||
|
version: 1
|
||||||
|
description: "review.json의 각 finding을 독립적으로 검증. verifierStatus 판정."
|
||||||
|
backend: openrouter
|
||||||
|
model: "openrouter:deepseek/deepseek-chat"
|
||||||
|
provider_origin: "China/DeepSeek"
|
||||||
|
capabilities:
|
||||||
|
- evidence_check
|
||||||
|
- objective_eval
|
||||||
|
max_risk_level: low
|
||||||
|
system_prompt: |
|
||||||
|
당신은 my-deepagent의 Verifier입니다. 한국어로 대화합니다.
|
||||||
|
|
||||||
|
## 역할
|
||||||
|
artifacts/review.json의 각 finding을 코드 증거를 통해 독립적으로 검증하고
|
||||||
|
verifierStatus를 confirmed 또는 rejected로 판정합니다.
|
||||||
|
|
||||||
|
## deepagents 도구 사용법
|
||||||
|
- write_todos: 검증 시작 전 반드시 finding 목록과 검증 계획을 작성합니다.
|
||||||
|
- read_file: review.json을 읽고 각 finding의 filePath를 읽어 증거를 확인합니다.
|
||||||
|
- glob: 관련 파일을 검색합니다.
|
||||||
|
- grep: finding에서 언급된 패턴을 실제 코드에서 확인합니다.
|
||||||
|
- write_file: 검증 결과를 artifacts/verification.json에 작성합니다.
|
||||||
|
|
||||||
|
## 검증 원칙
|
||||||
|
- 각 finding을 독립적으로 코드에서 직접 확인합니다.
|
||||||
|
- confirmed: 코드에서 실제로 해당 문제가 존재함을 확인한 경우
|
||||||
|
- rejected: 코드를 확인했을 때 해당 문제가 없거나 이미 처리된 경우
|
||||||
|
- 판정 근거를 evidence 필드에 명시합니다 (확인한 코드 라인 포함).
|
||||||
|
- 증거 없이 주관적으로 판정하지 않습니다.
|
||||||
|
- 완성된 검증 결과는 write_file로 artifacts/verification.json에 저장합니다.
|
||||||
|
|
||||||
|
## verification.json 형식
|
||||||
|
review.json과 동일한 dev/review-finding-batch@1 형식.
|
||||||
|
각 finding의 verifierStatus를 confirmed 또는 rejected로 업데이트.
|
||||||
|
reviewerRole을 "verifier"로 변경.
|
||||||
|
allowed_tools:
|
||||||
|
- read_file
|
||||||
|
- ls
|
||||||
|
- glob
|
||||||
|
- grep
|
||||||
|
- write_file
|
||||||
|
- write_todos
|
||||||
|
deepagents_backend: local_shell
|
||||||
|
fallback_model: "openrouter:anthropic/claude-haiku-4-5"
|
||||||
|
max_cost_per_call_usd: 0.005
|
||||||
|
model_params:
|
||||||
|
max_tokens: 4096
|
||||||
|
temperature: 0.2
|
||||||
|
top_p: 1.0
|
||||||
|
interrupt_on:
|
||||||
|
execute:
|
||||||
|
allowed_decisions: [approve, reject]
|
||||||
|
write_file: false
|
||||||
@@ -0,0 +1,108 @@
|
|||||||
|
name: bug-fix-with-reproduction
|
||||||
|
version: 1
|
||||||
|
description: "버그 재현 → 진단 → 수정 → 검증. 각 단계 artifact 생성."
|
||||||
|
roles:
|
||||||
|
- id: reproducer
|
||||||
|
required_capabilities:
|
||||||
|
- evidence_check
|
||||||
|
preferred_backends:
|
||||||
|
- openrouter
|
||||||
|
fallback_personas:
|
||||||
|
- "openrouter-claude-debugger@1"
|
||||||
|
- "openrouter-deepseek-log-analyzer@1"
|
||||||
|
- id: debugger
|
||||||
|
required_capabilities:
|
||||||
|
- code_edit
|
||||||
|
- evidence_check
|
||||||
|
- command_execute
|
||||||
|
preferred_backends:
|
||||||
|
- openrouter
|
||||||
|
fallback_personas:
|
||||||
|
- "openrouter-claude-debugger@1"
|
||||||
|
- id: fixer
|
||||||
|
required_capabilities:
|
||||||
|
- code_edit
|
||||||
|
- test_first_development
|
||||||
|
preferred_backends:
|
||||||
|
- openrouter
|
||||||
|
fallback_personas:
|
||||||
|
- "openrouter-claude-code-editor@1"
|
||||||
|
- id: verifier
|
||||||
|
required_capabilities:
|
||||||
|
- evidence_check
|
||||||
|
- objective_eval
|
||||||
|
preferred_backends:
|
||||||
|
- openrouter
|
||||||
|
fallback_personas:
|
||||||
|
- "openrouter-deepseek-verifier@1"
|
||||||
|
phases:
|
||||||
|
- key: reproduce
|
||||||
|
title: "버그 재현 및 재현 조건 문서화"
|
||||||
|
risk: low
|
||||||
|
role: reproducer
|
||||||
|
expected_artifact:
|
||||||
|
path: artifacts/reproduction.json
|
||||||
|
schema: dev/spec@1
|
||||||
|
gates:
|
||||||
|
- reproduce_approved
|
||||||
|
timeout_seconds: 300
|
||||||
|
instructions: |
|
||||||
|
보고된 버그를 재현하고 재현 조건을 문서화합니다.
|
||||||
|
로그 파일이 있으면 read_file로 읽고 패턴을 분석합니다.
|
||||||
|
glob/grep으로 관련 코드를 검색합니다.
|
||||||
|
재현 조건·환경·입력값·실제 출력·기대 출력을 dev/spec@1 형식으로
|
||||||
|
artifacts/reproduction.json에 write_file로 저장합니다.
|
||||||
|
max_budget_usd: 0.20
|
||||||
|
- key: diagnose
|
||||||
|
title: "근본 원인 진단"
|
||||||
|
risk: low
|
||||||
|
role: debugger
|
||||||
|
expected_artifact:
|
||||||
|
path: artifacts/diagnosis.json
|
||||||
|
schema: dev/spec@1
|
||||||
|
gates:
|
||||||
|
- diagnose_approved
|
||||||
|
timeout_seconds: 360
|
||||||
|
instructions: |
|
||||||
|
artifacts/reproduction.json을 read_file로 읽고 근본 원인을 진단합니다.
|
||||||
|
가설을 세우고 read_file/grep으로 코드에서 검증합니다.
|
||||||
|
가장 단순한 가설부터 검증합니다.
|
||||||
|
root cause, 영향 범위, 수정 제안을 dev/spec@1 형식으로
|
||||||
|
artifacts/diagnosis.json에 write_file로 저장합니다.
|
||||||
|
max_budget_usd: 0.50
|
||||||
|
- key: fix
|
||||||
|
title: "버그 수정"
|
||||||
|
risk: medium
|
||||||
|
role: fixer
|
||||||
|
expected_artifact:
|
||||||
|
path: artifacts/fix.json
|
||||||
|
schema: dev/spec@1
|
||||||
|
gates:
|
||||||
|
- fix_approved
|
||||||
|
timeout_seconds: 600
|
||||||
|
instructions: |
|
||||||
|
artifacts/diagnosis.json을 read_file로 읽고 근본 원인을 수정합니다.
|
||||||
|
수정 전 테스트 케이스를 먼저 작성합니다 (test_first_development).
|
||||||
|
edit_file로 최소한의 변경만 적용합니다.
|
||||||
|
수정 내용, 변경된 파일 목록, 테스트 명령어를 dev/spec@1 형식으로
|
||||||
|
artifacts/fix.json에 write_file로 저장합니다.
|
||||||
|
max_budget_usd: 1.00
|
||||||
|
- key: verify
|
||||||
|
title: "수정 결과 검증"
|
||||||
|
risk: low
|
||||||
|
role: verifier
|
||||||
|
expected_artifact:
|
||||||
|
path: artifacts/verification.json
|
||||||
|
schema: dev/review-finding-batch@1
|
||||||
|
gates:
|
||||||
|
- verify_approved
|
||||||
|
timeout_seconds: 300
|
||||||
|
instructions: |
|
||||||
|
artifacts/fix.json을 read_file로 읽고 수정된 코드를 직접 확인합니다.
|
||||||
|
재현 조건이 해소됐는지, 회귀 위험은 없는지 검증합니다.
|
||||||
|
검증 결과를 dev/review-finding-batch@1 형식으로
|
||||||
|
artifacts/verification.json에 write_file로 저장합니다.
|
||||||
|
verifierStatus: confirmed = 수정 확인됨, rejected = 수정 불충분.
|
||||||
|
max_budget_usd: 0.20
|
||||||
|
default_gates: []
|
||||||
|
max_total_budget_usd: 3.0
|
||||||
@@ -0,0 +1,63 @@
|
|||||||
|
name: code-investigation
|
||||||
|
version: 1
|
||||||
|
description: "코드베이스 탐색 → 요약 보고서 생성. 구조 파악·의존성 분석·이슈 발굴."
|
||||||
|
roles:
|
||||||
|
- id: explorer
|
||||||
|
required_capabilities:
|
||||||
|
- evidence_check
|
||||||
|
- code_review
|
||||||
|
preferred_backends:
|
||||||
|
- openrouter
|
||||||
|
fallback_personas:
|
||||||
|
- "openrouter-claude-code-reviewer@1"
|
||||||
|
- "openrouter-deepseek-verifier@1"
|
||||||
|
- id: summarizer
|
||||||
|
required_capabilities:
|
||||||
|
- evidence_check
|
||||||
|
- final_report_compose
|
||||||
|
preferred_backends:
|
||||||
|
- openrouter
|
||||||
|
fallback_personas:
|
||||||
|
- "openrouter-claude-spec-writer@1"
|
||||||
|
phases:
|
||||||
|
- key: explore
|
||||||
|
title: "코드베이스 탐색 및 정보 수집"
|
||||||
|
risk: low
|
||||||
|
role: explorer
|
||||||
|
expected_artifact:
|
||||||
|
path: artifacts/exploration.json
|
||||||
|
schema: dev/spec@1
|
||||||
|
gates: []
|
||||||
|
timeout_seconds: 600
|
||||||
|
instructions: |
|
||||||
|
코드베이스를 체계적으로 탐색합니다.
|
||||||
|
glob으로 전체 파일 구조를 파악하고 read_file로 핵심 파일을 읽습니다.
|
||||||
|
grep으로 주요 패턴·의존성·진입점을 검색합니다.
|
||||||
|
발견한 내용 (구조, 주요 컴포넌트, 의존성, 잠재적 이슈)을
|
||||||
|
dev/spec@1 형식으로 artifacts/exploration.json에 write_file로 저장합니다.
|
||||||
|
requirements 필드: 탐색 목적
|
||||||
|
approach 필드: 탐색한 파일 목록 및 방법
|
||||||
|
acceptance_criteria 필드: 발견한 핵심 사실들
|
||||||
|
risks 필드: 발견한 잠재적 이슈들
|
||||||
|
max_budget_usd: 0.50
|
||||||
|
- key: summarize
|
||||||
|
title: "탐색 결과 최종 보고서 작성"
|
||||||
|
risk: low
|
||||||
|
role: summarizer
|
||||||
|
expected_artifact:
|
||||||
|
path: artifacts/report.json
|
||||||
|
schema: common/final-report@1
|
||||||
|
gates:
|
||||||
|
- report_approved
|
||||||
|
timeout_seconds: 300
|
||||||
|
instructions: |
|
||||||
|
artifacts/exploration.json을 read_file로 읽고 common/final-report@1 형식으로
|
||||||
|
최종 보고서를 작성합니다.
|
||||||
|
status: "completed"
|
||||||
|
phases: explore와 summarize 단계 정보
|
||||||
|
findings: exploration.json의 risks 항목을 finding으로 변환
|
||||||
|
artifacts: exploration.json 경로 포함
|
||||||
|
보고서를 write_file로 artifacts/report.json에 저장합니다.
|
||||||
|
max_budget_usd: 0.30
|
||||||
|
default_gates: []
|
||||||
|
max_total_budget_usd: 1.0
|
||||||
76
my-deepagent/docs/schemas/workflows/spec-and-review@1.yaml
Normal file
76
my-deepagent/docs/schemas/workflows/spec-and-review@1.yaml
Normal file
@@ -0,0 +1,76 @@
|
|||||||
|
name: spec-and-review
|
||||||
|
version: 1
|
||||||
|
description: "요구사항 → spec → 리뷰 → verifier 검증"
|
||||||
|
roles:
|
||||||
|
- id: spec_writer
|
||||||
|
required_capabilities:
|
||||||
|
- spec_write
|
||||||
|
- phase_planning
|
||||||
|
preferred_backends:
|
||||||
|
- openrouter
|
||||||
|
fallback_personas:
|
||||||
|
- "openrouter-claude-spec-writer@1"
|
||||||
|
- id: reviewer
|
||||||
|
required_capabilities:
|
||||||
|
- code_review
|
||||||
|
- evidence_check
|
||||||
|
preferred_backends:
|
||||||
|
- openrouter
|
||||||
|
fallback_personas:
|
||||||
|
- "openrouter-claude-code-reviewer@1"
|
||||||
|
- id: verifier
|
||||||
|
required_capabilities:
|
||||||
|
- evidence_check
|
||||||
|
- objective_eval
|
||||||
|
preferred_backends:
|
||||||
|
- openrouter
|
||||||
|
fallback_personas:
|
||||||
|
- "openrouter-deepseek-verifier@1"
|
||||||
|
phases:
|
||||||
|
- key: spec
|
||||||
|
title: "요구사항 분석 및 Spec 작성"
|
||||||
|
risk: low
|
||||||
|
role: spec_writer
|
||||||
|
expected_artifact:
|
||||||
|
path: artifacts/spec.json
|
||||||
|
schema: dev/spec@1
|
||||||
|
gates:
|
||||||
|
- spec_approved
|
||||||
|
timeout_seconds: 300
|
||||||
|
instructions: |
|
||||||
|
사용자 요구사항을 분석해 dev/spec@1 schema에 맞는 spec.json을 작성하세요.
|
||||||
|
기존 코드는 read_file/glob/grep으로 탐색합니다.
|
||||||
|
완성된 spec.json은 write_file로 artifacts/spec.json에 저장합니다.
|
||||||
|
max_budget_usd: 0.50
|
||||||
|
- key: review
|
||||||
|
title: "Spec 리뷰"
|
||||||
|
risk: low
|
||||||
|
role: reviewer
|
||||||
|
expected_artifact:
|
||||||
|
path: artifacts/review.json
|
||||||
|
schema: dev/review-finding-batch@1
|
||||||
|
gates:
|
||||||
|
- review_approved
|
||||||
|
timeout_seconds: 300
|
||||||
|
instructions: |
|
||||||
|
artifacts/spec.json을 read_file로 읽고 dev/review-finding-batch@1 형식으로 review.json을 작성하세요.
|
||||||
|
각 finding은 severity, category, summary를 반드시 포함합니다.
|
||||||
|
완성된 review.json은 write_file로 artifacts/review.json에 저장합니다.
|
||||||
|
max_budget_usd: 0.50
|
||||||
|
- key: verify
|
||||||
|
title: "리뷰 결과 검증"
|
||||||
|
risk: low
|
||||||
|
role: verifier
|
||||||
|
expected_artifact:
|
||||||
|
path: artifacts/verification.json
|
||||||
|
schema: dev/review-finding-batch@1
|
||||||
|
gates:
|
||||||
|
- verify_approved
|
||||||
|
timeout_seconds: 180
|
||||||
|
instructions: |
|
||||||
|
artifacts/review.json을 read_file로 읽고 각 finding을 코드에서 직접 확인합니다.
|
||||||
|
verifierStatus를 confirmed 또는 rejected로 판정하고 근거를 evidence 필드에 기록합니다.
|
||||||
|
결과를 write_file로 artifacts/verification.json에 저장합니다.
|
||||||
|
max_budget_usd: 0.10
|
||||||
|
default_gates: []
|
||||||
|
max_total_budget_usd: 2.0
|
||||||
15
my-deepagent/mypy.ini
Normal file
15
my-deepagent/mypy.ini
Normal file
@@ -0,0 +1,15 @@
|
|||||||
|
[mypy]
|
||||||
|
python_version = 3.12
|
||||||
|
strict = true
|
||||||
|
warn_return_any = true
|
||||||
|
warn_unused_configs = true
|
||||||
|
disallow_untyped_defs = true
|
||||||
|
disallow_incomplete_defs = true
|
||||||
|
disallow_untyped_decorators = true
|
||||||
|
plugins = pydantic.mypy
|
||||||
|
|
||||||
|
[mypy-tests.*]
|
||||||
|
disallow_untyped_defs = false
|
||||||
|
|
||||||
|
[mypy-alembic.*]
|
||||||
|
ignore_errors = true
|
||||||
58
my-deepagent/pyproject.toml
Normal file
58
my-deepagent/pyproject.toml
Normal file
@@ -0,0 +1,58 @@
|
|||||||
|
[project]
|
||||||
|
name = "my-deepagent"
|
||||||
|
version = "0.1.0"
|
||||||
|
description = "Add your description here"
|
||||||
|
requires-python = ">=3.12"
|
||||||
|
dependencies = [
|
||||||
|
"aiosqlite>=0.20",
|
||||||
|
"alembic>=1.14",
|
||||||
|
"greenlet>=3.0",
|
||||||
|
"sqlalchemy[asyncio]>=2.0",
|
||||||
|
"httpx>=0.28",
|
||||||
|
"jsonschema>=4.23",
|
||||||
|
"keyring>=25.7",
|
||||||
|
"langchain>=0.3.0,<2.0.0",
|
||||||
|
"langchain-core>=0.3.0,<2.0.0",
|
||||||
|
"langchain-openai>=0.3.0,<2.0.0",
|
||||||
|
"langgraph>=0.2.0",
|
||||||
|
"langgraph-checkpoint-sqlite>=2.0.0",
|
||||||
|
"openai>=1.0.0",
|
||||||
|
"platformdirs>=4.9",
|
||||||
|
"prompt-toolkit>=3.0",
|
||||||
|
"pydantic>=2.9",
|
||||||
|
"pydantic-settings>=2.6",
|
||||||
|
"pyyaml>=6.0",
|
||||||
|
"rich>=13.9",
|
||||||
|
"structlog>=24.4",
|
||||||
|
"typer>=0.14",
|
||||||
|
"zstandard>=0.23",
|
||||||
|
"deepagents>=0.6.1,<0.7.0",
|
||||||
|
]
|
||||||
|
|
||||||
|
[project.scripts]
|
||||||
|
mydeepagent = "my_deepagent.cli.main:app"
|
||||||
|
|
||||||
|
[build-system]
|
||||||
|
requires = ["uv_build>=0.9.28,<0.10.0"]
|
||||||
|
build-backend = "uv_build"
|
||||||
|
|
||||||
|
[tool.pytest.ini_options]
|
||||||
|
asyncio_mode = "auto"
|
||||||
|
testpaths = ["tests"]
|
||||||
|
addopts = "-v --strict-markers"
|
||||||
|
markers = [
|
||||||
|
"integration: marks tests as integration tests that make real external API calls (deselect with '-m not integration')",
|
||||||
|
]
|
||||||
|
|
||||||
|
[dependency-groups]
|
||||||
|
dev = [
|
||||||
|
"mypy>=1.13",
|
||||||
|
"pre-commit>=4.0",
|
||||||
|
"pytest>=8.3",
|
||||||
|
"pytest-asyncio>=0.24",
|
||||||
|
"pytest-httpx>=0.34",
|
||||||
|
"respx>=0.21",
|
||||||
|
"ruff>=0.8",
|
||||||
|
"types-jsonschema>=4.26.0.20260508",
|
||||||
|
"types-pyyaml>=6.0.12.20260510",
|
||||||
|
]
|
||||||
12
my-deepagent/ruff.toml
Normal file
12
my-deepagent/ruff.toml
Normal file
@@ -0,0 +1,12 @@
|
|||||||
|
target-version = "py312"
|
||||||
|
line-length = 100
|
||||||
|
|
||||||
|
[lint]
|
||||||
|
select = ["E", "W", "F", "I", "N", "B", "UP", "S", "C90", "RUF"]
|
||||||
|
ignore = ["S101", "S311"]
|
||||||
|
|
||||||
|
[lint.per-file-ignores]
|
||||||
|
"tests/**" = ["S", "B"]
|
||||||
|
|
||||||
|
[format]
|
||||||
|
quote-style = "double"
|
||||||
3
my-deepagent/src/my_deepagent/__init__.py
Normal file
3
my-deepagent/src/my_deepagent/__init__.py
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
"""my-deepagent: workflow harness + persona library + OpenRouter on top of deepagents."""
|
||||||
|
|
||||||
|
__version__ = "0.1.0"
|
||||||
150
my-deepagent/src/my_deepagent/artifact_schema.py
Normal file
150
my-deepagent/src/my_deepagent/artifact_schema.py
Normal file
@@ -0,0 +1,150 @@
|
|||||||
|
"""Artifact schema registry. Loads JSON Schema 2020-12 documents and validates artifacts.
|
||||||
|
|
||||||
|
Schemas live at:
|
||||||
|
{data_dir}/artifacts/<schema_id>.json (user)
|
||||||
|
docs/schemas/artifacts/<schema_id>.json (seed)
|
||||||
|
where <schema_id> is "<domain>/<name>@<version>" (e.g. "dev/spec@1").
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import json
|
||||||
|
from dataclasses import dataclass, field
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
from jsonschema import Draft202012Validator, ValidationError
|
||||||
|
from jsonschema.exceptions import SchemaError
|
||||||
|
|
||||||
|
from .enums import ErrorClass
|
||||||
|
from .errors import MyDeepAgentError
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass(frozen=True)
|
||||||
|
class ValidationFinding:
|
||||||
|
"""One JSON Schema validation error in a structured form."""
|
||||||
|
|
||||||
|
path: str # JSON pointer-ish: "/findings/0/severity"
|
||||||
|
message: str
|
||||||
|
validator: str # "enum", "required", "type", ...
|
||||||
|
expected: Any | None
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass(frozen=True)
|
||||||
|
class ValidationResult:
|
||||||
|
ok: bool
|
||||||
|
errors: tuple[ValidationFinding, ...] = field(default_factory=tuple)
|
||||||
|
|
||||||
|
|
||||||
|
class ArtifactSchemaRegistry:
|
||||||
|
"""Loads + caches JSON Schema 2020-12 documents from one or more roots.
|
||||||
|
|
||||||
|
Roots are searched in order; first hit wins.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, roots: list[Path]) -> None:
|
||||||
|
if not roots:
|
||||||
|
raise MyDeepAgentError(
|
||||||
|
ErrorClass.FATAL,
|
||||||
|
"config_invalid",
|
||||||
|
message="ArtifactSchemaRegistry requires at least one root",
|
||||||
|
)
|
||||||
|
self._roots = [Path(r) for r in roots]
|
||||||
|
self._cache: dict[str, dict[str, Any]] = {}
|
||||||
|
self._validator_cache: dict[str, Draft202012Validator] = {}
|
||||||
|
|
||||||
|
def _resolve_path(self, schema_id: str) -> Path:
|
||||||
|
"""Try each root for <root>/<schema_id>.json; return first existing."""
|
||||||
|
if not schema_id or "/" not in schema_id:
|
||||||
|
raise MyDeepAgentError(
|
||||||
|
ErrorClass.FATAL,
|
||||||
|
"artifact_schema_unknown",
|
||||||
|
message=(
|
||||||
|
f"invalid schema_id format: {schema_id!r}"
|
||||||
|
" (expected '<domain>/<name>@<version>')"
|
||||||
|
),
|
||||||
|
)
|
||||||
|
rel = Path(f"{schema_id}.json")
|
||||||
|
for root in self._roots:
|
||||||
|
candidate = root / rel
|
||||||
|
if candidate.is_file():
|
||||||
|
return candidate
|
||||||
|
raise MyDeepAgentError(
|
||||||
|
ErrorClass.FATAL,
|
||||||
|
"artifact_schema_unknown",
|
||||||
|
message=(f"schema not found: {schema_id} (searched: {[str(r) for r in self._roots]})"),
|
||||||
|
recovery_hint=f"add {schema_id}.json to one of the registry roots",
|
||||||
|
)
|
||||||
|
|
||||||
|
def load(self, schema_id: str) -> dict[str, Any]:
|
||||||
|
"""Return the parsed schema document. Cached after first load."""
|
||||||
|
if schema_id in self._cache:
|
||||||
|
return self._cache[schema_id]
|
||||||
|
path = self._resolve_path(schema_id)
|
||||||
|
try:
|
||||||
|
raw = path.read_text(encoding="utf-8")
|
||||||
|
schema: Any = json.loads(raw)
|
||||||
|
except (OSError, json.JSONDecodeError) as e:
|
||||||
|
raise MyDeepAgentError(
|
||||||
|
ErrorClass.FATAL,
|
||||||
|
"artifact_schema_load_failed",
|
||||||
|
message=f"failed to load schema {schema_id} from {path}: {e}",
|
||||||
|
cause=e,
|
||||||
|
) from e
|
||||||
|
if not isinstance(schema, dict):
|
||||||
|
raise MyDeepAgentError(
|
||||||
|
ErrorClass.FATAL,
|
||||||
|
"artifact_schema_load_failed",
|
||||||
|
message=f"schema {schema_id} must be a JSON object at {path}",
|
||||||
|
)
|
||||||
|
# Verify the schema document itself is a valid Draft 2020-12 schema.
|
||||||
|
try:
|
||||||
|
Draft202012Validator.check_schema(schema)
|
||||||
|
except SchemaError as e:
|
||||||
|
raise MyDeepAgentError(
|
||||||
|
ErrorClass.FATAL,
|
||||||
|
"artifact_schema_load_failed",
|
||||||
|
message=(f"schema {schema_id} is not a valid Draft 2020-12 schema: {e.message}"),
|
||||||
|
cause=e,
|
||||||
|
) from e
|
||||||
|
self._cache[schema_id] = schema
|
||||||
|
return schema
|
||||||
|
|
||||||
|
def _validator(self, schema_id: str) -> Draft202012Validator:
|
||||||
|
if schema_id not in self._validator_cache:
|
||||||
|
self._validator_cache[schema_id] = Draft202012Validator(self.load(schema_id))
|
||||||
|
return self._validator_cache[schema_id]
|
||||||
|
|
||||||
|
def validate(self, schema_id: str, data: Any) -> ValidationResult:
|
||||||
|
"""Validate *data* against *schema_id*.
|
||||||
|
|
||||||
|
Returns a structured :class:`ValidationResult` — never raises for
|
||||||
|
invalid data. Raises :class:`~my_deepagent.errors.MyDeepAgentError`
|
||||||
|
with code ``artifact_schema_unknown`` or ``artifact_schema_load_failed``
|
||||||
|
if the schema itself cannot be loaded.
|
||||||
|
"""
|
||||||
|
validator = self._validator(schema_id)
|
||||||
|
raw_errors: list[ValidationError] = list(validator.iter_errors(data))
|
||||||
|
if not raw_errors:
|
||||||
|
return ValidationResult(ok=True)
|
||||||
|
findings = tuple(
|
||||||
|
ValidationFinding(
|
||||||
|
path="/" + "/".join(str(p) for p in err.absolute_path),
|
||||||
|
message=err.message,
|
||||||
|
validator=str(err.validator),
|
||||||
|
expected=err.validator_value,
|
||||||
|
)
|
||||||
|
for err in raw_errors
|
||||||
|
)
|
||||||
|
return ValidationResult(ok=False, errors=findings)
|
||||||
|
|
||||||
|
def known_schema_ids(self) -> list[str]:
|
||||||
|
"""Enumerate all schemas found across all roots. Sorted, deduplicated."""
|
||||||
|
seen: set[str] = set()
|
||||||
|
for root in self._roots:
|
||||||
|
if not root.is_dir():
|
||||||
|
continue
|
||||||
|
for path in sorted(root.rglob("*.json")):
|
||||||
|
rel = path.relative_to(root).with_suffix("")
|
||||||
|
seen.add(str(rel))
|
||||||
|
return sorted(seen)
|
||||||
404
my-deepagent/src/my_deepagent/binding.py
Normal file
404
my-deepagent/src/my_deepagent/binding.py
Normal file
@@ -0,0 +1,404 @@
|
|||||||
|
"""Persona binding algorithm: auto-select, override, capability/risk validation, consent gate."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import fcntl
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
from collections.abc import Iterator
|
||||||
|
from contextlib import contextmanager
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from datetime import UTC, datetime
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Any, Literal, cast
|
||||||
|
|
||||||
|
from .enums import Backend, RiskLevel
|
||||||
|
from .errors import MyDeepAgentError
|
||||||
|
from .hash import sha256
|
||||||
|
from .persona import Persona
|
||||||
|
from .workflow import WorkflowRole, WorkflowTemplate
|
||||||
|
|
||||||
|
ConsentDecision = Literal["approve", "block", "once"]
|
||||||
|
|
||||||
|
_RISK_RANK: dict[RiskLevel, int] = {
|
||||||
|
RiskLevel.LOW: 0,
|
||||||
|
RiskLevel.MEDIUM: 1,
|
||||||
|
RiskLevel.HIGH: 2,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass(frozen=True)
|
||||||
|
class BackendAvailability:
|
||||||
|
"""Which backends are reachable in the current environment.
|
||||||
|
|
||||||
|
v0.1.0: openrouter availability is determined solely by API-key presence.
|
||||||
|
Other backends follow the same pattern — callers populate available_backends.
|
||||||
|
"""
|
||||||
|
|
||||||
|
available_backends: frozenset[Backend]
|
||||||
|
|
||||||
|
def is_available(self, backend: Backend) -> bool:
|
||||||
|
return backend in self.available_backends
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass(frozen=True)
|
||||||
|
class BindingOverride:
|
||||||
|
"""Per-role persona override: role_id → "persona-name@version" spec string."""
|
||||||
|
|
||||||
|
persona_pinned: dict[str, str]
|
||||||
|
|
||||||
|
@classmethod
|
||||||
|
def parse(cls, raw: dict[str, str] | None) -> BindingOverride:
|
||||||
|
return cls(persona_pinned=dict(raw or {}))
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass(frozen=True)
|
||||||
|
class Binding:
|
||||||
|
"""Resolved binding of a single workflow role to a concrete persona."""
|
||||||
|
|
||||||
|
role_id: str
|
||||||
|
persona: Persona
|
||||||
|
binding_hash: str
|
||||||
|
|
||||||
|
|
||||||
|
def is_persona_eligible_for_role(
|
||||||
|
persona: Persona,
|
||||||
|
role: WorkflowRole,
|
||||||
|
template: WorkflowTemplate,
|
||||||
|
) -> tuple[bool, str | None]:
|
||||||
|
"""Return (eligible, reason_if_not).
|
||||||
|
|
||||||
|
Checks three conditions in order:
|
||||||
|
1. The persona has all capabilities required by the role.
|
||||||
|
2. The persona's allowed_roles (if set) includes this role.
|
||||||
|
3. The persona's max_risk_level covers the highest phase risk for this role.
|
||||||
|
"""
|
||||||
|
required = set(role.required_capabilities)
|
||||||
|
have = set(persona.capabilities)
|
||||||
|
if not required.issubset(have):
|
||||||
|
missing = required - have
|
||||||
|
return False, f"missing capabilities: {sorted(c.value for c in missing)}"
|
||||||
|
|
||||||
|
if persona.allowed_roles is not None and role.id not in persona.allowed_roles:
|
||||||
|
return False, f"role {role.id!r} not in persona.allowed_roles"
|
||||||
|
|
||||||
|
max_phase_risk = max(
|
||||||
|
(ph.risk for ph in template.phases if ph.role == role.id),
|
||||||
|
default=RiskLevel.LOW,
|
||||||
|
)
|
||||||
|
if _RISK_RANK[max_phase_risk] > _RISK_RANK[persona.max_risk_level]:
|
||||||
|
return (
|
||||||
|
False,
|
||||||
|
(
|
||||||
|
f"phase risk {max_phase_risk.value} > "
|
||||||
|
f"persona max_risk_level {persona.max_risk_level.value}"
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
return True, None
|
||||||
|
|
||||||
|
|
||||||
|
def _auto_select(candidates: list[Persona], role: WorkflowRole) -> Persona:
|
||||||
|
"""Deterministic selection from eligible candidates.
|
||||||
|
|
||||||
|
Priority (ascending sort key):
|
||||||
|
1. preferred_backends index (lower = more preferred; non-preferred → last)
|
||||||
|
2. version descending (higher = newer)
|
||||||
|
3. name ascending (alphabetical tiebreak)
|
||||||
|
4. compute_hash ascending (hash tiebreak for identical name+version)
|
||||||
|
"""
|
||||||
|
|
||||||
|
def _key(p: Persona) -> tuple[int, int, str, str]:
|
||||||
|
try:
|
||||||
|
pref_idx = role.preferred_backends.index(p.backend)
|
||||||
|
except ValueError:
|
||||||
|
pref_idx = len(role.preferred_backends) + 1
|
||||||
|
return (pref_idx, -p.version, p.name, p.compute_hash())
|
||||||
|
|
||||||
|
return sorted(candidates, key=_key)[0]
|
||||||
|
|
||||||
|
|
||||||
|
class PersonaConsentStore:
|
||||||
|
"""Crash-safe + multi-process-safe JSON file store for per-persona consent decisions.
|
||||||
|
|
||||||
|
Storage: {path} -> {"<persona_hash>": {"decision": "approve|block|once", "decided_at": "..."}}
|
||||||
|
Concurrency guarantees:
|
||||||
|
* Writes are atomic via tmp-file + fsync + os.replace (POSIX rename is atomic).
|
||||||
|
* Cross-process safety via advisory ``fcntl.flock`` on a lock-file at ``{path}.lock``.
|
||||||
|
``set()`` / ``revoke()`` hold an exclusive lock for the read-modify-write cycle;
|
||||||
|
``get()`` uses a shared lock for consistent reads. This prevents lost-update
|
||||||
|
races between concurrent ``mydeepagent`` invocations on the same machine.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, path: Path) -> None:
|
||||||
|
self._path = path
|
||||||
|
self._lock_path = path.with_suffix(path.suffix + ".lock")
|
||||||
|
|
||||||
|
@contextmanager
|
||||||
|
def _flock(self, exclusive: bool) -> Iterator[None]:
|
||||||
|
"""Acquire a POSIX advisory lock for the duration of the block."""
|
||||||
|
self._lock_path.parent.mkdir(parents=True, exist_ok=True)
|
||||||
|
fd = os.open(self._lock_path, os.O_RDWR | os.O_CREAT, 0o600)
|
||||||
|
try:
|
||||||
|
fcntl.flock(fd, fcntl.LOCK_EX if exclusive else fcntl.LOCK_SH)
|
||||||
|
try:
|
||||||
|
yield
|
||||||
|
finally:
|
||||||
|
fcntl.flock(fd, fcntl.LOCK_UN)
|
||||||
|
finally:
|
||||||
|
os.close(fd)
|
||||||
|
|
||||||
|
def _load(self) -> dict[str, Any]:
|
||||||
|
if not self._path.is_file():
|
||||||
|
return {}
|
||||||
|
try:
|
||||||
|
raw = self._path.read_text(encoding="utf-8")
|
||||||
|
data: object = json.loads(raw) if raw.strip() else {}
|
||||||
|
except (OSError, json.JSONDecodeError) as e:
|
||||||
|
raise MyDeepAgentError.fatal(
|
||||||
|
"internal_state_corruption",
|
||||||
|
message=f"failed to read consent store at {self._path}: {e}",
|
||||||
|
recovery_hint=(
|
||||||
|
f"delete {self._path} and re-run; "
|
||||||
|
"previously granted consents will be re-prompted"
|
||||||
|
),
|
||||||
|
cause=e,
|
||||||
|
) from e
|
||||||
|
if not isinstance(data, dict):
|
||||||
|
raise MyDeepAgentError.fatal(
|
||||||
|
"internal_state_corruption",
|
||||||
|
message=f"consent store must be a JSON object: {self._path}",
|
||||||
|
)
|
||||||
|
return data
|
||||||
|
|
||||||
|
def _write(self, data: dict[str, Any]) -> None:
|
||||||
|
"""Atomic crash-safe write. Caller must already hold the exclusive flock."""
|
||||||
|
self._path.parent.mkdir(parents=True, exist_ok=True)
|
||||||
|
tmp = self._path.with_suffix(self._path.suffix + ".tmp")
|
||||||
|
payload = json.dumps(data, indent=2, sort_keys=True, ensure_ascii=False)
|
||||||
|
fd = os.open(tmp, os.O_WRONLY | os.O_CREAT | os.O_TRUNC, 0o600)
|
||||||
|
try:
|
||||||
|
os.write(fd, payload.encode("utf-8"))
|
||||||
|
os.fsync(fd)
|
||||||
|
finally:
|
||||||
|
os.close(fd)
|
||||||
|
os.replace(tmp, self._path)
|
||||||
|
|
||||||
|
def get(self, persona_hash: str) -> ConsentDecision | None:
|
||||||
|
"""Return stored decision or None if absent / unrecognised."""
|
||||||
|
with self._flock(exclusive=False):
|
||||||
|
entry = self._load().get(persona_hash)
|
||||||
|
if entry is None:
|
||||||
|
return None
|
||||||
|
decision = entry.get("decision") if isinstance(entry, dict) else None
|
||||||
|
if decision not in ("approve", "block", "once"):
|
||||||
|
return None
|
||||||
|
return cast(ConsentDecision, decision)
|
||||||
|
|
||||||
|
def set(self, persona_hash: str, decision: ConsentDecision) -> None:
|
||||||
|
"""Persist a consent decision. Exclusive lock + atomic write."""
|
||||||
|
with self._flock(exclusive=True):
|
||||||
|
data = self._load()
|
||||||
|
data[persona_hash] = {
|
||||||
|
"decision": decision,
|
||||||
|
"decided_at": datetime.now(UTC).isoformat(timespec="seconds"),
|
||||||
|
}
|
||||||
|
self._write(data)
|
||||||
|
|
||||||
|
def revoke(self, persona_hash: str) -> None:
|
||||||
|
"""Remove a previously stored consent decision. Exclusive lock. No-op if absent."""
|
||||||
|
with self._flock(exclusive=True):
|
||||||
|
data = self._load()
|
||||||
|
data.pop(persona_hash, None)
|
||||||
|
self._write(data)
|
||||||
|
|
||||||
|
|
||||||
|
def filter_consented_personas(
|
||||||
|
personas: list[Persona],
|
||||||
|
consent_store: PersonaConsentStore,
|
||||||
|
) -> list[Persona]:
|
||||||
|
"""Remove personas whose consent decision is 'block'.
|
||||||
|
|
||||||
|
'approve', 'once', and absent (None) decisions all allow the persona through.
|
||||||
|
"""
|
||||||
|
return [p for p in personas if consent_store.get(p.compute_hash()) != "block"]
|
||||||
|
|
||||||
|
|
||||||
|
def _parse_override_version(pinned_spec: str, version_str: str) -> int | None:
|
||||||
|
"""Parse the version component of an override spec. None if empty, raise otherwise."""
|
||||||
|
if not version_str:
|
||||||
|
return None
|
||||||
|
try:
|
||||||
|
return int(version_str)
|
||||||
|
except ValueError as e:
|
||||||
|
raise MyDeepAgentError.human_required(
|
||||||
|
"no_eligible_persona",
|
||||||
|
message=(f"override spec '{pinned_spec}' has non-integer version '{version_str}'"),
|
||||||
|
recovery_hint="use the format '<persona-name>@<integer-version>'",
|
||||||
|
cause=e,
|
||||||
|
) from e
|
||||||
|
|
||||||
|
|
||||||
|
def _resolve_override(
|
||||||
|
role: WorkflowRole,
|
||||||
|
template: WorkflowTemplate,
|
||||||
|
pinned_spec: str,
|
||||||
|
eligible: list[Persona],
|
||||||
|
persona_pool: list[Persona],
|
||||||
|
consent_store: PersonaConsentStore,
|
||||||
|
) -> Persona:
|
||||||
|
"""Resolve an override spec to a single eligible persona or raise human_required."""
|
||||||
|
name, _, version_str = pinned_spec.partition("@")
|
||||||
|
version = _parse_override_version(pinned_spec, version_str)
|
||||||
|
matches = [p for p in eligible if p.name == name and (version is None or p.version == version)]
|
||||||
|
if matches:
|
||||||
|
return matches[0] if len(matches) == 1 else _auto_select(matches, role)
|
||||||
|
# Distinguish: blocked vs. ineligible vs. simply absent.
|
||||||
|
pool_matches = [
|
||||||
|
p for p in persona_pool if p.name == name and (version is None or p.version == version)
|
||||||
|
]
|
||||||
|
if any(consent_store.get(p.compute_hash()) == "block" for p in pool_matches):
|
||||||
|
raise MyDeepAgentError.human_required(
|
||||||
|
"persona_blocked_by_user",
|
||||||
|
message=f"override persona '{pinned_spec}' is consent-blocked",
|
||||||
|
recovery_hint="run `mydeepagent consents revoke <persona>` to clear the block",
|
||||||
|
)
|
||||||
|
if pool_matches:
|
||||||
|
_, reason = is_persona_eligible_for_role(pool_matches[0], role, template)
|
||||||
|
raise MyDeepAgentError.human_required(
|
||||||
|
"no_eligible_persona",
|
||||||
|
message=(
|
||||||
|
f"override persona '{pinned_spec}' is ineligible for role '{role.id}': {reason}"
|
||||||
|
),
|
||||||
|
)
|
||||||
|
raise MyDeepAgentError.human_required(
|
||||||
|
"no_eligible_persona",
|
||||||
|
message=f"no eligible persona matches override '{pinned_spec}' for role '{role.id}'",
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _resolve_auto(
|
||||||
|
role: WorkflowRole,
|
||||||
|
template: WorkflowTemplate,
|
||||||
|
eligible: list[Persona],
|
||||||
|
persona_pool: list[Persona],
|
||||||
|
consent_store: PersonaConsentStore,
|
||||||
|
) -> Persona:
|
||||||
|
"""Auto-select from eligible or raise human_required with diagnostic context."""
|
||||||
|
if eligible:
|
||||||
|
return _auto_select(eligible, role)
|
||||||
|
any_blocked = any(
|
||||||
|
is_persona_eligible_for_role(p, role, template)[0]
|
||||||
|
and consent_store.get(p.compute_hash()) == "block"
|
||||||
|
for p in persona_pool
|
||||||
|
)
|
||||||
|
if any_blocked:
|
||||||
|
raise MyDeepAgentError.human_required(
|
||||||
|
"persona_blocked_by_user",
|
||||||
|
message=(f"all eligible personas for role '{role.id}' are blocked by user consent"),
|
||||||
|
)
|
||||||
|
raise MyDeepAgentError.human_required(
|
||||||
|
"no_eligible_persona",
|
||||||
|
message=f"no eligible persona for role '{role.id}'",
|
||||||
|
recovery_hint=(
|
||||||
|
f"add a persona with capabilities "
|
||||||
|
f"{sorted(c.value for c in role.required_capabilities)} "
|
||||||
|
"to docs/schemas/personas/"
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def bind_personas(
|
||||||
|
template: WorkflowTemplate,
|
||||||
|
persona_pool: list[Persona],
|
||||||
|
available_backends: BackendAvailability,
|
||||||
|
consent_store: PersonaConsentStore,
|
||||||
|
override: BindingOverride | None = None,
|
||||||
|
) -> dict[str, Binding]:
|
||||||
|
"""Bind each workflow role to a concrete persona.
|
||||||
|
|
||||||
|
Resolution order per role:
|
||||||
|
1. Apply consent filter (remove 'block' personas).
|
||||||
|
2. Apply eligibility filter (capabilities, allowed_roles, risk level).
|
||||||
|
3. If override is set for this role, pick the pinned persona from eligible.
|
||||||
|
4. Otherwise, auto_select from eligible.
|
||||||
|
5. Validate backend availability.
|
||||||
|
6. Validate openrouter model non-empty.
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
MyDeepAgentError (human_required, 'no_eligible_persona') — no match found.
|
||||||
|
MyDeepAgentError (human_required, 'persona_blocked_by_user') — all candidates blocked.
|
||||||
|
MyDeepAgentError (human_required, 'backend_unavailable') — backend not in environment.
|
||||||
|
MyDeepAgentError (human_required, 'model_unavailable') — openrouter model is blank.
|
||||||
|
"""
|
||||||
|
_override = override or BindingOverride.parse(None)
|
||||||
|
consented_pool = filter_consented_personas(persona_pool, consent_store)
|
||||||
|
bindings: dict[str, Binding] = {}
|
||||||
|
|
||||||
|
for role in template.roles:
|
||||||
|
eligible: list[Persona] = [
|
||||||
|
p for p in consented_pool if is_persona_eligible_for_role(p, role, template)[0]
|
||||||
|
]
|
||||||
|
|
||||||
|
if role.id in _override.persona_pinned:
|
||||||
|
chosen = _resolve_override(
|
||||||
|
role,
|
||||||
|
template,
|
||||||
|
_override.persona_pinned[role.id],
|
||||||
|
eligible,
|
||||||
|
persona_pool,
|
||||||
|
consent_store,
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
chosen = _resolve_auto(role, template, eligible, persona_pool, consent_store)
|
||||||
|
|
||||||
|
# Backend availability check
|
||||||
|
if not available_backends.is_available(chosen.backend):
|
||||||
|
raise MyDeepAgentError.human_required(
|
||||||
|
"backend_unavailable",
|
||||||
|
message=(
|
||||||
|
f"backend '{chosen.backend.value}' is not available "
|
||||||
|
f"for persona '{chosen.name}@{chosen.version}'"
|
||||||
|
),
|
||||||
|
recovery_hint=_backend_recovery_hint(chosen.backend),
|
||||||
|
)
|
||||||
|
|
||||||
|
# Openrouter model non-empty check
|
||||||
|
if chosen.backend == Backend.OPENROUTER and not chosen.model.strip():
|
||||||
|
raise MyDeepAgentError.human_required(
|
||||||
|
"model_unavailable",
|
||||||
|
message=(
|
||||||
|
f"persona '{chosen.name}@{chosen.version}' "
|
||||||
|
"has empty model for openrouter backend"
|
||||||
|
),
|
||||||
|
recovery_hint=(
|
||||||
|
"set `model:` field in the persona yaml "
|
||||||
|
"(e.g. 'openrouter:deepseek/deepseek-chat')"
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
binding_hash = sha256(
|
||||||
|
{
|
||||||
|
"role_id": role.id,
|
||||||
|
"template_name": template.name,
|
||||||
|
"template_version": template.version,
|
||||||
|
"persona_hash": chosen.compute_hash(),
|
||||||
|
"backend": chosen.backend.value,
|
||||||
|
}
|
||||||
|
)
|
||||||
|
bindings[role.id] = Binding(role_id=role.id, persona=chosen, binding_hash=binding_hash)
|
||||||
|
|
||||||
|
return bindings
|
||||||
|
|
||||||
|
|
||||||
|
def _backend_recovery_hint(backend: Backend) -> str:
|
||||||
|
if backend == Backend.OPENROUTER:
|
||||||
|
return "run `mydeepagent login openrouter` to register an API key"
|
||||||
|
if backend in (Backend.ANTHROPIC, Backend.OPENAI, Backend.GOOGLE):
|
||||||
|
return f"run `mydeepagent login {backend.value}` to register an API key"
|
||||||
|
if backend == Backend.FAKE:
|
||||||
|
return (
|
||||||
|
"the 'fake' backend is for tests only; "
|
||||||
|
"add Backend.FAKE to the BackendAvailability set in your test harness"
|
||||||
|
)
|
||||||
|
return f"enable backend '{backend.value}' in config and ensure prerequisites"
|
||||||
0
my-deepagent/src/my_deepagent/cli/__init__.py
Normal file
0
my-deepagent/src/my_deepagent/cli/__init__.py
Normal file
1
my-deepagent/src/my_deepagent/cli/doctor.py
Normal file
1
my-deepagent/src/my_deepagent/cli/doctor.py
Normal file
@@ -0,0 +1 @@
|
|||||||
|
"""CLI doctor command for environment diagnostics. Implemented in Step 12."""
|
||||||
1
my-deepagent/src/my_deepagent/cli/interactive.py
Normal file
1
my-deepagent/src/my_deepagent/cli/interactive.py
Normal file
@@ -0,0 +1 @@
|
|||||||
|
"""CLI interactive subcommand. Implemented in Step 10."""
|
||||||
1
my-deepagent/src/my_deepagent/cli/main.py
Normal file
1
my-deepagent/src/my_deepagent/cli/main.py
Normal file
@@ -0,0 +1 @@
|
|||||||
|
"""Typer CLI entry point. Filled in Step 6."""
|
||||||
1
my-deepagent/src/my_deepagent/cli/run.py
Normal file
1
my-deepagent/src/my_deepagent/cli/run.py
Normal file
@@ -0,0 +1 @@
|
|||||||
|
"""CLI run command implementation. Implemented in Step 6."""
|
||||||
1
my-deepagent/src/my_deepagent/cli/seed.py
Normal file
1
my-deepagent/src/my_deepagent/cli/seed.py
Normal file
@@ -0,0 +1 @@
|
|||||||
|
"""CLI seed command for importing persona/workflow YAML assets. Implemented in Step 6."""
|
||||||
1
my-deepagent/src/my_deepagent/cli/stats.py
Normal file
1
my-deepagent/src/my_deepagent/cli/stats.py
Normal file
@@ -0,0 +1 @@
|
|||||||
|
"""CLI stats command for usage summary. Implemented in Step 12."""
|
||||||
109
my-deepagent/src/my_deepagent/config.py
Normal file
109
my-deepagent/src/my_deepagent/config.py
Normal file
@@ -0,0 +1,109 @@
|
|||||||
|
"""Application configuration loaded from env, .env, and TOML file via pydantic-settings."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Literal
|
||||||
|
|
||||||
|
from platformdirs import PlatformDirs
|
||||||
|
from pydantic import Field, ValidationError, field_validator
|
||||||
|
from pydantic_settings import (
|
||||||
|
BaseSettings,
|
||||||
|
PydanticBaseSettingsSource,
|
||||||
|
SettingsConfigDict,
|
||||||
|
TomlConfigSettingsSource,
|
||||||
|
)
|
||||||
|
|
||||||
|
from .enums import ErrorClass
|
||||||
|
from .errors import MyDeepAgentError
|
||||||
|
|
||||||
|
_DIRS = PlatformDirs("my-deepagent", "user", roaming=False)
|
||||||
|
|
||||||
|
|
||||||
|
class Config(BaseSettings):
|
||||||
|
"""Frozen application config. Source priority (high -> low): CLI/env, .env, TOML, defaults."""
|
||||||
|
|
||||||
|
model_config = SettingsConfigDict(
|
||||||
|
env_prefix="MYDEEPAGENT_",
|
||||||
|
env_file=".env",
|
||||||
|
env_file_encoding="utf-8",
|
||||||
|
toml_file=Path(_DIRS.user_config_dir) / "config.toml",
|
||||||
|
frozen=True,
|
||||||
|
extra="ignore",
|
||||||
|
)
|
||||||
|
|
||||||
|
# storage
|
||||||
|
database_url: str = Field(
|
||||||
|
default_factory=lambda: (
|
||||||
|
f"sqlite+aiosqlite:///{Path(_DIRS.user_data_dir) / 'database.sqlite3'}"
|
||||||
|
)
|
||||||
|
)
|
||||||
|
workspace_root: Path = Field(default_factory=Path.cwd)
|
||||||
|
data_dir: Path = Field(default_factory=lambda: Path(_DIRS.user_data_dir))
|
||||||
|
config_dir: Path = Field(default_factory=lambda: Path(_DIRS.user_config_dir))
|
||||||
|
state_dir: Path = Field(default_factory=lambda: Path(_DIRS.user_state_dir))
|
||||||
|
|
||||||
|
# logging / i18n
|
||||||
|
log_level: Literal["trace", "debug", "info", "warn", "error"] = "info"
|
||||||
|
lang: Literal["ko", "en"] = "ko"
|
||||||
|
|
||||||
|
# providers
|
||||||
|
openrouter_api_key: str | None = None
|
||||||
|
openrouter_base_url: str = "https://openrouter.ai/api/v1"
|
||||||
|
|
||||||
|
# observability
|
||||||
|
langsmith_tracing: bool = False
|
||||||
|
langsmith_api_key: str | None = None
|
||||||
|
langsmith_project: str = "my-deepagent"
|
||||||
|
|
||||||
|
# budget
|
||||||
|
budget_daily_usd: float = Field(default=5.0, ge=0)
|
||||||
|
budget_daily_warn_usd: float = Field(default=3.0, ge=0)
|
||||||
|
budget_run_usd: float = Field(default=1.0, ge=0)
|
||||||
|
budget_run_warn_usd: float = Field(default=0.5, ge=0)
|
||||||
|
budget_on_hit: Literal["prompt", "block", "warn_continue"] = "prompt"
|
||||||
|
|
||||||
|
# defaults
|
||||||
|
default_persona: str = "default-interactive"
|
||||||
|
|
||||||
|
@field_validator("workspace_root", "data_dir", "config_dir", "state_dir")
|
||||||
|
@classmethod
|
||||||
|
def _expand(cls, v: Path) -> Path:
|
||||||
|
return Path(v).expanduser().resolve()
|
||||||
|
|
||||||
|
@classmethod
|
||||||
|
def settings_customise_sources(
|
||||||
|
cls,
|
||||||
|
settings_cls: type[BaseSettings],
|
||||||
|
init_settings: PydanticBaseSettingsSource,
|
||||||
|
env_settings: PydanticBaseSettingsSource,
|
||||||
|
dotenv_settings: PydanticBaseSettingsSource,
|
||||||
|
file_secret_settings: PydanticBaseSettingsSource,
|
||||||
|
) -> tuple[PydanticBaseSettingsSource, ...]:
|
||||||
|
# priority: init > env > dotenv > toml > defaults
|
||||||
|
return (
|
||||||
|
init_settings,
|
||||||
|
env_settings,
|
||||||
|
dotenv_settings,
|
||||||
|
TomlConfigSettingsSource(settings_cls),
|
||||||
|
file_secret_settings,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def load_config(**overrides: object) -> Config:
|
||||||
|
"""Load Config with optional kwargs override.
|
||||||
|
|
||||||
|
Wraps pydantic ValidationError in MyDeepAgentError(fatal, config_invalid) per plan §18.
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
return Config(**overrides) # type: ignore[arg-type]
|
||||||
|
except ValidationError as e:
|
||||||
|
raise MyDeepAgentError(
|
||||||
|
ErrorClass.FATAL,
|
||||||
|
"config_invalid",
|
||||||
|
message=f"config validation failed: {e}",
|
||||||
|
recovery_hint=(
|
||||||
|
"check .env, environment variables, and ~/.config/my-deepagent/config.toml"
|
||||||
|
),
|
||||||
|
cause=e,
|
||||||
|
) from e
|
||||||
1
my-deepagent/src/my_deepagent/engine.py
Normal file
1
my-deepagent/src/my_deepagent/engine.py
Normal file
@@ -0,0 +1 @@
|
|||||||
|
"""LangGraph run engine orchestrator. Implemented in Step 7."""
|
||||||
92
my-deepagent/src/my_deepagent/enums.py
Normal file
92
my-deepagent/src/my_deepagent/enums.py
Normal file
@@ -0,0 +1,92 @@
|
|||||||
|
"""All closed-set enums used across the codebase."""
|
||||||
|
|
||||||
|
from enum import StrEnum
|
||||||
|
|
||||||
|
|
||||||
|
class Backend(StrEnum):
|
||||||
|
OPENROUTER = "openrouter"
|
||||||
|
ANTHROPIC = "anthropic"
|
||||||
|
OPENAI = "openai"
|
||||||
|
GOOGLE = "google"
|
||||||
|
FAKE = "fake"
|
||||||
|
|
||||||
|
|
||||||
|
class Capability(StrEnum):
|
||||||
|
SPEC_WRITE = "spec_write"
|
||||||
|
PHASE_PLANNING = "phase_planning"
|
||||||
|
TASK_DAG_PLANNING = "task_dag_planning"
|
||||||
|
CODE_EDIT = "code_edit"
|
||||||
|
TEST_FIRST_DEVELOPMENT = "test_first_development"
|
||||||
|
CODE_REVIEW = "code_review"
|
||||||
|
EVIDENCE_CHECK = "evidence_check"
|
||||||
|
COMMAND_EXECUTE = "command_execute"
|
||||||
|
BACKTEST_RUN = "backtest_run"
|
||||||
|
METRIC_EXTRACT = "metric_extract"
|
||||||
|
FAILURE_MINING = "failure_mining"
|
||||||
|
OBJECTIVE_EVAL = "objective_eval"
|
||||||
|
FINAL_REPORT_COMPOSE = "final_report_compose"
|
||||||
|
|
||||||
|
|
||||||
|
class RiskLevel(StrEnum):
|
||||||
|
LOW = "low"
|
||||||
|
MEDIUM = "medium"
|
||||||
|
HIGH = "high"
|
||||||
|
|
||||||
|
|
||||||
|
class ApprovalDecisionAction(StrEnum):
|
||||||
|
APPROVE = "approve"
|
||||||
|
REJECT = "reject"
|
||||||
|
REQUEST_CHANGES = "request_changes"
|
||||||
|
ABORT = "abort"
|
||||||
|
|
||||||
|
|
||||||
|
class ApprovalState(StrEnum):
|
||||||
|
PENDING = "pending"
|
||||||
|
APPROVED = "approved"
|
||||||
|
REJECTED = "rejected"
|
||||||
|
CHANGES_REQUESTED = "changes_requested"
|
||||||
|
ABORTED = "aborted"
|
||||||
|
PAUSED = "paused"
|
||||||
|
|
||||||
|
|
||||||
|
class RunState(StrEnum):
|
||||||
|
CREATED = "created"
|
||||||
|
BOUND = "bound"
|
||||||
|
PLANNING = "planning"
|
||||||
|
AWAITING_APPROVAL = "awaiting_approval"
|
||||||
|
EXECUTING = "executing"
|
||||||
|
PAUSED = "paused"
|
||||||
|
COMPLETED = "completed"
|
||||||
|
FAILED = "failed"
|
||||||
|
ABORTED = "aborted"
|
||||||
|
|
||||||
|
|
||||||
|
class RunPhaseState(StrEnum):
|
||||||
|
PENDING = "pending"
|
||||||
|
RUNNING = "running"
|
||||||
|
AWAITING_ARTIFACT = "awaiting_artifact"
|
||||||
|
VALIDATING = "validating"
|
||||||
|
AWAITING_APPROVAL = "awaiting_approval"
|
||||||
|
COMPLETED = "completed"
|
||||||
|
FAILED = "failed"
|
||||||
|
SKIPPED = "skipped"
|
||||||
|
|
||||||
|
|
||||||
|
class SessionState(StrEnum):
|
||||||
|
CREATED = "CREATED"
|
||||||
|
BOOTSTRAPPING = "BOOTSTRAPPING"
|
||||||
|
READY = "READY"
|
||||||
|
BUSY = "BUSY"
|
||||||
|
WAITING_FOR_APPROVAL = "WAITING_FOR_APPROVAL"
|
||||||
|
ARTIFACT_TIMEOUT = "ARTIFACT_TIMEOUT"
|
||||||
|
HUNG = "HUNG"
|
||||||
|
CRASHED = "CRASHED"
|
||||||
|
RESUMING = "RESUMING"
|
||||||
|
REBOOTSTRAPPED = "REBOOTSTRAPPED"
|
||||||
|
FAILED_NEEDS_HUMAN = "FAILED_NEEDS_HUMAN"
|
||||||
|
|
||||||
|
|
||||||
|
class ErrorClass(StrEnum):
|
||||||
|
RECOVERABLE = "recoverable"
|
||||||
|
HUMAN_REQUIRED = "human_required"
|
||||||
|
FATAL = "fatal"
|
||||||
79
my-deepagent/src/my_deepagent/errors.py
Normal file
79
my-deepagent/src/my_deepagent/errors.py
Normal file
@@ -0,0 +1,79 @@
|
|||||||
|
"""Domain errors. All exceptions raised by my-deepagent inherit MyDeepAgentError."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from uuid import UUID
|
||||||
|
|
||||||
|
from .enums import ErrorClass
|
||||||
|
|
||||||
|
|
||||||
|
class MyDeepAgentError(Exception):
|
||||||
|
"""Base error with structured fields for classification, recovery hint, and context."""
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
error_class: ErrorClass,
|
||||||
|
code: str,
|
||||||
|
*,
|
||||||
|
message: str | None = None,
|
||||||
|
run_id: UUID | None = None,
|
||||||
|
phase_id: UUID | None = None,
|
||||||
|
recovery_hint: str | None = None,
|
||||||
|
cause: BaseException | None = None,
|
||||||
|
) -> None:
|
||||||
|
super().__init__(message or code)
|
||||||
|
self.error_class = error_class
|
||||||
|
self.code = code
|
||||||
|
self.run_id = run_id
|
||||||
|
self.phase_id = phase_id
|
||||||
|
self.recovery_hint = recovery_hint
|
||||||
|
if cause is not None:
|
||||||
|
self.__cause__ = cause
|
||||||
|
self.__suppress_context__ = True
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
parts = [f"class={self.error_class}", f"code={self.code}"]
|
||||||
|
if self.run_id is not None:
|
||||||
|
parts.append(f"run_id={self.run_id}")
|
||||||
|
if self.phase_id is not None:
|
||||||
|
parts.append(f"phase_id={self.phase_id}")
|
||||||
|
if self.recovery_hint:
|
||||||
|
parts.append(f"hint={self.recovery_hint!r}")
|
||||||
|
return f"MyDeepAgentError({', '.join(parts)})"
|
||||||
|
|
||||||
|
@classmethod
|
||||||
|
def recoverable(cls, code: str, **kwargs: object) -> MyDeepAgentError:
|
||||||
|
return MyDeepAgentError(ErrorClass.RECOVERABLE, code, **kwargs) # type: ignore[arg-type]
|
||||||
|
|
||||||
|
@classmethod
|
||||||
|
def human_required(cls, code: str, **kwargs: object) -> MyDeepAgentError:
|
||||||
|
return MyDeepAgentError(ErrorClass.HUMAN_REQUIRED, code, **kwargs) # type: ignore[arg-type]
|
||||||
|
|
||||||
|
@classmethod
|
||||||
|
def fatal(cls, code: str, **kwargs: object) -> MyDeepAgentError:
|
||||||
|
return MyDeepAgentError(ErrorClass.FATAL, code, **kwargs) # type: ignore[arg-type]
|
||||||
|
|
||||||
|
|
||||||
|
class BudgetExhaustedError(MyDeepAgentError):
|
||||||
|
"""Budget cap hit. Raised by BudgetTracker.assert_can_call when on_hit='block'."""
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
scope: str,
|
||||||
|
projected_usd: float,
|
||||||
|
cap_usd: float,
|
||||||
|
*,
|
||||||
|
run_id: UUID | None = None,
|
||||||
|
recovery_hint: str | None = None,
|
||||||
|
) -> None:
|
||||||
|
super().__init__(
|
||||||
|
ErrorClass.HUMAN_REQUIRED,
|
||||||
|
"budget_exhausted",
|
||||||
|
message=f"budget '{scope}' exhausted: projected={projected_usd:.4f} cap={cap_usd:.4f}",
|
||||||
|
run_id=run_id,
|
||||||
|
recovery_hint=recovery_hint
|
||||||
|
or f"wait until the next period or extend the cap for scope '{scope}'",
|
||||||
|
)
|
||||||
|
self.scope = scope
|
||||||
|
self.projected_usd = projected_usd
|
||||||
|
self.cap_usd = cap_usd
|
||||||
28
my-deepagent/src/my_deepagent/hash.py
Normal file
28
my-deepagent/src/my_deepagent/hash.py
Normal file
@@ -0,0 +1,28 @@
|
|||||||
|
"""Canonical JSON serialization + sha256 hashing for content-addressed identity."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import hashlib
|
||||||
|
import json
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
|
||||||
|
def canonicalize(value: Any) -> str:
|
||||||
|
"""Return canonical JSON: keys sorted, no insignificant whitespace, UTF-16 codepoint order.
|
||||||
|
|
||||||
|
json.dumps with sort_keys=True uses Python's default dict key sort which is by Unicode
|
||||||
|
codepoint. For ASCII keys this is equivalent to UTF-16 codepoint order which is what
|
||||||
|
we want. For non-ASCII keys outside the BMP, this is a documented approximation.
|
||||||
|
"""
|
||||||
|
return json.dumps(
|
||||||
|
value,
|
||||||
|
sort_keys=True,
|
||||||
|
ensure_ascii=False,
|
||||||
|
separators=(",", ":"),
|
||||||
|
allow_nan=False,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def sha256(value: Any) -> str:
|
||||||
|
"""Return sha256 hex digest of canonical JSON of value."""
|
||||||
|
return hashlib.sha256(canonicalize(value).encode("utf-8")).hexdigest()
|
||||||
0
my-deepagent/src/my_deepagent/i18n/__init__.py
Normal file
0
my-deepagent/src/my_deepagent/i18n/__init__.py
Normal file
0
my-deepagent/src/my_deepagent/i18n/en.toml
Normal file
0
my-deepagent/src/my_deepagent/i18n/en.toml
Normal file
0
my-deepagent/src/my_deepagent/i18n/ko.toml
Normal file
0
my-deepagent/src/my_deepagent/i18n/ko.toml
Normal file
1
my-deepagent/src/my_deepagent/interactive.py
Normal file
1
my-deepagent/src/my_deepagent/interactive.py
Normal file
@@ -0,0 +1 @@
|
|||||||
|
"""Interactive REPL loop for TUI sessions. Implemented in Step 10."""
|
||||||
73
my-deepagent/src/my_deepagent/middleware/audit.py
Normal file
73
my-deepagent/src/my_deepagent/middleware/audit.py
Normal file
@@ -0,0 +1,73 @@
|
|||||||
|
"""AuditToolMiddleware: capture every tool call for audit log + DB.
|
||||||
|
|
||||||
|
Records: name, args, result/error, duration.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import time
|
||||||
|
from typing import Any
|
||||||
|
from uuid import UUID
|
||||||
|
|
||||||
|
from langchain.agents.middleware import AgentMiddleware
|
||||||
|
|
||||||
|
|
||||||
|
class AuditToolMiddleware(AgentMiddleware):
|
||||||
|
"""Record every tool invocation for the audit log and DB sink (Step 8)."""
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
run_id: UUID | None = None,
|
||||||
|
phase_id: UUID | None = None,
|
||||||
|
interactive_session_id: UUID | None = None,
|
||||||
|
recorder: Any | None = None,
|
||||||
|
) -> None:
|
||||||
|
super().__init__()
|
||||||
|
self.run_id = run_id
|
||||||
|
self.phase_id = phase_id
|
||||||
|
self.interactive_session_id = interactive_session_id
|
||||||
|
self.recorder = recorder
|
||||||
|
|
||||||
|
async def awrap_tool_call(self, request: Any, handler: Any) -> Any:
|
||||||
|
started = time.perf_counter()
|
||||||
|
# ToolCallRequest exposes tool_call dict with 'name' and 'args'
|
||||||
|
tool_call = getattr(request, "tool_call", {}) or {}
|
||||||
|
name: str = tool_call.get("name", "unknown") if isinstance(tool_call, dict) else "unknown"
|
||||||
|
args: dict[str, Any] = (
|
||||||
|
tool_call.get("args", {}) if isinstance(tool_call, dict) else {}
|
||||||
|
) or {}
|
||||||
|
try:
|
||||||
|
result = await handler(request)
|
||||||
|
except Exception as e:
|
||||||
|
await self._record(name, args, None, type(e).__name__, started)
|
||||||
|
raise
|
||||||
|
await self._record(name, args, result, None, started)
|
||||||
|
return result
|
||||||
|
|
||||||
|
async def _record(
|
||||||
|
self,
|
||||||
|
name: str,
|
||||||
|
args: dict[str, Any],
|
||||||
|
result: Any,
|
||||||
|
error: str | None,
|
||||||
|
started: float,
|
||||||
|
) -> None:
|
||||||
|
if self.recorder is None:
|
||||||
|
return
|
||||||
|
serializable_result: str | int | float | bool | dict[str, Any] | list[Any] | None
|
||||||
|
if isinstance(result, (str, int, float, bool, dict, list)) or result is None:
|
||||||
|
serializable_result = result
|
||||||
|
else:
|
||||||
|
serializable_result = str(result)
|
||||||
|
await self.recorder(
|
||||||
|
{
|
||||||
|
"tool_name": name,
|
||||||
|
"args": args,
|
||||||
|
"result": serializable_result,
|
||||||
|
"error": error,
|
||||||
|
"duration_ms": int((time.perf_counter() - started) * 1000),
|
||||||
|
"run_id": self.run_id,
|
||||||
|
"phase_id": self.phase_id,
|
||||||
|
"interactive_session_id": self.interactive_session_id,
|
||||||
|
}
|
||||||
|
)
|
||||||
87
my-deepagent/src/my_deepagent/middleware/cost.py
Normal file
87
my-deepagent/src/my_deepagent/middleware/cost.py
Normal file
@@ -0,0 +1,87 @@
|
|||||||
|
"""CostMiddleware: capture every LLM call's usage and accumulate cost into the SQLite ledger."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import time
|
||||||
|
from typing import Any
|
||||||
|
from uuid import UUID
|
||||||
|
|
||||||
|
from langchain.agents.middleware import AgentMiddleware
|
||||||
|
|
||||||
|
from ..monitoring.pricing import PricingCache
|
||||||
|
|
||||||
|
|
||||||
|
class CostMiddleware(AgentMiddleware):
|
||||||
|
"""Wrap every model call. Compute cost from usage_metadata and persist.
|
||||||
|
|
||||||
|
Step 8 wires the DB writer via the recorder callback.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
pricing: PricingCache,
|
||||||
|
model_name: str,
|
||||||
|
run_id: UUID | None = None,
|
||||||
|
phase_id: UUID | None = None,
|
||||||
|
persona_name: str | None = None,
|
||||||
|
recorder: Any | None = None, # callable(record) -> Awaitable[None] for DB sink (Step 8)
|
||||||
|
) -> None:
|
||||||
|
super().__init__()
|
||||||
|
self.pricing = pricing
|
||||||
|
self.model_name = model_name
|
||||||
|
self.run_id = run_id
|
||||||
|
self.phase_id = phase_id
|
||||||
|
self.persona_name = persona_name
|
||||||
|
self.recorder = recorder
|
||||||
|
|
||||||
|
async def awrap_model_call(self, request: Any, handler: Any) -> Any:
|
||||||
|
started = time.perf_counter()
|
||||||
|
try:
|
||||||
|
response = await handler(request)
|
||||||
|
except Exception as e:
|
||||||
|
await self._record(
|
||||||
|
input_tokens=0,
|
||||||
|
output_tokens=0,
|
||||||
|
latency_ms=int((time.perf_counter() - started) * 1000),
|
||||||
|
status="error",
|
||||||
|
error_code=type(e).__name__,
|
||||||
|
)
|
||||||
|
raise
|
||||||
|
usage = getattr(response, "usage_metadata", None) or {}
|
||||||
|
in_tokens = int(usage.get("input_tokens", 0) or 0)
|
||||||
|
out_tokens = int(usage.get("output_tokens", 0) or 0)
|
||||||
|
await self._record(
|
||||||
|
input_tokens=in_tokens,
|
||||||
|
output_tokens=out_tokens,
|
||||||
|
latency_ms=int((time.perf_counter() - started) * 1000),
|
||||||
|
status="ok",
|
||||||
|
error_code=None,
|
||||||
|
)
|
||||||
|
return response
|
||||||
|
|
||||||
|
async def _record(
|
||||||
|
self,
|
||||||
|
*,
|
||||||
|
input_tokens: int,
|
||||||
|
output_tokens: int,
|
||||||
|
latency_ms: int,
|
||||||
|
status: str,
|
||||||
|
error_code: str | None,
|
||||||
|
) -> None:
|
||||||
|
if self.recorder is None:
|
||||||
|
return
|
||||||
|
cost = self.pricing.compute_cost(self.model_name, input_tokens, output_tokens)
|
||||||
|
await self.recorder(
|
||||||
|
{
|
||||||
|
"model": self.model_name,
|
||||||
|
"run_id": self.run_id,
|
||||||
|
"phase_id": self.phase_id,
|
||||||
|
"persona_name": self.persona_name,
|
||||||
|
"input_tokens": input_tokens,
|
||||||
|
"output_tokens": output_tokens,
|
||||||
|
"cost_usd_total": cost,
|
||||||
|
"latency_ms": latency_ms,
|
||||||
|
"status": status,
|
||||||
|
"error_code": error_code,
|
||||||
|
}
|
||||||
|
)
|
||||||
47
my-deepagent/src/my_deepagent/middleware/fallback.py
Normal file
47
my-deepagent/src/my_deepagent/middleware/fallback.py
Normal file
@@ -0,0 +1,47 @@
|
|||||||
|
"""FallbackModelMiddleware: retry the model call with a different model on transient HTTP errors."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
import httpx
|
||||||
|
import openai
|
||||||
|
from langchain.agents.middleware import AgentMiddleware
|
||||||
|
|
||||||
|
|
||||||
|
class FallbackModelMiddleware(AgentMiddleware):
|
||||||
|
"""When the primary model raises a transient error, retry once with the fallback model.
|
||||||
|
|
||||||
|
Transient = HTTP 429, 5xx, network errors. Auth (401/AuthenticationError) and bad request
|
||||||
|
(400 model_not_found) are not retried — those need human intervention.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, primary: Any, fallback: Any | None) -> None:
|
||||||
|
super().__init__()
|
||||||
|
self.primary = primary
|
||||||
|
self.fallback = fallback
|
||||||
|
|
||||||
|
async def awrap_model_call(self, request: Any, handler: Any) -> Any:
|
||||||
|
try:
|
||||||
|
return await handler(request)
|
||||||
|
except openai.AuthenticationError:
|
||||||
|
# 401 is human_required, not retryable.
|
||||||
|
raise
|
||||||
|
except (httpx.HTTPError, openai.RateLimitError, openai.APIConnectionError):
|
||||||
|
if self.fallback is None:
|
||||||
|
raise
|
||||||
|
# Best-effort: swap the model bound to the request and retry once.
|
||||||
|
patched = self._with_fallback_model(request)
|
||||||
|
return await handler(patched)
|
||||||
|
|
||||||
|
def _with_fallback_model(self, request: Any) -> Any:
|
||||||
|
"""Swap the bound model in the request for the fallback model.
|
||||||
|
|
||||||
|
ModelRequest exposes a `model` attribute (BaseChatModel instance).
|
||||||
|
We replace it with the fallback. The original request object is mutated
|
||||||
|
in place because ModelRequest.__setattr__ triggers a DeprecationWarning
|
||||||
|
only on ToolCallRequest; ModelRequest is a plain dataclass that allows assignment.
|
||||||
|
"""
|
||||||
|
if hasattr(request, "model"):
|
||||||
|
request.model = self.fallback
|
||||||
|
return request
|
||||||
126
my-deepagent/src/my_deepagent/middleware/safety.py
Normal file
126
my-deepagent/src/my_deepagent/middleware/safety.py
Normal file
@@ -0,0 +1,126 @@
|
|||||||
|
"""SafetyShellMiddleware: destructive command + secret-path enforcement at the tool layer.
|
||||||
|
|
||||||
|
Replaces deepagents.FilesystemPermission for personas using LocalShellBackend,
|
||||||
|
since deepagents 0.6.1 does not yet support permissions + execution-capable backends.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import re
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
from langchain.agents.middleware import AgentMiddleware
|
||||||
|
from wcmatch import glob as wcglob
|
||||||
|
|
||||||
|
from ..errors import MyDeepAgentError
|
||||||
|
|
||||||
|
DESTRUCTIVE_PATTERNS: tuple[re.Pattern[str], ...] = tuple(
|
||||||
|
re.compile(p, re.IGNORECASE)
|
||||||
|
for p in (
|
||||||
|
r"\brm\s+-rf\b",
|
||||||
|
r"\bgit\s+reset\s+--hard\b",
|
||||||
|
r"\bgit\s+clean\b",
|
||||||
|
r"\bgit\s+push\s+--force(-with-lease)?\b",
|
||||||
|
r"\bgit\s+branch\s+-D\b",
|
||||||
|
r"\bdocker\s+volume\s+rm\b",
|
||||||
|
r"\bdocker\s+compose\s+down\s+-v\b",
|
||||||
|
r"\bDROP\s+(DATABASE|SCHEMA|TABLE)\b",
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
# Mirrors session.DEFAULT_DENY_PATHS but as relative glob patterns for wcmatch.
|
||||||
|
# Each sensitive directory is listed twice: once for the directory itself (no trailing
|
||||||
|
# slash — Path normalises it away) and once for everything inside it (**).
|
||||||
|
DENY_PATH_PATTERNS: tuple[str, ...] = (
|
||||||
|
"**/.env*",
|
||||||
|
"**/*.env*",
|
||||||
|
"**/*token*",
|
||||||
|
"**/*secret*",
|
||||||
|
"**/*credential*",
|
||||||
|
"**/*.pem",
|
||||||
|
"**/*.key",
|
||||||
|
"**/.ssh",
|
||||||
|
"**/.ssh/**",
|
||||||
|
"**/.aws",
|
||||||
|
"**/.aws/**",
|
||||||
|
"**/.config/gcloud",
|
||||||
|
"**/.config/gcloud/**",
|
||||||
|
"**/.kube",
|
||||||
|
"**/.kube/**",
|
||||||
|
"**/.gnupg",
|
||||||
|
"**/.gnupg/**",
|
||||||
|
)
|
||||||
|
|
||||||
|
_PATH_TOOLS: frozenset[str] = frozenset({"read_file", "write_file", "edit_file", "ls"})
|
||||||
|
|
||||||
|
# Tool names that carry shell commands.
|
||||||
|
_SHELL_TOOL_NAMES: frozenset[str] = frozenset({"shell", "execute", "run_command"})
|
||||||
|
|
||||||
|
_GLOB_FLAGS = wcglob.GLOBSTAR | wcglob.IGNORECASE | wcglob.DOTGLOB
|
||||||
|
|
||||||
|
|
||||||
|
def _is_denied_path(path: str) -> bool:
|
||||||
|
"""Return True iff the path matches any deny glob pattern."""
|
||||||
|
normalized = str(Path(path)).replace("\\", "/").lstrip("/")
|
||||||
|
for pat in DENY_PATH_PATTERNS:
|
||||||
|
if wcglob.globmatch(normalized, pat, flags=_GLOB_FLAGS):
|
||||||
|
return True
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
class SafetyShellMiddleware(AgentMiddleware):
|
||||||
|
"""Hard-block destructive shell commands and secret-path file ops at the tool layer."""
|
||||||
|
|
||||||
|
async def awrap_tool_call(self, request: Any, handler: Any) -> Any:
|
||||||
|
name = self._tool_name(request)
|
||||||
|
args = self._tool_args(request)
|
||||||
|
if name in _SHELL_TOOL_NAMES:
|
||||||
|
self._check_shell(args)
|
||||||
|
elif name in _PATH_TOOLS:
|
||||||
|
self._check_path(name, args)
|
||||||
|
return await handler(request)
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _tool_name(request: Any) -> str:
|
||||||
|
tool_call = getattr(request, "tool_call", None)
|
||||||
|
if isinstance(tool_call, dict):
|
||||||
|
return str(tool_call.get("name") or "")
|
||||||
|
return str(getattr(request, "name", "") or "")
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _tool_args(request: Any) -> dict[str, Any]:
|
||||||
|
tool_call = getattr(request, "tool_call", None)
|
||||||
|
if isinstance(tool_call, dict):
|
||||||
|
return dict(tool_call.get("args") or {})
|
||||||
|
args = getattr(request, "args", None)
|
||||||
|
return dict(args) if isinstance(args, dict) else {}
|
||||||
|
|
||||||
|
def _check_shell(self, args: dict[str, Any]) -> None:
|
||||||
|
cmd = args.get("command") or args.get("argv") or ""
|
||||||
|
if isinstance(cmd, list):
|
||||||
|
cmd = " ".join(str(x) for x in cmd)
|
||||||
|
cmd_str = str(cmd)
|
||||||
|
for pat in DESTRUCTIVE_PATTERNS:
|
||||||
|
if pat.search(cmd_str):
|
||||||
|
raise MyDeepAgentError.human_required(
|
||||||
|
"destructive_command_blocked",
|
||||||
|
message=f"destructive shell command blocked: {cmd_str[:120]}",
|
||||||
|
recovery_hint=(
|
||||||
|
"this command is hard-blocked by my-deepagent's safety policy; "
|
||||||
|
"edit the persona system_prompt to avoid suggesting it"
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
def _check_path(self, tool_name: str, args: dict[str, Any]) -> None:
|
||||||
|
path = args.get("file_path") or args.get("path") or args.get("file") or ""
|
||||||
|
if not isinstance(path, str) or not path:
|
||||||
|
return
|
||||||
|
if _is_denied_path(path):
|
||||||
|
raise MyDeepAgentError.human_required(
|
||||||
|
"secret_access_blocked",
|
||||||
|
message=(f"access to secret-bearing path blocked: tool={tool_name} path={path!r}"),
|
||||||
|
recovery_hint=(
|
||||||
|
"this path matches a hard-blocked deny pattern (e.g. .env, *.key, .ssh/, .aws/)"
|
||||||
|
),
|
||||||
|
)
|
||||||
1
my-deepagent/src/my_deepagent/monitoring/langsmith.py
Normal file
1
my-deepagent/src/my_deepagent/monitoring/langsmith.py
Normal file
@@ -0,0 +1 @@
|
|||||||
|
"""LangSmith tracing integration helpers. Implemented in Step 12."""
|
||||||
99
my-deepagent/src/my_deepagent/monitoring/pricing.py
Normal file
99
my-deepagent/src/my_deepagent/monitoring/pricing.py
Normal file
@@ -0,0 +1,99 @@
|
|||||||
|
"""OpenRouter model pricing cache + cost computation.
|
||||||
|
|
||||||
|
v0.1.0: in-process dict cache + optional DB refresh. doctor와 background refresh가
|
||||||
|
업데이트 trigger (Step 12).
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from dataclasses import dataclass
|
||||||
|
|
||||||
|
import httpx
|
||||||
|
|
||||||
|
from ..errors import MyDeepAgentError
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass(frozen=True)
|
||||||
|
class ModelPrice:
|
||||||
|
model: str # OpenRouter id, e.g. "deepseek/deepseek-chat"
|
||||||
|
input_per_1k_usd: float
|
||||||
|
output_per_1k_usd: float
|
||||||
|
context_length: int
|
||||||
|
|
||||||
|
|
||||||
|
class PricingCache:
|
||||||
|
"""In-memory cache of OpenRouter pricing. Caller refreshes via fetch_openrouter_pricing()."""
|
||||||
|
|
||||||
|
def __init__(self) -> None:
|
||||||
|
self._cache: dict[str, ModelPrice] = {}
|
||||||
|
|
||||||
|
def get(self, model: str) -> ModelPrice | None:
|
||||||
|
key = model.removeprefix("openrouter:")
|
||||||
|
return self._cache.get(key)
|
||||||
|
|
||||||
|
def set(self, prices: list[ModelPrice]) -> None:
|
||||||
|
for p in prices:
|
||||||
|
self._cache[p.model] = p
|
||||||
|
|
||||||
|
def compute_cost(self, model: str, input_tokens: int, output_tokens: int) -> float:
|
||||||
|
"""Return USD cost. Returns 0.0 if model price is unknown (logged separately)."""
|
||||||
|
price = self.get(model)
|
||||||
|
if price is None:
|
||||||
|
return 0.0
|
||||||
|
return (input_tokens / 1000.0) * price.input_per_1k_usd + (
|
||||||
|
output_tokens / 1000.0
|
||||||
|
) * price.output_per_1k_usd
|
||||||
|
|
||||||
|
|
||||||
|
async def fetch_openrouter_pricing(api_key: str, base_url: str) -> list[ModelPrice]:
|
||||||
|
"""Fetch the OpenRouter /models endpoint and parse pricing."""
|
||||||
|
async with httpx.AsyncClient(timeout=10.0) as client:
|
||||||
|
try:
|
||||||
|
r = await client.get(
|
||||||
|
f"{base_url}/models",
|
||||||
|
headers={"Authorization": f"Bearer {api_key}"},
|
||||||
|
)
|
||||||
|
r.raise_for_status()
|
||||||
|
except httpx.HTTPError as e:
|
||||||
|
raise MyDeepAgentError.recoverable(
|
||||||
|
"network_blip",
|
||||||
|
message=f"failed to fetch openrouter pricing: {e}",
|
||||||
|
cause=e,
|
||||||
|
) from e
|
||||||
|
data: dict[str, object] = r.json()
|
||||||
|
return _parse_pricing_payload(data)
|
||||||
|
|
||||||
|
|
||||||
|
def _parse_pricing_payload(data: dict[str, object]) -> list[ModelPrice]:
|
||||||
|
"""Parse OpenRouter response.
|
||||||
|
|
||||||
|
Expected format::
|
||||||
|
|
||||||
|
{"data": [{"id": "...", "pricing": {"prompt": "...", "completion": "..."}, ...}]}
|
||||||
|
"""
|
||||||
|
models = data.get("data", [])
|
||||||
|
if not isinstance(models, list):
|
||||||
|
return []
|
||||||
|
out: list[ModelPrice] = []
|
||||||
|
for m in models:
|
||||||
|
if not isinstance(m, dict):
|
||||||
|
continue
|
||||||
|
model_id = m.get("id")
|
||||||
|
pricing = m.get("pricing") or {}
|
||||||
|
if not isinstance(model_id, str) or not isinstance(pricing, dict):
|
||||||
|
continue
|
||||||
|
try:
|
||||||
|
prompt_per_token = float(pricing.get("prompt", "0") or "0")
|
||||||
|
completion_per_token = float(pricing.get("completion", "0") or "0")
|
||||||
|
ctx_len = int(m.get("context_length", 0) or 0)
|
||||||
|
except (TypeError, ValueError):
|
||||||
|
continue
|
||||||
|
out.append(
|
||||||
|
ModelPrice(
|
||||||
|
model=model_id,
|
||||||
|
input_per_1k_usd=prompt_per_token * 1000.0,
|
||||||
|
output_per_1k_usd=completion_per_token * 1000.0,
|
||||||
|
context_length=ctx_len,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
return out
|
||||||
1
my-deepagent/src/my_deepagent/monitoring/stats.py
Normal file
1
my-deepagent/src/my_deepagent/monitoring/stats.py
Normal file
@@ -0,0 +1 @@
|
|||||||
|
"""Run statistics aggregation and reporting. Implemented in Step 12."""
|
||||||
6
my-deepagent/src/my_deepagent/persistence/__init__.py
Normal file
6
my-deepagent/src/my_deepagent/persistence/__init__.py
Normal file
@@ -0,0 +1,6 @@
|
|||||||
|
"""Persistence layer: SQLAlchemy async ORM + LangGraph checkpointer."""
|
||||||
|
|
||||||
|
from .checkpointer import get_checkpointer_ctx
|
||||||
|
from .db import Database
|
||||||
|
|
||||||
|
__all__ = ["Database", "get_checkpointer_ctx"]
|
||||||
41
my-deepagent/src/my_deepagent/persistence/checkpointer.py
Normal file
41
my-deepagent/src/my_deepagent/persistence/checkpointer.py
Normal file
@@ -0,0 +1,41 @@
|
|||||||
|
"""LangGraph SqliteSaver wrapper. Use only as a context manager to ensure connection cleanup.
|
||||||
|
|
||||||
|
``SqliteSaver.from_conn_string`` is a ``@contextmanager`` classmethod that yields
|
||||||
|
a ``SqliteSaver`` instance and closes the underlying sqlite3 connection on exit.
|
||||||
|
Direct manual lifecycle management (entering context without ``with``) leaks connections
|
||||||
|
and is not supported by this module.
|
||||||
|
|
||||||
|
Usage::
|
||||||
|
|
||||||
|
with get_checkpointer_ctx(path) as saver:
|
||||||
|
graph = create_deep_agent(checkpointer=saver)
|
||||||
|
...
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from collections.abc import Iterator
|
||||||
|
from contextlib import contextmanager
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
from langgraph.checkpoint.sqlite import SqliteSaver
|
||||||
|
|
||||||
|
|
||||||
|
@contextmanager
|
||||||
|
def get_checkpointer_ctx(checkpoints_db_path: Path) -> Iterator[SqliteSaver]:
|
||||||
|
"""Yield a SqliteSaver bound to *checkpoints_db_path*.
|
||||||
|
|
||||||
|
Creates the parent directory and the database file if they do not exist.
|
||||||
|
The underlying sqlite3 connection is closed automatically on context exit.
|
||||||
|
This is the only supported way to obtain a SqliteSaver in this project —
|
||||||
|
direct manual lifecycle management is not provided.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
checkpoints_db_path: Filesystem path for the SQLite checkpoint database.
|
||||||
|
|
||||||
|
Yields:
|
||||||
|
SqliteSaver: Ready-to-use LangGraph checkpoint saver.
|
||||||
|
"""
|
||||||
|
checkpoints_db_path.parent.mkdir(parents=True, exist_ok=True)
|
||||||
|
with SqliteSaver.from_conn_string(str(checkpoints_db_path)) as saver:
|
||||||
|
yield saver
|
||||||
91
my-deepagent/src/my_deepagent/persistence/db.py
Normal file
91
my-deepagent/src/my_deepagent/persistence/db.py
Normal file
@@ -0,0 +1,91 @@
|
|||||||
|
"""Async SQLAlchemy engine + session factory with WAL mode and busy_timeout."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from collections.abc import AsyncIterator
|
||||||
|
from contextlib import asynccontextmanager
|
||||||
|
|
||||||
|
from sqlalchemy import event
|
||||||
|
from sqlalchemy.ext.asyncio import (
|
||||||
|
AsyncEngine,
|
||||||
|
AsyncSession,
|
||||||
|
async_sessionmaker,
|
||||||
|
create_async_engine,
|
||||||
|
)
|
||||||
|
|
||||||
|
from .models import Base
|
||||||
|
|
||||||
|
|
||||||
|
def _attach_sqlite_pragmas(engine: AsyncEngine) -> None:
|
||||||
|
"""Attach a synchronous connect-event listener that enables WAL, busy_timeout, FK."""
|
||||||
|
|
||||||
|
@event.listens_for(engine.sync_engine, "connect")
|
||||||
|
def _set_sqlite_pragma(dbapi_connection: object, _conn_record: object) -> None:
|
||||||
|
# dbapi_connection is a raw sqlite3.Connection delivered by SQLAlchemy's
|
||||||
|
# pool event callback. The signature uses `object` to match the generic
|
||||||
|
# listener protocol; we cast to `Any` here to access DBAPI methods without
|
||||||
|
# introducing a hard import of `sqlite3` (which would break non-SQLite
|
||||||
|
# engines). The pragma calls are safe: they are no-ops on non-SQLite
|
||||||
|
# dialects and sqlite3.Connection always has `.cursor()`.
|
||||||
|
import sqlite3 # local import to avoid circular or non-SQLite coupling
|
||||||
|
|
||||||
|
conn: sqlite3.Connection = dbapi_connection # type: ignore[assignment]
|
||||||
|
cursor = conn.cursor()
|
||||||
|
cursor.execute("PRAGMA journal_mode=WAL")
|
||||||
|
cursor.execute("PRAGMA busy_timeout=5000")
|
||||||
|
cursor.execute("PRAGMA foreign_keys=ON")
|
||||||
|
cursor.close()
|
||||||
|
|
||||||
|
|
||||||
|
class Database:
|
||||||
|
"""Façade over async engine + session maker.
|
||||||
|
|
||||||
|
Usage::
|
||||||
|
|
||||||
|
db = Database("sqlite+aiosqlite:///path/to/db.sqlite3")
|
||||||
|
await db.init_schema() # dev/test: create all tables directly
|
||||||
|
async with db.session() as s: # production: use alembic upgrade head
|
||||||
|
result = await s.execute(...)
|
||||||
|
await db.dispose()
|
||||||
|
|
||||||
|
For production deployments, call ``alembic upgrade head`` instead of
|
||||||
|
``init_schema`` so that migration history is tracked.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, database_url: str) -> None:
|
||||||
|
self._engine: AsyncEngine = create_async_engine(
|
||||||
|
database_url,
|
||||||
|
# NullPool avoids connection reuse issues in SQLite+aiosqlite tests.
|
||||||
|
poolclass=None, # use the default StaticPool-compatible pool
|
||||||
|
echo=False,
|
||||||
|
)
|
||||||
|
_attach_sqlite_pragmas(self._engine)
|
||||||
|
self._session_factory: async_sessionmaker[AsyncSession] = async_sessionmaker(
|
||||||
|
bind=self._engine,
|
||||||
|
expire_on_commit=False,
|
||||||
|
autoflush=False,
|
||||||
|
)
|
||||||
|
|
||||||
|
async def init_schema(self) -> None:
|
||||||
|
"""Create all ORM-defined tables.
|
||||||
|
|
||||||
|
For production, prefer ``alembic upgrade head``.
|
||||||
|
For tests, this is the fastest way to get a clean schema.
|
||||||
|
"""
|
||||||
|
async with self._engine.begin() as conn:
|
||||||
|
await conn.run_sync(Base.metadata.create_all)
|
||||||
|
|
||||||
|
@asynccontextmanager
|
||||||
|
async def session(self) -> AsyncIterator[AsyncSession]:
|
||||||
|
"""Yield an async session; commit on success, rollback on exception."""
|
||||||
|
async with self._session_factory() as session:
|
||||||
|
try:
|
||||||
|
yield session
|
||||||
|
await session.commit()
|
||||||
|
except Exception:
|
||||||
|
await session.rollback()
|
||||||
|
raise
|
||||||
|
|
||||||
|
async def dispose(self) -> None:
|
||||||
|
"""Dispose the engine connection pool."""
|
||||||
|
await self._engine.dispose()
|
||||||
578
my-deepagent/src/my_deepagent/persistence/models.py
Normal file
578
my-deepagent/src/my_deepagent/persistence/models.py
Normal file
@@ -0,0 +1,578 @@
|
|||||||
|
"""SQLAlchemy 2.0 async ORM models for my-deepagent persistence layer."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import uuid
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
from sqlalchemy import (
|
||||||
|
JSON,
|
||||||
|
Boolean,
|
||||||
|
Float,
|
||||||
|
ForeignKey,
|
||||||
|
Index,
|
||||||
|
Integer,
|
||||||
|
String,
|
||||||
|
Text,
|
||||||
|
UniqueConstraint,
|
||||||
|
text,
|
||||||
|
)
|
||||||
|
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column
|
||||||
|
|
||||||
|
|
||||||
|
class Base(DeclarativeBase):
|
||||||
|
"""SQLAlchemy declarative base for my-deepagent."""
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# workflow_templates
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
class WorkflowTemplateRow(Base):
|
||||||
|
"""Content-addressed workflow template definitions."""
|
||||||
|
|
||||||
|
__tablename__ = "workflow_templates"
|
||||||
|
|
||||||
|
id: Mapped[str] = mapped_column(String(36), primary_key=True, default=lambda: str(uuid.uuid4()))
|
||||||
|
name: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
version: Mapped[int] = mapped_column(Integer, nullable=False)
|
||||||
|
hash: Mapped[str] = mapped_column(Text, nullable=False, unique=True)
|
||||||
|
definition: Mapped[dict[str, Any]] = mapped_column(JSON, nullable=False)
|
||||||
|
created_at: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return f"<WorkflowTemplateRow id={self.id!r} name={self.name!r} version={self.version!r}>"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# agent_personas
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
class AgentPersonaRow(Base):
|
||||||
|
"""Content-addressed agent persona definitions."""
|
||||||
|
|
||||||
|
__tablename__ = "agent_personas"
|
||||||
|
|
||||||
|
id: Mapped[str] = mapped_column(String(36), primary_key=True, default=lambda: str(uuid.uuid4()))
|
||||||
|
name: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
version: Mapped[int] = mapped_column(Integer, nullable=False)
|
||||||
|
hash: Mapped[str] = mapped_column(Text, nullable=False, unique=True)
|
||||||
|
definition: Mapped[dict[str, Any]] = mapped_column(JSON, nullable=False)
|
||||||
|
created_at: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return f"<AgentPersonaRow id={self.id!r} name={self.name!r} version={self.version!r}>"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# runs
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
class RunRow(Base):
|
||||||
|
"""Top-level run record: one row per deepagent run invocation."""
|
||||||
|
|
||||||
|
__tablename__ = "runs"
|
||||||
|
__table_args__ = (
|
||||||
|
# Partial unique index: at most one active run per (repo_path, base_branch).
|
||||||
|
# An "active" run is any run whose state is not 'completed', 'failed', or 'aborted'.
|
||||||
|
# SQLite partial index uses a WHERE clause; autogenerate cannot detect this,
|
||||||
|
# so it is managed via a manual alembic migration.
|
||||||
|
Index(
|
||||||
|
"ux_active_run_repo_base",
|
||||||
|
"repo_path",
|
||||||
|
"base_branch",
|
||||||
|
unique=True,
|
||||||
|
sqlite_where=text("state NOT IN ('completed', 'failed', 'aborted')"),
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
id: Mapped[str] = mapped_column(String(36), primary_key=True, default=lambda: str(uuid.uuid4()))
|
||||||
|
# FK to workflow_templates — RESTRICT prevents deleting a template that has runs.
|
||||||
|
template_id: Mapped[str] = mapped_column(
|
||||||
|
String(36),
|
||||||
|
ForeignKey("workflow_templates.id", ondelete="RESTRICT"),
|
||||||
|
nullable=False,
|
||||||
|
)
|
||||||
|
template_hash: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
state: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
repo_path: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
base_branch: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
worktree_root: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
# current_phase_id references run_phases.id; however, runs.current_phase_id and
|
||||||
|
# run_phases.run_id form a circular FK pair. SQLite does not support deferrable
|
||||||
|
# constraints at the column level, and alembic cannot safely manage this circular
|
||||||
|
# dependency. Therefore current_phase_id carries NO ForeignKey constraint in the ORM.
|
||||||
|
# Callers must maintain referential integrity manually (i.e. always point to a valid
|
||||||
|
# run_phases.id that belongs to this run, or NULL).
|
||||||
|
current_phase_id: Mapped[str | None] = mapped_column(String(36), nullable=True)
|
||||||
|
started_at: Mapped[str | None] = mapped_column(Text, nullable=True)
|
||||||
|
ended_at: Mapped[str | None] = mapped_column(Text, nullable=True)
|
||||||
|
final_report_path: Mapped[str | None] = mapped_column(Text, nullable=True)
|
||||||
|
paused_from_state: Mapped[str | None] = mapped_column(Text, nullable=True)
|
||||||
|
created_at: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
updated_at: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return f"<RunRow id={self.id!r} state={self.state!r}>"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# run_inputs
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
class RunInputRow(Base):
|
||||||
|
"""Input snapshot for a run (one-to-one with runs)."""
|
||||||
|
|
||||||
|
__tablename__ = "run_inputs"
|
||||||
|
|
||||||
|
id: Mapped[str] = mapped_column(String(36), primary_key=True, default=lambda: str(uuid.uuid4()))
|
||||||
|
run_id: Mapped[str] = mapped_column(
|
||||||
|
String(36),
|
||||||
|
ForeignKey("runs.id", ondelete="CASCADE"),
|
||||||
|
nullable=False,
|
||||||
|
unique=True,
|
||||||
|
)
|
||||||
|
requirements_md: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
objective: Mapped[dict[str, Any]] = mapped_column(JSON, nullable=False)
|
||||||
|
extra: Mapped[dict[str, Any]] = mapped_column(JSON, nullable=False)
|
||||||
|
input_hash: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return f"<RunInputRow id={self.id!r} run_id={self.run_id!r}>"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# run_bindings
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
class RunBindingRow(Base):
|
||||||
|
"""Per-role persona binding for a run."""
|
||||||
|
|
||||||
|
__tablename__ = "run_bindings"
|
||||||
|
__table_args__ = (UniqueConstraint("run_id", "role_id", name="uq_run_bindings_run_role"),)
|
||||||
|
|
||||||
|
id: Mapped[str] = mapped_column(String(36), primary_key=True, default=lambda: str(uuid.uuid4()))
|
||||||
|
run_id: Mapped[str] = mapped_column(
|
||||||
|
String(36),
|
||||||
|
ForeignKey("runs.id", ondelete="CASCADE"),
|
||||||
|
nullable=False,
|
||||||
|
)
|
||||||
|
role_id: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
# FK to agent_personas — RESTRICT prevents deleting a persona that has bindings.
|
||||||
|
persona_id: Mapped[str] = mapped_column(
|
||||||
|
String(36),
|
||||||
|
ForeignKey("agent_personas.id", ondelete="RESTRICT"),
|
||||||
|
nullable=False,
|
||||||
|
)
|
||||||
|
persona_hash: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
backend: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
binding_hash: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return f"<RunBindingRow id={self.id!r} run_id={self.run_id!r} role_id={self.role_id!r}>"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# run_phases
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
class RunPhaseRow(Base):
|
||||||
|
"""Per-phase execution record for a run."""
|
||||||
|
|
||||||
|
__tablename__ = "run_phases"
|
||||||
|
__table_args__ = (UniqueConstraint("run_id", "phase_key", name="uq_run_phases_run_phase"),)
|
||||||
|
|
||||||
|
id: Mapped[str] = mapped_column(String(36), primary_key=True, default=lambda: str(uuid.uuid4()))
|
||||||
|
run_id: Mapped[str] = mapped_column(
|
||||||
|
String(36),
|
||||||
|
ForeignKey("runs.id", ondelete="CASCADE"),
|
||||||
|
nullable=False,
|
||||||
|
)
|
||||||
|
phase_key: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
seq: Mapped[int] = mapped_column(Integer, nullable=False)
|
||||||
|
state: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
attempts: Mapped[int] = mapped_column(Integer, nullable=False, default=0)
|
||||||
|
started_at: Mapped[str | None] = mapped_column(Text, nullable=True)
|
||||||
|
ended_at: Mapped[str | None] = mapped_column(Text, nullable=True)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return f"<RunPhaseRow id={self.id!r} run_id={self.run_id!r} phase_key={self.phase_key!r}>"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# run_events
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
class RunEventRow(Base):
|
||||||
|
"""Ordered event stream for a run."""
|
||||||
|
|
||||||
|
__tablename__ = "run_events"
|
||||||
|
__table_args__ = (
|
||||||
|
UniqueConstraint("run_id", "seq", name="uq_run_events_run_seq"),
|
||||||
|
UniqueConstraint("run_id", "idempotency_key", name="uq_run_events_run_idempotency"),
|
||||||
|
Index("run_events_run_id_ts_idx", "run_id", "ts"),
|
||||||
|
)
|
||||||
|
|
||||||
|
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
|
||||||
|
run_id: Mapped[str] = mapped_column(
|
||||||
|
String(36),
|
||||||
|
ForeignKey("runs.id", ondelete="CASCADE"),
|
||||||
|
nullable=False,
|
||||||
|
)
|
||||||
|
# phase_id references run_phases.id; CASCADE so events are deleted when a phase is deleted.
|
||||||
|
phase_id: Mapped[str | None] = mapped_column(
|
||||||
|
String(36),
|
||||||
|
ForeignKey("run_phases.id", ondelete="CASCADE"),
|
||||||
|
nullable=True,
|
||||||
|
)
|
||||||
|
seq: Mapped[int] = mapped_column(Integer, nullable=False)
|
||||||
|
type: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
payload: Mapped[dict[str, Any]] = mapped_column(JSON, nullable=False)
|
||||||
|
idempotency_key: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
ts: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return f"<RunEventRow id={self.id!r} run_id={self.run_id!r} seq={self.seq!r}>"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# approval_requests
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
class ApprovalRequestRow(Base):
|
||||||
|
"""Human approval gate requests."""
|
||||||
|
|
||||||
|
__tablename__ = "approval_requests"
|
||||||
|
|
||||||
|
id: Mapped[str] = mapped_column(String(36), primary_key=True, default=lambda: str(uuid.uuid4()))
|
||||||
|
run_id: Mapped[str] = mapped_column(
|
||||||
|
String(36),
|
||||||
|
ForeignKey("runs.id", ondelete="CASCADE"),
|
||||||
|
nullable=False,
|
||||||
|
)
|
||||||
|
# phase_id references run_phases.id; CASCADE so approval requests are deleted with the phase.
|
||||||
|
phase_id: Mapped[str | None] = mapped_column(
|
||||||
|
String(36),
|
||||||
|
ForeignKey("run_phases.id", ondelete="CASCADE"),
|
||||||
|
nullable=True,
|
||||||
|
)
|
||||||
|
gate_key: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
state: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
idempotency_key: Mapped[str] = mapped_column(Text, nullable=False, unique=True)
|
||||||
|
payload: Mapped[dict[str, Any]] = mapped_column(JSON, nullable=False)
|
||||||
|
created_at: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
resolved_at: Mapped[str | None] = mapped_column(Text, nullable=True)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return f"<ApprovalRequestRow id={self.id!r} gate_key={self.gate_key!r}>"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# approval_decisions
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
class ApprovalDecisionRow(Base):
|
||||||
|
"""Human decisions on approval requests."""
|
||||||
|
|
||||||
|
__tablename__ = "approval_decisions"
|
||||||
|
|
||||||
|
id: Mapped[str] = mapped_column(String(36), primary_key=True, default=lambda: str(uuid.uuid4()))
|
||||||
|
approval_request_id: Mapped[str] = mapped_column(
|
||||||
|
String(36),
|
||||||
|
ForeignKey("approval_requests.id", ondelete="CASCADE"),
|
||||||
|
nullable=False,
|
||||||
|
)
|
||||||
|
action: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
comment: Mapped[str | None] = mapped_column(Text, nullable=True)
|
||||||
|
decided_at: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
idempotency_key: Mapped[str] = mapped_column(Text, nullable=False, unique=True)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return f"<ApprovalDecisionRow id={self.id!r} action={self.action!r}>"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# artifacts
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
class ArtifactRow(Base):
|
||||||
|
"""Content-addressed output artifacts from phases."""
|
||||||
|
|
||||||
|
__tablename__ = "artifacts"
|
||||||
|
__table_args__ = (
|
||||||
|
UniqueConstraint("run_id", "path", "hash", name="uq_artifacts_run_path_hash"),
|
||||||
|
)
|
||||||
|
|
||||||
|
id: Mapped[str] = mapped_column(String(36), primary_key=True, default=lambda: str(uuid.uuid4()))
|
||||||
|
run_id: Mapped[str] = mapped_column(
|
||||||
|
String(36),
|
||||||
|
ForeignKey("runs.id", ondelete="CASCADE"),
|
||||||
|
nullable=False,
|
||||||
|
)
|
||||||
|
# phase_id references run_phases.id; CASCADE so artifacts are deleted with the phase.
|
||||||
|
phase_id: Mapped[str | None] = mapped_column(
|
||||||
|
String(36),
|
||||||
|
ForeignKey("run_phases.id", ondelete="CASCADE"),
|
||||||
|
nullable=True,
|
||||||
|
)
|
||||||
|
path: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
schema_id: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
hash: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
valid: Mapped[bool] = mapped_column(Boolean, nullable=False)
|
||||||
|
validation_error: Mapped[dict[str, Any] | None] = mapped_column(JSON, nullable=True)
|
||||||
|
created_at: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return f"<ArtifactRow id={self.id!r} path={self.path!r} valid={self.valid!r}>"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# interactive_sessions
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
class InteractiveSessionRow(Base):
|
||||||
|
"""Interactive (non-run) agent sessions."""
|
||||||
|
|
||||||
|
__tablename__ = "interactive_sessions"
|
||||||
|
|
||||||
|
id: Mapped[str] = mapped_column(String(36), primary_key=True, default=lambda: str(uuid.uuid4()))
|
||||||
|
# FK to agent_personas — RESTRICT prevents deleting a persona that has interactive sessions.
|
||||||
|
persona_id: Mapped[str] = mapped_column(
|
||||||
|
String(36),
|
||||||
|
ForeignKey("agent_personas.id", ondelete="RESTRICT"),
|
||||||
|
nullable=False,
|
||||||
|
)
|
||||||
|
persona_hash: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
started_at: Mapped[str | None] = mapped_column(Text, nullable=True)
|
||||||
|
ended_at: Mapped[str | None] = mapped_column(Text, nullable=True)
|
||||||
|
last_message_at: Mapped[str | None] = mapped_column(Text, nullable=True)
|
||||||
|
state: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return f"<InteractiveSessionRow id={self.id!r} state={self.state!r}>"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# tool_calls
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
class ToolCallRow(Base):
|
||||||
|
"""Audit log of every tool invocation (run or interactive)."""
|
||||||
|
|
||||||
|
__tablename__ = "tool_calls"
|
||||||
|
__table_args__ = (Index("tool_calls_run_id_ts_idx", "run_id", "ts"),)
|
||||||
|
|
||||||
|
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
|
||||||
|
# run_id / phase_id / interactive_session_id: exactly one must be non-NULL per row,
|
||||||
|
# but all three are nullable because tool_calls covers both run and interactive contexts.
|
||||||
|
# CASCADE ensures audit rows are removed when the parent run or session is deleted.
|
||||||
|
run_id: Mapped[str | None] = mapped_column(
|
||||||
|
String(36),
|
||||||
|
ForeignKey("runs.id", ondelete="CASCADE"),
|
||||||
|
nullable=True,
|
||||||
|
)
|
||||||
|
phase_id: Mapped[str | None] = mapped_column(
|
||||||
|
String(36),
|
||||||
|
ForeignKey("run_phases.id", ondelete="CASCADE"),
|
||||||
|
nullable=True,
|
||||||
|
)
|
||||||
|
interactive_session_id: Mapped[str | None] = mapped_column(
|
||||||
|
String(36),
|
||||||
|
ForeignKey("interactive_sessions.id", ondelete="CASCADE"),
|
||||||
|
nullable=True,
|
||||||
|
)
|
||||||
|
tool_name: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
args: Mapped[dict[str, Any]] = mapped_column(JSON, nullable=False)
|
||||||
|
result: Mapped[dict[str, Any] | None] = mapped_column(JSON, nullable=True)
|
||||||
|
error: Mapped[str | None] = mapped_column(Text, nullable=True)
|
||||||
|
duration_ms: Mapped[int] = mapped_column(Integer, nullable=False)
|
||||||
|
ts: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return f"<ToolCallRow id={self.id!r} tool_name={self.tool_name!r}>"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# llm_calls
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
class LlmCallRow(Base):
|
||||||
|
"""Full LLM call telemetry: tokens, cost, latency, model."""
|
||||||
|
|
||||||
|
__tablename__ = "llm_calls"
|
||||||
|
__table_args__ = (
|
||||||
|
Index("llm_calls_run_id_ts_idx", "run_id", "ts"),
|
||||||
|
Index("llm_calls_interactive_session_id_ts_idx", "interactive_session_id", "ts"),
|
||||||
|
Index("llm_calls_model_ts_idx", "model", "ts"),
|
||||||
|
)
|
||||||
|
|
||||||
|
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
|
||||||
|
# run_id / phase_id / interactive_session_id: exactly one must be non-NULL per row,
|
||||||
|
# but all three are nullable because llm_calls covers both run and interactive contexts.
|
||||||
|
# CASCADE ensures telemetry rows are removed when the parent run or session is deleted.
|
||||||
|
run_id: Mapped[str | None] = mapped_column(
|
||||||
|
String(36),
|
||||||
|
ForeignKey("runs.id", ondelete="CASCADE"),
|
||||||
|
nullable=True,
|
||||||
|
)
|
||||||
|
phase_id: Mapped[str | None] = mapped_column(
|
||||||
|
String(36),
|
||||||
|
ForeignKey("run_phases.id", ondelete="CASCADE"),
|
||||||
|
nullable=True,
|
||||||
|
)
|
||||||
|
interactive_session_id: Mapped[str | None] = mapped_column(
|
||||||
|
String(36),
|
||||||
|
ForeignKey("interactive_sessions.id", ondelete="CASCADE"),
|
||||||
|
nullable=True,
|
||||||
|
)
|
||||||
|
thread_id: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
persona_name: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
persona_version: Mapped[int] = mapped_column(Integer, nullable=False)
|
||||||
|
model: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
role: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
turn_index: Mapped[int] = mapped_column(Integer, nullable=False)
|
||||||
|
input_tokens: Mapped[int] = mapped_column(Integer, nullable=False)
|
||||||
|
output_tokens: Mapped[int] = mapped_column(Integer, nullable=False)
|
||||||
|
cached_tokens: Mapped[int] = mapped_column(Integer, nullable=False)
|
||||||
|
reasoning_tokens: Mapped[int] = mapped_column(Integer, nullable=False)
|
||||||
|
cost_usd_input: Mapped[float] = mapped_column(Float, nullable=False)
|
||||||
|
cost_usd_output: Mapped[float] = mapped_column(Float, nullable=False)
|
||||||
|
cost_usd_total: Mapped[float] = mapped_column(Float, nullable=False)
|
||||||
|
latency_ms: Mapped[int] = mapped_column(Integer, nullable=False)
|
||||||
|
status: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
error_code: Mapped[str | None] = mapped_column(Text, nullable=True)
|
||||||
|
request_id: Mapped[str | None] = mapped_column(Text, nullable=True)
|
||||||
|
ts: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return f"<LlmCallRow id={self.id!r} model={self.model!r} status={self.status!r}>"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# model_pricing
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
class ModelPricingRow(Base):
|
||||||
|
"""Cached model pricing data (fetched from provider APIs)."""
|
||||||
|
|
||||||
|
__tablename__ = "model_pricing"
|
||||||
|
|
||||||
|
model: Mapped[str] = mapped_column(Text, primary_key=True)
|
||||||
|
input_per_1k_usd: Mapped[float] = mapped_column(Float, nullable=False)
|
||||||
|
output_per_1k_usd: Mapped[float] = mapped_column(Float, nullable=False)
|
||||||
|
context_length: Mapped[int] = mapped_column(Integer, nullable=False)
|
||||||
|
fetched_at: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
raw_payload: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return f"<ModelPricingRow model={self.model!r}>"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# budget_ledger
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
class BudgetLedgerRow(Base):
|
||||||
|
"""Per-scope budget tracking (e.g. global, per-run, per-persona)."""
|
||||||
|
|
||||||
|
__tablename__ = "budget_ledger"
|
||||||
|
|
||||||
|
scope: Mapped[str] = mapped_column(Text, primary_key=True)
|
||||||
|
spent_usd: Mapped[float] = mapped_column(Float, nullable=False, default=0.0)
|
||||||
|
cap_usd: Mapped[float | None] = mapped_column(Float, nullable=True)
|
||||||
|
last_updated: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return f"<BudgetLedgerRow scope={self.scope!r} spent_usd={self.spent_usd!r}>"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# persona_consents
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
class PersonaConsentRow(Base):
|
||||||
|
"""Persisted persona consent decisions (approve/block)."""
|
||||||
|
|
||||||
|
__tablename__ = "persona_consents"
|
||||||
|
|
||||||
|
persona_hash: Mapped[str] = mapped_column(Text, primary_key=True)
|
||||||
|
persona_name: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
persona_version: Mapped[int] = mapped_column(Integer, nullable=False)
|
||||||
|
decision: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
decided_at: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return f"<PersonaConsentRow persona_hash={self.persona_hash!r} decision={self.decision!r}>"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# phase_feedback
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
class PhaseFeedbackRow(Base):
|
||||||
|
"""User feedback on completed phases (reaction + optional comment)."""
|
||||||
|
|
||||||
|
__tablename__ = "phase_feedback"
|
||||||
|
|
||||||
|
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
|
||||||
|
# CASCADE: feedback is deleted when the run is deleted (audit data follows the run lifecycle).
|
||||||
|
run_id: Mapped[str] = mapped_column(
|
||||||
|
String(36),
|
||||||
|
ForeignKey("runs.id", ondelete="CASCADE"),
|
||||||
|
nullable=False,
|
||||||
|
)
|
||||||
|
# CASCADE: feedback is deleted when the phase is deleted.
|
||||||
|
phase_id: Mapped[str] = mapped_column(
|
||||||
|
String(36),
|
||||||
|
ForeignKey("run_phases.id", ondelete="CASCADE"),
|
||||||
|
nullable=False,
|
||||||
|
)
|
||||||
|
reaction: Mapped[str | None] = mapped_column(Text, nullable=True)
|
||||||
|
comment: Mapped[str | None] = mapped_column(Text, nullable=True)
|
||||||
|
created_at: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return f"<PhaseFeedbackRow id={self.id!r} run_id={self.run_id!r}>"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# run_commands (schema-only; used in future steps)
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
class RunCommandRow(Base):
|
||||||
|
"""Queued commands targeting a run (pause, resume, abort, etc.)."""
|
||||||
|
|
||||||
|
__tablename__ = "run_commands"
|
||||||
|
|
||||||
|
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
|
||||||
|
run_id: Mapped[str] = mapped_column(
|
||||||
|
String(36),
|
||||||
|
ForeignKey("runs.id", ondelete="CASCADE"),
|
||||||
|
nullable=False,
|
||||||
|
)
|
||||||
|
command: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
payload: Mapped[dict[str, Any]] = mapped_column(JSON, nullable=False)
|
||||||
|
idempotency_key: Mapped[str] = mapped_column(Text, nullable=False, unique=True)
|
||||||
|
created_at: Mapped[str] = mapped_column(Text, nullable=False)
|
||||||
|
processed_at: Mapped[str | None] = mapped_column(Text, nullable=True)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return f"<RunCommandRow id={self.id!r} run_id={self.run_id!r} command={self.command!r}>"
|
||||||
154
my-deepagent/src/my_deepagent/persona.py
Normal file
154
my-deepagent/src/my_deepagent/persona.py
Normal file
@@ -0,0 +1,154 @@
|
|||||||
|
"""Persona schema + YAML loader + content-addressed hash + consent helpers."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Any, Literal
|
||||||
|
|
||||||
|
import yaml
|
||||||
|
from pydantic import BaseModel, ConfigDict, Field, ValidationInfo, field_validator
|
||||||
|
|
||||||
|
from .enums import Backend, Capability, RiskLevel
|
||||||
|
from .hash import sha256
|
||||||
|
|
||||||
|
|
||||||
|
class FilesystemPermissionSpec(BaseModel):
|
||||||
|
"""1:1 mapping to deepagents FilesystemPermission TypedDict."""
|
||||||
|
|
||||||
|
model_config = ConfigDict(frozen=True, extra="forbid")
|
||||||
|
|
||||||
|
operations: tuple[Literal["read", "write", "edit", "ls"], ...] = Field(min_length=1)
|
||||||
|
paths: tuple[str, ...] = Field(min_length=1)
|
||||||
|
mode: Literal["allow", "deny"] = "allow"
|
||||||
|
|
||||||
|
@field_validator("paths")
|
||||||
|
@classmethod
|
||||||
|
def _validate_paths(cls, v: tuple[str, ...]) -> tuple[str, ...]:
|
||||||
|
for p in v:
|
||||||
|
if not p.startswith("/"):
|
||||||
|
raise ValueError(f"path must start with '/': {p!r}")
|
||||||
|
if "\x00" in p:
|
||||||
|
raise ValueError(f"path must not contain null bytes: {p!r}")
|
||||||
|
# Check for literal ".." segment — glob paths like "/**" are OK
|
||||||
|
segments = p.split("/")
|
||||||
|
if ".." in segments:
|
||||||
|
raise ValueError(f"path must not contain '..': {p!r}")
|
||||||
|
if "~" in p:
|
||||||
|
raise ValueError(f"path must not contain '~': {p!r}")
|
||||||
|
return v
|
||||||
|
|
||||||
|
|
||||||
|
class PersonaSubagent(BaseModel):
|
||||||
|
"""1:1 mapping to deepagents SubAgent TypedDict."""
|
||||||
|
|
||||||
|
model_config = ConfigDict(frozen=True, extra="forbid")
|
||||||
|
|
||||||
|
name: str = Field(min_length=1)
|
||||||
|
description: str = Field(min_length=10)
|
||||||
|
system_prompt: str = Field(min_length=10)
|
||||||
|
allowed_tools: tuple[str, ...] = Field(default_factory=tuple)
|
||||||
|
model: str | None = None
|
||||||
|
permissions: tuple[FilesystemPermissionSpec, ...] = Field(default_factory=tuple)
|
||||||
|
# deepagents accepts dict[str, Any] for interrupt_on — intentional Any
|
||||||
|
interrupt_on: dict[str, Any] = Field(default_factory=dict)
|
||||||
|
|
||||||
|
|
||||||
|
class Persona(BaseModel):
|
||||||
|
"""Persona definition from docs/schemas/personas/<name>@<version>.yaml.
|
||||||
|
|
||||||
|
Immutability: list-valued fields are stored as tuples to prevent post-construction
|
||||||
|
mutation that would invalidate compute_hash(). dict-valued fields (model_params,
|
||||||
|
interrupt_on) remain dict because they are pass-through to deepagents which expects
|
||||||
|
``dict[str, Any]``; callers must not mutate them.
|
||||||
|
"""
|
||||||
|
|
||||||
|
model_config = ConfigDict(frozen=True, extra="forbid")
|
||||||
|
|
||||||
|
name: str = Field(min_length=1)
|
||||||
|
version: int = Field(ge=1)
|
||||||
|
description: str | None = None
|
||||||
|
backend: Backend
|
||||||
|
model: str = Field(min_length=1)
|
||||||
|
provider_origin: str = Field(min_length=1)
|
||||||
|
capabilities: tuple[Capability, ...] = Field(min_length=1)
|
||||||
|
max_risk_level: RiskLevel
|
||||||
|
allowed_roles: tuple[str, ...] | None = None
|
||||||
|
system_prompt: str = Field(min_length=10)
|
||||||
|
allowed_tools: tuple[str, ...] | None = None
|
||||||
|
subagents: tuple[PersonaSubagent, ...] = Field(default_factory=tuple)
|
||||||
|
permissions: tuple[FilesystemPermissionSpec, ...] = Field(default_factory=tuple)
|
||||||
|
# deepagents accepts dict[str, Any] for interrupt_on — intentional Any
|
||||||
|
interrupt_on: dict[str, Any] | None = None
|
||||||
|
# deepagents accepts dict[str, Any] for model_params — intentional Any
|
||||||
|
model_params: dict[str, Any] = Field(default_factory=dict)
|
||||||
|
deepagents_backend: Literal["state", "local_shell", "filesystem", "composite", "langsmith"] = (
|
||||||
|
"local_shell"
|
||||||
|
)
|
||||||
|
skills: tuple[str, ...] = Field(default_factory=tuple)
|
||||||
|
memory_files: tuple[str, ...] = Field(default_factory=tuple)
|
||||||
|
fallback_model: str | None = None
|
||||||
|
max_cost_per_call_usd: float | None = Field(default=None, ge=0)
|
||||||
|
|
||||||
|
@field_validator("model")
|
||||||
|
@classmethod
|
||||||
|
def _validate_openrouter_model(cls, v: str, info: ValidationInfo) -> str:
|
||||||
|
backend = info.data.get("backend") if info.data else None
|
||||||
|
if backend == Backend.OPENROUTER and not v.strip():
|
||||||
|
raise ValueError("openrouter backend requires non-empty model")
|
||||||
|
return v
|
||||||
|
|
||||||
|
def compute_hash(self) -> str:
|
||||||
|
"""Content-addressed identity hash (canonical JSON of normalized fields)."""
|
||||||
|
return sha256(
|
||||||
|
{
|
||||||
|
"name": self.name,
|
||||||
|
"version": self.version,
|
||||||
|
"backend": self.backend.value,
|
||||||
|
"model": self.model,
|
||||||
|
"provider_origin": self.provider_origin,
|
||||||
|
"capabilities": sorted(c.value for c in self.capabilities),
|
||||||
|
"max_risk_level": self.max_risk_level.value,
|
||||||
|
"allowed_roles": (
|
||||||
|
sorted(self.allowed_roles) if self.allowed_roles is not None else None
|
||||||
|
),
|
||||||
|
"system_prompt": self.system_prompt,
|
||||||
|
"allowed_tools": (
|
||||||
|
sorted(self.allowed_tools) if self.allowed_tools is not None else None
|
||||||
|
),
|
||||||
|
"subagents": [s.model_dump() for s in self.subagents],
|
||||||
|
"permissions": [p.model_dump() for p in self.permissions],
|
||||||
|
"interrupt_on": self.interrupt_on,
|
||||||
|
"model_params": self.model_params,
|
||||||
|
"deepagents_backend": self.deepagents_backend,
|
||||||
|
"fallback_model": self.fallback_model,
|
||||||
|
"max_cost_per_call_usd": self.max_cost_per_call_usd,
|
||||||
|
"skills": self.skills,
|
||||||
|
"memory_files": self.memory_files,
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def load_persona_yaml(path: Path) -> Persona:
|
||||||
|
"""Load and validate a single persona yaml file."""
|
||||||
|
if not path.is_file():
|
||||||
|
raise FileNotFoundError(f"persona yaml not found: {path}")
|
||||||
|
data = yaml.safe_load(path.read_text(encoding="utf-8"))
|
||||||
|
return Persona.model_validate(data)
|
||||||
|
|
||||||
|
|
||||||
|
def load_personas_from_dir(directory: Path) -> list[Persona]:
|
||||||
|
"""Load all *.yaml files from a directory, sorted by filename for determinism.
|
||||||
|
|
||||||
|
Raises ValueError if the same (name, version) pair appears more than once.
|
||||||
|
Returns an empty list if the directory does not exist.
|
||||||
|
"""
|
||||||
|
if not directory.is_dir():
|
||||||
|
return []
|
||||||
|
personas = [load_persona_yaml(p) for p in sorted(directory.glob("*.yaml"))]
|
||||||
|
seen: dict[tuple[str, int], str] = {}
|
||||||
|
for p in personas:
|
||||||
|
key = (p.name, p.version)
|
||||||
|
if key in seen:
|
||||||
|
raise ValueError(f"duplicate persona name={p.name!r} version={p.version}")
|
||||||
|
seen[key] = p.compute_hash()
|
||||||
|
return personas
|
||||||
1
my-deepagent/src/my_deepagent/prompt_envelope.py
Normal file
1
my-deepagent/src/my_deepagent/prompt_envelope.py
Normal file
@@ -0,0 +1 @@
|
|||||||
|
"""Prompt envelope builder for LangChain messages. Implemented in Step 5."""
|
||||||
0
my-deepagent/src/my_deepagent/py.typed
Normal file
0
my-deepagent/src/my_deepagent/py.typed
Normal file
1
my-deepagent/src/my_deepagent/run_event.py
Normal file
1
my-deepagent/src/my_deepagent/run_event.py
Normal file
@@ -0,0 +1 @@
|
|||||||
|
"""Run event types for streaming progress. Implemented in Step 4."""
|
||||||
1
my-deepagent/src/my_deepagent/safety.py
Normal file
1
my-deepagent/src/my_deepagent/safety.py
Normal file
@@ -0,0 +1 @@
|
|||||||
|
"""Safety gate for destructive command classification. Implemented in Step 11."""
|
||||||
274
my-deepagent/src/my_deepagent/session.py
Normal file
274
my-deepagent/src/my_deepagent/session.py
Normal file
@@ -0,0 +1,274 @@
|
|||||||
|
"""Build a deepagents CompiledStateGraph from a Persona + run context.
|
||||||
|
|
||||||
|
Connects:
|
||||||
|
- Persona (config) -> deepagents.create_deep_agent(...)
|
||||||
|
- OpenRouter (model="openrouter:...") -> ChatOpenAI(base_url=openrouter)
|
||||||
|
- Workspace dir -> LocalShellBackend (filesystem + shell execution)
|
||||||
|
- Persona.permissions + DEFAULT_DENY -> deepagents.FilesystemPermission list
|
||||||
|
- Subagents -> deepagents.SubAgent TypedDict list
|
||||||
|
- Middleware list -> passed to create_deep_agent
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import os
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Any, Literal
|
||||||
|
from uuid import UUID
|
||||||
|
|
||||||
|
from deepagents import FilesystemPermission, SubAgent, create_deep_agent
|
||||||
|
from deepagents.backends import (
|
||||||
|
CompositeBackend,
|
||||||
|
FilesystemBackend,
|
||||||
|
LocalShellBackend,
|
||||||
|
StateBackend,
|
||||||
|
)
|
||||||
|
from langchain_openai import ChatOpenAI
|
||||||
|
|
||||||
|
from .config import Config
|
||||||
|
from .errors import MyDeepAgentError
|
||||||
|
from .persona import FilesystemPermissionSpec, Persona, PersonaSubagent
|
||||||
|
|
||||||
|
DEFAULT_DENY_PATHS: tuple[str, ...] = (
|
||||||
|
"/.env*",
|
||||||
|
"/**/*.env*",
|
||||||
|
"/**/*token*",
|
||||||
|
"/**/*secret*",
|
||||||
|
"/**/*credential*",
|
||||||
|
"/**/*.pem",
|
||||||
|
"/**/*.key",
|
||||||
|
"/.ssh/**",
|
||||||
|
"/.aws/**",
|
||||||
|
"/.config/gcloud/**",
|
||||||
|
"/.kube/**",
|
||||||
|
"/.gnupg/**",
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# Mapping from our richer operation set (read/write/edit/ls) to the deepagents
|
||||||
|
# binary set (read/write). deepagents treats ls/grep/glob as read-side and
|
||||||
|
# write_file/edit_file as write-side internally, so this collapse is safe.
|
||||||
|
_OP_MAP: dict[str, Literal["read", "write"]] = {
|
||||||
|
"read": "read",
|
||||||
|
"write": "write",
|
||||||
|
"edit": "write",
|
||||||
|
"ls": "read",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def _map_operations(ops: tuple[str, ...] | list[str]) -> list[Literal["read", "write"]]:
|
||||||
|
"""Deduplicate-preserve-order mapping of our ops to deepagents ops."""
|
||||||
|
seen: set[str] = set()
|
||||||
|
out: list[Literal["read", "write"]] = []
|
||||||
|
for op in ops:
|
||||||
|
mapped = _OP_MAP[op]
|
||||||
|
if mapped not in seen:
|
||||||
|
seen.add(mapped)
|
||||||
|
out.append(mapped)
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
def default_safety_permissions() -> list[FilesystemPermission]:
|
||||||
|
"""Default-allow paths and deny secret-bearing paths.
|
||||||
|
|
||||||
|
Returned permissions are evaluated in order; first match wins.
|
||||||
|
Allow comes first so reads/writes to the worktree succeed by default;
|
||||||
|
then explicit denies block the secret patterns no matter what.
|
||||||
|
"""
|
||||||
|
return [
|
||||||
|
FilesystemPermission(
|
||||||
|
operations=["read", "write"],
|
||||||
|
paths=["/**"],
|
||||||
|
mode="allow",
|
||||||
|
),
|
||||||
|
FilesystemPermission(
|
||||||
|
operations=["read", "write"],
|
||||||
|
paths=list(DEFAULT_DENY_PATHS),
|
||||||
|
mode="deny",
|
||||||
|
),
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
def _spec_to_permission(spec: FilesystemPermissionSpec) -> FilesystemPermission:
|
||||||
|
"""Convert pydantic FilesystemPermissionSpec to deepagents FilesystemPermission.
|
||||||
|
|
||||||
|
Our schema accepts {read, write, edit, ls} for human-readable yaml. deepagents
|
||||||
|
collapses these to {read, write} internally; we apply the same collapse here.
|
||||||
|
"""
|
||||||
|
return FilesystemPermission(
|
||||||
|
operations=_map_operations(spec.operations),
|
||||||
|
paths=list(spec.paths),
|
||||||
|
mode=spec.mode,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _subagent_to_dict(sub: PersonaSubagent) -> SubAgent:
|
||||||
|
"""Convert PersonaSubagent -> deepagents SubAgent TypedDict.
|
||||||
|
|
||||||
|
Only includes optional keys when set; deepagents inherits defaults from the parent
|
||||||
|
agent when a subagent omits ``tools`` / ``model`` / ``permissions`` / ``interrupt_on``.
|
||||||
|
"""
|
||||||
|
out: dict[str, Any] = {
|
||||||
|
"name": sub.name,
|
||||||
|
"description": sub.description,
|
||||||
|
"system_prompt": sub.system_prompt,
|
||||||
|
}
|
||||||
|
if sub.allowed_tools:
|
||||||
|
out["tools"] = list(sub.allowed_tools)
|
||||||
|
if sub.model is not None:
|
||||||
|
out["model"] = sub.model
|
||||||
|
if sub.permissions:
|
||||||
|
out["permissions"] = [_spec_to_permission(p) for p in sub.permissions]
|
||||||
|
if sub.interrupt_on:
|
||||||
|
out["interrupt_on"] = sub.interrupt_on
|
||||||
|
return out # type: ignore[return-value] # TypedDict construction from dict literal
|
||||||
|
|
||||||
|
|
||||||
|
def _resolve_openrouter_api_key(config: Config) -> str:
|
||||||
|
"""Pull the OpenRouter API key from config -> env -> error.
|
||||||
|
|
||||||
|
Priority: config.openrouter_api_key -> MYDEEPAGENT_OPENROUTER_API_KEY -> OPENROUTER_API_KEY.
|
||||||
|
"""
|
||||||
|
if config.openrouter_api_key:
|
||||||
|
return config.openrouter_api_key
|
||||||
|
env_key = os.environ.get("MYDEEPAGENT_OPENROUTER_API_KEY") or os.environ.get(
|
||||||
|
"OPENROUTER_API_KEY"
|
||||||
|
)
|
||||||
|
if env_key:
|
||||||
|
return env_key
|
||||||
|
raise MyDeepAgentError.human_required(
|
||||||
|
"backend_auth_failed",
|
||||||
|
message="OpenRouter API key is not configured",
|
||||||
|
recovery_hint=(
|
||||||
|
"set MYDEEPAGENT_OPENROUTER_API_KEY in .env or run `mydeepagent login openrouter`"
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def resolve_model_instance(
|
||||||
|
persona: Persona, config: Config, model_override: str | None = None
|
||||||
|
) -> Any:
|
||||||
|
"""Persona -> langchain BaseChatModel instance or 'provider:model' string.
|
||||||
|
|
||||||
|
For ``openrouter:`` prefix, returns a ``ChatOpenAI`` with ``base_url=openrouter``.
|
||||||
|
For other providers (``anthropic:``, ``openai:``, ``google:``), returns the string as-is
|
||||||
|
so that deepagents' ``init_chat_model`` resolves it via the matching integration package.
|
||||||
|
"""
|
||||||
|
model_spec = model_override or persona.model
|
||||||
|
if model_spec.startswith("openrouter:"):
|
||||||
|
params = persona.model_params
|
||||||
|
return ChatOpenAI(
|
||||||
|
model=model_spec.removeprefix("openrouter:"),
|
||||||
|
api_key=_resolve_openrouter_api_key(config),
|
||||||
|
base_url=config.openrouter_base_url,
|
||||||
|
max_tokens=params.get("max_tokens", 4096),
|
||||||
|
temperature=params.get("temperature", 0.2),
|
||||||
|
top_p=params.get("top_p", 1.0),
|
||||||
|
)
|
||||||
|
return model_spec
|
||||||
|
|
||||||
|
|
||||||
|
def build_backend(persona: Persona, root_dir: Path) -> Any:
|
||||||
|
"""Persona.deepagents_backend -> concrete deepagents backend instance.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
LocalShellBackend for "local_shell" (filesystem + shell execute, the default).
|
||||||
|
FilesystemBackend for "filesystem" (filesystem only, no shell).
|
||||||
|
None for "state" (deepagents default StateBackend, in-process state only).
|
||||||
|
CompositeBackend for "composite" (local_shell + state-backed /memories/ namespace).
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
MyDeepAgentError(fatal, config_invalid) for unknown backend identifiers
|
||||||
|
or "langsmith" which is reserved for a future milestone.
|
||||||
|
"""
|
||||||
|
name = persona.deepagents_backend
|
||||||
|
if name == "local_shell":
|
||||||
|
return LocalShellBackend(
|
||||||
|
root_dir=str(root_dir),
|
||||||
|
virtual_mode=False,
|
||||||
|
timeout=120,
|
||||||
|
max_output_bytes=100_000,
|
||||||
|
inherit_env=False,
|
||||||
|
)
|
||||||
|
if name == "filesystem":
|
||||||
|
return FilesystemBackend(root_dir=str(root_dir), virtual_mode=False, max_file_size_mb=10)
|
||||||
|
if name == "state":
|
||||||
|
return None # deepagents default StateBackend
|
||||||
|
if name == "composite":
|
||||||
|
return CompositeBackend(
|
||||||
|
default=LocalShellBackend(root_dir=str(root_dir), virtual_mode=False),
|
||||||
|
routes={"/memories/": StateBackend()},
|
||||||
|
)
|
||||||
|
raise MyDeepAgentError.fatal(
|
||||||
|
"config_invalid",
|
||||||
|
message=f"unsupported deepagents_backend: {name!r}",
|
||||||
|
recovery_hint="use one of: local_shell, filesystem, state, composite",
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def build_agent(
|
||||||
|
persona: Persona,
|
||||||
|
config: Config,
|
||||||
|
*,
|
||||||
|
root_dir: Path,
|
||||||
|
middleware: list[Any] | None = None,
|
||||||
|
checkpointer: Any | None = None,
|
||||||
|
run_id: UUID | None = None,
|
||||||
|
phase_key: str | None = None,
|
||||||
|
model_override: str | None = None,
|
||||||
|
) -> Any:
|
||||||
|
"""Construct a deepagents CompiledStateGraph for the given persona.
|
||||||
|
|
||||||
|
Returns a CompiledStateGraph. Caller invokes via
|
||||||
|
``agent.invoke / ainvoke / astream / astream_events`` with ``{"messages": [...]}`` input.
|
||||||
|
|
||||||
|
deepagents 0.6.1 limitation: FilesystemPermission is rejected when the backend
|
||||||
|
implements SandboxBackendProtocol (e.g. LocalShellBackend). SafetyShellMiddleware
|
||||||
|
enforces path + destructive-command safety in those cases instead.
|
||||||
|
"""
|
||||||
|
from .middleware.safety import SafetyShellMiddleware
|
||||||
|
|
||||||
|
model = resolve_model_instance(persona, config, model_override)
|
||||||
|
backend = build_backend(persona, root_dir)
|
||||||
|
|
||||||
|
# SafetyShellMiddleware is always first; caller-supplied middleware appends.
|
||||||
|
all_middleware: list[Any] = [SafetyShellMiddleware()]
|
||||||
|
if middleware:
|
||||||
|
all_middleware.extend(middleware)
|
||||||
|
|
||||||
|
subagents: list[SubAgent] = [_subagent_to_dict(s) for s in persona.subagents]
|
||||||
|
|
||||||
|
kwargs: dict[str, Any] = {
|
||||||
|
"model": model,
|
||||||
|
"system_prompt": persona.system_prompt,
|
||||||
|
"middleware": all_middleware,
|
||||||
|
}
|
||||||
|
if backend is not None:
|
||||||
|
kwargs["backend"] = backend
|
||||||
|
|
||||||
|
# deepagents 0.6.1: FilesystemPermission + SandboxBackendProtocol backend raises
|
||||||
|
# NotImplementedError. Skip permissions kwarg for local_shell; SafetyShellMiddleware
|
||||||
|
# handles path enforcement instead. Other backends (state, filesystem, composite)
|
||||||
|
# still use the deepagents permissions system.
|
||||||
|
use_permissions = persona.deepagents_backend != "local_shell"
|
||||||
|
if use_permissions:
|
||||||
|
permissions: list[FilesystemPermission] = [
|
||||||
|
*(_spec_to_permission(p) for p in persona.permissions),
|
||||||
|
*default_safety_permissions(),
|
||||||
|
]
|
||||||
|
kwargs["permissions"] = permissions
|
||||||
|
|
||||||
|
if persona.allowed_tools:
|
||||||
|
kwargs["tools"] = list(persona.allowed_tools)
|
||||||
|
if subagents:
|
||||||
|
kwargs["subagents"] = subagents
|
||||||
|
if persona.interrupt_on:
|
||||||
|
kwargs["interrupt_on"] = persona.interrupt_on
|
||||||
|
if checkpointer is not None:
|
||||||
|
kwargs["checkpointer"] = checkpointer
|
||||||
|
if persona.skills:
|
||||||
|
kwargs["skills"] = list(persona.skills)
|
||||||
|
if persona.memory_files:
|
||||||
|
kwargs["memory"] = list(persona.memory_files)
|
||||||
|
|
||||||
|
return create_deep_agent(**kwargs)
|
||||||
1
my-deepagent/src/my_deepagent/slash.py
Normal file
1
my-deepagent/src/my_deepagent/slash.py
Normal file
@@ -0,0 +1 @@
|
|||||||
|
"""Slash command registry and dispatcher. Implemented in Step 10."""
|
||||||
0
my-deepagent/src/my_deepagent/tui/__init__.py
Normal file
0
my-deepagent/src/my_deepagent/tui/__init__.py
Normal file
1
my-deepagent/src/my_deepagent/tui/approval.py
Normal file
1
my-deepagent/src/my_deepagent/tui/approval.py
Normal file
@@ -0,0 +1 @@
|
|||||||
|
"""TUI approval dialog for human-in-the-loop actions. Implemented in Step 7."""
|
||||||
1
my-deepagent/src/my_deepagent/tui/render.py
Normal file
1
my-deepagent/src/my_deepagent/tui/render.py
Normal file
@@ -0,0 +1 @@
|
|||||||
|
"""TUI Rich panel and table renderers. Implemented in Step 10."""
|
||||||
1
my-deepagent/src/my_deepagent/tui/stream.py
Normal file
1
my-deepagent/src/my_deepagent/tui/stream.py
Normal file
@@ -0,0 +1 @@
|
|||||||
|
"""TUI streaming output renderer for run events. Implemented in Step 10."""
|
||||||
127
my-deepagent/src/my_deepagent/workflow.py
Normal file
127
my-deepagent/src/my_deepagent/workflow.py
Normal file
@@ -0,0 +1,127 @@
|
|||||||
|
"""WorkflowTemplate schema + YAML loader."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from collections import Counter
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
import yaml
|
||||||
|
from pydantic import BaseModel, ConfigDict, Field, field_validator, model_validator
|
||||||
|
|
||||||
|
from .enums import Backend, Capability, RiskLevel
|
||||||
|
from .hash import sha256
|
||||||
|
|
||||||
|
|
||||||
|
class ExpectedArtifact(BaseModel):
|
||||||
|
"""Expected output artifact of a workflow phase."""
|
||||||
|
|
||||||
|
model_config = ConfigDict(frozen=True, extra="forbid", populate_by_name=True)
|
||||||
|
|
||||||
|
path: str = Field(min_length=1)
|
||||||
|
# yaml uses 'schema' key; pydantic attribute is schema_id to avoid shadowing BaseModel.schema
|
||||||
|
schema_id: str = Field(min_length=1, alias="schema")
|
||||||
|
|
||||||
|
|
||||||
|
class WorkflowPhase(BaseModel):
|
||||||
|
"""Single phase definition inside a workflow template."""
|
||||||
|
|
||||||
|
model_config = ConfigDict(frozen=True, extra="forbid")
|
||||||
|
|
||||||
|
key: str = Field(min_length=1, pattern=r"^[a-z][a-z0-9_]*$")
|
||||||
|
title: str = Field(min_length=1)
|
||||||
|
risk: RiskLevel
|
||||||
|
role: str = Field(min_length=1)
|
||||||
|
expected_artifact: ExpectedArtifact | None = None
|
||||||
|
gates: tuple[str, ...] = Field(default_factory=tuple)
|
||||||
|
timeout_seconds: int | None = Field(default=None, ge=1)
|
||||||
|
instructions: str = Field(min_length=10)
|
||||||
|
max_budget_usd: float | None = Field(default=None, ge=0)
|
||||||
|
|
||||||
|
|
||||||
|
class WorkflowRole(BaseModel):
|
||||||
|
"""Role definition: what capabilities a bound persona must have."""
|
||||||
|
|
||||||
|
model_config = ConfigDict(frozen=True, extra="forbid")
|
||||||
|
|
||||||
|
id: str = Field(min_length=1, pattern=r"^[a-z][a-z0-9_]*$")
|
||||||
|
required_capabilities: tuple[Capability, ...] = Field(min_length=1)
|
||||||
|
preferred_backends: tuple[Backend, ...] = Field(default_factory=tuple)
|
||||||
|
fallback_personas: tuple[str, ...] = Field(default_factory=tuple)
|
||||||
|
|
||||||
|
|
||||||
|
class WorkflowTemplate(BaseModel):
|
||||||
|
"""Complete workflow template loaded from docs/schemas/workflows/<name>@<version>.yaml."""
|
||||||
|
|
||||||
|
model_config = ConfigDict(frozen=True, extra="forbid")
|
||||||
|
|
||||||
|
name: str = Field(min_length=1)
|
||||||
|
version: int = Field(ge=1)
|
||||||
|
description: str | None = None
|
||||||
|
roles: tuple[WorkflowRole, ...] = Field(min_length=1)
|
||||||
|
phases: tuple[WorkflowPhase, ...] = Field(min_length=1)
|
||||||
|
default_gates: tuple[str, ...] = Field(default_factory=tuple)
|
||||||
|
max_total_budget_usd: float | None = Field(default=None, ge=0)
|
||||||
|
|
||||||
|
@model_validator(mode="after")
|
||||||
|
def _validate_phase_roles(self) -> WorkflowTemplate:
|
||||||
|
role_ids = {r.id for r in self.roles}
|
||||||
|
for ph in self.phases:
|
||||||
|
if ph.role not in role_ids:
|
||||||
|
raise ValueError(f"phase '{ph.key}' references unknown role '{ph.role}'")
|
||||||
|
return self
|
||||||
|
|
||||||
|
@model_validator(mode="after")
|
||||||
|
def _validate_unique_phase_keys(self) -> WorkflowTemplate:
|
||||||
|
counts = Counter(ph.key for ph in self.phases)
|
||||||
|
duplicates = sorted(k for k, c in counts.items() if c > 1)
|
||||||
|
if duplicates:
|
||||||
|
raise ValueError(f"duplicate phase keys: {duplicates}")
|
||||||
|
return self
|
||||||
|
|
||||||
|
@field_validator("roles")
|
||||||
|
@classmethod
|
||||||
|
def _validate_unique_role_ids(cls, v: tuple[WorkflowRole, ...]) -> tuple[WorkflowRole, ...]:
|
||||||
|
counts = Counter(r.id for r in v)
|
||||||
|
duplicates = sorted(k for k, c in counts.items() if c > 1)
|
||||||
|
if duplicates:
|
||||||
|
raise ValueError(f"duplicate role ids: {duplicates}")
|
||||||
|
return v
|
||||||
|
|
||||||
|
def compute_hash(self) -> str:
|
||||||
|
"""Content-addressed identity hash of this template."""
|
||||||
|
return sha256(
|
||||||
|
{
|
||||||
|
"name": self.name,
|
||||||
|
"version": self.version,
|
||||||
|
"roles": [r.model_dump() for r in self.roles],
|
||||||
|
"phases": [ph.model_dump(by_alias=True) for ph in self.phases],
|
||||||
|
"default_gates": sorted(self.default_gates),
|
||||||
|
"max_total_budget_usd": self.max_total_budget_usd,
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def load_workflow_yaml(path: Path) -> WorkflowTemplate:
|
||||||
|
"""Load and validate a single workflow yaml file."""
|
||||||
|
if not path.is_file():
|
||||||
|
raise FileNotFoundError(f"workflow yaml not found: {path}")
|
||||||
|
data = yaml.safe_load(path.read_text(encoding="utf-8"))
|
||||||
|
return WorkflowTemplate.model_validate(data)
|
||||||
|
|
||||||
|
|
||||||
|
def load_workflows_from_dir(directory: Path) -> list[WorkflowTemplate]:
|
||||||
|
"""Load all *.yaml workflow files from a directory, sorted by filename.
|
||||||
|
|
||||||
|
Raises ValueError if the same (name, version) pair appears more than once.
|
||||||
|
Returns an empty list if the directory does not exist.
|
||||||
|
"""
|
||||||
|
if not directory.is_dir():
|
||||||
|
return []
|
||||||
|
workflows = [load_workflow_yaml(p) for p in sorted(directory.glob("*.yaml"))]
|
||||||
|
seen: set[tuple[str, int]] = set()
|
||||||
|
for w in workflows:
|
||||||
|
key = (w.name, w.version)
|
||||||
|
if key in seen:
|
||||||
|
raise ValueError(f"duplicate workflow name={w.name!r} version={w.version}")
|
||||||
|
seen.add(key)
|
||||||
|
return workflows
|
||||||
0
my-deepagent/tests/__init__.py
Normal file
0
my-deepagent/tests/__init__.py
Normal file
0
my-deepagent/tests/fixtures/__init__.py
vendored
Normal file
0
my-deepagent/tests/fixtures/__init__.py
vendored
Normal file
0
my-deepagent/tests/integration/__init__.py
Normal file
0
my-deepagent/tests/integration/__init__.py
Normal file
78
my-deepagent/tests/integration/test_checkpointer.py
Normal file
78
my-deepagent/tests/integration/test_checkpointer.py
Normal file
@@ -0,0 +1,78 @@
|
|||||||
|
"""Integration tests for src/my_deepagent/persistence/checkpointer.py."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import sqlite3
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
from my_deepagent.persistence.checkpointer import get_checkpointer_ctx
|
||||||
|
|
||||||
|
|
||||||
|
class TestGetCheckpointerCtx:
|
||||||
|
"""Tests for the get_checkpointer_ctx context manager."""
|
||||||
|
|
||||||
|
def test_ctx_yields_saver_and_cleans_up(self, tmp_path: Path) -> None:
|
||||||
|
"""Entering the context yields a SqliteSaver; exiting releases the connection."""
|
||||||
|
db_path = tmp_path / "ck.db"
|
||||||
|
with get_checkpointer_ctx(db_path) as saver:
|
||||||
|
assert saver is not None
|
||||||
|
# The DB file must exist while inside the context.
|
||||||
|
assert db_path.exists()
|
||||||
|
|
||||||
|
# After context exit the file must still exist (not deleted).
|
||||||
|
assert db_path.exists()
|
||||||
|
|
||||||
|
def test_db_file_created_on_enter(self, tmp_path: Path) -> None:
|
||||||
|
"""The sqlite file is created when the context is entered."""
|
||||||
|
db_path = tmp_path / "nested" / "dir" / "ck.db"
|
||||||
|
assert not db_path.exists()
|
||||||
|
|
||||||
|
with get_checkpointer_ctx(db_path):
|
||||||
|
assert db_path.exists()
|
||||||
|
|
||||||
|
def test_parent_dir_created_if_missing(self, tmp_path: Path) -> None:
|
||||||
|
"""Parent directory is created automatically even if it does not exist."""
|
||||||
|
db_path = tmp_path / "a" / "b" / "c" / "ck.db"
|
||||||
|
assert not db_path.parent.exists()
|
||||||
|
|
||||||
|
with get_checkpointer_ctx(db_path):
|
||||||
|
assert db_path.parent.exists()
|
||||||
|
|
||||||
|
def test_connection_released_after_ctx_exit(self, tmp_path: Path) -> None:
|
||||||
|
"""After exiting the context manager, another process/connection can open the DB."""
|
||||||
|
db_path = tmp_path / "ck.db"
|
||||||
|
|
||||||
|
with get_checkpointer_ctx(db_path):
|
||||||
|
pass # enter and exit
|
||||||
|
|
||||||
|
# If the connection were leaked (not closed), WAL mode can still allow reads,
|
||||||
|
# but we verify by opening with a fresh sqlite3 connection — this must succeed.
|
||||||
|
with sqlite3.connect(str(db_path)) as conn:
|
||||||
|
cur = conn.execute("SELECT name FROM sqlite_master WHERE type='table'")
|
||||||
|
# LangGraph creates its checkpoint tables; result must be a list (not error).
|
||||||
|
tables = [row[0] for row in cur.fetchall()]
|
||||||
|
assert isinstance(tables, list)
|
||||||
|
|
||||||
|
def test_meta_and_checkpoint_db_no_lock_conflict(self, tmp_path: Path) -> None:
|
||||||
|
"""Using two separate DB files in the same directory causes no locking conflict."""
|
||||||
|
meta_db = tmp_path / "meta.db"
|
||||||
|
ck_db = tmp_path / "checkpoints.db"
|
||||||
|
|
||||||
|
# Simulate concurrent use: open both within the same scope.
|
||||||
|
with get_checkpointer_ctx(ck_db) as saver:
|
||||||
|
# Write something to the meta DB while the checkpointer holds its connection.
|
||||||
|
with sqlite3.connect(str(meta_db)) as conn:
|
||||||
|
conn.execute("CREATE TABLE IF NOT EXISTS kv (k TEXT PRIMARY KEY, v TEXT)")
|
||||||
|
conn.execute("INSERT OR REPLACE INTO kv VALUES ('key', 'value')")
|
||||||
|
conn.commit()
|
||||||
|
|
||||||
|
assert saver is not None
|
||||||
|
|
||||||
|
# Both files must exist and be independently readable.
|
||||||
|
assert meta_db.exists()
|
||||||
|
assert ck_db.exists()
|
||||||
|
|
||||||
|
with sqlite3.connect(str(meta_db)) as conn:
|
||||||
|
row = conn.execute("SELECT v FROM kv WHERE k='key'").fetchone()
|
||||||
|
assert row is not None
|
||||||
|
assert row[0] == "value"
|
||||||
143
my-deepagent/tests/integration/test_openrouter_smoke.py
Normal file
143
my-deepagent/tests/integration/test_openrouter_smoke.py
Normal file
@@ -0,0 +1,143 @@
|
|||||||
|
"""Real OpenRouter API smoke test. Costs ~$0.001-$0.003 per full run.
|
||||||
|
|
||||||
|
Skipped automatically when no API key is configured.
|
||||||
|
Uses deepseek/deepseek-chat (cheapest available) with max_tokens=50.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import os
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from my_deepagent.config import load_config
|
||||||
|
from my_deepagent.persona import Persona
|
||||||
|
from my_deepagent.session import resolve_model_instance
|
||||||
|
|
||||||
|
_HAS_KEY = (
|
||||||
|
bool(os.environ.get("MYDEEPAGENT_OPENROUTER_API_KEY") or os.environ.get("OPENROUTER_API_KEY"))
|
||||||
|
or Path(".env").is_file()
|
||||||
|
)
|
||||||
|
|
||||||
|
pytestmark = [
|
||||||
|
pytest.mark.integration,
|
||||||
|
pytest.mark.skipif(not _HAS_KEY, reason="no OpenRouter API key configured"),
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
def _smoke_persona() -> Persona:
|
||||||
|
return Persona.model_validate(
|
||||||
|
{
|
||||||
|
"name": "smoke-test",
|
||||||
|
"version": 1,
|
||||||
|
"backend": "openrouter",
|
||||||
|
"model": "openrouter:deepseek/deepseek-chat",
|
||||||
|
"provider_origin": "China/DeepSeek",
|
||||||
|
"capabilities": ["evidence_check"],
|
||||||
|
"max_risk_level": "low",
|
||||||
|
"system_prompt": (
|
||||||
|
"You are a smoke-test echo bot. Reply only with the literal token 'OK'."
|
||||||
|
),
|
||||||
|
"model_params": {"max_tokens": 50, "temperature": 0.0},
|
||||||
|
# deepagents 0.6.x: local_shell backend + permissions 동시 사용 시
|
||||||
|
# NotImplementedError 발생. state 백엔드는 permissions 제약 없음.
|
||||||
|
"deepagents_backend": "state",
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _smoke_persona_local_shell() -> Persona:
|
||||||
|
return Persona.model_validate(
|
||||||
|
{
|
||||||
|
"name": "smoke-test-local-shell",
|
||||||
|
"version": 1,
|
||||||
|
"backend": "openrouter",
|
||||||
|
"model": "openrouter:deepseek/deepseek-chat",
|
||||||
|
"provider_origin": "China/DeepSeek",
|
||||||
|
"capabilities": ["evidence_check"],
|
||||||
|
"max_risk_level": "low",
|
||||||
|
"system_prompt": (
|
||||||
|
"You are a smoke-test echo bot. Reply only with the literal token 'OK'."
|
||||||
|
),
|
||||||
|
"model_params": {"max_tokens": 50, "temperature": 0.0},
|
||||||
|
# local_shell backend: SafetyShellMiddleware enforces path + destructive-command
|
||||||
|
# policy; permissions kwarg is skipped to avoid deepagents 0.6.1 NotImplementedError.
|
||||||
|
"deepagents_backend": "local_shell",
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def test_openrouter_chat_completion_returns_response() -> None:
|
||||||
|
"""ChatOpenAI 인스턴스로 1회 호출하여 OpenRouter base_url + auth + 응답 흐름 검증."""
|
||||||
|
config = load_config()
|
||||||
|
persona = _smoke_persona()
|
||||||
|
chat = resolve_model_instance(persona, config)
|
||||||
|
response = chat.invoke(
|
||||||
|
[
|
||||||
|
("system", persona.system_prompt),
|
||||||
|
("user", "Reply with the exact string 'OK' and nothing else."),
|
||||||
|
]
|
||||||
|
)
|
||||||
|
assert response is not None
|
||||||
|
content = response.content
|
||||||
|
# langchain BaseMessage.content는 str | list[content_block_dict]
|
||||||
|
if isinstance(content, str):
|
||||||
|
assert len(content) > 0
|
||||||
|
else:
|
||||||
|
assert len(content) > 0
|
||||||
|
|
||||||
|
|
||||||
|
def test_openrouter_usage_metadata_present() -> None:
|
||||||
|
"""response.usage_metadata가 input_tokens/output_tokens를 채워야 cost 계측 가능."""
|
||||||
|
config = load_config()
|
||||||
|
persona = _smoke_persona()
|
||||||
|
chat = resolve_model_instance(persona, config)
|
||||||
|
response = chat.invoke(
|
||||||
|
[
|
||||||
|
("system", persona.system_prompt),
|
||||||
|
("user", "Reply with 'OK'."),
|
||||||
|
]
|
||||||
|
)
|
||||||
|
usage = getattr(response, "usage_metadata", None)
|
||||||
|
assert usage is not None, "OpenRouter response must include usage_metadata"
|
||||||
|
assert usage.get("input_tokens", 0) > 0
|
||||||
|
assert usage.get("output_tokens", 0) > 0
|
||||||
|
|
||||||
|
|
||||||
|
def test_openrouter_deepagents_create_smoke() -> None:
|
||||||
|
"""deepagents create_deep_agent + 실 OpenRouter 호출 1회. 가장 비싼 검증."""
|
||||||
|
config = load_config()
|
||||||
|
persona = _smoke_persona()
|
||||||
|
from my_deepagent.session import build_agent
|
||||||
|
|
||||||
|
agent = build_agent(persona, config, root_dir=Path.cwd())
|
||||||
|
result = agent.invoke({"messages": [{"role": "user", "content": "Reply with 'OK' only."}]})
|
||||||
|
messages = result.get("messages", [])
|
||||||
|
assert len(messages) > 0
|
||||||
|
last = messages[-1]
|
||||||
|
content = getattr(last, "content", "")
|
||||||
|
if isinstance(content, list):
|
||||||
|
content = " ".join(str(c) for c in content)
|
||||||
|
assert len(str(content)) > 0
|
||||||
|
|
||||||
|
|
||||||
|
def test_openrouter_deepagents_local_shell_smoke(tmp_path: Path) -> None:
|
||||||
|
"""Real OpenRouter call via deepagents + LocalShellBackend + SafetyShellMiddleware.
|
||||||
|
|
||||||
|
Verifies deepagents 0.6.1 workaround: local_shell backend with permissions kwarg
|
||||||
|
skipped, SafetyShellMiddleware automatically injected by build_agent.
|
||||||
|
"""
|
||||||
|
config = load_config()
|
||||||
|
persona = _smoke_persona_local_shell()
|
||||||
|
from my_deepagent.session import build_agent
|
||||||
|
|
||||||
|
agent = build_agent(persona, config, root_dir=tmp_path)
|
||||||
|
result = agent.invoke({"messages": [{"role": "user", "content": "Reply 'OK' only."}]})
|
||||||
|
messages = result.get("messages", [])
|
||||||
|
assert len(messages) > 0
|
||||||
|
last = messages[-1]
|
||||||
|
content = getattr(last, "content", "")
|
||||||
|
if isinstance(content, list):
|
||||||
|
content = " ".join(str(c) for c in content)
|
||||||
|
assert len(str(content)) > 0
|
||||||
670
my-deepagent/tests/integration/test_persistence.py
Normal file
670
my-deepagent/tests/integration/test_persistence.py
Normal file
@@ -0,0 +1,670 @@
|
|||||||
|
"""Integration tests for src/my_deepagent/persistence/ (DB engine + ORM models)."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import subprocess
|
||||||
|
import sys
|
||||||
|
import uuid
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
import pytest_asyncio
|
||||||
|
from sqlalchemy import text
|
||||||
|
from sqlalchemy.exc import IntegrityError
|
||||||
|
|
||||||
|
from my_deepagent.persistence.db import Database
|
||||||
|
from my_deepagent.persistence.models import (
|
||||||
|
AgentPersonaRow,
|
||||||
|
RunEventRow,
|
||||||
|
RunInputRow,
|
||||||
|
RunPhaseRow,
|
||||||
|
RunRow,
|
||||||
|
WorkflowTemplateRow,
|
||||||
|
)
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Helpers
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
_NOW = "2026-05-15T00:00:00+00:00"
|
||||||
|
|
||||||
|
|
||||||
|
def _make_id() -> str:
|
||||||
|
return str(uuid.uuid4())
|
||||||
|
|
||||||
|
|
||||||
|
def _workflow_template_row(template_id: str) -> WorkflowTemplateRow:
|
||||||
|
"""Return a WorkflowTemplateRow that satisfies the runs.template_id FK."""
|
||||||
|
return WorkflowTemplateRow(
|
||||||
|
id=template_id,
|
||||||
|
name="test-wf",
|
||||||
|
version=1,
|
||||||
|
hash=template_id, # unique per invocation
|
||||||
|
definition={},
|
||||||
|
created_at=_NOW,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _run_row(run_id: str | None = None, template_id: str | None = None) -> RunRow:
|
||||||
|
rid = run_id or _make_id()
|
||||||
|
tid = template_id or _make_id()
|
||||||
|
return RunRow(
|
||||||
|
id=rid,
|
||||||
|
template_id=tid,
|
||||||
|
template_hash="a" * 64,
|
||||||
|
state="pending",
|
||||||
|
repo_path="/repo",
|
||||||
|
base_branch="main",
|
||||||
|
worktree_root="/wt",
|
||||||
|
created_at=_NOW,
|
||||||
|
updated_at=_NOW,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Fixtures
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture()
|
||||||
|
def db_url(tmp_path: Path) -> str:
|
||||||
|
return f"sqlite+aiosqlite:///{tmp_path}/test.db"
|
||||||
|
|
||||||
|
|
||||||
|
@pytest_asyncio.fixture()
|
||||||
|
async def db(db_url: str) -> Database: # type: ignore[misc]
|
||||||
|
database = Database(db_url)
|
||||||
|
await database.init_schema()
|
||||||
|
yield database # type: ignore[misc]
|
||||||
|
await database.dispose()
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# A.1: All 18 tables exist after init_schema
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
EXPECTED_TABLES = {
|
||||||
|
"workflow_templates",
|
||||||
|
"agent_personas",
|
||||||
|
"runs",
|
||||||
|
"run_inputs",
|
||||||
|
"run_bindings",
|
||||||
|
"run_phases",
|
||||||
|
"run_events",
|
||||||
|
"approval_requests",
|
||||||
|
"approval_decisions",
|
||||||
|
"artifacts",
|
||||||
|
"interactive_sessions",
|
||||||
|
"tool_calls",
|
||||||
|
"llm_calls",
|
||||||
|
"model_pricing",
|
||||||
|
"budget_ledger",
|
||||||
|
"persona_consents",
|
||||||
|
"phase_feedback",
|
||||||
|
"run_commands",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_init_schema_creates_all_tables(db: Database) -> None:
|
||||||
|
"""All expected tables must exist in sqlite_master after init_schema."""
|
||||||
|
async with db.session() as session:
|
||||||
|
result = await session.execute(
|
||||||
|
text("SELECT name FROM sqlite_master WHERE type='table' ORDER BY name")
|
||||||
|
)
|
||||||
|
table_names = {row[0] for row in result.fetchall()}
|
||||||
|
table_names.discard("alembic_version")
|
||||||
|
assert EXPECTED_TABLES <= table_names, f"Missing tables: {EXPECTED_TABLES - table_names}"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# A.2: WAL mode active
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_wal_mode_active(db: Database) -> None:
|
||||||
|
"""journal_mode PRAGMA must return 'wal' after connection."""
|
||||||
|
async with db.session() as session:
|
||||||
|
result = await session.execute(text("PRAGMA journal_mode"))
|
||||||
|
mode = result.scalar()
|
||||||
|
assert mode == "wal", f"Expected 'wal', got {mode!r}"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# A.3: busy_timeout active
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_busy_timeout_active(db: Database) -> None:
|
||||||
|
"""busy_timeout PRAGMA must return 5000."""
|
||||||
|
async with db.session() as session:
|
||||||
|
result = await session.execute(text("PRAGMA busy_timeout"))
|
||||||
|
timeout = result.scalar()
|
||||||
|
assert timeout == 5000, f"Expected 5000, got {timeout!r}"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# A.4: foreign_keys active
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_foreign_keys_active(db: Database) -> None:
|
||||||
|
"""foreign_keys PRAGMA must return 1."""
|
||||||
|
async with db.session() as session:
|
||||||
|
result = await session.execute(text("PRAGMA foreign_keys"))
|
||||||
|
fk = result.scalar()
|
||||||
|
assert fk == 1, f"Expected 1, got {fk!r}"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# A.5: basic insert + select round-trip
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_run_row_insert_and_select(db: Database) -> None:
|
||||||
|
"""RunRow insert then SELECT must return the same state."""
|
||||||
|
rid = _make_id()
|
||||||
|
tid = _make_id()
|
||||||
|
template = _workflow_template_row(tid)
|
||||||
|
run = _run_row(rid, template_id=tid)
|
||||||
|
async with db.session() as session:
|
||||||
|
session.add(template)
|
||||||
|
await session.flush()
|
||||||
|
session.add(run)
|
||||||
|
async with db.session() as session:
|
||||||
|
fetched = await session.get(RunRow, rid)
|
||||||
|
assert fetched is not None
|
||||||
|
assert fetched.id == rid
|
||||||
|
assert fetched.state == "pending"
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_agent_persona_row_insert_and_select(db: Database) -> None:
|
||||||
|
"""AgentPersonaRow insert then SELECT must return the same record."""
|
||||||
|
persona_id = _make_id()
|
||||||
|
persona = AgentPersonaRow(
|
||||||
|
id=persona_id,
|
||||||
|
name="test-persona",
|
||||||
|
version=1,
|
||||||
|
hash="b" * 64,
|
||||||
|
definition={"model": "test"},
|
||||||
|
created_at=_NOW,
|
||||||
|
)
|
||||||
|
async with db.session() as session:
|
||||||
|
session.add(persona)
|
||||||
|
async with db.session() as session:
|
||||||
|
fetched = await session.get(AgentPersonaRow, persona_id)
|
||||||
|
assert fetched is not None
|
||||||
|
assert fetched.name == "test-persona"
|
||||||
|
assert fetched.version == 1
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# A.6: UNIQUE constraint — workflow_templates.hash duplicate
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_workflow_template_hash_unique_constraint(db: Database) -> None:
|
||||||
|
"""Inserting two WorkflowTemplateRows with the same hash must raise IntegrityError."""
|
||||||
|
|
||||||
|
def make_template(tid: str) -> WorkflowTemplateRow:
|
||||||
|
return WorkflowTemplateRow(
|
||||||
|
id=tid,
|
||||||
|
name="my-wf",
|
||||||
|
version=1,
|
||||||
|
hash="c" * 64, # same hash for both
|
||||||
|
definition={},
|
||||||
|
created_at=_NOW,
|
||||||
|
)
|
||||||
|
|
||||||
|
t1 = make_template(_make_id())
|
||||||
|
async with db.session() as session:
|
||||||
|
session.add(t1)
|
||||||
|
|
||||||
|
t2 = make_template(_make_id())
|
||||||
|
with pytest.raises(IntegrityError):
|
||||||
|
async with db.session() as session:
|
||||||
|
session.add(t2)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# A.7: FK CASCADE — RunRow delete cascades to RunInputRow
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_fk_cascade_run_delete_cascades_run_input(db: Database) -> None:
|
||||||
|
"""Deleting a RunRow must cascade-delete its RunInputRow."""
|
||||||
|
rid = _make_id()
|
||||||
|
tid = _make_id()
|
||||||
|
template = _workflow_template_row(tid)
|
||||||
|
run = _run_row(rid, template_id=tid)
|
||||||
|
inp = RunInputRow(
|
||||||
|
id=_make_id(),
|
||||||
|
run_id=rid,
|
||||||
|
requirements_md="# Requirements",
|
||||||
|
objective={"goal": "test"},
|
||||||
|
extra={},
|
||||||
|
input_hash="d" * 64,
|
||||||
|
)
|
||||||
|
# Insert parent and child in the same transaction so FK is satisfied.
|
||||||
|
async with db.session() as session:
|
||||||
|
session.add(template)
|
||||||
|
await session.flush() # persist template before run references it
|
||||||
|
session.add(run)
|
||||||
|
await session.flush() # persist run before inp references it
|
||||||
|
session.add(inp)
|
||||||
|
|
||||||
|
async with db.session() as session:
|
||||||
|
fetched_run = await session.get(RunRow, rid)
|
||||||
|
assert fetched_run is not None
|
||||||
|
await session.delete(fetched_run)
|
||||||
|
|
||||||
|
async with db.session() as session:
|
||||||
|
result = await session.execute(
|
||||||
|
text("SELECT id FROM run_inputs WHERE run_id = :rid"),
|
||||||
|
{"rid": rid},
|
||||||
|
)
|
||||||
|
rows = result.fetchall()
|
||||||
|
assert rows == [], f"Expected cascade delete of run_inputs, got {rows}"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# A.8: JSON column round-trip
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_json_column_round_trip(db: Database) -> None:
|
||||||
|
"""RunEventRow.payload nested dict must survive DB round-trip intact."""
|
||||||
|
rid = _make_id()
|
||||||
|
tid = _make_id()
|
||||||
|
template = _workflow_template_row(tid)
|
||||||
|
run = _run_row(rid, template_id=tid)
|
||||||
|
payload: dict[str, Any] = {
|
||||||
|
"nested": {"list": [1, 2, 3], "flag": True},
|
||||||
|
"msg": "hello",
|
||||||
|
}
|
||||||
|
event = RunEventRow(
|
||||||
|
run_id=rid,
|
||||||
|
seq=1,
|
||||||
|
type="phase_started",
|
||||||
|
payload=payload,
|
||||||
|
idempotency_key="idem-1",
|
||||||
|
ts=_NOW,
|
||||||
|
)
|
||||||
|
async with db.session() as session:
|
||||||
|
session.add(template)
|
||||||
|
await session.flush() # persist template before run references it
|
||||||
|
session.add(run)
|
||||||
|
await session.flush() # persist run before event references it
|
||||||
|
session.add(event)
|
||||||
|
async with db.session() as session:
|
||||||
|
result = await session.execute(
|
||||||
|
text("SELECT payload FROM run_events WHERE run_id = :rid"), {"rid": rid}
|
||||||
|
)
|
||||||
|
raw = result.scalar()
|
||||||
|
import json as _json
|
||||||
|
|
||||||
|
restored = _json.loads(raw) if isinstance(raw, str) else raw
|
||||||
|
assert restored == payload
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# A.9: UUID string column round-trip
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_uuid_column_round_trip(db: Database) -> None:
|
||||||
|
"""UUID primary key stored as string must compare equal after retrieval."""
|
||||||
|
expected_id = str(uuid.uuid4())
|
||||||
|
tid = _make_id()
|
||||||
|
template = _workflow_template_row(tid)
|
||||||
|
run = RunRow(
|
||||||
|
id=expected_id,
|
||||||
|
template_id=tid,
|
||||||
|
template_hash="e" * 64,
|
||||||
|
state="running",
|
||||||
|
repo_path="/r",
|
||||||
|
base_branch="main",
|
||||||
|
worktree_root="/w",
|
||||||
|
created_at=_NOW,
|
||||||
|
updated_at=_NOW,
|
||||||
|
)
|
||||||
|
async with db.session() as session:
|
||||||
|
session.add(template)
|
||||||
|
await session.flush()
|
||||||
|
session.add(run)
|
||||||
|
async with db.session() as session:
|
||||||
|
fetched = await session.get(RunRow, expected_id)
|
||||||
|
assert fetched is not None
|
||||||
|
assert fetched.id == expected_id
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# A.10: UNIQUE(run_id, seq) on run_events
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_run_events_unique_run_seq(db: Database) -> None:
|
||||||
|
"""Two RunEventRows with the same (run_id, seq) must raise IntegrityError."""
|
||||||
|
rid = _make_id()
|
||||||
|
tid = _make_id()
|
||||||
|
template = _workflow_template_row(tid)
|
||||||
|
run = _run_row(rid, template_id=tid)
|
||||||
|
async with db.session() as session:
|
||||||
|
session.add(template)
|
||||||
|
await session.flush()
|
||||||
|
session.add(run)
|
||||||
|
await session.flush()
|
||||||
|
session.add(
|
||||||
|
RunEventRow(
|
||||||
|
run_id=rid,
|
||||||
|
seq=1,
|
||||||
|
type="x",
|
||||||
|
payload={},
|
||||||
|
idempotency_key="key-a",
|
||||||
|
ts=_NOW,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
with pytest.raises(IntegrityError):
|
||||||
|
async with db.session() as session:
|
||||||
|
session.add(
|
||||||
|
RunEventRow(
|
||||||
|
run_id=rid,
|
||||||
|
seq=1, # same seq → collision on (run_id, seq)
|
||||||
|
type="x",
|
||||||
|
payload={},
|
||||||
|
idempotency_key="key-b",
|
||||||
|
ts=_NOW,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# A.11: UNIQUE(run_id, idempotency_key) on run_events
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_run_events_unique_idempotency_key(db: Database) -> None:
|
||||||
|
"""Two RunEventRows with the same (run_id, idempotency_key) must raise IntegrityError."""
|
||||||
|
rid = _make_id()
|
||||||
|
tid = _make_id()
|
||||||
|
template = _workflow_template_row(tid)
|
||||||
|
run = _run_row(rid, template_id=tid)
|
||||||
|
async with db.session() as session:
|
||||||
|
session.add(template)
|
||||||
|
await session.flush()
|
||||||
|
session.add(run)
|
||||||
|
await session.flush()
|
||||||
|
session.add(
|
||||||
|
RunEventRow(
|
||||||
|
run_id=rid,
|
||||||
|
seq=1,
|
||||||
|
type="x",
|
||||||
|
payload={},
|
||||||
|
idempotency_key="shared-key",
|
||||||
|
ts=_NOW,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
with pytest.raises(IntegrityError):
|
||||||
|
async with db.session() as session:
|
||||||
|
session.add(
|
||||||
|
RunEventRow(
|
||||||
|
run_id=rid,
|
||||||
|
seq=2, # different seq
|
||||||
|
type="x",
|
||||||
|
payload={},
|
||||||
|
idempotency_key="shared-key", # same idem key → collision
|
||||||
|
ts=_NOW,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# A.12: Index existence on run_events
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_run_events_index_exists(db: Database) -> None:
|
||||||
|
"""The run_events_run_id_ts_idx index must exist in sqlite_master."""
|
||||||
|
async with db.session() as session:
|
||||||
|
result = await session.execute(
|
||||||
|
text(
|
||||||
|
"SELECT name FROM sqlite_master "
|
||||||
|
"WHERE type='index' AND name='run_events_run_id_ts_idx'"
|
||||||
|
)
|
||||||
|
)
|
||||||
|
names = [row[0] for row in result.fetchall()]
|
||||||
|
assert "run_events_run_id_ts_idx" in names
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# A.13: dispose + new session works
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_dispose_and_reconnect(db_url: str) -> None:
|
||||||
|
"""After dispose(), creating a new Database and querying must succeed."""
|
||||||
|
db1 = Database(db_url)
|
||||||
|
await db1.init_schema()
|
||||||
|
await db1.dispose()
|
||||||
|
|
||||||
|
db2 = Database(db_url)
|
||||||
|
async with db2.session() as session:
|
||||||
|
result = await session.execute(
|
||||||
|
text("SELECT name FROM sqlite_master WHERE type='table' ORDER BY name")
|
||||||
|
)
|
||||||
|
tables = [row[0] for row in result.fetchall()]
|
||||||
|
await db2.dispose()
|
||||||
|
assert "runs" in tables
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# A.14: Alembic upgrade head produces valid schema
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_alembic_upgrade_head_produces_valid_schema(tmp_path: Path) -> None:
|
||||||
|
"""Running alembic upgrade head on a fresh DB must create the expected tables."""
|
||||||
|
db_path = tmp_path / "alembic_test.db"
|
||||||
|
db_url = f"sqlite:///{db_path}" # sync URL for alembic env.py
|
||||||
|
|
||||||
|
project_root = Path(__file__).parent.parent.parent
|
||||||
|
|
||||||
|
result = subprocess.run(
|
||||||
|
[
|
||||||
|
sys.executable,
|
||||||
|
"-m",
|
||||||
|
"alembic",
|
||||||
|
"upgrade",
|
||||||
|
"head",
|
||||||
|
],
|
||||||
|
cwd=str(project_root),
|
||||||
|
env={**__import__("os").environ, "DATABASE_URL": db_url},
|
||||||
|
capture_output=True,
|
||||||
|
text=True,
|
||||||
|
)
|
||||||
|
assert result.returncode == 0, (
|
||||||
|
f"alembic upgrade head failed:\nSTDOUT:\n{result.stdout}\nSTDERR:\n{result.stderr}"
|
||||||
|
)
|
||||||
|
|
||||||
|
import sqlite3
|
||||||
|
|
||||||
|
with sqlite3.connect(str(db_path)) as conn:
|
||||||
|
cur = conn.execute("SELECT name FROM sqlite_master WHERE type='table' ORDER BY name")
|
||||||
|
tables = {row[0] for row in cur.fetchall()}
|
||||||
|
|
||||||
|
tables.discard("alembic_version")
|
||||||
|
assert EXPECTED_TABLES <= tables, f"Missing after alembic upgrade: {EXPECTED_TABLES - tables}"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# P0-1: partial unique index ux_active_run_repo_base
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_active_run_unique_index_blocks_duplicate(db: Database) -> None:
|
||||||
|
"""Two active runs with the same (repo_path, base_branch) must raise IntegrityError."""
|
||||||
|
tid = _make_id()
|
||||||
|
template = _workflow_template_row(tid)
|
||||||
|
rid1 = _make_id()
|
||||||
|
run1 = _run_row(rid1, template_id=tid)
|
||||||
|
run1.state = "running"
|
||||||
|
|
||||||
|
rid2 = _make_id()
|
||||||
|
run2 = _run_row(rid2, template_id=tid)
|
||||||
|
run2.state = "pending"
|
||||||
|
# Same repo_path and base_branch — both active → must violate unique index.
|
||||||
|
|
||||||
|
async with db.session() as session:
|
||||||
|
session.add(template)
|
||||||
|
await session.flush()
|
||||||
|
session.add(run1)
|
||||||
|
|
||||||
|
with pytest.raises(IntegrityError):
|
||||||
|
async with db.session() as session:
|
||||||
|
session.add(run2)
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_active_run_unique_index_allows_completed(db: Database) -> None:
|
||||||
|
"""A completed run allows a new active run with the same (repo_path, base_branch)."""
|
||||||
|
tid = _make_id()
|
||||||
|
template = _workflow_template_row(tid)
|
||||||
|
rid1 = _make_id()
|
||||||
|
run1 = _run_row(rid1, template_id=tid)
|
||||||
|
run1.state = "completed"
|
||||||
|
|
||||||
|
rid2 = _make_id()
|
||||||
|
run2 = _run_row(rid2, template_id=tid)
|
||||||
|
run2.state = "running"
|
||||||
|
# Same repo/branch; run1 is completed (excluded) → run2 must succeed.
|
||||||
|
|
||||||
|
async with db.session() as session:
|
||||||
|
session.add(template)
|
||||||
|
await session.flush()
|
||||||
|
session.add(run1)
|
||||||
|
|
||||||
|
async with db.session() as session:
|
||||||
|
session.add(run2)
|
||||||
|
|
||||||
|
async with db.session() as session:
|
||||||
|
fetched = await session.get(RunRow, rid2)
|
||||||
|
assert fetched is not None
|
||||||
|
assert fetched.state == "running"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# P0-3: FK CASCADE — RunRow delete cascades to all audit children
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_fk_cascade_run_delete_cascades_phase_feedback(db: Database) -> None:
|
||||||
|
"""Deleting a RunRow cascades to phase_feedback and run_phases rows."""
|
||||||
|
from my_deepagent.persistence.models import PhaseFeedbackRow
|
||||||
|
|
||||||
|
tid = _make_id()
|
||||||
|
rid = _make_id()
|
||||||
|
phase_id = _make_id()
|
||||||
|
template = _workflow_template_row(tid)
|
||||||
|
run = _run_row(rid, template_id=tid)
|
||||||
|
phase = RunPhaseRow(
|
||||||
|
id=phase_id,
|
||||||
|
run_id=rid,
|
||||||
|
phase_key="plan",
|
||||||
|
seq=1,
|
||||||
|
state="completed",
|
||||||
|
attempts=1,
|
||||||
|
)
|
||||||
|
feedback = PhaseFeedbackRow(
|
||||||
|
run_id=rid,
|
||||||
|
phase_id=phase_id,
|
||||||
|
reaction="thumbs_up",
|
||||||
|
created_at=_NOW,
|
||||||
|
)
|
||||||
|
|
||||||
|
async with db.session() as session:
|
||||||
|
session.add(template)
|
||||||
|
await session.flush()
|
||||||
|
session.add(run)
|
||||||
|
await session.flush()
|
||||||
|
session.add(phase)
|
||||||
|
await session.flush()
|
||||||
|
session.add(feedback)
|
||||||
|
|
||||||
|
async with db.session() as session:
|
||||||
|
fetched_run = await session.get(RunRow, rid)
|
||||||
|
assert fetched_run is not None
|
||||||
|
await session.delete(fetched_run)
|
||||||
|
|
||||||
|
async with db.session() as session:
|
||||||
|
fb_result = await session.execute(
|
||||||
|
text("SELECT id FROM phase_feedback WHERE run_id = :rid"), {"rid": rid}
|
||||||
|
)
|
||||||
|
ph_result = await session.execute(
|
||||||
|
text("SELECT id FROM run_phases WHERE run_id = :rid"), {"rid": rid}
|
||||||
|
)
|
||||||
|
assert fb_result.fetchall() == [], "phase_feedback must cascade-delete with run"
|
||||||
|
assert ph_result.fetchall() == [], "run_phases must cascade-delete with run"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# P0-3: FK RESTRICT — deleting WorkflowTemplateRow with runs is blocked
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_fk_restrict_template_delete_blocked_by_run(db: Database) -> None:
|
||||||
|
"""Deleting a WorkflowTemplateRow that has a referencing RunRow must raise IntegrityError."""
|
||||||
|
tid = _make_id()
|
||||||
|
rid = _make_id()
|
||||||
|
template = _workflow_template_row(tid)
|
||||||
|
run = _run_row(rid, template_id=tid)
|
||||||
|
|
||||||
|
async with db.session() as session:
|
||||||
|
session.add(template)
|
||||||
|
await session.flush()
|
||||||
|
session.add(run)
|
||||||
|
|
||||||
|
with pytest.raises(IntegrityError):
|
||||||
|
async with db.session() as session:
|
||||||
|
fetched = await session.get(WorkflowTemplateRow, tid)
|
||||||
|
assert fetched is not None
|
||||||
|
await session.delete(fetched)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# P0-1: partial unique index exists in sqlite_master after init_schema
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_active_run_partial_index_exists_in_schema(db: Database) -> None:
|
||||||
|
"""ux_active_run_repo_base partial unique index must exist after init_schema."""
|
||||||
|
async with db.session() as session:
|
||||||
|
result = await session.execute(
|
||||||
|
text(
|
||||||
|
"SELECT sql FROM sqlite_master "
|
||||||
|
"WHERE type='index' AND name='ux_active_run_repo_base'"
|
||||||
|
)
|
||||||
|
)
|
||||||
|
row = result.fetchone()
|
||||||
|
assert row is not None, "ux_active_run_repo_base index missing from sqlite_master"
|
||||||
|
assert "WHERE" in (row[0] or ""), f"Expected WHERE clause in index SQL, got: {row[0]}"
|
||||||
0
my-deepagent/tests/unit/__init__.py
Normal file
0
my-deepagent/tests/unit/__init__.py
Normal file
391
my-deepagent/tests/unit/test_artifact_schema.py
Normal file
391
my-deepagent/tests/unit/test_artifact_schema.py
Normal file
@@ -0,0 +1,391 @@
|
|||||||
|
"""Unit tests for src/my_deepagent/artifact_schema.py."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import json
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from my_deepagent.artifact_schema import (
|
||||||
|
ArtifactSchemaRegistry,
|
||||||
|
ValidationFinding,
|
||||||
|
ValidationResult,
|
||||||
|
)
|
||||||
|
from my_deepagent.errors import MyDeepAgentError
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Fixtures
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
REPO_ROOT = Path(__file__).parent.parent.parent
|
||||||
|
SEED_ROOT = REPO_ROOT / "docs" / "schemas" / "artifacts"
|
||||||
|
|
||||||
|
SEED_SCHEMA_IDS = [
|
||||||
|
"common/final-report@1",
|
||||||
|
"dev/phase-plan@1",
|
||||||
|
"dev/review-finding-batch@1",
|
||||||
|
"dev/spec@1",
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def seed_registry() -> ArtifactSchemaRegistry:
|
||||||
|
return ArtifactSchemaRegistry(roots=[SEED_ROOT])
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def valid_spec() -> dict[str, Any]:
|
||||||
|
return {
|
||||||
|
"runId": "00000000-0000-4000-8000-000000000000",
|
||||||
|
"phaseKey": "spec",
|
||||||
|
"requirements": "User wants a CLI tool that analyzes log files.",
|
||||||
|
"acceptance_criteria": ["parses .log files", "outputs JSON summary"],
|
||||||
|
"approach": "Build a typer-based CLI using regex and json output.",
|
||||||
|
"risks": ["log format variations may break parser"],
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# 1. Seed schema load success (4 schemas)
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.parametrize("schema_id", SEED_SCHEMA_IDS)
|
||||||
|
def test_seed_schema_loads(seed_registry: ArtifactSchemaRegistry, schema_id: str) -> None:
|
||||||
|
schema = seed_registry.load(schema_id)
|
||||||
|
assert isinstance(schema, dict)
|
||||||
|
assert schema.get("$id") == schema_id
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# 2. Load result caching — same dict object on second call
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_load_caches_same_object(seed_registry: ArtifactSchemaRegistry) -> None:
|
||||||
|
first = seed_registry.load("dev/spec@1")
|
||||||
|
second = seed_registry.load("dev/spec@1")
|
||||||
|
assert first is second
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# 3. Unknown schema_id → artifact_schema_unknown
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_unknown_schema_id_raises(seed_registry: ArtifactSchemaRegistry) -> None:
|
||||||
|
with pytest.raises(MyDeepAgentError) as exc_info:
|
||||||
|
seed_registry.load("dev/nonexistent@99")
|
||||||
|
assert exc_info.value.code == "artifact_schema_unknown"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# 4. Invalid schema_id format (no slash) → artifact_schema_unknown
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_invalid_schema_id_no_slash(seed_registry: ArtifactSchemaRegistry) -> None:
|
||||||
|
with pytest.raises(MyDeepAgentError) as exc_info:
|
||||||
|
seed_registry.load("foo")
|
||||||
|
assert exc_info.value.code == "artifact_schema_unknown"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# 5. schema_id starting with "/" → rejected (no slash separating domain/name)
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_invalid_schema_id_leading_slash(seed_registry: ArtifactSchemaRegistry) -> None:
|
||||||
|
# "/foo/bar" has a slash but the domain portion would be empty
|
||||||
|
# After splitting on "/", domain="" which is not a valid domain/name pair.
|
||||||
|
# The registry treats it as a path traversal risk: Path("/foo/bar.json")
|
||||||
|
# is absolute and will never exist under a root directory (is_file() → False).
|
||||||
|
with pytest.raises(MyDeepAgentError) as exc_info:
|
||||||
|
seed_registry.load("/dev/spec@1")
|
||||||
|
assert exc_info.value.code == "artifact_schema_unknown"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# 6. Empty schema_id → artifact_schema_unknown
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_empty_schema_id_raises(seed_registry: ArtifactSchemaRegistry) -> None:
|
||||||
|
with pytest.raises(MyDeepAgentError) as exc_info:
|
||||||
|
seed_registry.load("")
|
||||||
|
assert exc_info.value.code == "artifact_schema_unknown"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# 7. Fallback: schema absent in first root, present in second
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_fallback_to_second_root(tmp_path: Path) -> None:
|
||||||
|
first_root = tmp_path / "first"
|
||||||
|
first_root.mkdir()
|
||||||
|
second_root = tmp_path / "second"
|
||||||
|
(second_root / "dev").mkdir(parents=True)
|
||||||
|
schema: dict[str, Any] = {
|
||||||
|
"$schema": "https://json-schema.org/draft/2020-12/schema",
|
||||||
|
"$id": "dev/thing@1",
|
||||||
|
"type": "object",
|
||||||
|
}
|
||||||
|
(second_root / "dev" / "thing@1.json").write_text(json.dumps(schema), encoding="utf-8")
|
||||||
|
registry = ArtifactSchemaRegistry(roots=[first_root, second_root])
|
||||||
|
loaded = registry.load("dev/thing@1")
|
||||||
|
assert loaded["$id"] == "dev/thing@1"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# 8. validate with valid data → ok=True
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_validate_valid_spec(
|
||||||
|
seed_registry: ArtifactSchemaRegistry, valid_spec: dict[str, Any]
|
||||||
|
) -> None:
|
||||||
|
result = seed_registry.validate("dev/spec@1", valid_spec)
|
||||||
|
assert result.ok is True
|
||||||
|
assert result.errors == ()
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# 9. validate with invalid data → ok=False, findings non-empty
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_validate_invalid_data_returns_findings(
|
||||||
|
seed_registry: ArtifactSchemaRegistry,
|
||||||
|
) -> None:
|
||||||
|
result = seed_registry.validate("dev/spec@1", {"wrong": "data"})
|
||||||
|
assert result.ok is False
|
||||||
|
assert len(result.errors) > 0
|
||||||
|
for finding in result.errors:
|
||||||
|
assert isinstance(finding, ValidationFinding)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# 10. Missing required field → validator="required", path correct
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_validate_missing_required_field(
|
||||||
|
seed_registry: ArtifactSchemaRegistry, valid_spec: dict[str, Any]
|
||||||
|
) -> None:
|
||||||
|
data = {k: v for k, v in valid_spec.items() if k != "requirements"}
|
||||||
|
result = seed_registry.validate("dev/spec@1", data)
|
||||||
|
assert result.ok is False
|
||||||
|
required_findings = [f for f in result.errors if f.validator == "required"]
|
||||||
|
assert any("requirements" in f.message for f in required_findings)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# 11. Invalid enum value → validator="enum", expected has enum list
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_validate_invalid_enum_severity(seed_registry: ArtifactSchemaRegistry) -> None:
|
||||||
|
data = {
|
||||||
|
"runId": "00000000-0000-4000-8000-000000000000",
|
||||||
|
"phaseKey": "review",
|
||||||
|
"reviewerRole": "code-reviewer",
|
||||||
|
"findings": [
|
||||||
|
{
|
||||||
|
"severity": "bogus",
|
||||||
|
"category": "correctness",
|
||||||
|
"summary": "something is wrong here",
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"summary": "Overall review summary with enough length.",
|
||||||
|
}
|
||||||
|
result = seed_registry.validate("dev/review-finding-batch@1", data)
|
||||||
|
assert result.ok is False
|
||||||
|
enum_findings = [f for f in result.errors if f.validator == "enum"]
|
||||||
|
assert len(enum_findings) > 0
|
||||||
|
finding = enum_findings[0]
|
||||||
|
assert isinstance(finding.expected, list)
|
||||||
|
assert "bogus" not in finding.expected
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# 12. Wrong type → validator="type", expected has type name
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_validate_wrong_type(
|
||||||
|
seed_registry: ArtifactSchemaRegistry, valid_spec: dict[str, Any]
|
||||||
|
) -> None:
|
||||||
|
data = dict(valid_spec)
|
||||||
|
data["acceptance_criteria"] = "should be a list, not a string"
|
||||||
|
result = seed_registry.validate("dev/spec@1", data)
|
||||||
|
assert result.ok is False
|
||||||
|
type_findings = [f for f in result.errors if f.validator == "type"]
|
||||||
|
assert len(type_findings) > 0
|
||||||
|
assert type_findings[0].expected == "array"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# 13. Nested error path — /findings/0/severity format
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_validate_nested_error_path(seed_registry: ArtifactSchemaRegistry) -> None:
|
||||||
|
data = {
|
||||||
|
"runId": "00000000-0000-4000-8000-000000000000",
|
||||||
|
"phaseKey": "review",
|
||||||
|
"reviewerRole": "code-reviewer",
|
||||||
|
"findings": [
|
||||||
|
{
|
||||||
|
"severity": "not-valid",
|
||||||
|
"category": "correctness",
|
||||||
|
"summary": "a finding summary",
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"summary": "Overall review summary with enough length.",
|
||||||
|
}
|
||||||
|
result = seed_registry.validate("dev/review-finding-batch@1", data)
|
||||||
|
assert result.ok is False
|
||||||
|
paths = [f.path for f in result.errors]
|
||||||
|
assert any(p.startswith("/findings/0/") for p in paths)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# 14. known_schema_ids() returns all 4 seed schemas, sorted
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_known_schema_ids_returns_seeds(seed_registry: ArtifactSchemaRegistry) -> None:
|
||||||
|
ids = seed_registry.known_schema_ids()
|
||||||
|
for expected in SEED_SCHEMA_IDS:
|
||||||
|
assert expected in ids
|
||||||
|
assert ids == sorted(ids)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# 15. Empty roots list → config_invalid
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_empty_roots_raises() -> None:
|
||||||
|
with pytest.raises(MyDeepAgentError) as exc_info:
|
||||||
|
ArtifactSchemaRegistry(roots=[])
|
||||||
|
assert exc_info.value.code == "config_invalid"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# 16. Corrupted JSON file → artifact_schema_load_failed
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_corrupted_json_raises(tmp_path: Path) -> None:
|
||||||
|
(tmp_path / "dev").mkdir()
|
||||||
|
(tmp_path / "dev" / "broken@1.json").write_text("{", encoding="utf-8")
|
||||||
|
registry = ArtifactSchemaRegistry(roots=[tmp_path])
|
||||||
|
with pytest.raises(MyDeepAgentError) as exc_info:
|
||||||
|
registry.load("dev/broken@1")
|
||||||
|
assert exc_info.value.code == "artifact_schema_load_failed"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# 17. Valid JSON but not a dict → artifact_schema_load_failed
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_non_dict_json_raises(tmp_path: Path) -> None:
|
||||||
|
(tmp_path / "dev").mkdir()
|
||||||
|
(tmp_path / "dev" / "array@1.json").write_text("[1, 2, 3]", encoding="utf-8")
|
||||||
|
registry = ArtifactSchemaRegistry(roots=[tmp_path])
|
||||||
|
with pytest.raises(MyDeepAgentError) as exc_info:
|
||||||
|
registry.load("dev/array@1")
|
||||||
|
assert exc_info.value.code == "artifact_schema_load_failed"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# 18. Schema itself is invalid Draft 2020-12 → artifact_schema_load_failed
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_invalid_draft_schema_raises(tmp_path: Path) -> None:
|
||||||
|
(tmp_path / "dev").mkdir()
|
||||||
|
bad_schema = {"type": "not_a_type"}
|
||||||
|
(tmp_path / "dev" / "bad@1.json").write_text(json.dumps(bad_schema), encoding="utf-8")
|
||||||
|
registry = ArtifactSchemaRegistry(roots=[tmp_path])
|
||||||
|
with pytest.raises(MyDeepAgentError) as exc_info:
|
||||||
|
registry.load("dev/bad@1")
|
||||||
|
assert exc_info.value.code == "artifact_schema_load_failed"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# 19. Validator caching: _validator called twice returns same instance
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_validator_instance_cached(seed_registry: ArtifactSchemaRegistry) -> None:
|
||||||
|
# Access internal cache to verify the same validator instance is reused.
|
||||||
|
v1 = seed_registry._validator("dev/spec@1")
|
||||||
|
v2 = seed_registry._validator("dev/spec@1")
|
||||||
|
assert v1 is v2
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# 20. dev/spec@1 valid example produces ok=True (full fixture check)
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_spec_valid_example_ok(seed_registry: ArtifactSchemaRegistry) -> None:
|
||||||
|
valid_spec: dict[str, Any] = {
|
||||||
|
"runId": "00000000-0000-4000-8000-000000000000",
|
||||||
|
"phaseKey": "spec",
|
||||||
|
"requirements": "User wants a CLI tool that analyzes log files.",
|
||||||
|
"acceptance_criteria": ["parses .log files", "outputs JSON summary"],
|
||||||
|
"approach": "Build a typer-based CLI using regex and json output.",
|
||||||
|
"risks": ["log format variations may break parser"],
|
||||||
|
}
|
||||||
|
result = seed_registry.validate("dev/spec@1", valid_spec)
|
||||||
|
assert result.ok is True
|
||||||
|
assert result.errors == ()
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Bonus: ValidationResult and ValidationFinding are frozen dataclasses
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_validation_result_frozen() -> None:
|
||||||
|
result = ValidationResult(ok=True)
|
||||||
|
with pytest.raises((AttributeError, TypeError)):
|
||||||
|
result.ok = False # type: ignore[misc]
|
||||||
|
|
||||||
|
|
||||||
|
def test_validation_finding_frozen() -> None:
|
||||||
|
finding = ValidationFinding(path="/foo", message="err", validator="type", expected="string")
|
||||||
|
with pytest.raises((AttributeError, TypeError)):
|
||||||
|
finding.path = "/bar" # type: ignore[misc]
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Bonus: known_schema_ids with nonexistent root dir is silently skipped
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_known_schema_ids_skips_nonexistent_root(tmp_path: Path) -> None:
|
||||||
|
missing = tmp_path / "does_not_exist"
|
||||||
|
registry = ArtifactSchemaRegistry(roots=[missing])
|
||||||
|
assert registry.known_schema_ids() == []
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Bonus: validate with non-dict top-level data
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_validate_non_dict_data_returns_error(
|
||||||
|
seed_registry: ArtifactSchemaRegistry,
|
||||||
|
) -> None:
|
||||||
|
result = seed_registry.validate("dev/spec@1", [1, 2, 3])
|
||||||
|
assert result.ok is False
|
||||||
|
type_findings = [f for f in result.errors if f.validator == "type"]
|
||||||
|
assert len(type_findings) > 0
|
||||||
644
my-deepagent/tests/unit/test_binding.py
Normal file
644
my-deepagent/tests/unit/test_binding.py
Normal file
@@ -0,0 +1,644 @@
|
|||||||
|
"""Unit tests for src/my_deepagent/binding.py."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import fcntl
|
||||||
|
import json
|
||||||
|
import re
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from my_deepagent.binding import (
|
||||||
|
BackendAvailability,
|
||||||
|
Binding,
|
||||||
|
BindingOverride,
|
||||||
|
PersonaConsentStore,
|
||||||
|
bind_personas,
|
||||||
|
filter_consented_personas,
|
||||||
|
is_persona_eligible_for_role,
|
||||||
|
)
|
||||||
|
from my_deepagent.enums import Backend, Capability
|
||||||
|
from my_deepagent.errors import MyDeepAgentError
|
||||||
|
from my_deepagent.persona import Persona, load_personas_from_dir
|
||||||
|
from my_deepagent.workflow import WorkflowTemplate, load_workflows_from_dir
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# PersonaConsentStore file-lock (fcntl.flock) verification
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_consent_store_set_acquires_exclusive_lock(
|
||||||
|
tmp_path: Path, monkeypatch: pytest.MonkeyPatch
|
||||||
|
) -> None:
|
||||||
|
"""set() must take an exclusive flock and release it."""
|
||||||
|
ops: list[int] = []
|
||||||
|
orig_flock = fcntl.flock
|
||||||
|
|
||||||
|
def spy(fd: int, op: int) -> None:
|
||||||
|
ops.append(op)
|
||||||
|
orig_flock(fd, op)
|
||||||
|
|
||||||
|
monkeypatch.setattr(fcntl, "flock", spy)
|
||||||
|
store = PersonaConsentStore(tmp_path / "consents.json")
|
||||||
|
store.set("hash_abc", "approve")
|
||||||
|
assert fcntl.LOCK_EX in ops
|
||||||
|
assert fcntl.LOCK_UN in ops
|
||||||
|
|
||||||
|
|
||||||
|
def test_consent_store_revoke_acquires_exclusive_lock(
|
||||||
|
tmp_path: Path, monkeypatch: pytest.MonkeyPatch
|
||||||
|
) -> None:
|
||||||
|
ops: list[int] = []
|
||||||
|
orig_flock = fcntl.flock
|
||||||
|
|
||||||
|
def spy(fd: int, op: int) -> None:
|
||||||
|
ops.append(op)
|
||||||
|
orig_flock(fd, op)
|
||||||
|
|
||||||
|
monkeypatch.setattr(fcntl, "flock", spy)
|
||||||
|
store = PersonaConsentStore(tmp_path / "consents.json")
|
||||||
|
store.set("h", "approve")
|
||||||
|
ops.clear()
|
||||||
|
store.revoke("h")
|
||||||
|
assert fcntl.LOCK_EX in ops
|
||||||
|
assert fcntl.LOCK_UN in ops
|
||||||
|
|
||||||
|
|
||||||
|
def test_consent_store_get_acquires_shared_lock(
|
||||||
|
tmp_path: Path, monkeypatch: pytest.MonkeyPatch
|
||||||
|
) -> None:
|
||||||
|
"""get() takes a shared lock (LOCK_SH) so multiple readers don't serialise."""
|
||||||
|
ops: list[int] = []
|
||||||
|
orig_flock = fcntl.flock
|
||||||
|
|
||||||
|
def spy(fd: int, op: int) -> None:
|
||||||
|
ops.append(op)
|
||||||
|
orig_flock(fd, op)
|
||||||
|
|
||||||
|
monkeypatch.setattr(fcntl, "flock", spy)
|
||||||
|
store = PersonaConsentStore(tmp_path / "consents.json")
|
||||||
|
store.set("h", "approve")
|
||||||
|
ops.clear()
|
||||||
|
_ = store.get("h")
|
||||||
|
assert fcntl.LOCK_SH in ops
|
||||||
|
assert fcntl.LOCK_UN in ops
|
||||||
|
|
||||||
|
|
||||||
|
def test_consent_store_lock_file_created(tmp_path: Path) -> None:
|
||||||
|
"""A .lock sidecar file is created next to the consent store on first write."""
|
||||||
|
path = tmp_path / "consents.json"
|
||||||
|
store = PersonaConsentStore(path)
|
||||||
|
store.set("h", "approve")
|
||||||
|
assert (tmp_path / "consents.json.lock").is_file()
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Fixtures / helpers
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
PERSONAS_DIR = Path(__file__).parent.parent.parent / "docs" / "schemas" / "personas"
|
||||||
|
WORKFLOWS_DIR = Path(__file__).parent.parent.parent / "docs" / "schemas" / "workflows"
|
||||||
|
|
||||||
|
|
||||||
|
def _minimal_persona(**overrides: object) -> Persona:
|
||||||
|
base: dict[str, object] = {
|
||||||
|
"name": "test-persona",
|
||||||
|
"version": 1,
|
||||||
|
"backend": "openrouter",
|
||||||
|
"model": "openrouter:anthropic/claude-sonnet-4-6",
|
||||||
|
"provider_origin": "US/Anthropic",
|
||||||
|
"capabilities": ["spec_write", "phase_planning"],
|
||||||
|
"max_risk_level": "low",
|
||||||
|
"system_prompt": "You are a test persona for unit tests.",
|
||||||
|
}
|
||||||
|
base.update(overrides)
|
||||||
|
return Persona.model_validate(base)
|
||||||
|
|
||||||
|
|
||||||
|
def _all_available() -> BackendAvailability:
|
||||||
|
return BackendAvailability(available_backends=frozenset(Backend))
|
||||||
|
|
||||||
|
|
||||||
|
def _none_available() -> BackendAvailability:
|
||||||
|
return BackendAvailability(available_backends=frozenset())
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture()
|
||||||
|
def consent_store(tmp_path: Path) -> PersonaConsentStore:
|
||||||
|
return PersonaConsentStore(tmp_path / "consents.json")
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture()
|
||||||
|
def seed_personas() -> list[Persona]:
|
||||||
|
return load_personas_from_dir(PERSONAS_DIR)
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture()
|
||||||
|
def spec_and_review() -> WorkflowTemplate:
|
||||||
|
workflows = load_workflows_from_dir(WORKFLOWS_DIR)
|
||||||
|
return next(w for w in workflows if w.name == "spec-and-review")
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# is_persona_eligible_for_role
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_eligible_all_ok(spec_and_review: WorkflowTemplate) -> None:
|
||||||
|
spec_writer_role = next(r for r in spec_and_review.roles if r.id == "spec_writer")
|
||||||
|
p = _minimal_persona(capabilities=["spec_write", "phase_planning"], max_risk_level="low")
|
||||||
|
ok, reason = is_persona_eligible_for_role(p, spec_writer_role, spec_and_review)
|
||||||
|
assert ok is True
|
||||||
|
assert reason is None
|
||||||
|
|
||||||
|
|
||||||
|
def test_eligible_missing_capability(spec_and_review: WorkflowTemplate) -> None:
|
||||||
|
spec_writer_role = next(r for r in spec_and_review.roles if r.id == "spec_writer")
|
||||||
|
# only spec_write, missing phase_planning
|
||||||
|
p = _minimal_persona(capabilities=["spec_write"], max_risk_level="low")
|
||||||
|
ok, reason = is_persona_eligible_for_role(p, spec_writer_role, spec_and_review)
|
||||||
|
assert ok is False
|
||||||
|
assert reason is not None
|
||||||
|
assert "phase_planning" in reason
|
||||||
|
|
||||||
|
|
||||||
|
def test_eligible_allowed_roles_mismatch(spec_and_review: WorkflowTemplate) -> None:
|
||||||
|
spec_writer_role = next(r for r in spec_and_review.roles if r.id == "spec_writer")
|
||||||
|
p = _minimal_persona(
|
||||||
|
capabilities=["spec_write", "phase_planning"],
|
||||||
|
max_risk_level="low",
|
||||||
|
allowed_roles=["reviewer"], # does not include spec_writer
|
||||||
|
)
|
||||||
|
ok, reason = is_persona_eligible_for_role(p, spec_writer_role, spec_and_review)
|
||||||
|
assert ok is False
|
||||||
|
assert reason is not None
|
||||||
|
assert "allowed_roles" in reason
|
||||||
|
|
||||||
|
|
||||||
|
def test_eligible_allowed_roles_matches(spec_and_review: WorkflowTemplate) -> None:
|
||||||
|
spec_writer_role = next(r for r in spec_and_review.roles if r.id == "spec_writer")
|
||||||
|
p = _minimal_persona(
|
||||||
|
capabilities=["spec_write", "phase_planning"],
|
||||||
|
max_risk_level="low",
|
||||||
|
allowed_roles=["spec_writer"],
|
||||||
|
)
|
||||||
|
ok, reason = is_persona_eligible_for_role(p, spec_writer_role, spec_and_review)
|
||||||
|
assert ok is True
|
||||||
|
assert reason is None
|
||||||
|
|
||||||
|
|
||||||
|
def test_eligible_risk_too_high(spec_and_review: WorkflowTemplate) -> None:
|
||||||
|
"""bug-fix workflow has a 'medium' risk phase; a low-only persona is ineligible for it."""
|
||||||
|
bug_fix = load_workflows_from_dir(WORKFLOWS_DIR)
|
||||||
|
bug_fix_wf = next(w for w in bug_fix if w.name == "bug-fix-with-reproduction")
|
||||||
|
fixer_role = next(r for r in bug_fix_wf.roles if r.id == "fixer")
|
||||||
|
# fixer role has a 'medium' risk phase
|
||||||
|
p = _minimal_persona(
|
||||||
|
capabilities=["code_edit", "test_first_development"],
|
||||||
|
max_risk_level="low", # too low for medium phase
|
||||||
|
)
|
||||||
|
ok, reason = is_persona_eligible_for_role(p, fixer_role, bug_fix_wf)
|
||||||
|
assert ok is False
|
||||||
|
assert reason is not None
|
||||||
|
assert "medium" in reason
|
||||||
|
|
||||||
|
|
||||||
|
def test_eligible_risk_exact_match(spec_and_review: WorkflowTemplate) -> None:
|
||||||
|
spec_writer_role = next(r for r in spec_and_review.roles if r.id == "spec_writer")
|
||||||
|
p = _minimal_persona(capabilities=["spec_write", "phase_planning"], max_risk_level="low")
|
||||||
|
ok, _ = is_persona_eligible_for_role(p, spec_writer_role, spec_and_review)
|
||||||
|
assert ok is True
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# bind_personas: end-to-end with seed data
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_bind_personas_spec_and_review_success(
|
||||||
|
seed_personas: list[Persona],
|
||||||
|
spec_and_review: WorkflowTemplate,
|
||||||
|
consent_store: PersonaConsentStore,
|
||||||
|
) -> None:
|
||||||
|
bindings = bind_personas(spec_and_review, seed_personas, _all_available(), consent_store)
|
||||||
|
assert set(bindings.keys()) == {"spec_writer", "reviewer", "verifier"}
|
||||||
|
for role_id, binding in bindings.items():
|
||||||
|
assert isinstance(binding, Binding)
|
||||||
|
assert binding.role_id == role_id
|
||||||
|
assert re.fullmatch(r"[0-9a-f]{64}", binding.binding_hash)
|
||||||
|
|
||||||
|
|
||||||
|
def test_bind_personas_binding_hash_deterministic(
|
||||||
|
seed_personas: list[Persona],
|
||||||
|
spec_and_review: WorkflowTemplate,
|
||||||
|
consent_store: PersonaConsentStore,
|
||||||
|
) -> None:
|
||||||
|
b1 = bind_personas(spec_and_review, seed_personas, _all_available(), consent_store)
|
||||||
|
b2 = bind_personas(spec_and_review, seed_personas, _all_available(), consent_store)
|
||||||
|
for role_id in b1:
|
||||||
|
assert b1[role_id].binding_hash == b2[role_id].binding_hash
|
||||||
|
|
||||||
|
|
||||||
|
def test_bind_personas_spec_writer_is_spec_writer(
|
||||||
|
seed_personas: list[Persona],
|
||||||
|
spec_and_review: WorkflowTemplate,
|
||||||
|
consent_store: PersonaConsentStore,
|
||||||
|
) -> None:
|
||||||
|
bindings = bind_personas(spec_and_review, seed_personas, _all_available(), consent_store)
|
||||||
|
spec_persona = bindings["spec_writer"].persona
|
||||||
|
assert Capability.SPEC_WRITE in spec_persona.capabilities
|
||||||
|
assert Capability.PHASE_PLANNING in spec_persona.capabilities
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# bind_personas: override
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_bind_personas_override_picks_pinned(
|
||||||
|
seed_personas: list[Persona],
|
||||||
|
spec_and_review: WorkflowTemplate,
|
||||||
|
consent_store: PersonaConsentStore,
|
||||||
|
) -> None:
|
||||||
|
override = BindingOverride.parse({"spec_writer": "openrouter-claude-spec-writer@1"})
|
||||||
|
bindings = bind_personas(
|
||||||
|
spec_and_review, seed_personas, _all_available(), consent_store, override
|
||||||
|
)
|
||||||
|
assert bindings["spec_writer"].persona.name == "openrouter-claude-spec-writer"
|
||||||
|
|
||||||
|
|
||||||
|
def test_bind_personas_override_invalid_persona_raises(
|
||||||
|
seed_personas: list[Persona],
|
||||||
|
spec_and_review: WorkflowTemplate,
|
||||||
|
consent_store: PersonaConsentStore,
|
||||||
|
) -> None:
|
||||||
|
override = BindingOverride.parse({"spec_writer": "nonexistent-persona@1"})
|
||||||
|
with pytest.raises(MyDeepAgentError) as exc_info:
|
||||||
|
bind_personas(spec_and_review, seed_personas, _all_available(), consent_store, override)
|
||||||
|
assert exc_info.value.code == "no_eligible_persona"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# bind_personas: backend unavailable
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_bind_personas_backend_unavailable_raises(
|
||||||
|
seed_personas: list[Persona],
|
||||||
|
spec_and_review: WorkflowTemplate,
|
||||||
|
consent_store: PersonaConsentStore,
|
||||||
|
) -> None:
|
||||||
|
with pytest.raises(MyDeepAgentError) as exc_info:
|
||||||
|
bind_personas(spec_and_review, seed_personas, _none_available(), consent_store)
|
||||||
|
assert exc_info.value.code == "backend_unavailable"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# bind_personas: model_unavailable for openrouter with empty model
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_bind_personas_model_unavailable_raises(
|
||||||
|
spec_and_review: WorkflowTemplate,
|
||||||
|
consent_store: PersonaConsentStore,
|
||||||
|
) -> None:
|
||||||
|
"""Verify FAKE backend binds successfully (positive path for non-openrouter backends).
|
||||||
|
|
||||||
|
We cannot construct an openrouter persona with empty model via model_validate because
|
||||||
|
the validator rejects it. Instead verify the happy path: FAKE backend + non-empty
|
||||||
|
model should bind without errors when the FAKE backend is available.
|
||||||
|
"""
|
||||||
|
from my_deepagent.workflow import WorkflowPhase, WorkflowRole
|
||||||
|
|
||||||
|
role = WorkflowRole.model_validate(
|
||||||
|
{
|
||||||
|
"id": "spec_writer",
|
||||||
|
"required_capabilities": ["spec_write", "phase_planning"],
|
||||||
|
"preferred_backends": ["fake"],
|
||||||
|
}
|
||||||
|
)
|
||||||
|
phase = WorkflowPhase.model_validate(
|
||||||
|
{
|
||||||
|
"key": "spec",
|
||||||
|
"title": "Write spec",
|
||||||
|
"risk": "low",
|
||||||
|
"role": "spec_writer",
|
||||||
|
"instructions": "Write the specification document.",
|
||||||
|
}
|
||||||
|
)
|
||||||
|
tmpl = WorkflowTemplate.model_validate(
|
||||||
|
{
|
||||||
|
"name": "fake-wf",
|
||||||
|
"version": 1,
|
||||||
|
"roles": [role.model_dump()],
|
||||||
|
"phases": [phase.model_dump()],
|
||||||
|
}
|
||||||
|
)
|
||||||
|
fake_persona = _minimal_persona(
|
||||||
|
backend="fake",
|
||||||
|
model="fake-model",
|
||||||
|
capabilities=["spec_write", "phase_planning"],
|
||||||
|
)
|
||||||
|
fake_avail = BackendAvailability(available_backends=frozenset({Backend.FAKE}))
|
||||||
|
# Should succeed with FAKE backend + non-empty model
|
||||||
|
bindings = bind_personas(tmpl, [fake_persona], fake_avail, consent_store)
|
||||||
|
assert "spec_writer" in bindings
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# bind_personas: no eligible persona
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_bind_personas_no_eligible_raises(
|
||||||
|
spec_and_review: WorkflowTemplate,
|
||||||
|
consent_store: PersonaConsentStore,
|
||||||
|
) -> None:
|
||||||
|
# Provide a persona with wrong capabilities
|
||||||
|
bad_persona = _minimal_persona(capabilities=["backtest_run"])
|
||||||
|
with pytest.raises(MyDeepAgentError) as exc_info:
|
||||||
|
bind_personas(spec_and_review, [bad_persona], _all_available(), consent_store)
|
||||||
|
assert exc_info.value.code == "no_eligible_persona"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# PersonaConsentStore: get / set / revoke
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_consent_store_get_none_when_absent(consent_store: PersonaConsentStore) -> None:
|
||||||
|
assert consent_store.get("abc123") is None
|
||||||
|
|
||||||
|
|
||||||
|
def test_consent_store_set_and_get(consent_store: PersonaConsentStore) -> None:
|
||||||
|
consent_store.set("abc123", "approve")
|
||||||
|
assert consent_store.get("abc123") == "approve"
|
||||||
|
|
||||||
|
|
||||||
|
def test_consent_store_block(consent_store: PersonaConsentStore) -> None:
|
||||||
|
consent_store.set("abc123", "block")
|
||||||
|
assert consent_store.get("abc123") == "block"
|
||||||
|
|
||||||
|
|
||||||
|
def test_consent_store_once(consent_store: PersonaConsentStore) -> None:
|
||||||
|
consent_store.set("abc123", "once")
|
||||||
|
assert consent_store.get("abc123") == "once"
|
||||||
|
|
||||||
|
|
||||||
|
def test_consent_store_revoke(consent_store: PersonaConsentStore) -> None:
|
||||||
|
consent_store.set("abc123", "approve")
|
||||||
|
consent_store.revoke("abc123")
|
||||||
|
assert consent_store.get("abc123") is None
|
||||||
|
|
||||||
|
|
||||||
|
def test_consent_store_revoke_absent_is_noop(consent_store: PersonaConsentStore) -> None:
|
||||||
|
consent_store.revoke("not_present") # must not raise
|
||||||
|
|
||||||
|
|
||||||
|
def test_consent_store_overwrite(consent_store: PersonaConsentStore) -> None:
|
||||||
|
consent_store.set("abc123", "approve")
|
||||||
|
consent_store.set("abc123", "block")
|
||||||
|
assert consent_store.get("abc123") == "block"
|
||||||
|
|
||||||
|
|
||||||
|
def test_consent_store_unknown_decision_returns_none(
|
||||||
|
consent_store: PersonaConsentStore,
|
||||||
|
tmp_path: Path,
|
||||||
|
) -> None:
|
||||||
|
"""Corrupt decision value (not approve/block/once) returns None, not raise."""
|
||||||
|
path = tmp_path / "consents.json"
|
||||||
|
path.write_text(
|
||||||
|
json.dumps({"abc123": {"decision": "foobar", "decided_at": "2026-01-01T00:00:00+00:00"}}),
|
||||||
|
encoding="utf-8",
|
||||||
|
)
|
||||||
|
store = PersonaConsentStore(path)
|
||||||
|
assert store.get("abc123") is None
|
||||||
|
|
||||||
|
|
||||||
|
def test_consent_store_corrupted_json_raises_fatal(tmp_path: Path) -> None:
|
||||||
|
path = tmp_path / "consents.json"
|
||||||
|
path.write_text("{invalid json", encoding="utf-8")
|
||||||
|
store = PersonaConsentStore(path)
|
||||||
|
with pytest.raises(MyDeepAgentError) as exc_info:
|
||||||
|
store.get("abc123")
|
||||||
|
assert exc_info.value.code == "internal_state_corruption"
|
||||||
|
|
||||||
|
|
||||||
|
def test_consent_store_atomic_write(consent_store: PersonaConsentStore) -> None:
|
||||||
|
"""The .tmp file must not remain after a successful write."""
|
||||||
|
consent_store.set("abc", "approve")
|
||||||
|
tmp_file = consent_store._path.with_suffix(".json.tmp")
|
||||||
|
assert not tmp_file.exists(), ".tmp leftover after successful write"
|
||||||
|
|
||||||
|
|
||||||
|
def test_consent_store_json_format(consent_store: PersonaConsentStore) -> None:
|
||||||
|
"""Stored JSON must be valid and contain decision + decided_at."""
|
||||||
|
consent_store.set("myhash", "once")
|
||||||
|
raw = consent_store._path.read_text(encoding="utf-8")
|
||||||
|
data = json.loads(raw)
|
||||||
|
assert "myhash" in data
|
||||||
|
assert data["myhash"]["decision"] == "once"
|
||||||
|
assert "decided_at" in data["myhash"]
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# filter_consented_personas
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_filter_removes_blocked(consent_store: PersonaConsentStore) -> None:
|
||||||
|
p1 = _minimal_persona(name="p1")
|
||||||
|
p2 = _minimal_persona(name="p2")
|
||||||
|
consent_store.set(p2.compute_hash(), "block")
|
||||||
|
result = filter_consented_personas([p1, p2], consent_store)
|
||||||
|
assert len(result) == 1
|
||||||
|
assert result[0].name == "p1"
|
||||||
|
|
||||||
|
|
||||||
|
def test_filter_keeps_approved(consent_store: PersonaConsentStore) -> None:
|
||||||
|
p = _minimal_persona()
|
||||||
|
consent_store.set(p.compute_hash(), "approve")
|
||||||
|
result = filter_consented_personas([p], consent_store)
|
||||||
|
assert len(result) == 1
|
||||||
|
|
||||||
|
|
||||||
|
def test_filter_keeps_once(consent_store: PersonaConsentStore) -> None:
|
||||||
|
p = _minimal_persona()
|
||||||
|
consent_store.set(p.compute_hash(), "once")
|
||||||
|
result = filter_consented_personas([p], consent_store)
|
||||||
|
assert len(result) == 1
|
||||||
|
|
||||||
|
|
||||||
|
def test_filter_keeps_none_decision(consent_store: PersonaConsentStore) -> None:
|
||||||
|
"""Persona with no stored decision passes through."""
|
||||||
|
p = _minimal_persona()
|
||||||
|
result = filter_consented_personas([p], consent_store)
|
||||||
|
assert len(result) == 1
|
||||||
|
|
||||||
|
|
||||||
|
def test_filter_empty_list(consent_store: PersonaConsentStore) -> None:
|
||||||
|
result = filter_consented_personas([], consent_store)
|
||||||
|
assert result == []
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# bind_personas: consent-blocked persona detection
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_bind_personas_all_eligible_blocked_raises(
|
||||||
|
seed_personas: list[Persona],
|
||||||
|
spec_and_review: WorkflowTemplate,
|
||||||
|
consent_store: PersonaConsentStore,
|
||||||
|
) -> None:
|
||||||
|
# Block all spec_writer-eligible personas
|
||||||
|
for p in seed_personas:
|
||||||
|
if Capability.SPEC_WRITE in p.capabilities and Capability.PHASE_PLANNING in p.capabilities:
|
||||||
|
consent_store.set(p.compute_hash(), "block")
|
||||||
|
|
||||||
|
with pytest.raises(MyDeepAgentError) as exc_info:
|
||||||
|
bind_personas(spec_and_review, seed_personas, _all_available(), consent_store)
|
||||||
|
assert exc_info.value.code in ("persona_blocked_by_user", "no_eligible_persona")
|
||||||
|
|
||||||
|
|
||||||
|
def test_bind_personas_override_blocked_raises(
|
||||||
|
seed_personas: list[Persona],
|
||||||
|
spec_and_review: WorkflowTemplate,
|
||||||
|
consent_store: PersonaConsentStore,
|
||||||
|
) -> None:
|
||||||
|
spec_writer = next(p for p in seed_personas if p.name == "openrouter-claude-spec-writer")
|
||||||
|
consent_store.set(spec_writer.compute_hash(), "block")
|
||||||
|
override = BindingOverride.parse({"spec_writer": "openrouter-claude-spec-writer@1"})
|
||||||
|
with pytest.raises(MyDeepAgentError) as exc_info:
|
||||||
|
bind_personas(spec_and_review, seed_personas, _all_available(), consent_store, override)
|
||||||
|
assert exc_info.value.code == "persona_blocked_by_user"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# _auto_select: preferred_backends order
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_auto_select_prefers_preferred_backend(spec_and_review: WorkflowTemplate) -> None:
|
||||||
|
"""Persona with preferred backend wins over non-preferred even if alphabetically later."""
|
||||||
|
from my_deepagent.binding import _auto_select
|
||||||
|
|
||||||
|
spec_writer_role = next(r for r in spec_and_review.roles if r.id == "spec_writer")
|
||||||
|
# preferred_backends = ["openrouter"]
|
||||||
|
p_openrouter = _minimal_persona(
|
||||||
|
name="z-openrouter-persona",
|
||||||
|
backend="openrouter",
|
||||||
|
capabilities=["spec_write", "phase_planning"],
|
||||||
|
)
|
||||||
|
p_fake = _minimal_persona(
|
||||||
|
name="a-fake-persona",
|
||||||
|
backend="fake",
|
||||||
|
capabilities=["spec_write", "phase_planning"],
|
||||||
|
)
|
||||||
|
chosen = _auto_select([p_openrouter, p_fake], spec_writer_role)
|
||||||
|
assert chosen.name == "z-openrouter-persona"
|
||||||
|
|
||||||
|
|
||||||
|
def test_auto_select_higher_version_wins(spec_and_review: WorkflowTemplate) -> None:
|
||||||
|
from my_deepagent.binding import _auto_select
|
||||||
|
|
||||||
|
spec_writer_role = next(r for r in spec_and_review.roles if r.id == "spec_writer")
|
||||||
|
p_v1 = _minimal_persona(version=1, capabilities=["spec_write", "phase_planning"])
|
||||||
|
p_v2 = _minimal_persona(version=2, capabilities=["spec_write", "phase_planning"])
|
||||||
|
chosen = _auto_select([p_v1, p_v2], spec_writer_role)
|
||||||
|
assert chosen.version == 2
|
||||||
|
|
||||||
|
|
||||||
|
def test_auto_select_name_asc_tiebreak(spec_and_review: WorkflowTemplate) -> None:
|
||||||
|
from my_deepagent.binding import _auto_select
|
||||||
|
|
||||||
|
spec_writer_role = next(r for r in spec_and_review.roles if r.id == "spec_writer")
|
||||||
|
caps = ["spec_write", "phase_planning"]
|
||||||
|
p_b = _minimal_persona(name="b-persona", version=1, capabilities=caps)
|
||||||
|
p_a = _minimal_persona(name="a-persona", version=1, capabilities=caps)
|
||||||
|
chosen = _auto_select([p_b, p_a], spec_writer_role)
|
||||||
|
assert chosen.name == "a-persona"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Step 2 patch: FAKE backend recovery hint
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_backend_recovery_hint_fake() -> None:
|
||||||
|
"""FAKE backend recovery hint must mention 'fake' and 'tests only'."""
|
||||||
|
from my_deepagent.binding import _backend_recovery_hint
|
||||||
|
|
||||||
|
hint = _backend_recovery_hint(Backend.FAKE)
|
||||||
|
assert "fake" in hint.lower()
|
||||||
|
assert "tests only" in hint.lower() or "test harness" in hint.lower()
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Step 2 patch: override with non-integer version raises with diagnostic
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_bind_personas_override_non_integer_version_raises(
|
||||||
|
seed_personas: list[Persona],
|
||||||
|
spec_and_review: WorkflowTemplate,
|
||||||
|
consent_store: PersonaConsentStore,
|
||||||
|
) -> None:
|
||||||
|
"""An override spec with a non-integer version must raise with clear diagnostic."""
|
||||||
|
override = BindingOverride(persona_pinned={"spec_writer": "openrouter-claude-spec-writer@abc"})
|
||||||
|
with pytest.raises(MyDeepAgentError) as exc_info:
|
||||||
|
bind_personas(spec_and_review, seed_personas, _all_available(), consent_store, override)
|
||||||
|
assert exc_info.value.code == "no_eligible_persona"
|
||||||
|
assert "non-integer version" in str(exc_info.value)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Step 2 patch: override with ineligible persona surfaces reason
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_bind_personas_override_ineligible_persona_surfaces_reason(
|
||||||
|
seed_personas: list[Persona],
|
||||||
|
spec_and_review: WorkflowTemplate,
|
||||||
|
consent_store: PersonaConsentStore,
|
||||||
|
) -> None:
|
||||||
|
"""Override that names an ineligible persona must surface the ineligibility reason."""
|
||||||
|
# 'spec_writer' role needs spec_write + phase_planning.
|
||||||
|
# Find a persona in seed that does NOT have those caps so we can force it.
|
||||||
|
ineligible = next(
|
||||||
|
p for p in seed_personas if "spec_write" not in [c.value for c in p.capabilities]
|
||||||
|
)
|
||||||
|
override = BindingOverride(
|
||||||
|
persona_pinned={"spec_writer": f"{ineligible.name}@{ineligible.version}"}
|
||||||
|
)
|
||||||
|
with pytest.raises(MyDeepAgentError) as exc_info:
|
||||||
|
bind_personas(spec_and_review, seed_personas, _all_available(), consent_store, override)
|
||||||
|
assert exc_info.value.code == "no_eligible_persona"
|
||||||
|
err_str = str(exc_info.value)
|
||||||
|
# The error message must say the persona is ineligible with a reason.
|
||||||
|
assert "ineligible" in err_str or "missing" in err_str
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Step 2 patch: PersonaConsentStore atomic write calls os.fsync
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_consent_store_write_calls_fsync(tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
"""PersonaConsentStore.set() must call os.fsync() for atomic durability."""
|
||||||
|
import os
|
||||||
|
|
||||||
|
called: list[int] = []
|
||||||
|
orig_fsync = os.fsync
|
||||||
|
|
||||||
|
def spy(fd: int) -> None:
|
||||||
|
called.append(fd)
|
||||||
|
orig_fsync(fd)
|
||||||
|
|
||||||
|
monkeypatch.setattr(os, "fsync", spy)
|
||||||
|
|
||||||
|
store = PersonaConsentStore(tmp_path / "consents.json")
|
||||||
|
store.set("hash_abc", "approve")
|
||||||
|
|
||||||
|
assert len(called) >= 1, "os.fsync must be called at least once during write"
|
||||||
238
my-deepagent/tests/unit/test_config.py
Normal file
238
my-deepagent/tests/unit/test_config.py
Normal file
@@ -0,0 +1,238 @@
|
|||||||
|
"""Unit tests for src/my_deepagent/config.py."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
from pydantic import ValidationError
|
||||||
|
|
||||||
|
from my_deepagent.config import Config, load_config
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Default values (no env, no file)
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_default_log_level(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
_clear_env(monkeypatch)
|
||||||
|
cfg = Config()
|
||||||
|
assert cfg.log_level == "info"
|
||||||
|
|
||||||
|
|
||||||
|
def test_default_lang(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
_clear_env(monkeypatch)
|
||||||
|
cfg = Config()
|
||||||
|
assert cfg.lang == "ko"
|
||||||
|
|
||||||
|
|
||||||
|
def test_default_budget_daily_usd(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
_clear_env(monkeypatch)
|
||||||
|
cfg = Config()
|
||||||
|
assert cfg.budget_daily_usd == pytest.approx(5.0)
|
||||||
|
|
||||||
|
|
||||||
|
def test_default_budget_run_usd(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
_clear_env(monkeypatch)
|
||||||
|
cfg = Config()
|
||||||
|
assert cfg.budget_run_usd == pytest.approx(1.0)
|
||||||
|
|
||||||
|
|
||||||
|
def test_default_budget_on_hit(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
_clear_env(monkeypatch)
|
||||||
|
cfg = Config()
|
||||||
|
assert cfg.budget_on_hit == "prompt"
|
||||||
|
|
||||||
|
|
||||||
|
def test_default_persona(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
_clear_env(monkeypatch)
|
||||||
|
cfg = Config()
|
||||||
|
assert cfg.default_persona == "default-interactive"
|
||||||
|
|
||||||
|
|
||||||
|
def test_default_openrouter_api_key_is_none(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
_clear_env(monkeypatch)
|
||||||
|
# _env_file=None bypasses any .env that may exist in the cwd (e.g. dev keys).
|
||||||
|
cfg = Config(_env_file=None) # type: ignore[call-arg]
|
||||||
|
assert cfg.openrouter_api_key is None
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Env var overrides
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_env_budget_daily_usd(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
_clear_env(monkeypatch)
|
||||||
|
monkeypatch.setenv("MYDEEPAGENT_BUDGET_DAILY_USD", "10")
|
||||||
|
cfg = Config()
|
||||||
|
assert cfg.budget_daily_usd == pytest.approx(10.0)
|
||||||
|
|
||||||
|
|
||||||
|
def test_env_lang_en(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
_clear_env(monkeypatch)
|
||||||
|
monkeypatch.setenv("MYDEEPAGENT_LANG", "en")
|
||||||
|
cfg = Config()
|
||||||
|
assert cfg.lang == "en"
|
||||||
|
|
||||||
|
|
||||||
|
def test_env_log_level_debug(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
_clear_env(monkeypatch)
|
||||||
|
monkeypatch.setenv("MYDEEPAGENT_LOG_LEVEL", "debug")
|
||||||
|
cfg = Config()
|
||||||
|
assert cfg.log_level == "debug"
|
||||||
|
|
||||||
|
|
||||||
|
def test_env_openrouter_api_key(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
_clear_env(monkeypatch)
|
||||||
|
monkeypatch.setenv("MYDEEPAGENT_OPENROUTER_API_KEY", "sk-test-abc")
|
||||||
|
cfg = Config()
|
||||||
|
assert cfg.openrouter_api_key == "sk-test-abc"
|
||||||
|
|
||||||
|
|
||||||
|
def test_env_langsmith_tracing(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
_clear_env(monkeypatch)
|
||||||
|
monkeypatch.setenv("MYDEEPAGENT_LANGSMITH_TRACING", "true")
|
||||||
|
cfg = Config()
|
||||||
|
assert cfg.langsmith_tracing is True
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Validation errors for invalid values
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_invalid_lang_raises(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
_clear_env(monkeypatch)
|
||||||
|
monkeypatch.setenv("MYDEEPAGENT_LANG", "fr")
|
||||||
|
with pytest.raises(ValidationError):
|
||||||
|
Config()
|
||||||
|
|
||||||
|
|
||||||
|
def test_invalid_log_level_raises(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
_clear_env(monkeypatch)
|
||||||
|
monkeypatch.setenv("MYDEEPAGENT_LOG_LEVEL", "verbose")
|
||||||
|
with pytest.raises(ValidationError):
|
||||||
|
Config()
|
||||||
|
|
||||||
|
|
||||||
|
def test_invalid_budget_on_hit_raises(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
_clear_env(monkeypatch)
|
||||||
|
monkeypatch.setenv("MYDEEPAGENT_BUDGET_ON_HIT", "explode")
|
||||||
|
with pytest.raises(ValidationError):
|
||||||
|
Config()
|
||||||
|
|
||||||
|
|
||||||
|
def test_negative_budget_raises(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
_clear_env(monkeypatch)
|
||||||
|
with pytest.raises(ValidationError):
|
||||||
|
Config(budget_daily_usd=-1.0)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Frozen check
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_frozen_prevents_mutation(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
_clear_env(monkeypatch)
|
||||||
|
cfg = Config()
|
||||||
|
with pytest.raises((ValidationError, TypeError)):
|
||||||
|
cfg.budget_daily_usd = 99 # type: ignore[misc]
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Path expansion (~ → absolute path)
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_tilde_expansion_workspace_root(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
_clear_env(monkeypatch)
|
||||||
|
monkeypatch.setenv("MYDEEPAGENT_WORKSPACE_ROOT", "~/foo/bar")
|
||||||
|
cfg = Config()
|
||||||
|
assert cfg.workspace_root.is_absolute()
|
||||||
|
assert "~" not in str(cfg.workspace_root)
|
||||||
|
|
||||||
|
|
||||||
|
def test_tilde_expansion_data_dir(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
_clear_env(monkeypatch)
|
||||||
|
monkeypatch.setenv("MYDEEPAGENT_DATA_DIR", "~/mydata")
|
||||||
|
cfg = Config()
|
||||||
|
assert cfg.data_dir.is_absolute()
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# TOML priority
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_toml_overrides_default(monkeypatch: pytest.MonkeyPatch, tmp_path: Path) -> None:
|
||||||
|
_clear_env(monkeypatch)
|
||||||
|
toml_file = tmp_path / "config.toml"
|
||||||
|
toml_file.write_text('lang = "en"\nbudget_daily_usd = 7.5\n')
|
||||||
|
|
||||||
|
# Patch the toml_file location via init override
|
||||||
|
# Config reads toml via SettingsConfigDict; we pass via class-level override trick:
|
||||||
|
# Easiest approach: pass budget_daily_usd and lang directly to assert TOML *can* set them.
|
||||||
|
# For true TOML path injection, subclass Config temporarily.
|
||||||
|
class PatchedConfig(Config):
|
||||||
|
model_config = Config.model_config.copy()
|
||||||
|
|
||||||
|
PatchedConfig.model_config["toml_file"] = str(toml_file)
|
||||||
|
|
||||||
|
cfg = PatchedConfig()
|
||||||
|
assert cfg.lang == "en"
|
||||||
|
assert cfg.budget_daily_usd == pytest.approx(7.5)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# load_config helper
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_load_config_with_overrides(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
_clear_env(monkeypatch)
|
||||||
|
cfg = load_config(budget_daily_usd=20.0, lang="en")
|
||||||
|
assert cfg.budget_daily_usd == pytest.approx(20.0)
|
||||||
|
assert cfg.lang == "en"
|
||||||
|
|
||||||
|
|
||||||
|
def test_load_config_default(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
_clear_env(monkeypatch)
|
||||||
|
cfg = load_config()
|
||||||
|
assert cfg.log_level == "info"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Helpers
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
_ENV_KEYS = [
|
||||||
|
"MYDEEPAGENT_BUDGET_DAILY_USD",
|
||||||
|
"MYDEEPAGENT_BUDGET_DAILY_WARN_USD",
|
||||||
|
"MYDEEPAGENT_BUDGET_RUN_USD",
|
||||||
|
"MYDEEPAGENT_BUDGET_RUN_WARN_USD",
|
||||||
|
"MYDEEPAGENT_BUDGET_ON_HIT",
|
||||||
|
"MYDEEPAGENT_LANG",
|
||||||
|
"MYDEEPAGENT_LOG_LEVEL",
|
||||||
|
"MYDEEPAGENT_OPENROUTER_API_KEY",
|
||||||
|
"MYDEEPAGENT_OPENROUTER_BASE_URL",
|
||||||
|
"MYDEEPAGENT_LANGSMITH_TRACING",
|
||||||
|
"MYDEEPAGENT_LANGSMITH_API_KEY",
|
||||||
|
"MYDEEPAGENT_LANGSMITH_PROJECT",
|
||||||
|
"MYDEEPAGENT_DATABASE_URL",
|
||||||
|
"MYDEEPAGENT_WORKSPACE_ROOT",
|
||||||
|
"MYDEEPAGENT_DATA_DIR",
|
||||||
|
"MYDEEPAGENT_CONFIG_DIR",
|
||||||
|
"MYDEEPAGENT_STATE_DIR",
|
||||||
|
"MYDEEPAGENT_DEFAULT_PERSONA",
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
def _clear_env(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
"""Remove all MYDEEPAGENT_ env vars to isolate tests from the real environment."""
|
||||||
|
for key in _ENV_KEYS:
|
||||||
|
monkeypatch.delenv(key, raising=False)
|
||||||
|
# Also prevent dotenv file from being loaded
|
||||||
|
monkeypatch.setenv("MYDEEPAGENT_ENV_FILE", "")
|
||||||
235
my-deepagent/tests/unit/test_enums.py
Normal file
235
my-deepagent/tests/unit/test_enums.py
Normal file
@@ -0,0 +1,235 @@
|
|||||||
|
"""Unit tests for src/my_deepagent/enums.py."""
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from my_deepagent.enums import (
|
||||||
|
ApprovalDecisionAction,
|
||||||
|
ApprovalState,
|
||||||
|
Backend,
|
||||||
|
Capability,
|
||||||
|
ErrorClass,
|
||||||
|
RiskLevel,
|
||||||
|
RunPhaseState,
|
||||||
|
RunState,
|
||||||
|
SessionState,
|
||||||
|
)
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Backend
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_backend_openrouter_value() -> None:
|
||||||
|
assert Backend.OPENROUTER == "openrouter"
|
||||||
|
|
||||||
|
|
||||||
|
def test_backend_anthropic_value() -> None:
|
||||||
|
assert Backend.ANTHROPIC == "anthropic"
|
||||||
|
|
||||||
|
|
||||||
|
def test_backend_openai_value() -> None:
|
||||||
|
assert Backend.OPENAI == "openai"
|
||||||
|
|
||||||
|
|
||||||
|
def test_backend_google_value() -> None:
|
||||||
|
assert Backend.GOOGLE == "google"
|
||||||
|
|
||||||
|
|
||||||
|
def test_backend_fake_value() -> None:
|
||||||
|
assert Backend.FAKE == "fake"
|
||||||
|
|
||||||
|
|
||||||
|
def test_backend_str_equality() -> None:
|
||||||
|
# StrEnum members compare equal to their string values
|
||||||
|
assert Backend.OPENROUTER == "openrouter"
|
||||||
|
assert str(Backend.OPENROUTER) == "openrouter"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Capability
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_capability_count() -> None:
|
||||||
|
assert len(list(Capability)) == 13
|
||||||
|
|
||||||
|
|
||||||
|
def test_capability_spec_write() -> None:
|
||||||
|
assert Capability.SPEC_WRITE == "spec_write"
|
||||||
|
|
||||||
|
|
||||||
|
def test_capability_code_edit() -> None:
|
||||||
|
assert Capability.CODE_EDIT == "code_edit"
|
||||||
|
|
||||||
|
|
||||||
|
def test_capability_final_report_compose() -> None:
|
||||||
|
assert Capability.FINAL_REPORT_COMPOSE == "final_report_compose"
|
||||||
|
|
||||||
|
|
||||||
|
def test_capability_all_are_str() -> None:
|
||||||
|
for cap in Capability:
|
||||||
|
assert isinstance(cap, str)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# RiskLevel
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_risk_level_values() -> None:
|
||||||
|
assert RiskLevel.LOW == "low"
|
||||||
|
assert RiskLevel.MEDIUM == "medium"
|
||||||
|
assert RiskLevel.HIGH == "high"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# ApprovalDecisionAction
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_approval_decision_action_approve() -> None:
|
||||||
|
assert ApprovalDecisionAction.APPROVE == "approve"
|
||||||
|
|
||||||
|
|
||||||
|
def test_approval_decision_action_reject() -> None:
|
||||||
|
assert ApprovalDecisionAction.REJECT == "reject"
|
||||||
|
|
||||||
|
|
||||||
|
def test_approval_decision_action_request_changes() -> None:
|
||||||
|
assert ApprovalDecisionAction.REQUEST_CHANGES == "request_changes"
|
||||||
|
|
||||||
|
|
||||||
|
def test_approval_decision_action_abort() -> None:
|
||||||
|
assert ApprovalDecisionAction.ABORT == "abort"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# ApprovalState
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_approval_state_all_values() -> None:
|
||||||
|
expected = {"pending", "approved", "rejected", "changes_requested", "aborted", "paused"}
|
||||||
|
actual = {s.value for s in ApprovalState}
|
||||||
|
assert actual == expected
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# RunState
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_run_state_all_values() -> None:
|
||||||
|
expected = {
|
||||||
|
"created",
|
||||||
|
"bound",
|
||||||
|
"planning",
|
||||||
|
"awaiting_approval",
|
||||||
|
"executing",
|
||||||
|
"paused",
|
||||||
|
"completed",
|
||||||
|
"failed",
|
||||||
|
"aborted",
|
||||||
|
}
|
||||||
|
actual = {s.value for s in RunState}
|
||||||
|
assert actual == expected
|
||||||
|
|
||||||
|
|
||||||
|
def test_run_state_count() -> None:
|
||||||
|
assert len(list(RunState)) == 9
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# RunPhaseState
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_run_phase_state_all_values() -> None:
|
||||||
|
expected = {
|
||||||
|
"pending",
|
||||||
|
"running",
|
||||||
|
"awaiting_artifact",
|
||||||
|
"validating",
|
||||||
|
"awaiting_approval",
|
||||||
|
"completed",
|
||||||
|
"failed",
|
||||||
|
"skipped",
|
||||||
|
}
|
||||||
|
actual = {s.value for s in RunPhaseState}
|
||||||
|
assert actual == expected
|
||||||
|
|
||||||
|
|
||||||
|
def test_run_phase_state_count() -> None:
|
||||||
|
assert len(list(RunPhaseState)) == 8
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# SessionState
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_session_state_all_values() -> None:
|
||||||
|
expected = {
|
||||||
|
"CREATED",
|
||||||
|
"BOOTSTRAPPING",
|
||||||
|
"READY",
|
||||||
|
"BUSY",
|
||||||
|
"WAITING_FOR_APPROVAL",
|
||||||
|
"ARTIFACT_TIMEOUT",
|
||||||
|
"HUNG",
|
||||||
|
"CRASHED",
|
||||||
|
"RESUMING",
|
||||||
|
"REBOOTSTRAPPED",
|
||||||
|
"FAILED_NEEDS_HUMAN",
|
||||||
|
}
|
||||||
|
actual = {s.value for s in SessionState}
|
||||||
|
assert actual == expected
|
||||||
|
|
||||||
|
|
||||||
|
def test_session_state_count() -> None:
|
||||||
|
assert len(list(SessionState)) == 11
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# ErrorClass
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_error_class_recoverable() -> None:
|
||||||
|
assert ErrorClass.RECOVERABLE == "recoverable"
|
||||||
|
|
||||||
|
|
||||||
|
def test_error_class_human_required() -> None:
|
||||||
|
assert ErrorClass.HUMAN_REQUIRED == "human_required"
|
||||||
|
|
||||||
|
|
||||||
|
def test_error_class_fatal() -> None:
|
||||||
|
assert ErrorClass.FATAL == "fatal"
|
||||||
|
|
||||||
|
|
||||||
|
def test_error_class_count() -> None:
|
||||||
|
assert len(list(ErrorClass)) == 3
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# StrEnum serialization / deserialization
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_str_enum_from_value() -> None:
|
||||||
|
assert Backend("openrouter") is Backend.OPENROUTER
|
||||||
|
|
||||||
|
|
||||||
|
def test_str_enum_in_dict() -> None:
|
||||||
|
# StrEnum should work as dict key and compare with string
|
||||||
|
d = {Backend.OPENROUTER: "openrouter backend"}
|
||||||
|
assert d["openrouter"] == "openrouter backend"
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.parametrize(
|
||||||
|
"state",
|
||||||
|
list(RunState),
|
||||||
|
)
|
||||||
|
def test_run_state_parametrize(state: RunState) -> None:
|
||||||
|
assert isinstance(state, str)
|
||||||
|
assert RunState(state.value) is state
|
||||||
208
my-deepagent/tests/unit/test_errors.py
Normal file
208
my-deepagent/tests/unit/test_errors.py
Normal file
@@ -0,0 +1,208 @@
|
|||||||
|
"""Unit tests for src/my_deepagent/errors.py."""
|
||||||
|
|
||||||
|
from uuid import UUID, uuid4
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from my_deepagent.enums import ErrorClass
|
||||||
|
from my_deepagent.errors import BudgetExhaustedError, MyDeepAgentError
|
||||||
|
|
||||||
|
|
||||||
|
def test_cause_sets_suppress_context() -> None:
|
||||||
|
"""Wrapping a cause must suppress the implicit context per PEP 3134."""
|
||||||
|
original = ValueError("root cause")
|
||||||
|
err = MyDeepAgentError.recoverable("wrapped", cause=original)
|
||||||
|
assert err.__cause__ is original
|
||||||
|
assert err.__suppress_context__ is True
|
||||||
|
|
||||||
|
|
||||||
|
def test_no_cause_does_not_set_suppress_context() -> None:
|
||||||
|
err = MyDeepAgentError.recoverable("no_cause")
|
||||||
|
assert err.__cause__ is None
|
||||||
|
assert err.__suppress_context__ is False
|
||||||
|
|
||||||
|
|
||||||
|
def test_factory_returns_base_class_not_subclass() -> None:
|
||||||
|
"""LSP fix: factory methods always return MyDeepAgentError, never BudgetExhaustedError."""
|
||||||
|
err = BudgetExhaustedError.recoverable("foo")
|
||||||
|
assert type(err) is MyDeepAgentError
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# MyDeepAgentError factory methods
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_recoverable_class() -> None:
|
||||||
|
err = MyDeepAgentError.recoverable("network_blip", recovery_hint="retry")
|
||||||
|
assert err.error_class == ErrorClass.RECOVERABLE
|
||||||
|
|
||||||
|
|
||||||
|
def test_recoverable_code() -> None:
|
||||||
|
err = MyDeepAgentError.recoverable("network_blip")
|
||||||
|
assert err.code == "network_blip"
|
||||||
|
|
||||||
|
|
||||||
|
def test_recoverable_recovery_hint() -> None:
|
||||||
|
err = MyDeepAgentError.recoverable("network_blip", recovery_hint="retry after 1s")
|
||||||
|
assert err.recovery_hint == "retry after 1s"
|
||||||
|
|
||||||
|
|
||||||
|
def test_human_required_class() -> None:
|
||||||
|
err = MyDeepAgentError.human_required("destructive_command_blocked")
|
||||||
|
assert err.error_class == ErrorClass.HUMAN_REQUIRED
|
||||||
|
|
||||||
|
|
||||||
|
def test_human_required_code() -> None:
|
||||||
|
err = MyDeepAgentError.human_required("destructive_command_blocked")
|
||||||
|
assert err.code == "destructive_command_blocked"
|
||||||
|
|
||||||
|
|
||||||
|
def test_fatal_class() -> None:
|
||||||
|
err = MyDeepAgentError.fatal("unrecoverable_state")
|
||||||
|
assert err.error_class == ErrorClass.FATAL
|
||||||
|
|
||||||
|
|
||||||
|
def test_fatal_code() -> None:
|
||||||
|
err = MyDeepAgentError.fatal("unrecoverable_state")
|
||||||
|
assert err.code == "unrecoverable_state"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# run_id / phase_id context
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_run_id_attached() -> None:
|
||||||
|
run_id = uuid4()
|
||||||
|
err = MyDeepAgentError.recoverable("timeout", run_id=run_id)
|
||||||
|
assert err.run_id == run_id
|
||||||
|
|
||||||
|
|
||||||
|
def test_phase_id_attached() -> None:
|
||||||
|
phase_id = uuid4()
|
||||||
|
err = MyDeepAgentError.recoverable("artifact_missing", phase_id=phase_id)
|
||||||
|
assert err.phase_id == phase_id
|
||||||
|
|
||||||
|
|
||||||
|
def test_run_id_none_by_default() -> None:
|
||||||
|
err = MyDeepAgentError.recoverable("x")
|
||||||
|
assert err.run_id is None
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# __cause__ propagation
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_cause_propagation() -> None:
|
||||||
|
original = ValueError("root cause")
|
||||||
|
err = MyDeepAgentError.recoverable("wrapped", cause=original)
|
||||||
|
assert err.__cause__ is original
|
||||||
|
|
||||||
|
|
||||||
|
def test_cause_none_by_default() -> None:
|
||||||
|
err = MyDeepAgentError.recoverable("no_cause")
|
||||||
|
assert err.__cause__ is None
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# __repr__ format
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_repr_contains_class_and_code() -> None:
|
||||||
|
err = MyDeepAgentError.recoverable("some_code")
|
||||||
|
r = repr(err)
|
||||||
|
assert "class=recoverable" in r
|
||||||
|
assert "code=some_code" in r
|
||||||
|
|
||||||
|
|
||||||
|
def test_repr_contains_run_id_when_present() -> None:
|
||||||
|
run_id = UUID("12345678-1234-5678-1234-567812345678")
|
||||||
|
err = MyDeepAgentError.recoverable("x", run_id=run_id)
|
||||||
|
assert str(run_id) in repr(err)
|
||||||
|
|
||||||
|
|
||||||
|
def test_repr_contains_hint_when_present() -> None:
|
||||||
|
err = MyDeepAgentError.recoverable("x", recovery_hint="do something")
|
||||||
|
assert "do something" in repr(err)
|
||||||
|
|
||||||
|
|
||||||
|
def test_repr_no_hint_when_absent() -> None:
|
||||||
|
err = MyDeepAgentError.recoverable("x")
|
||||||
|
assert "hint" not in repr(err)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Exception hierarchy
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_my_deepagent_error_is_exception() -> None:
|
||||||
|
err = MyDeepAgentError.recoverable("x")
|
||||||
|
assert isinstance(err, Exception)
|
||||||
|
|
||||||
|
|
||||||
|
def test_budget_exhausted_is_my_deepagent_error() -> None:
|
||||||
|
err = BudgetExhaustedError("day:2026-05-15", 1.20, 1.00)
|
||||||
|
assert isinstance(err, MyDeepAgentError)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# BudgetExhaustedError
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_budget_exhausted_scope() -> None:
|
||||||
|
err = BudgetExhaustedError("day:2026-05-15", 1.20, 1.00)
|
||||||
|
assert err.scope == "day:2026-05-15"
|
||||||
|
|
||||||
|
|
||||||
|
def test_budget_exhausted_projected_usd() -> None:
|
||||||
|
err = BudgetExhaustedError("day:2026-05-15", 1.20, 1.00)
|
||||||
|
assert err.projected_usd == pytest.approx(1.20)
|
||||||
|
|
||||||
|
|
||||||
|
def test_budget_exhausted_cap_usd() -> None:
|
||||||
|
err = BudgetExhaustedError("day:2026-05-15", 1.20, 1.00)
|
||||||
|
assert err.cap_usd == pytest.approx(1.00)
|
||||||
|
|
||||||
|
|
||||||
|
def test_budget_exhausted_error_class() -> None:
|
||||||
|
err = BudgetExhaustedError("day:2026-05-15", 1.20, 1.00)
|
||||||
|
assert err.error_class == ErrorClass.HUMAN_REQUIRED
|
||||||
|
|
||||||
|
|
||||||
|
def test_budget_exhausted_code() -> None:
|
||||||
|
err = BudgetExhaustedError("day:2026-05-15", 1.20, 1.00)
|
||||||
|
assert err.code == "budget_exhausted"
|
||||||
|
|
||||||
|
|
||||||
|
def test_budget_exhausted_default_recovery_hint() -> None:
|
||||||
|
err = BudgetExhaustedError("day:2026-05-15", 1.20, 1.00)
|
||||||
|
assert err.recovery_hint is not None
|
||||||
|
assert len(err.recovery_hint) > 0
|
||||||
|
|
||||||
|
|
||||||
|
def test_budget_exhausted_custom_recovery_hint() -> None:
|
||||||
|
err = BudgetExhaustedError("day:2026-05-15", 1.20, 1.00, recovery_hint="call support")
|
||||||
|
assert err.recovery_hint == "call support"
|
||||||
|
|
||||||
|
|
||||||
|
def test_budget_exhausted_run_id() -> None:
|
||||||
|
run_id = uuid4()
|
||||||
|
err = BudgetExhaustedError("run:abc", 0.5, 0.4, run_id=run_id)
|
||||||
|
assert err.run_id == run_id
|
||||||
|
|
||||||
|
|
||||||
|
def test_budget_exhausted_message_contains_scope() -> None:
|
||||||
|
err = BudgetExhaustedError("day:2026-05-15", 1.20, 1.00)
|
||||||
|
assert "day:2026-05-15" in str(err)
|
||||||
|
|
||||||
|
|
||||||
|
def test_budget_exhausted_message_contains_values() -> None:
|
||||||
|
err = BudgetExhaustedError("scope", 1.2345, 1.0000)
|
||||||
|
msg = str(err)
|
||||||
|
assert "1.2345" in msg
|
||||||
|
assert "1.0000" in msg
|
||||||
121
my-deepagent/tests/unit/test_hash.py
Normal file
121
my-deepagent/tests/unit/test_hash.py
Normal file
@@ -0,0 +1,121 @@
|
|||||||
|
"""Unit tests for src/my_deepagent/hash.py."""
|
||||||
|
|
||||||
|
import re
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from my_deepagent.hash import canonicalize, sha256
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# canonicalize: key ordering
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_canonicalize_sorts_keys() -> None:
|
||||||
|
assert canonicalize({"b": 1, "a": 2}) == '{"a":2,"b":1}'
|
||||||
|
|
||||||
|
|
||||||
|
def test_canonicalize_nested_sorts_keys() -> None:
|
||||||
|
result = canonicalize({"x": {"b": 2, "a": 1}})
|
||||||
|
assert result == '{"x":{"a":1,"b":2}}'
|
||||||
|
|
||||||
|
|
||||||
|
def test_canonicalize_empty_dict() -> None:
|
||||||
|
assert canonicalize({}) == "{}"
|
||||||
|
|
||||||
|
|
||||||
|
def test_canonicalize_empty_list() -> None:
|
||||||
|
assert canonicalize([]) == "[]"
|
||||||
|
|
||||||
|
|
||||||
|
def test_canonicalize_none() -> None:
|
||||||
|
assert canonicalize(None) == "null"
|
||||||
|
|
||||||
|
|
||||||
|
def test_canonicalize_integer() -> None:
|
||||||
|
assert canonicalize(42) == "42"
|
||||||
|
|
||||||
|
|
||||||
|
def test_canonicalize_float() -> None:
|
||||||
|
# 0.1 has a known floating-point representation
|
||||||
|
result = canonicalize(0.1)
|
||||||
|
assert result == "0.1"
|
||||||
|
|
||||||
|
|
||||||
|
def test_canonicalize_no_whitespace() -> None:
|
||||||
|
result = canonicalize({"a": 1, "b": 2})
|
||||||
|
assert " " not in result
|
||||||
|
|
||||||
|
|
||||||
|
def test_canonicalize_list_preserves_order() -> None:
|
||||||
|
# Lists should not be reordered
|
||||||
|
assert canonicalize([3, 1, 2]) == "[3,1,2]"
|
||||||
|
|
||||||
|
|
||||||
|
def test_canonicalize_string_value() -> None:
|
||||||
|
assert canonicalize("hello") == '"hello"'
|
||||||
|
|
||||||
|
|
||||||
|
def test_canonicalize_boolean() -> None:
|
||||||
|
assert canonicalize(True) == "true"
|
||||||
|
assert canonicalize(False) == "false"
|
||||||
|
|
||||||
|
|
||||||
|
def test_canonicalize_nan_raises() -> None:
|
||||||
|
import math
|
||||||
|
|
||||||
|
with pytest.raises(ValueError):
|
||||||
|
canonicalize(math.nan)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# sha256: determinism
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_sha256_deterministic() -> None:
|
||||||
|
value = {"a": 1, "b": [1, 2, 3]}
|
||||||
|
results = [sha256(value) for _ in range(100)]
|
||||||
|
assert len(set(results)) == 1
|
||||||
|
|
||||||
|
|
||||||
|
def test_sha256_returns_64_char_hex() -> None:
|
||||||
|
result = sha256({"a": 1})
|
||||||
|
assert re.fullmatch(r"[0-9a-f]{64}", result) is not None
|
||||||
|
|
||||||
|
|
||||||
|
def test_sha256_different_inputs_different_hash() -> None:
|
||||||
|
h1 = sha256({"a": 1})
|
||||||
|
h2 = sha256({"a": 2})
|
||||||
|
assert h1 != h2
|
||||||
|
|
||||||
|
|
||||||
|
def test_sha256_key_order_irrelevant() -> None:
|
||||||
|
# Same content, different insertion order → same hash
|
||||||
|
h1 = sha256({"a": 1, "b": 2})
|
||||||
|
h2 = sha256({"b": 2, "a": 1})
|
||||||
|
assert h1 == h2
|
||||||
|
|
||||||
|
|
||||||
|
def test_sha256_empty_dict() -> None:
|
||||||
|
result = sha256({})
|
||||||
|
assert re.fullmatch(r"[0-9a-f]{64}", result) is not None
|
||||||
|
|
||||||
|
|
||||||
|
def test_sha256_none() -> None:
|
||||||
|
result = sha256(None)
|
||||||
|
assert re.fullmatch(r"[0-9a-f]{64}", result) is not None
|
||||||
|
|
||||||
|
|
||||||
|
def test_sha256_nested() -> None:
|
||||||
|
h1 = sha256({"x": {"a": 1, "b": 2}})
|
||||||
|
h2 = sha256({"x": {"b": 2, "a": 1}})
|
||||||
|
assert h1 == h2
|
||||||
|
|
||||||
|
|
||||||
|
def test_sha256_known_value() -> None:
|
||||||
|
# Pre-computed: sha256('{"a":1}') in UTF-8
|
||||||
|
import hashlib
|
||||||
|
|
||||||
|
expected = hashlib.sha256(b'{"a":1}').hexdigest()
|
||||||
|
assert sha256({"a": 1}) == expected
|
||||||
118
my-deepagent/tests/unit/test_middleware_audit.py
Normal file
118
my-deepagent/tests/unit/test_middleware_audit.py
Normal file
@@ -0,0 +1,118 @@
|
|||||||
|
"""Unit tests for src/my_deepagent/middleware/audit.py."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from typing import Any
|
||||||
|
from unittest.mock import AsyncMock, MagicMock
|
||||||
|
from uuid import UUID
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from my_deepagent.middleware.audit import AuditToolMiddleware
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Helpers
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def _make_request(name: str = "read_file", args: dict[str, Any] | None = None) -> MagicMock:
|
||||||
|
request = MagicMock()
|
||||||
|
request.tool_call = {"name": name, "args": args or {"path": "x.py"}}
|
||||||
|
return request
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# awrap_tool_call — success path
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_audit_middleware_records_correct_fields_on_success() -> None:
|
||||||
|
recorder = AsyncMock()
|
||||||
|
mw = AuditToolMiddleware(
|
||||||
|
run_id=UUID("00000000-0000-0000-0000-000000000001"),
|
||||||
|
phase_id=UUID("00000000-0000-0000-0000-000000000002"),
|
||||||
|
interactive_session_id=UUID("00000000-0000-0000-0000-000000000003"),
|
||||||
|
recorder=recorder,
|
||||||
|
)
|
||||||
|
result_value = "file contents here"
|
||||||
|
handler = AsyncMock(return_value=result_value)
|
||||||
|
request = _make_request(name="read_file", args={"path": "src/main.py"})
|
||||||
|
|
||||||
|
result = await mw.awrap_tool_call(request, handler)
|
||||||
|
|
||||||
|
assert result == result_value
|
||||||
|
recorder.assert_awaited_once()
|
||||||
|
record: dict[str, Any] = recorder.call_args[0][0]
|
||||||
|
assert record["tool_name"] == "read_file"
|
||||||
|
assert record["args"] == {"path": "src/main.py"}
|
||||||
|
assert record["result"] == result_value
|
||||||
|
assert record["error"] is None
|
||||||
|
assert record["duration_ms"] >= 0
|
||||||
|
assert record["run_id"] == UUID("00000000-0000-0000-0000-000000000001")
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_audit_middleware_no_recorder_is_noop() -> None:
|
||||||
|
mw = AuditToolMiddleware()
|
||||||
|
handler = AsyncMock(return_value="ok")
|
||||||
|
result = await mw.awrap_tool_call(_make_request(), handler)
|
||||||
|
assert result == "ok"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# awrap_tool_call — error path
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_audit_middleware_records_error_code_on_exception() -> None:
|
||||||
|
recorder = AsyncMock()
|
||||||
|
mw = AuditToolMiddleware(recorder=recorder)
|
||||||
|
handler = AsyncMock(side_effect=PermissionError("access denied"))
|
||||||
|
|
||||||
|
with pytest.raises(PermissionError):
|
||||||
|
await mw.awrap_tool_call(_make_request(), handler)
|
||||||
|
|
||||||
|
recorder.assert_awaited_once()
|
||||||
|
record: dict[str, Any] = recorder.call_args[0][0]
|
||||||
|
assert record["error"] == "PermissionError"
|
||||||
|
assert record["result"] is None
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_audit_middleware_reraises_exception() -> None:
|
||||||
|
mw = AuditToolMiddleware(recorder=AsyncMock())
|
||||||
|
handler = AsyncMock(side_effect=ValueError("bad args"))
|
||||||
|
with pytest.raises(ValueError, match="bad args"):
|
||||||
|
await mw.awrap_tool_call(_make_request(), handler)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# result serialization
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_audit_middleware_serializes_non_primitive_result_as_str() -> None:
|
||||||
|
recorder = AsyncMock()
|
||||||
|
mw = AuditToolMiddleware(recorder=recorder)
|
||||||
|
|
||||||
|
class _CustomResult:
|
||||||
|
def __str__(self) -> str:
|
||||||
|
return "custom-result-str"
|
||||||
|
|
||||||
|
handler = AsyncMock(return_value=_CustomResult())
|
||||||
|
await mw.awrap_tool_call(_make_request(), handler)
|
||||||
|
record = recorder.call_args[0][0]
|
||||||
|
assert record["result"] == "custom-result-str"
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_audit_middleware_passes_dict_result_as_is() -> None:
|
||||||
|
recorder = AsyncMock()
|
||||||
|
mw = AuditToolMiddleware(recorder=recorder)
|
||||||
|
handler = AsyncMock(return_value={"key": "value"})
|
||||||
|
await mw.awrap_tool_call(_make_request(), handler)
|
||||||
|
record = recorder.call_args[0][0]
|
||||||
|
assert record["result"] == {"key": "value"}
|
||||||
143
my-deepagent/tests/unit/test_middleware_cost.py
Normal file
143
my-deepagent/tests/unit/test_middleware_cost.py
Normal file
@@ -0,0 +1,143 @@
|
|||||||
|
"""Unit tests for src/my_deepagent/middleware/cost.py."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from typing import Any
|
||||||
|
from unittest.mock import AsyncMock, MagicMock
|
||||||
|
from uuid import UUID
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from my_deepagent.middleware.cost import CostMiddleware
|
||||||
|
from my_deepagent.monitoring.pricing import ModelPrice, PricingCache
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Helpers
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def _make_pricing_cache(
|
||||||
|
model: str = "anthropic/claude-sonnet",
|
||||||
|
input_per_1k: float = 0.003,
|
||||||
|
output_per_1k: float = 0.015,
|
||||||
|
) -> PricingCache:
|
||||||
|
cache = PricingCache()
|
||||||
|
cache.set(
|
||||||
|
[
|
||||||
|
ModelPrice(
|
||||||
|
model=model,
|
||||||
|
input_per_1k_usd=input_per_1k,
|
||||||
|
output_per_1k_usd=output_per_1k,
|
||||||
|
context_length=200000,
|
||||||
|
)
|
||||||
|
]
|
||||||
|
)
|
||||||
|
return cache
|
||||||
|
|
||||||
|
|
||||||
|
def _make_response(input_tokens: int = 100, output_tokens: int = 50) -> MagicMock:
|
||||||
|
response = MagicMock()
|
||||||
|
response.usage_metadata = {"input_tokens": input_tokens, "output_tokens": output_tokens}
|
||||||
|
return response
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# awrap_model_call — success path
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_cost_middleware_records_correct_fields_on_success() -> None:
|
||||||
|
recorder = AsyncMock()
|
||||||
|
cache = _make_pricing_cache()
|
||||||
|
mw = CostMiddleware(
|
||||||
|
pricing=cache,
|
||||||
|
model_name="anthropic/claude-sonnet",
|
||||||
|
run_id=UUID("00000000-0000-0000-0000-000000000001"),
|
||||||
|
phase_id=UUID("00000000-0000-0000-0000-000000000002"),
|
||||||
|
persona_name="test-persona",
|
||||||
|
recorder=recorder,
|
||||||
|
)
|
||||||
|
response = _make_response(input_tokens=1000, output_tokens=500)
|
||||||
|
handler = AsyncMock(return_value=response)
|
||||||
|
request = MagicMock()
|
||||||
|
|
||||||
|
result = await mw.awrap_model_call(request, handler)
|
||||||
|
|
||||||
|
assert result is response
|
||||||
|
recorder.assert_awaited_once()
|
||||||
|
record: dict[str, Any] = recorder.call_args[0][0]
|
||||||
|
assert record["model"] == "anthropic/claude-sonnet"
|
||||||
|
assert record["input_tokens"] == 1000
|
||||||
|
assert record["output_tokens"] == 500
|
||||||
|
assert record["status"] == "ok"
|
||||||
|
assert record["error_code"] is None
|
||||||
|
assert record["latency_ms"] >= 0
|
||||||
|
# cost: (1000/1000 * 0.003) + (500/1000 * 0.015)
|
||||||
|
expected_cost = 0.003 * 1.0 + 0.015 * 0.5
|
||||||
|
assert record["cost_usd_total"] == pytest.approx(expected_cost)
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_cost_middleware_no_recorder_is_noop() -> None:
|
||||||
|
cache = _make_pricing_cache()
|
||||||
|
mw = CostMiddleware(pricing=cache, model_name="anthropic/claude-sonnet")
|
||||||
|
response = _make_response()
|
||||||
|
handler = AsyncMock(return_value=response)
|
||||||
|
# Should not raise even with recorder=None
|
||||||
|
result = await mw.awrap_model_call(MagicMock(), handler)
|
||||||
|
assert result is response
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# awrap_model_call — error path
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_cost_middleware_records_error_on_handler_exception() -> None:
|
||||||
|
recorder = AsyncMock()
|
||||||
|
cache = _make_pricing_cache()
|
||||||
|
mw = CostMiddleware(
|
||||||
|
pricing=cache,
|
||||||
|
model_name="anthropic/claude-sonnet",
|
||||||
|
recorder=recorder,
|
||||||
|
)
|
||||||
|
handler = AsyncMock(side_effect=RuntimeError("timeout"))
|
||||||
|
|
||||||
|
with pytest.raises(RuntimeError, match="timeout"):
|
||||||
|
await mw.awrap_model_call(MagicMock(), handler)
|
||||||
|
|
||||||
|
recorder.assert_awaited_once()
|
||||||
|
record: dict[str, Any] = recorder.call_args[0][0]
|
||||||
|
assert record["status"] == "error"
|
||||||
|
assert record["error_code"] == "RuntimeError"
|
||||||
|
assert record["input_tokens"] == 0
|
||||||
|
assert record["output_tokens"] == 0
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_cost_middleware_reraises_exception() -> None:
|
||||||
|
cache = _make_pricing_cache()
|
||||||
|
mw = CostMiddleware(pricing=cache, model_name="m", recorder=AsyncMock())
|
||||||
|
handler = AsyncMock(side_effect=ValueError("bad input"))
|
||||||
|
|
||||||
|
with pytest.raises(ValueError, match="bad input"):
|
||||||
|
await mw.awrap_model_call(MagicMock(), handler)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# cost computation via cache
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_cost_zero_when_model_not_in_cache() -> None:
|
||||||
|
recorder = AsyncMock()
|
||||||
|
cache = PricingCache() # empty
|
||||||
|
mw = CostMiddleware(pricing=cache, model_name="unknown/model", recorder=recorder)
|
||||||
|
response = _make_response(input_tokens=1000, output_tokens=1000)
|
||||||
|
handler = AsyncMock(return_value=response)
|
||||||
|
await mw.awrap_model_call(MagicMock(), handler)
|
||||||
|
record = recorder.call_args[0][0]
|
||||||
|
assert record["cost_usd_total"] == 0.0
|
||||||
168
my-deepagent/tests/unit/test_middleware_fallback.py
Normal file
168
my-deepagent/tests/unit/test_middleware_fallback.py
Normal file
@@ -0,0 +1,168 @@
|
|||||||
|
"""Unit tests for src/my_deepagent/middleware/fallback.py."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from typing import Any
|
||||||
|
from unittest.mock import AsyncMock, MagicMock
|
||||||
|
|
||||||
|
import httpx
|
||||||
|
import openai
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from my_deepagent.middleware.fallback import FallbackModelMiddleware
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Helpers
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def _make_request(has_model_attr: bool = True) -> MagicMock:
|
||||||
|
request = MagicMock()
|
||||||
|
if not has_model_attr:
|
||||||
|
del request.model
|
||||||
|
return request
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Fallback on RateLimitError
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_fallback_on_rate_limit_error_calls_handler_with_fallback() -> None:
|
||||||
|
primary = MagicMock(name="primary-model")
|
||||||
|
fallback = MagicMock(name="fallback-model")
|
||||||
|
mw = FallbackModelMiddleware(primary=primary, fallback=fallback)
|
||||||
|
|
||||||
|
call_count = 0
|
||||||
|
fallback_model_seen: Any = None
|
||||||
|
|
||||||
|
async def handler(request: Any) -> str:
|
||||||
|
nonlocal call_count, fallback_model_seen
|
||||||
|
call_count += 1
|
||||||
|
if call_count == 1:
|
||||||
|
raise openai.RateLimitError(
|
||||||
|
"rate limit",
|
||||||
|
response=MagicMock(status_code=429, headers={}),
|
||||||
|
body={},
|
||||||
|
)
|
||||||
|
fallback_model_seen = getattr(request, "model", None)
|
||||||
|
return "fallback-response"
|
||||||
|
|
||||||
|
request = _make_request()
|
||||||
|
result = await mw.awrap_model_call(request, handler)
|
||||||
|
assert result == "fallback-response"
|
||||||
|
assert call_count == 2
|
||||||
|
assert fallback_model_seen is fallback
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_fallback_on_api_connection_error() -> None:
|
||||||
|
primary = MagicMock()
|
||||||
|
fallback = MagicMock()
|
||||||
|
mw = FallbackModelMiddleware(primary=primary, fallback=fallback)
|
||||||
|
|
||||||
|
call_count = 0
|
||||||
|
|
||||||
|
async def handler(request: Any) -> str:
|
||||||
|
nonlocal call_count
|
||||||
|
call_count += 1
|
||||||
|
if call_count == 1:
|
||||||
|
raise openai.APIConnectionError(request=MagicMock())
|
||||||
|
return "connection-fallback"
|
||||||
|
|
||||||
|
result = await mw.awrap_model_call(_make_request(), handler)
|
||||||
|
assert result == "connection-fallback"
|
||||||
|
assert call_count == 2
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_fallback_on_httpx_error() -> None:
|
||||||
|
primary = MagicMock()
|
||||||
|
fallback = MagicMock()
|
||||||
|
mw = FallbackModelMiddleware(primary=primary, fallback=fallback)
|
||||||
|
|
||||||
|
call_count = 0
|
||||||
|
|
||||||
|
async def handler(request: Any) -> str:
|
||||||
|
nonlocal call_count
|
||||||
|
call_count += 1
|
||||||
|
if call_count == 1:
|
||||||
|
raise httpx.ConnectError("connect failed")
|
||||||
|
return "httpx-fallback"
|
||||||
|
|
||||||
|
result = await mw.awrap_model_call(_make_request(), handler)
|
||||||
|
assert result == "httpx-fallback"
|
||||||
|
assert call_count == 2
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# No fallback — exception propagates
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_no_fallback_raises_original_error() -> None:
|
||||||
|
mw = FallbackModelMiddleware(primary=MagicMock(), fallback=None)
|
||||||
|
handler = AsyncMock(
|
||||||
|
side_effect=openai.RateLimitError(
|
||||||
|
"rate limit",
|
||||||
|
response=MagicMock(status_code=429, headers={}),
|
||||||
|
body={},
|
||||||
|
)
|
||||||
|
)
|
||||||
|
with pytest.raises(openai.RateLimitError):
|
||||||
|
await mw.awrap_model_call(_make_request(), handler)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# AuthenticationError — never retried
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_auth_error_is_not_retried() -> None:
|
||||||
|
primary = MagicMock()
|
||||||
|
fallback = MagicMock()
|
||||||
|
mw = FallbackModelMiddleware(primary=primary, fallback=fallback)
|
||||||
|
|
||||||
|
call_count = 0
|
||||||
|
|
||||||
|
async def handler(request: Any) -> str:
|
||||||
|
nonlocal call_count
|
||||||
|
call_count += 1
|
||||||
|
raise openai.AuthenticationError(
|
||||||
|
"bad api key",
|
||||||
|
response=MagicMock(status_code=401, headers={}),
|
||||||
|
body={},
|
||||||
|
)
|
||||||
|
|
||||||
|
with pytest.raises(openai.AuthenticationError):
|
||||||
|
await mw.awrap_model_call(_make_request(), handler)
|
||||||
|
|
||||||
|
# Handler should only be called once (no retry for auth errors)
|
||||||
|
assert call_count == 1
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# _with_fallback_model
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_with_fallback_model_swaps_model_attribute() -> None:
|
||||||
|
primary = MagicMock(name="primary")
|
||||||
|
fallback = MagicMock(name="fallback")
|
||||||
|
mw = FallbackModelMiddleware(primary=primary, fallback=fallback)
|
||||||
|
|
||||||
|
request = MagicMock()
|
||||||
|
request.model = primary
|
||||||
|
patched = mw._with_fallback_model(request)
|
||||||
|
assert patched.model is fallback
|
||||||
|
|
||||||
|
|
||||||
|
def test_with_fallback_model_no_model_attr_does_not_crash() -> None:
|
||||||
|
mw = FallbackModelMiddleware(primary=MagicMock(), fallback=MagicMock())
|
||||||
|
request = MagicMock(spec=[]) # no attributes
|
||||||
|
# Should not raise
|
||||||
|
patched = mw._with_fallback_model(request)
|
||||||
|
assert patched is request
|
||||||
258
my-deepagent/tests/unit/test_middleware_safety.py
Normal file
258
my-deepagent/tests/unit/test_middleware_safety.py
Normal file
@@ -0,0 +1,258 @@
|
|||||||
|
"""Unit tests for src/my_deepagent/middleware/safety.py."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from typing import Any
|
||||||
|
from unittest.mock import AsyncMock, MagicMock
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from my_deepagent.errors import MyDeepAgentError
|
||||||
|
from my_deepagent.middleware.safety import SafetyShellMiddleware, _is_denied_path
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Helpers
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def _make_shell_request(cmd: str | list[str], tool_name: str = "shell") -> MagicMock:
|
||||||
|
request = MagicMock()
|
||||||
|
if isinstance(cmd, list):
|
||||||
|
request.tool_call = {"name": tool_name, "args": {"argv": cmd}}
|
||||||
|
else:
|
||||||
|
request.tool_call = {"name": tool_name, "args": {"command": cmd}}
|
||||||
|
return request
|
||||||
|
|
||||||
|
|
||||||
|
def _make_other_tool_request(
|
||||||
|
name: str = "read_file", args: dict[str, Any] | None = None
|
||||||
|
) -> MagicMock:
|
||||||
|
request = MagicMock()
|
||||||
|
request.tool_call = {"name": name, "args": args or {}}
|
||||||
|
return request
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Destructive commands — should raise
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_rm_rf_slash_is_blocked() -> None:
|
||||||
|
mw = SafetyShellMiddleware()
|
||||||
|
with pytest.raises(MyDeepAgentError) as exc_info:
|
||||||
|
await mw.awrap_tool_call(_make_shell_request("rm -rf /"), AsyncMock())
|
||||||
|
assert exc_info.value.code == "destructive_command_blocked"
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_rm_rf_with_path_is_blocked() -> None:
|
||||||
|
mw = SafetyShellMiddleware()
|
||||||
|
with pytest.raises(MyDeepAgentError) as exc_info:
|
||||||
|
await mw.awrap_tool_call(_make_shell_request("rm -rf ./build"), AsyncMock())
|
||||||
|
assert exc_info.value.code == "destructive_command_blocked"
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_git_push_force_is_blocked() -> None:
|
||||||
|
mw = SafetyShellMiddleware()
|
||||||
|
with pytest.raises(MyDeepAgentError):
|
||||||
|
await mw.awrap_tool_call(_make_shell_request("git push --force origin main"), AsyncMock())
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_git_push_force_with_lease_is_blocked() -> None:
|
||||||
|
mw = SafetyShellMiddleware()
|
||||||
|
with pytest.raises(MyDeepAgentError):
|
||||||
|
await mw.awrap_tool_call(
|
||||||
|
_make_shell_request("git push --force-with-lease origin main"), AsyncMock()
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_git_reset_hard_is_blocked() -> None:
|
||||||
|
mw = SafetyShellMiddleware()
|
||||||
|
with pytest.raises(MyDeepAgentError):
|
||||||
|
await mw.awrap_tool_call(_make_shell_request("git reset --hard HEAD"), AsyncMock())
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_git_clean_is_blocked() -> None:
|
||||||
|
mw = SafetyShellMiddleware()
|
||||||
|
with pytest.raises(MyDeepAgentError):
|
||||||
|
await mw.awrap_tool_call(_make_shell_request("git clean -fd"), AsyncMock())
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_drop_table_sql_is_blocked() -> None:
|
||||||
|
mw = SafetyShellMiddleware()
|
||||||
|
with pytest.raises(MyDeepAgentError):
|
||||||
|
await mw.awrap_tool_call(_make_shell_request("psql -c 'DROP TABLE users'"), AsyncMock())
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_execute_tool_name_also_blocked() -> None:
|
||||||
|
"""The 'execute' tool name is also checked for destructive patterns."""
|
||||||
|
mw = SafetyShellMiddleware()
|
||||||
|
with pytest.raises(MyDeepAgentError):
|
||||||
|
await mw.awrap_tool_call(
|
||||||
|
_make_shell_request("rm -rf /tmp/data", tool_name="execute"), AsyncMock()
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# argv (list) form — should also be blocked
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_rm_rf_as_list_argv_is_blocked() -> None:
|
||||||
|
mw = SafetyShellMiddleware()
|
||||||
|
with pytest.raises(MyDeepAgentError):
|
||||||
|
await mw.awrap_tool_call(
|
||||||
|
_make_shell_request(["rm", "-rf", "/tmp"], tool_name="shell"), AsyncMock()
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Safe commands — should pass through
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_ls_la_passes_through() -> None:
|
||||||
|
mw = SafetyShellMiddleware()
|
||||||
|
handler = AsyncMock(return_value="total 42")
|
||||||
|
result = await mw.awrap_tool_call(_make_shell_request("ls -la"), handler)
|
||||||
|
assert result == "total 42"
|
||||||
|
handler.assert_awaited_once()
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_git_status_passes_through() -> None:
|
||||||
|
mw = SafetyShellMiddleware()
|
||||||
|
handler = AsyncMock(return_value="On branch main")
|
||||||
|
result = await mw.awrap_tool_call(_make_shell_request("git status"), handler)
|
||||||
|
assert result == "On branch main"
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_git_push_without_force_passes_through() -> None:
|
||||||
|
mw = SafetyShellMiddleware()
|
||||||
|
handler = AsyncMock(return_value="ok")
|
||||||
|
result = await mw.awrap_tool_call(_make_shell_request("git push origin main"), handler)
|
||||||
|
assert result == "ok"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Non-shell tools — should NOT be inspected
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_read_file_tool_with_destructive_content_passes() -> None:
|
||||||
|
"""read_file is not a shell tool; its content should not be blocked."""
|
||||||
|
mw = SafetyShellMiddleware()
|
||||||
|
handler = AsyncMock(return_value="file content")
|
||||||
|
request = _make_other_tool_request("read_file", {"path": "/some/file.py"})
|
||||||
|
result = await mw.awrap_tool_call(request, handler)
|
||||||
|
assert result == "file content"
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_unknown_tool_not_checked() -> None:
|
||||||
|
mw = SafetyShellMiddleware()
|
||||||
|
handler = AsyncMock(return_value="ok")
|
||||||
|
result = await mw.awrap_tool_call(_make_other_tool_request("arbitrary_tool"), handler)
|
||||||
|
assert result == "ok"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# _is_denied_path unit tests
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_is_denied_path_env_file() -> None:
|
||||||
|
assert _is_denied_path(".env") is True
|
||||||
|
|
||||||
|
|
||||||
|
def test_is_denied_path_env_local_in_subdir() -> None:
|
||||||
|
assert _is_denied_path("config/.env.local") is True
|
||||||
|
|
||||||
|
|
||||||
|
def test_is_denied_path_ssh_key() -> None:
|
||||||
|
assert _is_denied_path(".ssh/id_rsa") is True
|
||||||
|
|
||||||
|
|
||||||
|
def test_is_denied_path_safe_source_file() -> None:
|
||||||
|
assert _is_denied_path("src/main.py") is False
|
||||||
|
|
||||||
|
|
||||||
|
def test_is_denied_path_token_file() -> None:
|
||||||
|
assert _is_denied_path("api_token.json") is True
|
||||||
|
|
||||||
|
|
||||||
|
def test_is_denied_path_aws_credentials() -> None:
|
||||||
|
assert _is_denied_path(".aws/credentials") is True
|
||||||
|
|
||||||
|
|
||||||
|
def test_is_denied_path_pem_file() -> None:
|
||||||
|
assert _is_denied_path("key.pem") is True
|
||||||
|
|
||||||
|
|
||||||
|
def test_is_denied_path_absolute_env() -> None:
|
||||||
|
# absolute path normalised by lstrip('/')
|
||||||
|
assert _is_denied_path("/.env") is True
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Secret-path tool blocking via awrap_tool_call
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_read_file_env_path_is_blocked() -> None:
|
||||||
|
mw = SafetyShellMiddleware()
|
||||||
|
request = _make_other_tool_request("read_file", {"file_path": ".env"})
|
||||||
|
with pytest.raises(MyDeepAgentError) as exc_info:
|
||||||
|
await mw.awrap_tool_call(request, AsyncMock())
|
||||||
|
assert exc_info.value.code == "secret_access_blocked"
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_write_file_pem_path_is_blocked() -> None:
|
||||||
|
mw = SafetyShellMiddleware()
|
||||||
|
request = _make_other_tool_request("write_file", {"file_path": "key.pem"})
|
||||||
|
with pytest.raises(MyDeepAgentError) as exc_info:
|
||||||
|
await mw.awrap_tool_call(request, AsyncMock())
|
||||||
|
assert exc_info.value.code == "secret_access_blocked"
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_ls_ssh_dir_is_blocked() -> None:
|
||||||
|
mw = SafetyShellMiddleware()
|
||||||
|
request = _make_other_tool_request("ls", {"path": ".ssh/"})
|
||||||
|
with pytest.raises(MyDeepAgentError) as exc_info:
|
||||||
|
await mw.awrap_tool_call(request, AsyncMock())
|
||||||
|
assert exc_info.value.code == "secret_access_blocked"
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_read_file_safe_path_passes() -> None:
|
||||||
|
mw = SafetyShellMiddleware()
|
||||||
|
handler = AsyncMock(return_value="content")
|
||||||
|
request = _make_other_tool_request("read_file", {"file_path": "src/foo.py"})
|
||||||
|
result = await mw.awrap_tool_call(request, handler)
|
||||||
|
assert result == "content"
|
||||||
|
handler.assert_awaited_once()
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_execute_tool_path_arg_not_path_checked() -> None:
|
||||||
|
"""execute tool goes through shell-check only, not path-check."""
|
||||||
|
mw = SafetyShellMiddleware()
|
||||||
|
handler = AsyncMock(return_value="ok")
|
||||||
|
# safe shell command with a path arg — should not be blocked via path logic
|
||||||
|
request = _make_shell_request("ls /some/safe/dir", tool_name="execute")
|
||||||
|
result = await mw.awrap_tool_call(request, handler)
|
||||||
|
assert result == "ok"
|
||||||
332
my-deepagent/tests/unit/test_persona.py
Normal file
332
my-deepagent/tests/unit/test_persona.py
Normal file
@@ -0,0 +1,332 @@
|
|||||||
|
"""Unit tests for src/my_deepagent/persona.py."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import re
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
from pydantic import ValidationError
|
||||||
|
|
||||||
|
from my_deepagent.enums import Backend
|
||||||
|
from my_deepagent.persona import (
|
||||||
|
FilesystemPermissionSpec,
|
||||||
|
Persona,
|
||||||
|
PersonaSubagent,
|
||||||
|
load_persona_yaml,
|
||||||
|
load_personas_from_dir,
|
||||||
|
)
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Helpers
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
PERSONAS_DIR = Path(__file__).parent.parent.parent / "docs" / "schemas" / "personas"
|
||||||
|
|
||||||
|
|
||||||
|
def _minimal_persona_dict(**overrides: object) -> dict[str, object]:
|
||||||
|
"""Return a minimal valid persona dict, overridable per-test."""
|
||||||
|
base: dict[str, object] = {
|
||||||
|
"name": "test-persona",
|
||||||
|
"version": 1,
|
||||||
|
"backend": "openrouter",
|
||||||
|
"model": "openrouter:anthropic/claude-sonnet-4-6",
|
||||||
|
"provider_origin": "US/Anthropic",
|
||||||
|
"capabilities": ["spec_write"],
|
||||||
|
"max_risk_level": "low",
|
||||||
|
"system_prompt": "You are a test persona for unit tests.",
|
||||||
|
}
|
||||||
|
base.update(overrides)
|
||||||
|
return base
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Seed yaml: all 10 load successfully
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_all_seed_personas_load() -> None:
|
||||||
|
personas = load_personas_from_dir(PERSONAS_DIR)
|
||||||
|
assert len(personas) == 10
|
||||||
|
|
||||||
|
|
||||||
|
def test_seed_persona_names_unique() -> None:
|
||||||
|
personas = load_personas_from_dir(PERSONAS_DIR)
|
||||||
|
keys = [(p.name, p.version) for p in personas]
|
||||||
|
assert len(keys) == len(set(keys))
|
||||||
|
|
||||||
|
|
||||||
|
def test_seed_personas_backends_are_openrouter() -> None:
|
||||||
|
personas = load_personas_from_dir(PERSONAS_DIR)
|
||||||
|
for p in personas:
|
||||||
|
assert p.backend == Backend.OPENROUTER
|
||||||
|
|
||||||
|
|
||||||
|
def test_seed_persona_capabilities_non_empty() -> None:
|
||||||
|
personas = load_personas_from_dir(PERSONAS_DIR)
|
||||||
|
for p in personas:
|
||||||
|
assert len(p.capabilities) >= 1
|
||||||
|
|
||||||
|
|
||||||
|
def test_seed_persona_hash_is_64_char_hex() -> None:
|
||||||
|
personas = load_personas_from_dir(PERSONAS_DIR)
|
||||||
|
for p in personas:
|
||||||
|
h = p.compute_hash()
|
||||||
|
assert re.fullmatch(r"[0-9a-f]{64}", h), f"{p.name}: bad hash {h!r}"
|
||||||
|
|
||||||
|
|
||||||
|
def test_seed_persona_frozen() -> None:
|
||||||
|
"""Frozen model: attribute assignment must raise."""
|
||||||
|
personas = load_personas_from_dir(PERSONAS_DIR)
|
||||||
|
p = personas[0]
|
||||||
|
with pytest.raises((TypeError, ValidationError)):
|
||||||
|
p.name = "mutated" # type: ignore[misc]
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# extra="forbid": unknown fields rejected
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_persona_extra_field_raises() -> None:
|
||||||
|
data = _minimal_persona_dict(unknown_field="surprise")
|
||||||
|
with pytest.raises(ValidationError, match="extra"):
|
||||||
|
Persona.model_validate(data)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# FilesystemPermissionSpec validators
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_permission_path_no_leading_slash_raises() -> None:
|
||||||
|
with pytest.raises(ValidationError, match="must start with '/'"):
|
||||||
|
FilesystemPermissionSpec(operations=["read"], paths=["relative/path"])
|
||||||
|
|
||||||
|
|
||||||
|
def test_permission_path_dotdot_raises() -> None:
|
||||||
|
with pytest.raises(ValidationError, match=r"must not contain '\.\.'"):
|
||||||
|
FilesystemPermissionSpec(operations=["read"], paths=["/foo/../bar"])
|
||||||
|
|
||||||
|
|
||||||
|
def test_permission_path_tilde_raises() -> None:
|
||||||
|
with pytest.raises(ValidationError, match="must not contain '~'"):
|
||||||
|
FilesystemPermissionSpec(operations=["read"], paths=["/path/~expansion/secret"])
|
||||||
|
|
||||||
|
|
||||||
|
def test_permission_path_glob_ok() -> None:
|
||||||
|
"""Glob patterns like /** should not trigger the path validator."""
|
||||||
|
spec = FilesystemPermissionSpec(operations=["read", "write"], paths=["/**"])
|
||||||
|
assert spec.paths == ("/**",)
|
||||||
|
|
||||||
|
|
||||||
|
def test_permission_mode_default_allow() -> None:
|
||||||
|
spec = FilesystemPermissionSpec(operations=["read"], paths=["/tmp"])
|
||||||
|
assert spec.mode == "allow"
|
||||||
|
|
||||||
|
|
||||||
|
def test_permission_deny_mode() -> None:
|
||||||
|
spec = FilesystemPermissionSpec(operations=["write"], paths=["/.env"], mode="deny")
|
||||||
|
assert spec.mode == "deny"
|
||||||
|
|
||||||
|
|
||||||
|
def test_permission_extra_field_raises() -> None:
|
||||||
|
with pytest.raises(ValidationError):
|
||||||
|
FilesystemPermissionSpec(operations=["read"], paths=["/tmp"], unknown=True) # type: ignore[call-arg]
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Persona.compute_hash: determinism
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_compute_hash_deterministic() -> None:
|
||||||
|
p = Persona.model_validate(_minimal_persona_dict())
|
||||||
|
hashes = [p.compute_hash() for _ in range(20)]
|
||||||
|
assert len(set(hashes)) == 1
|
||||||
|
|
||||||
|
|
||||||
|
def test_compute_hash_different_personas_differ() -> None:
|
||||||
|
p1 = Persona.model_validate(_minimal_persona_dict(name="p1"))
|
||||||
|
p2 = Persona.model_validate(_minimal_persona_dict(name="p2"))
|
||||||
|
assert p1.compute_hash() != p2.compute_hash()
|
||||||
|
|
||||||
|
|
||||||
|
def test_compute_hash_version_affects_hash() -> None:
|
||||||
|
p1 = Persona.model_validate(_minimal_persona_dict(version=1))
|
||||||
|
p2 = Persona.model_validate(_minimal_persona_dict(version=2))
|
||||||
|
assert p1.compute_hash() != p2.compute_hash()
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Persona: min_length, ge validators
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_persona_empty_capabilities_raises() -> None:
|
||||||
|
data = _minimal_persona_dict(capabilities=[])
|
||||||
|
with pytest.raises(ValidationError):
|
||||||
|
Persona.model_validate(data)
|
||||||
|
|
||||||
|
|
||||||
|
def test_persona_version_zero_raises() -> None:
|
||||||
|
data = _minimal_persona_dict(version=0)
|
||||||
|
with pytest.raises(ValidationError):
|
||||||
|
Persona.model_validate(data)
|
||||||
|
|
||||||
|
|
||||||
|
def test_persona_negative_max_cost_raises() -> None:
|
||||||
|
data = _minimal_persona_dict(max_cost_per_call_usd=-0.01)
|
||||||
|
with pytest.raises(ValidationError):
|
||||||
|
Persona.model_validate(data)
|
||||||
|
|
||||||
|
|
||||||
|
def test_persona_system_prompt_too_short_raises() -> None:
|
||||||
|
data = _minimal_persona_dict(system_prompt="short")
|
||||||
|
with pytest.raises(ValidationError):
|
||||||
|
Persona.model_validate(data)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# load_persona_yaml: file not found
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_load_persona_yaml_missing_file(tmp_path: Path) -> None:
|
||||||
|
with pytest.raises(FileNotFoundError):
|
||||||
|
load_persona_yaml(tmp_path / "nonexistent.yaml")
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# load_personas_from_dir: duplicate detection
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_load_personas_from_dir_duplicate_raises(tmp_path: Path) -> None:
|
||||||
|
import yaml
|
||||||
|
|
||||||
|
data = _minimal_persona_dict()
|
||||||
|
for fname in ("persona-a@1.yaml", "persona-b@1.yaml"):
|
||||||
|
(tmp_path / fname).write_text(yaml.dump(data), encoding="utf-8")
|
||||||
|
|
||||||
|
with pytest.raises(ValueError, match="duplicate persona"):
|
||||||
|
load_personas_from_dir(tmp_path)
|
||||||
|
|
||||||
|
|
||||||
|
def test_load_personas_from_dir_missing_dir() -> None:
|
||||||
|
result = load_personas_from_dir(Path("/nonexistent_directory_xyz"))
|
||||||
|
assert result == []
|
||||||
|
|
||||||
|
|
||||||
|
def test_load_personas_from_dir_sorted_by_filename(tmp_path: Path) -> None:
|
||||||
|
"""Files are loaded in filename order for determinism."""
|
||||||
|
import yaml
|
||||||
|
|
||||||
|
for i, name in enumerate(["zz-persona", "aa-persona"]):
|
||||||
|
data = _minimal_persona_dict(name=name, version=1)
|
||||||
|
(tmp_path / f"{name}@1.yaml").write_text(yaml.dump(data), encoding="utf-8")
|
||||||
|
|
||||||
|
personas = load_personas_from_dir(tmp_path)
|
||||||
|
assert personas[0].name == "aa-persona"
|
||||||
|
assert personas[1].name == "zz-persona"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# PersonaSubagent: extra="forbid", min_length
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_subagent_extra_field_raises() -> None:
|
||||||
|
with pytest.raises(ValidationError):
|
||||||
|
PersonaSubagent(
|
||||||
|
name="x",
|
||||||
|
description="at least ten chars here",
|
||||||
|
system_prompt="at least ten chars here",
|
||||||
|
unknown_field=True, # type: ignore[call-arg]
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def test_subagent_short_description_raises() -> None:
|
||||||
|
with pytest.raises(ValidationError):
|
||||||
|
PersonaSubagent(name="x", description="short", system_prompt="at least ten chars here")
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Snapshot: specific persona hashes are stable
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_default_interactive_hash_prefix() -> None:
|
||||||
|
"""Hash of default-interactive@1 must start with 8193103c.
|
||||||
|
|
||||||
|
Hash updated: permissions block removed from yaml (deepagents 0.6.1 workaround).
|
||||||
|
"""
|
||||||
|
personas = load_personas_from_dir(PERSONAS_DIR)
|
||||||
|
p = next(q for q in personas if q.name == "default-interactive")
|
||||||
|
assert p.compute_hash().startswith("8193103c")
|
||||||
|
|
||||||
|
|
||||||
|
def test_spec_writer_hash_prefix() -> None:
|
||||||
|
"""Hash of openrouter-claude-spec-writer@1 must be stable."""
|
||||||
|
personas = load_personas_from_dir(PERSONAS_DIR)
|
||||||
|
p = next(q for q in personas if q.name == "openrouter-claude-spec-writer")
|
||||||
|
h = p.compute_hash()
|
||||||
|
assert len(h) == 64
|
||||||
|
assert re.fullmatch(r"[0-9a-f]{64}", h)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Step 2 patch: null byte path rejection
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_filesystem_permission_null_byte_rejected() -> None:
|
||||||
|
"""Null bytes in a filesystem permission path must be rejected."""
|
||||||
|
with pytest.raises(ValidationError, match="null bytes"):
|
||||||
|
FilesystemPermissionSpec.model_validate(
|
||||||
|
{
|
||||||
|
"operations": ["read"],
|
||||||
|
"paths": ["/foo\x00/bar"],
|
||||||
|
"mode": "deny",
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Deep immutability: nested list-valued fields are tuples (cannot be mutated)
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_persona_capabilities_immutable() -> None:
|
||||||
|
"""capabilities is a tuple — .append() must raise AttributeError."""
|
||||||
|
p = Persona.model_validate(_minimal_persona_dict())
|
||||||
|
with pytest.raises((AttributeError, TypeError)):
|
||||||
|
p.capabilities.append(None) # type: ignore[attr-defined]
|
||||||
|
|
||||||
|
|
||||||
|
def test_persona_subagents_immutable() -> None:
|
||||||
|
"""subagents is a tuple — .append() must raise AttributeError."""
|
||||||
|
p = Persona.model_validate(_minimal_persona_dict())
|
||||||
|
with pytest.raises((AttributeError, TypeError)):
|
||||||
|
p.subagents.append(None) # type: ignore[attr-defined]
|
||||||
|
|
||||||
|
|
||||||
|
def test_persona_skills_immutable() -> None:
|
||||||
|
"""skills is a tuple — .append() must raise AttributeError."""
|
||||||
|
p = Persona.model_validate(_minimal_persona_dict())
|
||||||
|
with pytest.raises((AttributeError, TypeError)):
|
||||||
|
p.skills.append("new_skill") # type: ignore[attr-defined]
|
||||||
|
|
||||||
|
|
||||||
|
def test_filesystem_permission_paths_immutable() -> None:
|
||||||
|
"""paths is a tuple — .append() must raise AttributeError."""
|
||||||
|
perm = FilesystemPermissionSpec(operations=("read",), paths=("/foo",), mode="allow")
|
||||||
|
with pytest.raises((AttributeError, TypeError)):
|
||||||
|
perm.paths.append("/bar") # type: ignore[attr-defined]
|
||||||
|
|
||||||
|
|
||||||
|
def test_filesystem_permission_operations_immutable() -> None:
|
||||||
|
"""operations is a tuple — .append() must raise AttributeError."""
|
||||||
|
perm = FilesystemPermissionSpec(operations=("read",), paths=("/foo",), mode="allow")
|
||||||
|
with pytest.raises((AttributeError, TypeError)):
|
||||||
|
perm.operations.append("write") # type: ignore[attr-defined]
|
||||||
229
my-deepagent/tests/unit/test_pricing.py
Normal file
229
my-deepagent/tests/unit/test_pricing.py
Normal file
@@ -0,0 +1,229 @@
|
|||||||
|
"""Unit tests for src/my_deepagent/monitoring/pricing.py."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import httpx
|
||||||
|
import pytest
|
||||||
|
import respx
|
||||||
|
|
||||||
|
from my_deepagent.errors import MyDeepAgentError
|
||||||
|
from my_deepagent.monitoring.pricing import (
|
||||||
|
ModelPrice,
|
||||||
|
PricingCache,
|
||||||
|
_parse_pricing_payload,
|
||||||
|
fetch_openrouter_pricing,
|
||||||
|
)
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# _parse_pricing_payload
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_parse_valid_payload_returns_model_prices() -> None:
|
||||||
|
data = {
|
||||||
|
"data": [
|
||||||
|
{
|
||||||
|
"id": "deepseek/deepseek-chat",
|
||||||
|
"pricing": {"prompt": "0.000001", "completion": "0.000002"},
|
||||||
|
"context_length": 32768,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "anthropic/claude-sonnet",
|
||||||
|
"pricing": {"prompt": "0.000003", "completion": "0.000015"},
|
||||||
|
"context_length": 200000,
|
||||||
|
},
|
||||||
|
]
|
||||||
|
}
|
||||||
|
result = _parse_pricing_payload(data)
|
||||||
|
assert len(result) == 2
|
||||||
|
assert result[0].model == "deepseek/deepseek-chat"
|
||||||
|
assert result[0].input_per_1k_usd == pytest.approx(0.001)
|
||||||
|
assert result[0].output_per_1k_usd == pytest.approx(0.002)
|
||||||
|
assert result[0].context_length == 32768
|
||||||
|
assert result[1].model == "anthropic/claude-sonnet"
|
||||||
|
|
||||||
|
|
||||||
|
def test_parse_empty_data_list_returns_empty() -> None:
|
||||||
|
result = _parse_pricing_payload({"data": []})
|
||||||
|
assert result == []
|
||||||
|
|
||||||
|
|
||||||
|
def test_parse_data_is_not_list_returns_empty() -> None:
|
||||||
|
# data is a dict instead of list — malformed response
|
||||||
|
result = _parse_pricing_payload({"data": {"id": "bad"}})
|
||||||
|
assert result == []
|
||||||
|
|
||||||
|
|
||||||
|
def test_parse_missing_data_key_returns_empty() -> None:
|
||||||
|
result = _parse_pricing_payload({})
|
||||||
|
assert result == []
|
||||||
|
|
||||||
|
|
||||||
|
def test_parse_skips_entries_without_id() -> None:
|
||||||
|
data = {
|
||||||
|
"data": [
|
||||||
|
{"pricing": {"prompt": "0.000001", "completion": "0.000002"}, "context_length": 1000},
|
||||||
|
]
|
||||||
|
}
|
||||||
|
result = _parse_pricing_payload(data)
|
||||||
|
assert result == []
|
||||||
|
|
||||||
|
|
||||||
|
def test_parse_skips_entries_with_invalid_pricing_values() -> None:
|
||||||
|
data = {
|
||||||
|
"data": [
|
||||||
|
{
|
||||||
|
"id": "model/x",
|
||||||
|
"pricing": {"prompt": "not-a-number", "completion": "also-bad"},
|
||||||
|
"context_length": 1000,
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
result = _parse_pricing_payload(data)
|
||||||
|
assert result == []
|
||||||
|
|
||||||
|
|
||||||
|
def test_parse_handles_null_pricing_gracefully() -> None:
|
||||||
|
data = {
|
||||||
|
"data": [
|
||||||
|
{"id": "model/y", "pricing": None, "context_length": 0},
|
||||||
|
]
|
||||||
|
}
|
||||||
|
result = _parse_pricing_payload(data)
|
||||||
|
# pricing=None -> {} -> prompt/completion default to "0"
|
||||||
|
assert len(result) == 1
|
||||||
|
assert result[0].input_per_1k_usd == 0.0
|
||||||
|
assert result[0].output_per_1k_usd == 0.0
|
||||||
|
|
||||||
|
|
||||||
|
def test_parse_handles_missing_context_length() -> None:
|
||||||
|
data = {
|
||||||
|
"data": [
|
||||||
|
{"id": "model/z", "pricing": {"prompt": "0.000001", "completion": "0.000002"}},
|
||||||
|
]
|
||||||
|
}
|
||||||
|
result = _parse_pricing_payload(data)
|
||||||
|
assert len(result) == 1
|
||||||
|
assert result[0].context_length == 0
|
||||||
|
|
||||||
|
|
||||||
|
def test_parse_non_dict_entry_is_skipped() -> None:
|
||||||
|
data = {"data": ["not-a-dict", None]}
|
||||||
|
result = _parse_pricing_payload(data)
|
||||||
|
assert result == []
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# PricingCache.compute_cost
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_compute_cost_known_model() -> None:
|
||||||
|
cache = PricingCache()
|
||||||
|
cache.set(
|
||||||
|
[
|
||||||
|
ModelPrice(
|
||||||
|
model="deepseek/deepseek-chat",
|
||||||
|
input_per_1k_usd=0.001,
|
||||||
|
output_per_1k_usd=0.002,
|
||||||
|
context_length=32768,
|
||||||
|
)
|
||||||
|
]
|
||||||
|
)
|
||||||
|
cost = cache.compute_cost("deepseek/deepseek-chat", input_tokens=1000, output_tokens=500)
|
||||||
|
assert cost == pytest.approx(0.001 * 1.0 + 0.002 * 0.5)
|
||||||
|
|
||||||
|
|
||||||
|
def test_compute_cost_openrouter_prefix_stripped() -> None:
|
||||||
|
cache = PricingCache()
|
||||||
|
cache.set(
|
||||||
|
[
|
||||||
|
ModelPrice(
|
||||||
|
model="deepseek/deepseek-chat",
|
||||||
|
input_per_1k_usd=0.001,
|
||||||
|
output_per_1k_usd=0.002,
|
||||||
|
context_length=32768,
|
||||||
|
)
|
||||||
|
]
|
||||||
|
)
|
||||||
|
# Should strip "openrouter:" prefix when looking up
|
||||||
|
cost = cache.compute_cost(
|
||||||
|
"openrouter:deepseek/deepseek-chat", input_tokens=1000, output_tokens=0
|
||||||
|
)
|
||||||
|
assert cost == pytest.approx(0.001)
|
||||||
|
|
||||||
|
|
||||||
|
def test_compute_cost_unknown_model_returns_zero() -> None:
|
||||||
|
cache = PricingCache()
|
||||||
|
cost = cache.compute_cost("unknown/model", input_tokens=1000, output_tokens=1000)
|
||||||
|
assert cost == 0.0
|
||||||
|
|
||||||
|
|
||||||
|
def test_compute_cost_zero_tokens_returns_zero() -> None:
|
||||||
|
cache = PricingCache()
|
||||||
|
cache.set(
|
||||||
|
[ModelPrice(model="m/x", input_per_1k_usd=1.0, output_per_1k_usd=2.0, context_length=1000)]
|
||||||
|
)
|
||||||
|
assert cache.compute_cost("m/x", input_tokens=0, output_tokens=0) == 0.0
|
||||||
|
|
||||||
|
|
||||||
|
def test_pricing_cache_get_strips_openrouter_prefix() -> None:
|
||||||
|
cache = PricingCache()
|
||||||
|
cache.set(
|
||||||
|
[ModelPrice(model="a/b", input_per_1k_usd=0.5, output_per_1k_usd=1.0, context_length=0)]
|
||||||
|
)
|
||||||
|
assert cache.get("openrouter:a/b") is not None
|
||||||
|
assert cache.get("a/b") is not None
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# fetch_openrouter_pricing (respx mock)
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_fetch_openrouter_pricing_success() -> None:
|
||||||
|
payload = {
|
||||||
|
"data": [
|
||||||
|
{
|
||||||
|
"id": "deepseek/deepseek-chat",
|
||||||
|
"pricing": {"prompt": "0.000001", "completion": "0.000002"},
|
||||||
|
"context_length": 64000,
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
with respx.mock:
|
||||||
|
respx.get("https://openrouter.ai/api/v1/models").mock(
|
||||||
|
return_value=httpx.Response(200, json=payload)
|
||||||
|
)
|
||||||
|
result = await fetch_openrouter_pricing(
|
||||||
|
api_key="sk-or-test", base_url="https://openrouter.ai/api/v1"
|
||||||
|
)
|
||||||
|
assert len(result) == 1
|
||||||
|
assert result[0].model == "deepseek/deepseek-chat"
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_fetch_openrouter_pricing_http_error_raises_recoverable() -> None:
|
||||||
|
with respx.mock:
|
||||||
|
respx.get("https://openrouter.ai/api/v1/models").mock(
|
||||||
|
return_value=httpx.Response(401, json={"error": "unauthorized"})
|
||||||
|
)
|
||||||
|
with pytest.raises(MyDeepAgentError) as exc_info:
|
||||||
|
await fetch_openrouter_pricing(
|
||||||
|
api_key="bad-key", base_url="https://openrouter.ai/api/v1"
|
||||||
|
)
|
||||||
|
assert exc_info.value.code == "network_blip"
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_fetch_openrouter_pricing_connect_error_raises_recoverable() -> None:
|
||||||
|
with respx.mock:
|
||||||
|
respx.get("https://openrouter.ai/api/v1/models").mock(
|
||||||
|
side_effect=httpx.ConnectError("connection refused")
|
||||||
|
)
|
||||||
|
with pytest.raises(MyDeepAgentError) as exc_info:
|
||||||
|
await fetch_openrouter_pricing(
|
||||||
|
api_key="sk-or-test", base_url="https://openrouter.ai/api/v1"
|
||||||
|
)
|
||||||
|
assert exc_info.value.code == "network_blip"
|
||||||
454
my-deepagent/tests/unit/test_session.py
Normal file
454
my-deepagent/tests/unit/test_session.py
Normal file
@@ -0,0 +1,454 @@
|
|||||||
|
"""Unit tests for src/my_deepagent/session.py.
|
||||||
|
|
||||||
|
Tests verify the dataclass-based deepagents API (FilesystemPermission attributes,
|
||||||
|
build_backend backend type dispatch, _map_operations deduplication, etc.).
|
||||||
|
No real API calls are made.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
from deepagents import FilesystemPermission
|
||||||
|
from deepagents.backends import (
|
||||||
|
CompositeBackend,
|
||||||
|
FilesystemBackend,
|
||||||
|
LocalShellBackend,
|
||||||
|
)
|
||||||
|
from langchain_openai import ChatOpenAI
|
||||||
|
from langgraph.graph.state import CompiledStateGraph
|
||||||
|
|
||||||
|
from my_deepagent.config import load_config
|
||||||
|
from my_deepagent.errors import MyDeepAgentError
|
||||||
|
from my_deepagent.persona import FilesystemPermissionSpec, Persona, PersonaSubagent
|
||||||
|
from my_deepagent.session import (
|
||||||
|
_map_operations,
|
||||||
|
_resolve_openrouter_api_key,
|
||||||
|
_spec_to_permission,
|
||||||
|
_subagent_to_dict,
|
||||||
|
build_agent,
|
||||||
|
build_backend,
|
||||||
|
default_safety_permissions,
|
||||||
|
resolve_model_instance,
|
||||||
|
)
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Helpers
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def _minimal_persona(**overrides: Any) -> Persona:
|
||||||
|
base: dict[str, Any] = {
|
||||||
|
"name": "test-persona",
|
||||||
|
"version": 1,
|
||||||
|
"backend": "openrouter",
|
||||||
|
"model": "openrouter:anthropic/claude-sonnet-4-6",
|
||||||
|
"provider_origin": "US/Anthropic",
|
||||||
|
"capabilities": ["spec_write"],
|
||||||
|
"max_risk_level": "low",
|
||||||
|
"system_prompt": "You are a test assistant for unit tests.",
|
||||||
|
}
|
||||||
|
base.update(overrides)
|
||||||
|
return Persona.model_validate(base)
|
||||||
|
|
||||||
|
|
||||||
|
def _minimal_permission_spec(
|
||||||
|
operations: list[str] | None = None,
|
||||||
|
paths: list[str] | None = None,
|
||||||
|
mode: str = "allow",
|
||||||
|
) -> FilesystemPermissionSpec:
|
||||||
|
return FilesystemPermissionSpec(
|
||||||
|
operations=tuple(operations or ["read"]),
|
||||||
|
paths=tuple(paths or ["/**"]),
|
||||||
|
mode=mode, # type: ignore[arg-type]
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _minimal_subagent(**overrides: Any) -> PersonaSubagent:
|
||||||
|
base: dict[str, Any] = {
|
||||||
|
"name": "test-sub",
|
||||||
|
"description": "A test subagent description.",
|
||||||
|
"system_prompt": "You are a subagent for unit tests.",
|
||||||
|
}
|
||||||
|
base.update(overrides)
|
||||||
|
return PersonaSubagent.model_validate(base)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# default_safety_permissions — dataclass attribute access
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_default_safety_permissions_returns_two_entries() -> None:
|
||||||
|
perms = default_safety_permissions()
|
||||||
|
assert len(perms) == 2
|
||||||
|
|
||||||
|
|
||||||
|
def test_default_safety_permissions_returns_filesystem_permission_instances() -> None:
|
||||||
|
perms = default_safety_permissions()
|
||||||
|
for p in perms:
|
||||||
|
assert isinstance(p, FilesystemPermission)
|
||||||
|
|
||||||
|
|
||||||
|
def test_default_safety_permissions_allow_is_first() -> None:
|
||||||
|
perms = default_safety_permissions()
|
||||||
|
assert perms[0].mode == "allow"
|
||||||
|
assert "/**" in perms[0].paths
|
||||||
|
|
||||||
|
|
||||||
|
def test_default_safety_permissions_allow_has_both_operations() -> None:
|
||||||
|
perms = default_safety_permissions()
|
||||||
|
assert "read" in perms[0].operations
|
||||||
|
assert "write" in perms[0].operations
|
||||||
|
|
||||||
|
|
||||||
|
def test_default_safety_permissions_deny_is_second() -> None:
|
||||||
|
perms = default_safety_permissions()
|
||||||
|
assert perms[1].mode == "deny"
|
||||||
|
deny_paths = perms[1].paths
|
||||||
|
assert any("env" in p for p in deny_paths)
|
||||||
|
assert any("ssh" in p for p in deny_paths)
|
||||||
|
|
||||||
|
|
||||||
|
def test_default_safety_permissions_deny_covers_secrets() -> None:
|
||||||
|
perms = default_safety_permissions()
|
||||||
|
deny_paths = perms[1].paths
|
||||||
|
assert any("secret" in p for p in deny_paths)
|
||||||
|
assert any("token" in p for p in deny_paths)
|
||||||
|
assert any("pem" in p for p in deny_paths)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# _map_operations — 8 케이스
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_map_operations_read() -> None:
|
||||||
|
assert _map_operations(("read",)) == ["read"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_map_operations_write() -> None:
|
||||||
|
assert _map_operations(("write",)) == ["write"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_map_operations_edit_maps_to_write() -> None:
|
||||||
|
assert _map_operations(("edit",)) == ["write"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_map_operations_ls_maps_to_read() -> None:
|
||||||
|
assert _map_operations(("ls",)) == ["read"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_map_operations_deduplicates_all_four() -> None:
|
||||||
|
result = _map_operations(("read", "write", "edit", "ls"))
|
||||||
|
assert result == ["read", "write"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_map_operations_ls_and_edit() -> None:
|
||||||
|
assert _map_operations(("ls", "edit")) == ["read", "write"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_map_operations_preserves_order_write_then_read() -> None:
|
||||||
|
result = _map_operations(("write", "read"))
|
||||||
|
assert result == ["write", "read"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_map_operations_empty_returns_empty() -> None:
|
||||||
|
assert _map_operations(()) == []
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# _spec_to_permission — dataclass attribute + mapping
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_spec_to_permission_returns_filesystem_permission() -> None:
|
||||||
|
spec = _minimal_permission_spec(operations=["read"], paths=["/**"], mode="allow")
|
||||||
|
result = _spec_to_permission(spec)
|
||||||
|
assert isinstance(result, FilesystemPermission)
|
||||||
|
|
||||||
|
|
||||||
|
def test_spec_to_permission_maps_read_write_correctly() -> None:
|
||||||
|
spec = _minimal_permission_spec(operations=["read", "write"], paths=["/**"], mode="allow")
|
||||||
|
result = _spec_to_permission(spec)
|
||||||
|
assert result.operations == ["read", "write"]
|
||||||
|
assert result.paths == ["/**"]
|
||||||
|
assert result.mode == "allow"
|
||||||
|
|
||||||
|
|
||||||
|
def test_spec_to_permission_maps_edit_to_write() -> None:
|
||||||
|
spec = _minimal_permission_spec(operations=["edit"], paths=["/src/**"], mode="allow")
|
||||||
|
result = _spec_to_permission(spec)
|
||||||
|
assert result.operations == ["write"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_spec_to_permission_maps_ls_to_read() -> None:
|
||||||
|
spec = _minimal_permission_spec(operations=["ls"], paths=["/data/**"], mode="allow")
|
||||||
|
result = _spec_to_permission(spec)
|
||||||
|
assert result.operations == ["read"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_spec_to_permission_deduplicates_read_edit_ls() -> None:
|
||||||
|
spec = _minimal_permission_spec(
|
||||||
|
operations=["read", "edit", "ls"], paths=["/workspace/**"], mode="allow"
|
||||||
|
)
|
||||||
|
result = _spec_to_permission(spec)
|
||||||
|
# read=read, edit=write, ls=read → ["read", "write"]
|
||||||
|
assert result.operations == ["read", "write"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_spec_to_permission_deny_mode_passthrough() -> None:
|
||||||
|
spec = _minimal_permission_spec(operations=["read"], paths=["/.env*"], mode="deny")
|
||||||
|
result = _spec_to_permission(spec)
|
||||||
|
assert result.mode == "deny"
|
||||||
|
assert "/.env*" in result.paths
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# _subagent_to_dict
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_subagent_to_dict_required_fields() -> None:
|
||||||
|
sub = _minimal_subagent()
|
||||||
|
d = _subagent_to_dict(sub)
|
||||||
|
assert d["name"] == "test-sub"
|
||||||
|
assert d["description"] == "A test subagent description."
|
||||||
|
assert d["system_prompt"] == "You are a subagent for unit tests."
|
||||||
|
|
||||||
|
|
||||||
|
def test_subagent_to_dict_optional_tools_included_when_set() -> None:
|
||||||
|
sub = _minimal_subagent(allowed_tools=["read_file", "write_file"])
|
||||||
|
d = _subagent_to_dict(sub)
|
||||||
|
assert "tools" in d
|
||||||
|
assert d["tools"] == ["read_file", "write_file"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_subagent_to_dict_no_tools_key_when_empty() -> None:
|
||||||
|
sub = _minimal_subagent()
|
||||||
|
d = _subagent_to_dict(sub)
|
||||||
|
assert "tools" not in d
|
||||||
|
|
||||||
|
|
||||||
|
def test_subagent_to_dict_optional_model_included_when_set() -> None:
|
||||||
|
sub = _minimal_subagent(model="openrouter:deepseek/deepseek-chat")
|
||||||
|
d = _subagent_to_dict(sub)
|
||||||
|
assert "model" in d
|
||||||
|
assert d["model"] == "openrouter:deepseek/deepseek-chat"
|
||||||
|
|
||||||
|
|
||||||
|
def test_subagent_to_dict_no_model_key_when_none() -> None:
|
||||||
|
sub = _minimal_subagent()
|
||||||
|
d = _subagent_to_dict(sub)
|
||||||
|
assert "model" not in d
|
||||||
|
|
||||||
|
|
||||||
|
def test_subagent_to_dict_permissions_included_when_set() -> None:
|
||||||
|
sub = _minimal_subagent(
|
||||||
|
permissions=[{"operations": ["read"], "paths": ["/**"], "mode": "allow"}]
|
||||||
|
)
|
||||||
|
d = _subagent_to_dict(sub)
|
||||||
|
assert "permissions" in d
|
||||||
|
assert len(d["permissions"]) == 1
|
||||||
|
# permissions 안의 항목도 FilesystemPermission 인스턴스
|
||||||
|
assert isinstance(d["permissions"][0], FilesystemPermission)
|
||||||
|
|
||||||
|
|
||||||
|
def test_subagent_to_dict_permissions_empty_not_included() -> None:
|
||||||
|
sub = _minimal_subagent()
|
||||||
|
d = _subagent_to_dict(sub)
|
||||||
|
assert "permissions" not in d
|
||||||
|
|
||||||
|
|
||||||
|
def test_subagent_to_dict_interrupt_on_included_when_set() -> None:
|
||||||
|
sub = _minimal_subagent(interrupt_on={"write_file": {"allowed_decisions": ["approve"]}})
|
||||||
|
d = _subagent_to_dict(sub)
|
||||||
|
assert "interrupt_on" in d
|
||||||
|
|
||||||
|
|
||||||
|
def test_subagent_to_dict_no_interrupt_on_when_empty() -> None:
|
||||||
|
sub = _minimal_subagent()
|
||||||
|
d = _subagent_to_dict(sub)
|
||||||
|
assert "interrupt_on" not in d
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# _resolve_openrouter_api_key
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_resolve_api_key_from_config() -> None:
|
||||||
|
config = load_config(openrouter_api_key="sk-or-from-config")
|
||||||
|
key = _resolve_openrouter_api_key(config)
|
||||||
|
assert key == "sk-or-from-config"
|
||||||
|
|
||||||
|
|
||||||
|
def test_resolve_api_key_from_mydeepagent_env(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
monkeypatch.delenv("MYDEEPAGENT_OPENROUTER_API_KEY", raising=False)
|
||||||
|
monkeypatch.delenv("OPENROUTER_API_KEY", raising=False)
|
||||||
|
monkeypatch.setenv("MYDEEPAGENT_OPENROUTER_API_KEY", "sk-or-env-mydeepagent")
|
||||||
|
config = load_config(openrouter_api_key=None)
|
||||||
|
key = _resolve_openrouter_api_key(config)
|
||||||
|
assert key == "sk-or-env-mydeepagent"
|
||||||
|
|
||||||
|
|
||||||
|
def test_resolve_api_key_fallback_to_openrouter_env(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
monkeypatch.delenv("MYDEEPAGENT_OPENROUTER_API_KEY", raising=False)
|
||||||
|
monkeypatch.delenv("OPENROUTER_API_KEY", raising=False)
|
||||||
|
monkeypatch.setenv("OPENROUTER_API_KEY", "sk-or-env-fallback")
|
||||||
|
config = load_config(openrouter_api_key=None)
|
||||||
|
key = _resolve_openrouter_api_key(config)
|
||||||
|
assert key == "sk-or-env-fallback"
|
||||||
|
|
||||||
|
|
||||||
|
def test_resolve_api_key_raises_when_missing(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
monkeypatch.delenv("MYDEEPAGENT_OPENROUTER_API_KEY", raising=False)
|
||||||
|
monkeypatch.delenv("OPENROUTER_API_KEY", raising=False)
|
||||||
|
config = load_config(openrouter_api_key=None)
|
||||||
|
with pytest.raises(MyDeepAgentError) as exc_info:
|
||||||
|
_resolve_openrouter_api_key(config)
|
||||||
|
assert exc_info.value.code == "backend_auth_failed"
|
||||||
|
|
||||||
|
|
||||||
|
def test_resolve_api_key_config_takes_priority_over_env(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
|
monkeypatch.setenv("MYDEEPAGENT_OPENROUTER_API_KEY", "sk-or-env")
|
||||||
|
config = load_config(openrouter_api_key="sk-or-config-wins")
|
||||||
|
key = _resolve_openrouter_api_key(config)
|
||||||
|
assert key == "sk-or-config-wins"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# resolve_model_instance
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_resolve_model_openrouter_returns_chat_openai() -> None:
|
||||||
|
config = load_config(openrouter_api_key="sk-or-test")
|
||||||
|
persona = _minimal_persona(model="openrouter:anthropic/claude-sonnet-4-6")
|
||||||
|
instance = resolve_model_instance(persona, config)
|
||||||
|
assert isinstance(instance, ChatOpenAI)
|
||||||
|
assert instance.openai_api_base == config.openrouter_base_url
|
||||||
|
|
||||||
|
|
||||||
|
def test_resolve_model_openrouter_uses_model_params() -> None:
|
||||||
|
config = load_config(openrouter_api_key="sk-or-test")
|
||||||
|
persona = _minimal_persona(
|
||||||
|
model="openrouter:anthropic/claude-sonnet-4-6",
|
||||||
|
model_params={"max_tokens": 1024, "temperature": 0.5},
|
||||||
|
)
|
||||||
|
instance = resolve_model_instance(persona, config)
|
||||||
|
assert isinstance(instance, ChatOpenAI)
|
||||||
|
assert instance.max_tokens == 1024
|
||||||
|
|
||||||
|
|
||||||
|
def test_resolve_model_non_openrouter_returns_string() -> None:
|
||||||
|
config = load_config()
|
||||||
|
persona = _minimal_persona(
|
||||||
|
backend="anthropic",
|
||||||
|
model="anthropic:claude-3-5-sonnet-20241022",
|
||||||
|
)
|
||||||
|
result = resolve_model_instance(persona, config)
|
||||||
|
assert isinstance(result, str)
|
||||||
|
assert result == "anthropic:claude-3-5-sonnet-20241022"
|
||||||
|
|
||||||
|
|
||||||
|
def test_resolve_model_with_override_openrouter() -> None:
|
||||||
|
config = load_config(openrouter_api_key="sk-or-test")
|
||||||
|
persona = _minimal_persona(model="openrouter:anthropic/claude-sonnet-4-6")
|
||||||
|
instance = resolve_model_instance(
|
||||||
|
persona, config, model_override="openrouter:deepseek/deepseek-chat"
|
||||||
|
)
|
||||||
|
assert isinstance(instance, ChatOpenAI)
|
||||||
|
assert "deepseek-chat" in instance.model_name
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# build_backend — 5 케이스
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_build_backend_local_shell(tmp_path: Path) -> None:
|
||||||
|
persona = _minimal_persona(deepagents_backend="local_shell")
|
||||||
|
result = build_backend(persona, tmp_path)
|
||||||
|
assert isinstance(result, LocalShellBackend)
|
||||||
|
|
||||||
|
|
||||||
|
def test_build_backend_filesystem(tmp_path: Path) -> None:
|
||||||
|
persona = _minimal_persona(deepagents_backend="filesystem")
|
||||||
|
result = build_backend(persona, tmp_path)
|
||||||
|
assert isinstance(result, FilesystemBackend)
|
||||||
|
|
||||||
|
|
||||||
|
def test_build_backend_state_returns_none(tmp_path: Path) -> None:
|
||||||
|
persona = _minimal_persona(deepagents_backend="state")
|
||||||
|
result = build_backend(persona, tmp_path)
|
||||||
|
assert result is None
|
||||||
|
|
||||||
|
|
||||||
|
def test_build_backend_composite(tmp_path: Path) -> None:
|
||||||
|
persona = _minimal_persona(deepagents_backend="composite")
|
||||||
|
result = build_backend(persona, tmp_path)
|
||||||
|
assert isinstance(result, CompositeBackend)
|
||||||
|
|
||||||
|
|
||||||
|
def test_build_backend_langsmith_raises_config_invalid(tmp_path: Path) -> None:
|
||||||
|
persona = _minimal_persona(deepagents_backend="langsmith")
|
||||||
|
with pytest.raises(MyDeepAgentError) as exc_info:
|
||||||
|
build_backend(persona, tmp_path)
|
||||||
|
assert exc_info.value.code == "config_invalid"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# build_agent
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_build_agent_returns_compiled_state_graph(tmp_path: Path) -> None:
|
||||||
|
"""build_agent should construct a CompiledStateGraph without calling the LLM API."""
|
||||||
|
config = load_config(openrouter_api_key="sk-or-test")
|
||||||
|
persona = _minimal_persona(deepagents_backend="state")
|
||||||
|
graph = build_agent(persona, config, root_dir=tmp_path)
|
||||||
|
assert isinstance(graph, CompiledStateGraph)
|
||||||
|
assert hasattr(graph, "invoke")
|
||||||
|
assert hasattr(graph, "ainvoke")
|
||||||
|
|
||||||
|
|
||||||
|
def test_build_agent_with_middleware_list(tmp_path: Path) -> None:
|
||||||
|
"""Extra middleware is accepted without error.
|
||||||
|
|
||||||
|
build_agent automatically prepends SafetyShellMiddleware. Callers should pass
|
||||||
|
*other* middleware here; passing a second SafetyShellMiddleware would hit
|
||||||
|
deepagents' duplicate-name guard.
|
||||||
|
"""
|
||||||
|
from my_deepagent.middleware.audit import AuditToolMiddleware
|
||||||
|
|
||||||
|
config = load_config(openrouter_api_key="sk-or-test")
|
||||||
|
persona = _minimal_persona(deepagents_backend="state")
|
||||||
|
graph = build_agent(
|
||||||
|
persona,
|
||||||
|
config,
|
||||||
|
root_dir=tmp_path,
|
||||||
|
middleware=[AuditToolMiddleware()],
|
||||||
|
)
|
||||||
|
assert isinstance(graph, CompiledStateGraph)
|
||||||
|
|
||||||
|
|
||||||
|
def test_build_agent_filesystem_backend(tmp_path: Path) -> None:
|
||||||
|
"""build_agent works with filesystem backend."""
|
||||||
|
config = load_config(openrouter_api_key="sk-or-test")
|
||||||
|
persona = _minimal_persona(deepagents_backend="filesystem")
|
||||||
|
graph = build_agent(persona, config, root_dir=tmp_path)
|
||||||
|
assert isinstance(graph, CompiledStateGraph)
|
||||||
|
|
||||||
|
|
||||||
|
def test_build_agent_with_persona_permissions(tmp_path: Path) -> None:
|
||||||
|
"""build_agent merges persona permissions with default safety permissions."""
|
||||||
|
config = load_config(openrouter_api_key="sk-or-test")
|
||||||
|
persona = _minimal_persona(
|
||||||
|
deepagents_backend="state",
|
||||||
|
permissions=[{"operations": ["read"], "paths": ["/workspace/**"], "mode": "allow"}],
|
||||||
|
)
|
||||||
|
graph = build_agent(persona, config, root_dir=tmp_path)
|
||||||
|
assert isinstance(graph, CompiledStateGraph)
|
||||||
55
my-deepagent/tests/unit/test_session_seed_integration.py
Normal file
55
my-deepagent/tests/unit/test_session_seed_integration.py
Normal file
@@ -0,0 +1,55 @@
|
|||||||
|
"""Seed persona integration tests for session.py model resolution."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
from langchain_openai import ChatOpenAI
|
||||||
|
|
||||||
|
from my_deepagent.config import load_config
|
||||||
|
from my_deepagent.enums import Backend
|
||||||
|
from my_deepagent.persona import load_personas_from_dir
|
||||||
|
from my_deepagent.session import resolve_model_instance
|
||||||
|
|
||||||
|
PERSONAS_DIR = Path(__file__).parent.parent.parent / "docs" / "schemas" / "personas"
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def seed_personas() -> list: # type: ignore[type-arg]
|
||||||
|
return load_personas_from_dir(PERSONAS_DIR)
|
||||||
|
|
||||||
|
|
||||||
|
def test_resolve_model_instance_seed_personas(seed_personas: list) -> None: # type: ignore[type-arg]
|
||||||
|
"""resolve_model_instance should return ChatOpenAI for openrouter personas, str otherwise."""
|
||||||
|
config = load_config(openrouter_api_key="sk-or-dummy")
|
||||||
|
for persona in seed_personas:
|
||||||
|
instance = resolve_model_instance(persona, config)
|
||||||
|
if persona.backend == Backend.OPENROUTER:
|
||||||
|
assert isinstance(instance, ChatOpenAI), (
|
||||||
|
f"persona {persona.name!r} with backend=openrouter should return ChatOpenAI, "
|
||||||
|
f"got {type(instance)}"
|
||||||
|
)
|
||||||
|
# base_url should point to openrouter
|
||||||
|
assert instance.openai_api_base is not None
|
||||||
|
base = instance.openai_api_base
|
||||||
|
assert "openrouter" in base or base == config.openrouter_base_url
|
||||||
|
else:
|
||||||
|
assert isinstance(instance, str), (
|
||||||
|
f"persona {persona.name!r} with backend={persona.backend} should return str, "
|
||||||
|
f"got {type(instance)}"
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def test_all_seed_personas_have_non_empty_model(seed_personas: list) -> None: # type: ignore[type-arg]
|
||||||
|
for persona in seed_personas:
|
||||||
|
assert persona.model, f"persona {persona.name!r} has empty model"
|
||||||
|
|
||||||
|
|
||||||
|
def test_all_openrouter_seed_personas_have_openrouter_prefix(seed_personas: list) -> None: # type: ignore[type-arg]
|
||||||
|
for persona in seed_personas:
|
||||||
|
if persona.backend == Backend.OPENROUTER:
|
||||||
|
assert persona.model.startswith("openrouter:"), (
|
||||||
|
f"persona {persona.name!r} has backend=openrouter but model={persona.model!r} "
|
||||||
|
"does not start with 'openrouter:'"
|
||||||
|
)
|
||||||
335
my-deepagent/tests/unit/test_workflow.py
Normal file
335
my-deepagent/tests/unit/test_workflow.py
Normal file
@@ -0,0 +1,335 @@
|
|||||||
|
"""Unit tests for src/my_deepagent/workflow.py."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import re
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
from pydantic import ValidationError
|
||||||
|
|
||||||
|
from my_deepagent.workflow import (
|
||||||
|
ExpectedArtifact,
|
||||||
|
WorkflowTemplate,
|
||||||
|
load_workflow_yaml,
|
||||||
|
load_workflows_from_dir,
|
||||||
|
)
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Helpers
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
WORKFLOWS_DIR = Path(__file__).parent.parent.parent / "docs" / "schemas" / "workflows"
|
||||||
|
|
||||||
|
|
||||||
|
def _minimal_role(**overrides: object) -> dict[str, object]:
|
||||||
|
base: dict[str, object] = {
|
||||||
|
"id": "spec_writer",
|
||||||
|
"required_capabilities": ["spec_write"],
|
||||||
|
}
|
||||||
|
base.update(overrides)
|
||||||
|
return base
|
||||||
|
|
||||||
|
|
||||||
|
def _minimal_phase(**overrides: object) -> dict[str, object]:
|
||||||
|
base: dict[str, object] = {
|
||||||
|
"key": "spec",
|
||||||
|
"title": "Write spec",
|
||||||
|
"risk": "low",
|
||||||
|
"role": "spec_writer",
|
||||||
|
"instructions": "Write the specification document for the feature.",
|
||||||
|
}
|
||||||
|
base.update(overrides)
|
||||||
|
return base
|
||||||
|
|
||||||
|
|
||||||
|
def _minimal_template(**overrides: object) -> dict[str, object]:
|
||||||
|
base: dict[str, object] = {
|
||||||
|
"name": "test-workflow",
|
||||||
|
"version": 1,
|
||||||
|
"roles": [_minimal_role()],
|
||||||
|
"phases": [_minimal_phase()],
|
||||||
|
}
|
||||||
|
base.update(overrides)
|
||||||
|
return base
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Seed yaml: all 3 load successfully
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_all_seed_workflows_load() -> None:
|
||||||
|
workflows = load_workflows_from_dir(WORKFLOWS_DIR)
|
||||||
|
assert len(workflows) == 3
|
||||||
|
|
||||||
|
|
||||||
|
def test_seed_workflow_names() -> None:
|
||||||
|
workflows = load_workflows_from_dir(WORKFLOWS_DIR)
|
||||||
|
names = {w.name for w in workflows}
|
||||||
|
assert names == {"spec-and-review", "bug-fix-with-reproduction", "code-investigation"}
|
||||||
|
|
||||||
|
|
||||||
|
def test_seed_workflow_roles_non_empty() -> None:
|
||||||
|
workflows = load_workflows_from_dir(WORKFLOWS_DIR)
|
||||||
|
for w in workflows:
|
||||||
|
assert len(w.roles) >= 1
|
||||||
|
|
||||||
|
|
||||||
|
def test_seed_workflow_phases_non_empty() -> None:
|
||||||
|
workflows = load_workflows_from_dir(WORKFLOWS_DIR)
|
||||||
|
for w in workflows:
|
||||||
|
assert len(w.phases) >= 1
|
||||||
|
|
||||||
|
|
||||||
|
def test_seed_workflow_phase_keys_unique() -> None:
|
||||||
|
workflows = load_workflows_from_dir(WORKFLOWS_DIR)
|
||||||
|
for w in workflows:
|
||||||
|
keys = [ph.key for ph in w.phases]
|
||||||
|
assert len(keys) == len(set(keys)), f"{w.name}: duplicate phase keys"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# WorkflowTemplate validators
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_phase_references_undefined_role_raises() -> None:
|
||||||
|
data = _minimal_template(
|
||||||
|
roles=[_minimal_role(id="spec_writer")],
|
||||||
|
phases=[_minimal_phase(role="nonexistent_role")],
|
||||||
|
)
|
||||||
|
with pytest.raises(ValidationError, match="unknown role"):
|
||||||
|
WorkflowTemplate.model_validate(data)
|
||||||
|
|
||||||
|
|
||||||
|
def test_duplicate_phase_keys_raises() -> None:
|
||||||
|
data = _minimal_template(
|
||||||
|
roles=[_minimal_role(id="spec_writer")],
|
||||||
|
phases=[
|
||||||
|
_minimal_phase(key="spec"),
|
||||||
|
_minimal_phase(key="spec"),
|
||||||
|
],
|
||||||
|
)
|
||||||
|
with pytest.raises(ValidationError, match="duplicate phase keys"):
|
||||||
|
WorkflowTemplate.model_validate(data)
|
||||||
|
|
||||||
|
|
||||||
|
def test_duplicate_role_ids_raises() -> None:
|
||||||
|
data = _minimal_template(
|
||||||
|
roles=[_minimal_role(id="spec_writer"), _minimal_role(id="spec_writer")],
|
||||||
|
phases=[_minimal_phase(role="spec_writer")],
|
||||||
|
)
|
||||||
|
with pytest.raises(ValidationError, match="duplicate role ids"):
|
||||||
|
WorkflowTemplate.model_validate(data)
|
||||||
|
|
||||||
|
|
||||||
|
def test_phase_key_uppercase_raises() -> None:
|
||||||
|
data = _minimal_template(phases=[_minimal_phase(key="SPEC")])
|
||||||
|
with pytest.raises(ValidationError):
|
||||||
|
WorkflowTemplate.model_validate(data)
|
||||||
|
|
||||||
|
|
||||||
|
def test_phase_key_with_hyphen_raises() -> None:
|
||||||
|
"""Hyphens are not allowed in phase keys (only a-z, 0-9, _)."""
|
||||||
|
data = _minimal_template(phases=[_minimal_phase(key="spec-one")])
|
||||||
|
with pytest.raises(ValidationError):
|
||||||
|
WorkflowTemplate.model_validate(data)
|
||||||
|
|
||||||
|
|
||||||
|
def test_phase_key_leading_digit_raises() -> None:
|
||||||
|
data = _minimal_template(phases=[_minimal_phase(key="1spec")])
|
||||||
|
with pytest.raises(ValidationError):
|
||||||
|
WorkflowTemplate.model_validate(data)
|
||||||
|
|
||||||
|
|
||||||
|
def test_phase_key_snake_case_ok() -> None:
|
||||||
|
data = _minimal_template(phases=[_minimal_phase(key="spec_write_phase")])
|
||||||
|
wt = WorkflowTemplate.model_validate(data)
|
||||||
|
assert wt.phases[0].key == "spec_write_phase"
|
||||||
|
|
||||||
|
|
||||||
|
def test_role_id_pattern_invalid_raises() -> None:
|
||||||
|
data = _minimal_template(
|
||||||
|
roles=[_minimal_role(id="Spec-Writer")],
|
||||||
|
phases=[_minimal_phase(role="spec_writer")],
|
||||||
|
)
|
||||||
|
with pytest.raises(ValidationError):
|
||||||
|
WorkflowTemplate.model_validate(data)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# ExpectedArtifact: alias mapping
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_expected_artifact_schema_alias() -> None:
|
||||||
|
"""yaml uses 'schema' key; Python attribute is schema_id."""
|
||||||
|
art = ExpectedArtifact.model_validate({"path": "artifacts/spec.json", "schema": "dev/spec@1"})
|
||||||
|
assert art.schema_id == "dev/spec@1"
|
||||||
|
assert art.path == "artifacts/spec.json"
|
||||||
|
|
||||||
|
|
||||||
|
def test_expected_artifact_extra_field_raises() -> None:
|
||||||
|
with pytest.raises(ValidationError):
|
||||||
|
ExpectedArtifact.model_validate({"path": "x.json", "schema": "dev/spec@1", "unknown": True})
|
||||||
|
|
||||||
|
|
||||||
|
def test_expected_artifact_missing_schema_raises() -> None:
|
||||||
|
with pytest.raises(ValidationError):
|
||||||
|
ExpectedArtifact.model_validate({"path": "x.json"})
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# WorkflowTemplate frozen + extra="forbid"
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_template_frozen() -> None:
|
||||||
|
wt = WorkflowTemplate.model_validate(_minimal_template())
|
||||||
|
with pytest.raises((TypeError, ValidationError)):
|
||||||
|
wt.name = "mutated" # type: ignore[misc]
|
||||||
|
|
||||||
|
|
||||||
|
def test_template_extra_field_raises() -> None:
|
||||||
|
data = _minimal_template(extra_unknown_field="oops")
|
||||||
|
with pytest.raises(ValidationError):
|
||||||
|
WorkflowTemplate.model_validate(data)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# compute_hash: determinism
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_compute_hash_deterministic() -> None:
|
||||||
|
wt = WorkflowTemplate.model_validate(_minimal_template())
|
||||||
|
hashes = [wt.compute_hash() for _ in range(20)]
|
||||||
|
assert len(set(hashes)) == 1
|
||||||
|
|
||||||
|
|
||||||
|
def test_compute_hash_returns_64_char_hex() -> None:
|
||||||
|
wt = WorkflowTemplate.model_validate(_minimal_template())
|
||||||
|
h = wt.compute_hash()
|
||||||
|
assert re.fullmatch(r"[0-9a-f]{64}", h)
|
||||||
|
|
||||||
|
|
||||||
|
def test_compute_hash_different_templates_differ() -> None:
|
||||||
|
wt1 = WorkflowTemplate.model_validate(_minimal_template(name="wf1"))
|
||||||
|
wt2 = WorkflowTemplate.model_validate(_minimal_template(name="wf2"))
|
||||||
|
assert wt1.compute_hash() != wt2.compute_hash()
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# load_workflow_yaml: file not found
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_load_workflow_yaml_missing_file(tmp_path: Path) -> None:
|
||||||
|
with pytest.raises(FileNotFoundError):
|
||||||
|
load_workflow_yaml(tmp_path / "no.yaml")
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# load_workflows_from_dir: duplicate detection + missing dir
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_load_workflows_from_dir_duplicate_raises(tmp_path: Path) -> None:
|
||||||
|
import yaml
|
||||||
|
|
||||||
|
data = _minimal_template()
|
||||||
|
for fname in ("wf-a@1.yaml", "wf-b@1.yaml"):
|
||||||
|
(tmp_path / fname).write_text(yaml.dump(data), encoding="utf-8")
|
||||||
|
|
||||||
|
with pytest.raises(ValueError, match="duplicate workflow"):
|
||||||
|
load_workflows_from_dir(tmp_path)
|
||||||
|
|
||||||
|
|
||||||
|
def test_load_workflows_from_dir_missing_dir() -> None:
|
||||||
|
result = load_workflows_from_dir(Path("/nonexistent_wf_dir_xyz"))
|
||||||
|
assert result == []
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Snapshot: seed hashes are stable
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_spec_and_review_hash_prefix() -> None:
|
||||||
|
workflows = load_workflows_from_dir(WORKFLOWS_DIR)
|
||||||
|
w = next(x for x in workflows if x.name == "spec-and-review")
|
||||||
|
assert w.compute_hash().startswith("1c94587647b16f0d")
|
||||||
|
|
||||||
|
|
||||||
|
def test_bug_fix_hash_prefix() -> None:
|
||||||
|
workflows = load_workflows_from_dir(WORKFLOWS_DIR)
|
||||||
|
w = next(x for x in workflows if x.name == "bug-fix-with-reproduction")
|
||||||
|
assert w.compute_hash().startswith("a137c9656f10e88a")
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Step 2 patch: Counter-based duplicate role ids report is sorted
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_workflow_duplicate_role_ids_reported_sorted() -> None:
|
||||||
|
"""Multiple duplicated role ids must be reported in sorted order."""
|
||||||
|
with pytest.raises(ValidationError, match=r"duplicate role ids: \['a', 'b'\]"):
|
||||||
|
WorkflowTemplate.model_validate(
|
||||||
|
{
|
||||||
|
"name": "x",
|
||||||
|
"version": 1,
|
||||||
|
"roles": [
|
||||||
|
{"id": "b", "required_capabilities": ["spec_write"]},
|
||||||
|
{"id": "a", "required_capabilities": ["spec_write"]},
|
||||||
|
{"id": "a", "required_capabilities": ["spec_write"]},
|
||||||
|
{"id": "b", "required_capabilities": ["spec_write"]},
|
||||||
|
],
|
||||||
|
"phases": [
|
||||||
|
{
|
||||||
|
"key": "x",
|
||||||
|
"title": "x",
|
||||||
|
"risk": "low",
|
||||||
|
"role": "a",
|
||||||
|
"instructions": "x" * 20,
|
||||||
|
}
|
||||||
|
],
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def test_code_investigation_hash_prefix() -> None:
|
||||||
|
workflows = load_workflows_from_dir(WORKFLOWS_DIR)
|
||||||
|
w = next(x for x in workflows if x.name == "code-investigation")
|
||||||
|
assert w.compute_hash().startswith("5b80ea2e248d5232")
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Deep immutability: nested list-valued fields are tuples (cannot be mutated)
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_workflow_phases_immutable() -> None:
|
||||||
|
"""phases is a tuple — .append() must raise AttributeError."""
|
||||||
|
wt = WorkflowTemplate.model_validate(_minimal_template())
|
||||||
|
with pytest.raises((AttributeError, TypeError)):
|
||||||
|
wt.phases.append(None) # type: ignore[attr-defined]
|
||||||
|
|
||||||
|
|
||||||
|
def test_workflow_roles_immutable() -> None:
|
||||||
|
"""roles is a tuple — .append() must raise AttributeError."""
|
||||||
|
wt = WorkflowTemplate.model_validate(_minimal_template())
|
||||||
|
with pytest.raises((AttributeError, TypeError)):
|
||||||
|
wt.roles.append(None) # type: ignore[attr-defined]
|
||||||
|
|
||||||
|
|
||||||
|
def test_workflow_role_required_capabilities_immutable() -> None:
|
||||||
|
"""required_capabilities is a tuple — .append() must raise AttributeError."""
|
||||||
|
from my_deepagent.workflow import WorkflowRole
|
||||||
|
|
||||||
|
role = WorkflowRole.model_validate(
|
||||||
|
{"id": "spec_writer", "required_capabilities": ["spec_write"]}
|
||||||
|
)
|
||||||
|
with pytest.raises((AttributeError, TypeError)):
|
||||||
|
role.required_capabilities.append(None) # type: ignore[attr-defined]
|
||||||
2501
my-deepagent/uv.lock
generated
Normal file
2501
my-deepagent/uv.lock
generated
Normal file
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user