feat(my-deepagent): v0.1.0 Step 0~5 — scaffolding through deepagent + OpenRouter

Python rewrite of the agent harness on top of deepagents 0.6.1 + langchain 1.x, replacing the abandoned TS attempt in packages/. 388 unit/integration tests pass. Steps ----- 0. Scaffolding — uv workspace, ruff/mypy/pre-commit/alembic, src/tests/docs trees with docs/schemas/ seeded from my-deepagent-seed/. 1. Core — config (pydantic-settings with MYDEEPAGENT_ env prefix and TOML source), enums (Backend, Capability, RiskLevel, ApprovalDecisionAction, ApprovalState, RunState, RunPhaseState, SessionState, ErrorClass), errors (MyDeepAgentError + BudgetExhaustedError with PEP-3134 cause + context suppression), hash (canonical JSON + sha256). 2. Persona/Workflow/Binding — pydantic v2 schemas with tuple-based deep immutability (post-construction hash drift prevented), YAML loaders, deterministic auto-select (preferred_backends → version → name → hash), override resolution with ineligibility diagnostics, PersonaConsentStore with fcntl.flock + tmp+fsync+rename atomic write. 3. Artifact schema registry — Draft202012Validator, multi-root resolution, structured ValidationFinding output. 4. Persistence — 18 SQLAlchemy 2.0 async ORM models with FK CASCADE/RESTRICT, WAL + busy_timeout + foreign_keys PRAGMA, alembic baseline + ux_active_run_repo_base partial unique index, LangGraph SqliteSaver as context manager only (lifecycle safety). 5. DeepAgent session — build_agent wires Persona → create_deep_agent with LocalShellBackend / FilesystemBackend / StateBackend / CompositeBackend, ChatOpenAI(base_url=openrouter) for openrouter: model strings, and 4 middleware classes (cost / audit-tool / safety-shell / fallback-model). Critical workarounds -------------------- - deepagents 0.6.1 rejects FilesystemPermission together with backends that implement SandboxBackendProtocol (LocalShellBackend). SafetyShellMiddleware enforces destructive-command and secret-path policy at the tool layer instead, and build_agent strips the permissions kwarg when the persona's deepagents_backend is local_shell. - FilesystemOperation in deepagents is Literal['read', 'write'] only; _map_operations collapses our richer schema (read/write/edit/ls) safely. Real OpenRouter smoke --------------------- test_openrouter_deepagents_local_shell_smoke calls DeepSeek via deepagents + LocalShellBackend + SafetyShellMiddleware end-to-end. PASS, ~$0.000001 cost, input=9 / output=1 tokens with content "OK". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 19:40:02 +09:00
parent 1fe59d16ca
commit 17ba5d723b
100 changed files with 12408 additions and 0 deletions
--- a/my-deepagent/tests/unit/test_enums.py
+++ b/my-deepagent/tests/unit/test_enums.py
@@ -0,0 +1,235 @@
+"""Unit tests for src/my_deepagent/enums.py."""
+
+import pytest
+
+from my_deepagent.enums import (
+    ApprovalDecisionAction,
+    ApprovalState,
+    Backend,
+    Capability,
+    ErrorClass,
+    RiskLevel,
+    RunPhaseState,
+    RunState,
+    SessionState,
+)
+
+# ---------------------------------------------------------------------------
+# Backend
+# ---------------------------------------------------------------------------
+
+
+def test_backend_openrouter_value() -> None:
+    assert Backend.OPENROUTER == "openrouter"
+
+
+def test_backend_anthropic_value() -> None:
+    assert Backend.ANTHROPIC == "anthropic"
+
+
+def test_backend_openai_value() -> None:
+    assert Backend.OPENAI == "openai"
+
+
+def test_backend_google_value() -> None:
+    assert Backend.GOOGLE == "google"
+
+
+def test_backend_fake_value() -> None:
+    assert Backend.FAKE == "fake"
+
+
+def test_backend_str_equality() -> None:
+    # StrEnum members compare equal to their string values
+    assert Backend.OPENROUTER == "openrouter"
+    assert str(Backend.OPENROUTER) == "openrouter"
+
+
+# ---------------------------------------------------------------------------
+# Capability
+# ---------------------------------------------------------------------------
+
+
+def test_capability_count() -> None:
+    assert len(list(Capability)) == 13
+
+
+def test_capability_spec_write() -> None:
+    assert Capability.SPEC_WRITE == "spec_write"
+
+
+def test_capability_code_edit() -> None:
+    assert Capability.CODE_EDIT == "code_edit"
+
+
+def test_capability_final_report_compose() -> None:
+    assert Capability.FINAL_REPORT_COMPOSE == "final_report_compose"
+
+
+def test_capability_all_are_str() -> None:
+    for cap in Capability:
+        assert isinstance(cap, str)
+
+
+# ---------------------------------------------------------------------------
+# RiskLevel
+# ---------------------------------------------------------------------------
+
+
+def test_risk_level_values() -> None:
+    assert RiskLevel.LOW == "low"
+    assert RiskLevel.MEDIUM == "medium"
+    assert RiskLevel.HIGH == "high"
+
+
+# ---------------------------------------------------------------------------
+# ApprovalDecisionAction
+# ---------------------------------------------------------------------------
+
+
+def test_approval_decision_action_approve() -> None:
+    assert ApprovalDecisionAction.APPROVE == "approve"
+
+
+def test_approval_decision_action_reject() -> None:
+    assert ApprovalDecisionAction.REJECT == "reject"
+
+
+def test_approval_decision_action_request_changes() -> None:
+    assert ApprovalDecisionAction.REQUEST_CHANGES == "request_changes"
+
+
+def test_approval_decision_action_abort() -> None:
+    assert ApprovalDecisionAction.ABORT == "abort"
+
+
+# ---------------------------------------------------------------------------
+# ApprovalState
+# ---------------------------------------------------------------------------
+
+
+def test_approval_state_all_values() -> None:
+    expected = {"pending", "approved", "rejected", "changes_requested", "aborted", "paused"}
+    actual = {s.value for s in ApprovalState}
+    assert actual == expected
+
+
+# ---------------------------------------------------------------------------
+# RunState
+# ---------------------------------------------------------------------------
+
+
+def test_run_state_all_values() -> None:
+    expected = {
+        "created",
+        "bound",
+        "planning",
+        "awaiting_approval",
+        "executing",
+        "paused",
+        "completed",
+        "failed",
+        "aborted",
+    }
+    actual = {s.value for s in RunState}
+    assert actual == expected
+
+
+def test_run_state_count() -> None:
+    assert len(list(RunState)) == 9
+
+
+# ---------------------------------------------------------------------------
+# RunPhaseState
+# ---------------------------------------------------------------------------
+
+
+def test_run_phase_state_all_values() -> None:
+    expected = {
+        "pending",
+        "running",
+        "awaiting_artifact",
+        "validating",
+        "awaiting_approval",
+        "completed",
+        "failed",
+        "skipped",
+    }
+    actual = {s.value for s in RunPhaseState}
+    assert actual == expected
+
+
+def test_run_phase_state_count() -> None:
+    assert len(list(RunPhaseState)) == 8
+
+
+# ---------------------------------------------------------------------------
+# SessionState
+# ---------------------------------------------------------------------------
+
+
+def test_session_state_all_values() -> None:
+    expected = {
+        "CREATED",
+        "BOOTSTRAPPING",
+        "READY",
+        "BUSY",
+        "WAITING_FOR_APPROVAL",
+        "ARTIFACT_TIMEOUT",
+        "HUNG",
+        "CRASHED",
+        "RESUMING",
+        "REBOOTSTRAPPED",
+        "FAILED_NEEDS_HUMAN",
+    }
+    actual = {s.value for s in SessionState}
+    assert actual == expected
+
+
+def test_session_state_count() -> None:
+    assert len(list(SessionState)) == 11
+
+
+# ---------------------------------------------------------------------------
+# ErrorClass
+# ---------------------------------------------------------------------------
+
+
+def test_error_class_recoverable() -> None:
+    assert ErrorClass.RECOVERABLE == "recoverable"
+
+
+def test_error_class_human_required() -> None:
+    assert ErrorClass.HUMAN_REQUIRED == "human_required"
+
+
+def test_error_class_fatal() -> None:
+    assert ErrorClass.FATAL == "fatal"
+
+
+def test_error_class_count() -> None:
+    assert len(list(ErrorClass)) == 3
+
+
+# ---------------------------------------------------------------------------
+# StrEnum serialization / deserialization
+# ---------------------------------------------------------------------------
+
+
+def test_str_enum_from_value() -> None:
+    assert Backend("openrouter") is Backend.OPENROUTER
+
+
+def test_str_enum_in_dict() -> None:
+    # StrEnum should work as dict key and compare with string
+    d = {Backend.OPENROUTER: "openrouter backend"}
+    assert d["openrouter"] == "openrouter backend"
+
+
+@pytest.mark.parametrize(
+    "state",
+    list(RunState),
+)
+def test_run_state_parametrize(state: RunState) -> None:
+    assert isinstance(state, str)
+    assert RunState(state.value) is state