feat: ESCALATE verdict, issue tracker, onboarding commands

Add 3-verdict system (PASS/FAIL/ESCALATE) with priority handling across simple and phased pipelines. Senior reviewers can now escalate issues requiring human intervention, immediately breaking the review loop. - ESCALATE verdict extraction with highest priority over PASS/FAIL - Issue Tracker tables (ISS-NNN) carried across iterations - Auto-escalate heuristic using (file, keyword) composite fingerprints - Report restructuring: executive view first (verdict → tracker → metrics) - Onboarding: `doctor`, `demo`, `init --guided` commands - Exit codes: PASS=0, FAIL=1, ESCALATE=2 - 87 tests passing (54 config + 25 onboarding + 8 integration) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 18:19:05 +09:00
parent ee4f1a07ef
commit 204e071b74
15 changed files with 3032 additions and 156 deletions
--- a/DEVELOPMENT.md
+++ b/DEVELOPMENT.md
@@ -41,7 +41,7 @@ inputs:
  checklist: checklist.md

 agents:
-  generator:
+  coder:
    command: claude
    args: ["-p", "--model", "sonnet", "--permission-mode", "auto"]
    system_prompt: "You are a senior software engineer. Follow the plan precisely."
@@ -53,14 +53,16 @@ agents:
 # 방법 1: 프리셋 사용 (사용자가 pipeline YAML 직접 작성할 필요 없음)
 pipeline: preset:simple          # "A 생성 → B 리뷰" (기본값)
 # pipeline: preset:cross-review  # "둘 다 생성 → 서로 리뷰"
+# pipeline: preset:plan-review   # "구현 전 문서/기획 검토"
+# pipeline: preset:coding-review-fix  # "초기 코딩 1회 → 리뷰/수정 반복"

 # 방법 2: 직접 커스텀 (고급 사용자용)
 # pipeline:
-#   - name: generate
-#     agent: generator
-#     role: generate
-#     prompt_template: "default:generate"
-#     output_key: generated_code
+#   - name: coding
+#     agent: coder
+#     role: coding
+#     prompt_template: "default:coding"
+#     output_key: coding_output
 #   - name: review
 #     agent: reviewer
 #     role: review
@@ -73,8 +75,10 @@ pipeline: preset:simple          # "A 생성 → B 리뷰" (기본값)

 | 프리셋 | 설명 | 자동 생성되는 steps |
 |--------|------|-------------------|
-| `simple` | A 생성 → B 리뷰 | generate(agent1) → review(agent2) |
-| `cross-review` | 둘 다 생성, 서로 리뷰 | gen_a → gen_b → review_of_b(agent_a) → review_of_a(agent_b) |
+| `simple` | A 코딩 → B 리뷰 | coding(agent1) → review(agent2) |
+| `cross-review` | 둘 다 코딩, 서로 리뷰 | coding_a → coding_b → review_of_b(agent_a) → review_of_a(agent_b) |
+| `plan-review` | 구현 전 문서 검토 | parallel plan_review_* → senior_review(optional) |
+| `coding-review-fix` | 초기 코딩 후 리뷰/수정 반복 | initial_coding(coding) → review_fix(review* → aggregate → coding → verify) |

 프리셋은 내부적으로 적절한 pipeline steps + context_override를 자동 구성한다. agents에 정의된 순서대로 agent1, agent2가 배정된다. 프리셋이 불충분하면 직접 steps를 작성할 수 있다.

@@ -109,11 +113,11 @@ cross_eval/
 - verdict_pattern 유효한 정규식인지

 **prompts.py** — 기본 프롬프트 2종 + 파이프라인 프리셋 정의:
- `default:generate` — "기획서에 명시된 것만 구현하라, 과최적화 금지" + plan/checklist/feedback + **"프로젝트 디렉토리의 기존 코드를 탐색하여 컨텍스트를 파악하라"** 지시
+- `default:coding` — "기획서에 명시된 것만 구현하라, 과최적화 금지" + plan/checklist/feedback + **"프로젝트 디렉토리의 기존 코드를 탐색하여 컨텍스트를 파악하라"** 지시
 - `default:review` — 과최적화/오탐/누락 3기준 검토 + `VERDICT: PASS|FAIL` 출력 + **"프로젝트 디렉토리를 직접 탐색하여 코드를 검증하라"** 지시
 - `{variable}` 플레이스홀더, 누락 시 `(no {key} provided)` 출력
 - 사용자가 커스텀 .md 파일로 오버라이드 가능
- `PIPELINE_PRESETS` dict: `simple`, `cross-review` 등 프리셋별 StepConfig 리스트 정의
+- `PIPELINE_PRESETS` dict: `simple`, `cross-review`, `plan-review` 등 프리셋별 StepConfig 리스트 정의

 **agent.py** — `invoke_agent(agent_config, prompt, cwd)`:
 - `cwd` 파라미터로 프로젝트 디렉토리 지정 → 에이전트가 해당 디렉토리에서 파일 탐색 가능
@@ -141,7 +145,7 @@ final-report.md 생성
 - 최종 판정

 **cli.py** — 서브커맨드:
- `cross-eval init [--dir .] [--preset simple|cross-review]` — 스캐폴딩 (기존 파일 안 덮어씀)
+- `cross-eval init [--dir .] [--preset simple|cross-review|plan-review]` — 스캐폴딩 (기존 파일 안 덮어씀)
 - `cross-eval run [-c config] [--max-iter N] [--dry-run] [--output-dir path] [--input key=path ...]`
 - `--input key=path`: config의 inputs 오버라이드/추가
 - `--dry-run`: 에이전트 호출 없이 렌더링된 프롬프트만 출력
@@ -167,3 +171,17 @@ final-report.md 생성
 3. `cross-eval run --dry-run` 로 프롬프트 렌더링 확인 (에이전트 호출 없이)
 4. plan.md/checklist.md에 간단한 내용 넣고 `cross-eval run --max-iter 2` 로 실제 실행
 5. `output/` 디렉토리에 v1/, final-report.md 생성 확인
+
+
+  cross-eval run \
+    --docs /Users/chungyeong/Desktop/Dev/new-alpha-foundry/plans/TO_CLICKHOUSE \
+    --preset coding-review-fix \
+    --coder claude \
+    --reviewer codex \
+    --reviewer codex \
+    --reviewer codex \
+    --senior codex \
+    --coder-effort high \
+    --reviewer-effort high \
+    --senior-effort xhigh \
+    --max-iter 10