feat: propagate execution evidence across iterations and enhance reports

- Carry execution evidence forward so reviewer/senior prompts in
  subsequent iterations can inspect prior transcript and command data
- Add {execution_evidence} to REVIEW_ONLY templates (en/ko)
- Add evidence summary table to iteration reports
- Fix test_agentic to match stdin-based prompt delivery for Claude
- Add expanded claim/no-change marker tests and cross-iteration
  evidence propagation tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
chungyeong
2026-03-13 23:36:28 +09:00
parent c467222a2a
commit 87bc0ffbfb
5 changed files with 591 additions and 10 deletions

View File

@@ -243,9 +243,14 @@ You are tasked with reviewing existing code against a plan and checklist.
## Previous Review (iteration {iteration} of {max_iterations})
{feedback}
## Execution Evidence
{execution_evidence}
## Review Instructions
Explore the project directory thoroughly to understand the full codebase, \
then evaluate the EXISTING code against ONLY the plan and checklist above.
then evaluate the EXISTING code against ONLY the plan and checklist above. \
Use the execution evidence above to verify agent claims against actual \
command outputs and exit codes.
You are NOT generating or modifying code. You are auditing what already exists.
@@ -314,9 +319,13 @@ REVIEW_ONLY_TEMPLATE_KO = """\
## 이전 리뷰 결과 ({max_iterations}회 중 {iteration}번째)
{feedback}
## 실행 증거
{execution_evidence}
## 검토 지침
프로젝트 디렉토리를 직접 탐색하여 전체 코드베이스를 파악한 뒤, \
위 기획서와 체크리스트 기준으로 **기존 코드**를 평가하세요.
위 기획서와 체크리스트 기준으로 **기존 코드**를 평가하세요. \
위 실행 증거를 활용하여 에이전트의 주장을 실제 명령어 출력과 종료 코드로 검증하세요.
코드를 생성하거나 수정하지 마세요. 이미 존재하는 코드를 감사하는 것이 목적입니다.