Claude in agentic mode (interactive, no -p flag) commits its own changes,
advancing HEAD. This made `git diff --cached HEAD` return empty, triggering
false EMPTY_DIFF errors every time. Now capture_diff diffs against the
base commit SHA recorded at worktree creation, so changes are captured
regardless of whether the agent committed them.
Also adds UX_IMPROVEMENT_PLAN.md for guided message improvements.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
_finalize_worktree was returning None and deleting the branch when the
final commit was empty, even though _commit_iteration had already
committed changes during the pipeline. Now checks git log for any
commits on the branch before deciding to clean up.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add -p flag to _CLAUDE_REVIEW_ARGS so reviewer uses print mode (stdin→stdout)
instead of interactive mode which conflicts with plan permission mode
- Copy input files (plan, checklist) into worktree .cross-eval-inputs/ so
agents in plan mode can access them without escaping the sandbox
- Simplify _snapshot_repo_state to use only git diff HEAD + untracked hashes,
eliminating false positives from staging state changes (git diff --cached)
and git status index drift during long-running pipelines
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Carry execution evidence forward so reviewer/senior prompts in
subsequent iterations can inspect prior transcript and command data
- Add {execution_evidence} to REVIEW_ONLY templates (en/ko)
- Add evidence summary table to iteration reports
- Fix test_agentic to match stdin-based prompt delivery for Claude
- Add expanded claim/no-change marker tests and cross-iteration
evidence propagation tests
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>