feat(ultraplan-local): v1.7.0 — self-verifying plan chain

Wave 1 of a 6-session parallel build revealed three failure modes: (1) hallucinated completion (status=completed after 2/5 steps, last tool call was an arbitrary file review), (2) fail-late bash (3/6 sessions had push blocked inside sub-agent sandbox after all work was done), (3) no objective verification (plans were prose). v1.7 closes all three by making the plan an executable contract. Per-step YAML manifest (expected_paths, commit_message_pattern, bash_syntax_check, forbidden_paths, must_contain) is the objective completion predicate. Plan-critic dimension 10 (Manifest quality) is a hard gate. Session decomposer propagates manifests verbatim and emits an obligatory Step 0 pre-flight (git push --dry-run, exit 77 sentinel) in every session spec. ultraexecute-local gets Phase 7.5 (independent manifest audit from git log + filesystem, ignoring agent bookkeeping) and Phase 7.6 (bounded recovery dispatch, recovery_depth ≤ 2). Hard Rule 17 forbids marking a step passed without manifest verification. Hard Rule 18 forbids ending on an arbitrary tool call before reporting. Division of labor is made explicit: - /ultraresearch-local gathers context (no build decisions) - /ultraplan-local produces an executable contract (manifests, plan-critic gate) - /ultraexecute-local executes disciplined (does NOT compensate for weak plans — escalates) Code complete. Docs partial (Arbeidsdeling table + manifest section added to plugin + marketplace READMEs). Verification tests (10-sequence) pending — see REMEMBER.md. Backward compat: v1.6 plans without plan_version marker get legacy mode with synthesized manifests and legacy_plan: true in progress file. Plan-critic emits advisory, not block. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-12 07:38:16 +02:00 · 2026-04-12 07:38:16 +02:00 · d1befac35a
commit d1befac35a
parent 72f2e8f6c9
11 changed files with 651 additions and 27 deletions
--- a/plugins/ultraplan-local/templates/headless-launch-template.md
+++ b/plugins/ultraplan-local/templates/headless-launch-template.md
@ -47,6 +47,27 @@ if [ -n "$(git status --porcelain)" ]; then
  exit 1
 fi

+# Pre-flight: verify remote push permissions (catches credential/auth issues
+# BEFORE spawning sessions). Sub-agent bash sandbox may have different
+# credentials than the launching shell — Step 0 in each session spec handles
+# the sandbox-side detection. Set ULTRAEXECUTE_SKIP_PREFLIGHT=1 for offline
+# or air-gapped testing.
+if [ "${ULTRAEXECUTE_SKIP_PREFLIGHT:-0}" != "1" ]; then
+  if ! git push --dry-run origin HEAD >/tmp/push-dryrun-launch.log 2>&1; then
+    echo "ERROR: git push --dry-run failed. Sessions will be unable to push."
+    cat /tmp/push-dryrun-launch.log
+    echo ""
+    echo "Fix remote credentials before running parallel execution, or set"
+    echo "ULTRAEXECUTE_SKIP_PREFLIGHT=1 to bypass (offline/air-gapped only)."
+    exit 1
+  fi
+  if grep -qE "(rejected|denied|forbidden|permission)" /tmp/push-dryrun-launch.log; then
+    echo "ERROR: git push --dry-run reports rejection. Sessions will fail at commit time."
+    cat /tmp/push-dryrun-launch.log
+    exit 1
+  fi
+fi
+
 echo "=== Ultraplan Headless Execution (Worktree-Isolated) ==="
 echo "Plan: {plan_path}"
 echo "Sessions: {total_sessions}"
--- a/plugins/ultraplan-local/templates/plan-template.md
+++ b/plugins/ultraplan-local/templates/plan-template.md
@ -2,7 +2,7 @@

 > **Plan quality: {grade}** ({score}/100) — {APPROVE | APPROVE_WITH_NOTES | REVISE | REPLAN}
 >
-> Generated by ultraplan-local v{version} on {YYYY-MM-DD}
+> Generated by ultraplan-local v{version} on {YYYY-MM-DD} — `plan_version: 1.7`

 ## Context

@ -56,6 +56,17 @@ when the project has tests.
 - **Verify:** `{exact command}` → expected: `{output}`
 - **On failure:** {revert | retry | skip | escalate} — {specific instructions}
 - **Checkpoint:** `git commit -m "{conventional commit message}"`
+- **Manifest:**
+  ```yaml
+  manifest:
+    expected_paths:
+      - path/to/file.ts
+    min_file_count: 1
+    commit_message_pattern: "^feat\\(scope\\):"
+    bash_syntax_check: []
+    forbidden_paths: []
+    must_contain: []
+  ```

 ### Step 2: {description}

@ -69,10 +80,43 @@ when the project has tests.
 - **Verify:** `{exact command}` → expected: `{output}`
 - **On failure:** {revert | retry | skip | escalate} — {specific instructions}
 - **Checkpoint:** `git commit -m "{conventional commit message}"`
+- **Manifest:**
+  ```yaml
+  manifest:
+    expected_paths:
+      - path/to/file.ts
+    min_file_count: 1
+    commit_message_pattern: "^feat\\(scope\\):"
+    bash_syntax_check: []
+    forbidden_paths: []
+    must_contain:
+      - path: path/to/file.ts
+        pattern: "expected content marker"
+  ```

 *For projects without tests: omit "Test first" and keep "Verify" with a
 concrete command (e.g., run the app, check output, curl an endpoint).*

+### Manifest — objective completion predicate
+
+Every step MUST have a Manifest block. This is the machine-checkable contract
+that ultraexecute-local verifies after the Verify command passes. A step is
+not considered complete until its manifest verifies — regardless of Verify
+command exit code.
+
+- **expected_paths** — files that must exist after this step. Existing files
+  must be present in repo; new files must be marked `(new file)` in prose.
+- **min_file_count** — minimum number of expected_paths that must exist.
+  Typically equal to `len(expected_paths)`.
+- **commit_message_pattern** — regex that MUST match the HEAD commit message
+  after Checkpoint runs. Use escaped regex syntax (e.g., `\\(scope\\)`).
+- **bash_syntax_check** — list of `.sh` files that must pass `bash -n`.
+  Auto-include any `.sh` in expected_paths.
+- **forbidden_paths** — files this step must NOT modify (defense-in-depth
+  beyond Scope Fence).
+- **must_contain** — optional grep assertions: `path` + `pattern` pairs that
+  must match in created/modified files.
+
 ### Failure recovery rules

 - **On failure: revert** — undo this step's changes (`git checkout -- {files}`), do NOT proceed
@ -121,7 +165,10 @@ before execution.*

 ## Verification

-End-to-end checks that prove the plan was implemented correctly.
+*Per-step manifest verification runs automatically during execution (every
+step's Manifest block is objectively checked by ultraexecute-local before the
+step is marked passed). This section is for end-to-end integration checks
+that cross step boundaries — complete workflows, system-level behavior.*

 - [ ] `{exact command}` → expected: `{exact output or behavior}`
 - [ ] `{exact command}` → expected: `{exact output or behavior}`
@ -179,7 +226,8 @@ later waves depend on earlier waves completing first.*
 | Coverage completeness | 0.20 | {0–100} | {spec → steps, no gaps} |
 | Specification quality | 0.15 | {0–100} | {no placeholders, clear criteria} |
 | Risk & pre-mortem | 0.15 | {0–100} | {failure modes addressed} |
-| Headless readiness | 0.15 | {0–100} | {On failure + Checkpoint per step} |
+| Headless readiness | 0.10 | {0–100} | {On failure + Checkpoint per step} |
+| Manifest quality | 0.05 | {0–100} | {all steps have valid, checkable manifests} |
 | **Weighted total** | **1.00** | **{score}** | **Grade: {A/B/C/D}** |

 **Adversarial review:**
--- a/plugins/ultraplan-local/templates/session-spec-template.md
+++ b/plugins/ultraplan-local/templates/session-spec-template.md
@ -20,8 +20,61 @@ the purpose and make judgment calls.}
 - **Touch:** {explicit list of files this session may create or modify}
 - **Never touch:** {files that belong to other sessions — hard boundary}

+## Session Manifest
+
+Machine-readable aggregate of all step manifests in this session. Used by
+ultraexecute-local for independent Phase 7.5 audit.
+
+```yaml
+session_manifest:
+  plan_version: "1.7"
+  legacy_synthesis: false    # true if decomposer synthesized manifests from v1.6 plan
+  expected_paths:            # union across all steps (deduplicated)
+    - {path from step N}
+    - {path from step M}
+  commit_count: {N}          # number of implementation steps (excludes Step 0)
+  commit_message_patterns:   # in step order; Step 0 omitted
+    - "^feat\\(scope\\):"
+    - "^fix\\(scope\\):"
+  bash_syntax_check: []      # union of step bash_syntax_check
+  scope_touch: []            # from Scope Fence Touch
+  scope_forbidden: []        # Never touch + union of step forbidden_paths
+```
+
 ## Steps

+### Step 0: Sandbox pre-flight (auto-generated — do not modify)
+
+- **Files:** none (read-only test)
+- **Changes:** verify git push permissions are available in this sandbox
+- **Verify:**
+  ```
+  git push --dry-run origin HEAD 2>&1 | tee /tmp/push-dryrun-$$.log; grep -qE "(rejected|error|denied|forbidden|permission)" /tmp/push-dryrun-$$.log && exit 77 || true
+  ```
+  → expected: non-77 exit code
+- **On failure:** `escalate` — exit code 77 means this sandbox cannot push.
+  Abort immediately; do not attempt any work. Main orchestrator will
+  re-spawn with correct permissions.
+- **Checkpoint:** none (no file changes)
+- **Manifest:**
+  ```yaml
+  manifest:
+    expected_paths: []
+    min_file_count: 0
+    commit_message_pattern: ""
+    bash_syntax_check: []
+    forbidden_paths: []
+    must_contain: []
+    sandbox_preflight: true
+  ```
+
+*Step 0 runs in the same sandbox as all real work. If it exits 77,
+ultraexecute-local marks the session `blocked` and does NOT proceed. This
+catches the fail-late push-denial mode observed in Wave 1.*
+
+*Escape hatch:* set `ULTRAEXECUTE_SKIP_PREFLIGHT=1` in the environment to
+bypass Step 0 (use only for offline/air-gapped testing).
+
 ### Step 1: {description}

 - **Files:** `{path}`
@ -31,10 +84,21 @@ the purpose and make judgment calls.}
 - **Verify:** `{exact command}` → expected: `{output}`
 - **On failure:** {revert | retry | skip | escalate} — {specific instructions}
 - **Checkpoint:** `git commit -m "{message}"`
+- **Manifest:**
+  ```yaml
+  manifest:
+    expected_paths:
+      - {path}
+    min_file_count: 1
+    commit_message_pattern: "^feat\\(scope\\):"
+    bash_syntax_check: []
+    forbidden_paths: []
+    must_contain: []
+  ```

 ### Step 2: {description}

-{same structure as Step 1}
+{same structure as Step 1, including Manifest block}

 ## Exit Condition

@ -78,3 +142,14 @@ introduced. This section bridges sessions — it's the "baton" in a relay race.}
 - **Steps from plan:** {step N}–{step M}
 - **Estimated complexity:** {low | medium | high}
 - **Model recommendation:** {opus | sonnet} — {rationale}
+
+## Recovery Metadata
+
+*This section is populated only when this session spec was generated by the
+ultraexecute-local Phase 7.6 recovery dispatcher. Omit for normal sessions.*
+
+- **Recovery of:** `{original session spec path}`
+- **Recovery depth:** {1 | 2}
+- **Missing steps (reason for recovery):** {step numbers + drift summary}
+- **Entry condition override:** {e.g., "previous partial session committed at {sha}"}
+- **Parent progress file:** `{path to .ultraexecute-progress-*.json}`