feat(ultraplan-local): v1.7.0 — self-verifying plan chain

Wave 1 of a 6-session parallel build revealed three failure modes: (1) hallucinated completion (status=completed after 2/5 steps, last tool call was an arbitrary file review), (2) fail-late bash (3/6 sessions had push blocked inside sub-agent sandbox after all work was done), (3) no objective verification (plans were prose). v1.7 closes all three by making the plan an executable contract. Per-step YAML manifest (expected_paths, commit_message_pattern, bash_syntax_check, forbidden_paths, must_contain) is the objective completion predicate. Plan-critic dimension 10 (Manifest quality) is a hard gate. Session decomposer propagates manifests verbatim and emits an obligatory Step 0 pre-flight (git push --dry-run, exit 77 sentinel) in every session spec. ultraexecute-local gets Phase 7.5 (independent manifest audit from git log + filesystem, ignoring agent bookkeeping) and Phase 7.6 (bounded recovery dispatch, recovery_depth ≤ 2). Hard Rule 17 forbids marking a step passed without manifest verification. Hard Rule 18 forbids ending on an arbitrary tool call before reporting. Division of labor is made explicit: - /ultraresearch-local gathers context (no build decisions) - /ultraplan-local produces an executable contract (manifests, plan-critic gate) - /ultraexecute-local executes disciplined (does NOT compensate for weak plans — escalates) Code complete. Docs partial (Arbeidsdeling table + manifest section added to plugin + marketplace READMEs). Verification tests (10-sequence) pending — see REMEMBER.md. Backward compat: v1.6 plans without plan_version marker get legacy mode with synthesized manifests and legacy_plan: true in progress file. Plan-critic emits advisory, not block. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-12 07:38:16 +02:00 · 2026-04-12 07:38:16 +02:00 · d1befac35a
commit d1befac35a
parent 72f2e8f6c9
11 changed files with 651 additions and 27 deletions
--- a/plugins/ultraplan-local/.claude-plugin/plugin.json
+++ b/plugins/ultraplan-local/.claude-plugin/plugin.json
@ -1,7 +1,7 @@
 {
  "name": "ultraplan-local",
  "description": "Deep implementation planning and research with interview, specialized agent swarms, external research, triangulation, adversarial review, session decomposition, and headless execution support.",
-  "version": "1.6.0",
+  "version": "1.7.0",
  "author": {
    "name": "Kjell Tore Guttormsen"
  },
--- a/plugins/ultraplan-local/CHANGELOG.md
+++ b/plugins/ultraplan-local/CHANGELOG.md
@ -4,6 +4,90 @@ All notable changes to this project will be documented in this file.

 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).

+## [1.7.0] - 2026-04-12
+
+### The self-verifying plan chain
+
+Wave 1 of a parallel 6-session build revealed three failure modes: (1) a
+session reported `status=completed` after only 2/5 steps — last tool call
+was an arbitrary file review, not a completion check; (2) 3/6 sessions
+had push blocked inside the sub-agent bash sandbox *after* all work was
+done; (3) plans and blueprints were prose, so the orchestrator had no
+machine-readable way to verify completion. v1.7.0 closes all three by
+making the plan itself an executable contract.
+
+### Added
+
+- **Per-step verification manifest** in plan format (`plan_version: 1.7`).
+  Every step now ends with a YAML `manifest:` block declaring
+  `expected_paths`, `min_file_count`, `commit_message_pattern`,
+  `bash_syntax_check`, `forbidden_paths`, `must_contain`. The manifest is
+  the objective completion predicate — the Verify command is necessary but
+  not sufficient.
+- **Plan-critic dimension 10 — Manifest quality (hard gate).** Missing
+  or invalid manifest (unparseable regex, path contradiction, missing
+  block) is a `major` finding. v1.6 plans get a legacy-mode warning
+  instead of a block.
+- **Session Manifest aggregate** in session specs — synthesized by
+  `session-decomposer` as the union of per-step manifests. Gives
+  `ultraexecute-local` a single YAML block per session to audit against.
+- **Step 0: Sandbox pre-flight** — obligatory first step in every
+  generated session spec. Runs `git push --dry-run origin HEAD`; exit 77
+  = sandbox cannot push, session status becomes `blocked` (not `failed`),
+  no real work attempted. Escape hatch: `ULTRAEXECUTE_SKIP_PREFLIGHT=1`.
+- **Launch script pre-flight** — `headless-launch-template.md` adds a
+  `git push --dry-run` check outside the sandbox, before any session
+  spawns, catching credential issues at the earliest possible point.
+- **Phase 7.5 — Manifest audit (independent).** After all steps complete,
+  `ultraexecute-local` re-verifies expected paths, commit count, commit
+  message patterns, bash syntax, and forbidden-path untouched-ness from
+  git log and filesystem. Agent's own bookkeeping is ignored. Disagreement
+  with progress file → status overridden to `partial`.
+- **Phase 7.6 — Recovery dispatch (bounded).** When Phase 7.5 detects
+  drift in multi-session parent context, synthesize a temp session spec
+  containing only missing steps and dispatch via existing
+  `claude -p "/ultraexecute-local --session N"`. `recovery_depth ≤ 2`
+  hard cap — third drift escalates to user.
+- **Hard Rule 17: Manifest is the completion predicate.** A step may
+  not be marked passed if its manifest does not verify, regardless of
+  Verify's exit code.
+- **Hard Rule 18: Last-activity rule.** Executor's final tool call
+  before Phase 8 must be a manifest check, never an arbitrary file
+  review. Prevents hallucinated completion.
+
+### Changed
+
+- **Plan template (`templates/plan-template.md`)** — adds
+  `plan_version: 1.7` metadata line, `Manifest:` field on every step,
+  "Manifest — objective completion predicate" section.
+- **Plan-critic scoring** rebalanced: Headless readiness 0.15 → 0.10,
+  Manifest quality 0.05 added. Legacy v1.6 plans skip the Manifest
+  dimension and keep Headless readiness at 0.15.
+- **Planning-orchestrator Phase 5** adds "Manifest generation rules
+  (REQUIRED for every step)" with mechanical derivation from `Files:`
+  and Checkpoint. Validates regex compilation and path existence before
+  handoff to plan-critic.
+- **Session-decomposer** parses plan manifests and propagates them
+  verbatim into session specs. For v1.7+ plans with missing manifests:
+  abort with pointer to failing step. For legacy v1.6 plans: synthesize
+  minimal manifests and flag `legacy_synthesis: true`.
+- **ultraexecute-local Phase 2** parses manifest YAML. Ugyldig YAML =
+  abort with pointer to step. v<1.7 plans: synthesize + log
+  `legacy_plan: true`.
+- **ultraexecute-local Phase 6** — sub-step D renamed to D1 "Command
+  verification"; new D2 "Manifest verification" runs after D1 with 5
+  checks. F "Checkpoint" adds `checkpoint_drift` logging when HEAD
+  message doesn't match `commit_message_pattern` (non-fatal).
+- **Phase 8 report** — table gets Manifest column; JSON summary adds
+  `plan_version`, `manifest_audit`, `drift_details`, `recovery_dispatched`,
+  `recovery_depth`, `legacy_plan`. Result vocabulary strict:
+  `completed | partial | blocked | failed | stopped`.
+- **Division of labor clarified** in README — `/ultraresearch-local`
+  gathers context (no decisions), `/ultraplan-local` transforms intent
+  into an executable contract (manifests, plan-critic gate),
+  `/ultraexecute-local` executes the contract disciplined (does NOT
+  compensate for weak plans — escalates).
+
 ## [1.6.0] - 2026-04-08

 ### Added
--- a/plugins/ultraplan-local/README.md
+++ b/plugins/ultraplan-local/README.md
@ -1,6 +1,6 @@
 # ultraplan-local — Plan, Research, Execute

-![Version](https://img.shields.io/badge/version-1.6.0-blue)
+![Version](https://img.shields.io/badge/version-1.7.0-blue)
 ![License](https://img.shields.io/badge/license-MIT-green)
 ![Platform](https://img.shields.io/badge/platform-Claude%20Code-purple)

@ -14,6 +14,22 @@ A [Claude Code](https://docs.anthropic.com/en/docs/claude-code) plugin for deep

 Research first, then plan, then execute. Or skip research and plan directly. Or plan and execute in one flow. The plan is the contract between planning and execution. The research brief is the contract between research and planning.

+### Division of labor
+
+| Command | Responsibility | Output |
+|---|---|---|
+| `/ultraresearch-local` | **Gather context** — code state, external docs, community, risk. Makes NO build decisions. | Research brief (markdown) |
+| `/ultraplan-local` | **Transform intent into an executable contract** — per-step YAML manifest, regex-validated checkpoints, verifiable paths. Plan-critic is a hard gate. | `plan.md` with Manifest blocks + `plan_version: 1.7` |
+| `/ultraexecute-local` | **Execute the contract disciplined** — fresh verification, independent manifest audit, honest reporting. Does NOT compensate for weak plans — escalates. | Progress file + structured report + manifest-audit status |
+
+**Principle:** Execute trusts the plan. If execute has to guess or interpret, the plan is weak and must be revised upstream — not patched downstream.
+
+### Self-verifying plan chain (v1.7)
+
+Every step in a v1.7 plan ends with a YAML `manifest:` block declaring `expected_paths`, `commit_message_pattern`, `bash_syntax_check`, `forbidden_paths`, `must_contain`. This makes the plan the **objective completion predicate**: a step may not be marked passed if its manifest does not verify, regardless of the Verify command's exit code (Hard Rule 17).
+
+After all steps complete, `ultraexecute-local` runs **Phase 7.5 — Manifest audit (independent)**: re-verifies every expected path from git log + filesystem, ignoring the agent's own bookkeeping. Drift → status `partial`, **Phase 7.6** auto-dispatches a bounded recovery session with only the missing steps (`recovery_depth ≤ 2`). Step 0 pre-flight (`git push --dry-run`) runs inside every session sandbox before any real work — exit 77 sentinel catches sandbox push-denial before the agent wastes the whole budget.
+
 No cloud dependency. No GitHub requirement. Works on **Mac, Linux, and Windows**.

 ## Quick start
--- a/plugins/ultraplan-local/agents/plan-critic.md
+++ b/plugins/ultraplan-local/agents/plan-critic.md
@ -113,6 +113,42 @@ Steps missing On failure or Checkpoint clauses are **major** findings
 (not blockers — the plan is still valid for interactive use, but it
 cannot be decomposed into headless sessions).

+### 10. Manifest quality (hard gate)
+
+Manifests are the objective completion predicate. ultraexecute-local uses
+them to determine whether a step is actually done — not just whether the
+Verify command returned 0. A plan without valid manifests cannot drive
+deterministic execution.
+
+Check plans with `plan_version: 1.7` (or later) against these rules:
+
+- Does EVERY step have a `Manifest:` block with YAML content?
+- Are `expected_paths` entries all either existing files OR explicitly marked
+  `(new file)` in the step's Changes prose?
+- Is `expected_paths` a subset of `Files:` (no orphan paths)?
+- Does `commit_message_pattern` compile as a valid regex? (check with a
+  mental regex-parse — e.g., unbalanced `(`, `[` is invalid)
+- Does the `commit_message_pattern` actually match the literal Checkpoint
+  commit message declared in the step?
+- Are all `bash_syntax_check` entries `.sh` files that appear in
+  `expected_paths` (not references to external scripts)?
+- Do `forbidden_paths` avoid overlap with `expected_paths` (contradiction)?
+- Does the step create shell scripts that are NOT listed in
+  `bash_syntax_check`? (minor finding — suggests incomplete manifest)
+
+**Severity:**
+- Missing Manifest block on any step → **major** (same tier as missing On failure)
+- Invalid regex in commit_message_pattern → **major**
+- Pattern doesn't match declared Checkpoint → **major**
+- `expected_paths` references non-existent path not marked new → **major**
+- `forbidden_paths` overlaps `expected_paths` → **blocker** (contradiction)
+- Missing bash_syntax_check for declared `.sh` files → **minor**
+
+**Backward compat:** For plans without `plan_version: 1.7` (legacy), emit
+a single advisory note ("Plan is v1.6 legacy format — manifests will be
+synthesized by ultraexecute with reduced audit precision") and skip this
+dimension's scoring.
+
 ## Rating system

 Rate each finding:
@ -131,10 +167,15 @@ After reviewing all findings, produce a quantitative score:
 | Coverage completeness | 0.20 | Spec-to-steps mapping, no gaps |
 | Specification quality | 0.15 | No placeholders, clear criteria |
 | Risk & pre-mortem | 0.15 | Failure modes addressed, mitigations realistic |
-| Headless readiness | 0.15 | On failure clauses, checkpoints, circuit breakers |
+| Headless readiness | 0.10 | On failure clauses, checkpoints, circuit breakers |
+| Manifest quality | 0.05 | Every step has a valid, checkable manifest (v1.7+) |

 Score each dimension 0–100, then compute the weighted total.

+**Weighting note (v1.7):** Headless readiness reduced 0.15→0.10, Manifest
+quality added at 0.05. Total still 1.00. For legacy v1.6 plans, Manifest
+quality is not scored and Headless readiness returns to 0.15.
+
 **Grade thresholds:**
 - **A** (90–100): APPROVE
 - **B** (75–89): APPROVE_WITH_NOTES
@ -166,7 +207,8 @@ Score each dimension 0–100, then compute the weighted total.
 | Coverage completeness | 0.20 | {0–100} | {assessment} |
 | Specification quality | 0.15 | {0–100} | {assessment} |
 | Risk & pre-mortem | 0.15 | {0–100} | {assessment} |
-| Headless readiness | 0.15 | {0–100} | {assessment} |
+| Headless readiness | 0.10 | {0–100} | {assessment} |
+| Manifest quality | 0.05 | {0–100} | {assessment — omit for legacy v1.6} |
 | **Weighted total** | **1.00** | **{score}** | **Grade: {A/B/C/D}** |

 ## Summary
--- a/plugins/ultraplan-local/agents/planning-orchestrator.md
+++ b/plugins/ultraplan-local/agents/planning-orchestrator.md
@ -186,6 +186,50 @@ Write a comprehensive implementation plan including:
 - Test Strategy (if test-strategist was used)
 - Verification (concrete commands), Estimated Scope

+**Plan-version header:** Include `plan_version: 1.7` in the metadata line below
+the title. This signals to ultraexecute-local that the plan includes per-step
+verification manifests and enables strict audit mode. Plans without this
+marker are treated as legacy v1.6 with synthesized minimal manifests.
+
+### Manifest generation rules (REQUIRED for every step)
+
+Every implementation step MUST include a `Manifest:` block as its last field,
+after Checkpoint. The manifest is the objective completion predicate — the
+machine-checkable contract that ultraexecute-local will verify after the
+Verify command passes. A step cannot be marked passed if its manifest does
+not verify.
+
+Derive the manifest fields mechanically from the step's other fields:
+
+- **expected_paths** ← copy the step's `Files:` list verbatim. Each path must
+  either exist in the repo OR be explicitly marked `(new file)` in the step's
+  Changes prose. Do not list paths that neither exist nor are declared new.
+- **min_file_count** ← default to `len(expected_paths)`. Lower only when the
+  step explicitly allows partial creation (rare).
+- **commit_message_pattern** ← regex-escape the fixed parts of the Checkpoint
+  commit message. Preserve Conventional Commit structure. Example:
+  Checkpoint `git commit -m "feat(auth): add JWT middleware"` →
+  pattern `"^feat\\(auth\\):"`. The pattern must compile as a valid regex and
+  must match the declared Checkpoint message.
+- **bash_syntax_check** ← auto-include every `.sh` file appearing in
+  expected_paths. Add other shell scripts the step creates transitively.
+- **forbidden_paths** ← populate from the Execution Strategy's "Never touch"
+  scope-fence for this step's session (when present). Defense-in-depth.
+- **must_contain** ← optional. Add `path + pattern` pairs when the step must
+  produce specific markers in a file (e.g., a new config section, a required
+  export, a migration boundary).
+
+**Validation before writing plan:**
+1. Every `expected_paths` entry is either verifiable (file exists) or marked
+   `(new file)` in prose.
+2. Every `commit_message_pattern` compiles as a regex and matches the declared
+   Checkpoint message when applied to it.
+3. Every `bash_syntax_check` entry has a `.sh` suffix and appears in
+   `expected_paths`.
+4. No `forbidden_paths` overlaps with `expected_paths` (contradiction).
+
+If any validation fails, fix the plan before handing to Phase 6 review.
+
 ### Failure recovery (REQUIRED for every step)

 Each implementation step MUST include:
@ -237,12 +281,19 @@ Create directories if needed.
 Launch two review agents **in parallel**:

 - `plan-critic` — find missing steps, wrong ordering, fragile assumptions,
-  missing error handling, scope creep, underspecified steps
+  missing error handling, scope creep, underspecified steps, AND manifest
+  quality (dimension 10: every step has a valid, regex-compilable,
+  path-verified manifest). Missing or invalid manifest = **major** finding.
 - `scope-guardian` — verify plan matches spec requirements, find scope
  creep and scope gaps, validate file/function references

 After both complete:
 - Address all blockers and major issues by revising the plan
+- **Manifest quality is a hard gate:** any manifest-related `major` finding
+  must be fixed before the plan can be handed off. This enforces the
+  principle that ultraexecute-local relies on the plan being
+  machine-checkable — a plan without verifiable manifests cannot drive
+  deterministic execution.
 - Add a "Revisions" note at the bottom documenting changes

 ### Phase 7 — Completion
--- a/plugins/ultraplan-local/agents/session-decomposer.md
+++ b/plugins/ultraplan-local/agents/session-decomposer.md
@ -50,9 +50,23 @@ Extract from the plan:
 3. Per-step dependencies (explicit or implicit from step ordering)
 4. Per-step verification commands
 5. Per-step failure recovery (if present)
-6. The overall verification section
-7. Context and codebase analysis sections
-8. Check for an existing `## Execution Strategy` section
+6. **Per-step verification manifest (v1.7+)** — the `Manifest:` YAML block
+   following Checkpoint. Parse it as YAML. Preserve all fields:
+   `expected_paths`, `min_file_count`, `commit_message_pattern`,
+   `bash_syntax_check`, `forbidden_paths`, `must_contain`.
+7. The overall verification section
+8. Context and codebase analysis sections
+9. The `plan_version` marker (if present in the header line)
+10. Check for an existing `## Execution Strategy` section
+
+**Manifest handling:**
+- If `plan_version: 1.7` or later AND any step is missing a Manifest block:
+  STOP with error "Plan claims v1.7 but step N lacks Manifest. Re-run
+  planning-orchestrator." Do not attempt to synthesize.
+- If no `plan_version` marker is present: treat as legacy v1.6. Synthesize
+  minimal manifests from `Files:` (expected_paths) and the Checkpoint commit
+  message (commit_message_pattern escaped). Mark output session specs with
+  `legacy_synthesis: true` in their Session Manifest.

 **If an Execution Strategy already exists:**
 - Log: "Existing Execution Strategy detected — using as primary input."
@ -135,6 +149,59 @@ For each session, write a spec file to the output directory:
 5. **Failure handling** — what to do on failure (copied from plan's On failure fields,
   or default to "stop and report")
 6. **Handoff state** — what this session produces that other sessions need
+7. **Per-step Manifest blocks** — copy each plan step's Manifest YAML verbatim
+   into the corresponding session-spec step. Do NOT edit or summarize.
+8. **Session Manifest aggregate** — synthesize a top-level `## Session Manifest`
+   block aggregating all per-step manifests in the session:
+   - `expected_paths`: union of all steps' expected_paths (deduplicated)
+   - `commit_count`: number of implementation steps in this session (excludes Step 0)
+   - `commit_message_patterns`: list of per-step patterns, in step order
+   - `bash_syntax_check`: union of all steps' bash_syntax_check
+   - `scope_touch`: from Scope Fence Touch (already present)
+   - `scope_forbidden`: from Scope Fence Never Touch + union of step
+     forbidden_paths
+   - `plan_version`: from the source plan
+   - `legacy_synthesis`: true/false based on Step 1's handling
+
+### Step 5.5 — Emit obligatory Step 0 pre-flight
+
+Every generated session spec MUST begin its `## Steps` list with a synthetic
+**Step 0: Sandbox pre-flight** that validates the subagent bash sandbox can
+reach the remote before any real work is done. This catches the fail-late
+push-denial observed in Wave 1 (3/6 sessions all lost their pushes at the
+very end).
+
+The Step 0 block to prepend verbatim:
+
+```markdown
+### Step 0: Sandbox pre-flight (auto-generated — do not modify)
+
+- **Files:** none (read-only test)
+- **Changes:** verify git push permissions are available in this sandbox
+- **Verify:**
+  ```
+  git push --dry-run origin HEAD 2>&1 | tee /tmp/push-dryrun-$$.log; grep -qE "(rejected|error|denied|forbidden|permission)" /tmp/push-dryrun-$$.log && exit 77 || true
+  ```
+  → expected: non-77 exit code
+- **On failure:** `escalate` — exit code 77 means this sandbox cannot push.
+  Abort immediately; do not attempt any work. Main orchestrator will
+  re-spawn with correct permissions.
+- **Checkpoint:** none (no file changes)
+- **Manifest:**
+  ```yaml
+  manifest:
+    expected_paths: []
+    min_file_count: 0
+    commit_message_pattern: ""
+    bash_syntax_check: []
+    forbidden_paths: []
+    must_contain: []
+    sandbox_preflight: true
+  ```
+```
+
+Do NOT skip Step 0 for any session. It is the only early-detection mechanism
+for sandbox-blocked bash.

 ### Step 6 — Generate the dependency diagram

--- a/plugins/ultraplan-local/commands/ultraexecute-local.md
+++ b/plugins/ultraplan-local/commands/ultraexecute-local.md
@ -87,10 +87,48 @@ Extract every `### Step N: {description}` heading (in order). For each step, ext
 - **Verify** — command to run after implementation
 - **On failure** — recovery action (revert/retry/skip/escalate)
 - **Checkpoint** — git commit command after success
+- **Manifest** — YAML block following Checkpoint (v1.7+)

 If a step is missing `On failure`, default to `escalate` and record a parse warning.
 If a step is missing `Verify`, record a parse warning.

+### Parse the Manifest block (v1.7+)
+
+Read the plan header for a `plan_version` marker. Treat ≥ `1.7` as strict
+mode; absence or <1.7 as legacy mode.
+
+**Strict mode (plan_version ≥ 1.7):**
+- Every step MUST have a `Manifest:` block with a YAML fenced payload.
+- Parse the YAML. Required keys: `expected_paths` (list), `min_file_count`
+  (integer), `commit_message_pattern` (string), `bash_syntax_check` (list),
+  `forbidden_paths` (list), `must_contain` (list of `{path, pattern}` dicts).
+- Validate each `commit_message_pattern` compiles as a regex (use a tolerant
+  parser — do not execute untrusted input against shell).
+- If any step is missing its Manifest, or YAML is malformed, STOP with:
+  ```
+  Error: plan_version=1.7 but step {N} has invalid/missing Manifest.
+  Re-run planning-orchestrator — plan is not executable.
+  ```
+
+**Legacy mode (no plan_version, or < 1.7):**
+- Synthesize a minimal manifest per step:
+  - `expected_paths` ← step's `Files:` list
+  - `min_file_count` ← `len(expected_paths)`
+  - `commit_message_pattern` ← regex-escape first 3 words of Checkpoint msg
+  - `bash_syntax_check` ← auto-detect `.sh` in Files
+  - `forbidden_paths` ← []
+  - `must_contain` ← []
+- Record `legacy_plan: true` in the progress file.
+- Emit warning: `Legacy plan (v1.6 or earlier) — manifests synthesized with
+  reduced audit precision.`
+
+### Parse Session Manifest (session specs only)
+
+If the file is a session spec (v1.7+), parse the `## Session Manifest` block
+into a YAML dict. Preserve `session_manifest.*` fields for Phase 7.5 audit.
+If missing and `plan_version ≥ 1.7`, record a parse warning but continue —
+the per-step manifests are still available for audit.
+
 ### Parse session spec fields (if applicable)

 - **Entry condition** from `## Dependencies`
@ -721,7 +759,7 @@ Read the step's `Files:` and `Changes:` fields. Implement exactly as described.
 - If `Reuses:` references existing code, read that code first for context
 - Only touch files listed in `Files:` — nothing else

-#### Sub-step D — Verification
+#### Sub-step D1 — Command verification

 **Security check (mandatory):** Before running the Verify command, check it against
 the executor security denylist. If the command matches ANY of these patterns,
@ -763,7 +801,50 @@ Result: {PASS | FAIL} (exit code {N})
 {if FAIL}: Output (first 10 lines): {output}
 ```

-If **PASS**: proceed to Sub-step F (checkpoint).
+**Step 0 sentinel:** if the Verify command exits `77` AND the step's manifest
+has `sandbox_preflight: true`, do NOT treat as a normal failure. Set step
+status = `blocked` and apply `On failure: escalate`. Skip D2 manifest
+verification (nothing to verify for a read-only sandbox test). Commit none —
+jump straight to Phase 7 with a structured "sandbox-blocked" reason.
+
+If **PASS**: proceed to Sub-step D2 (manifest verification).
+
+#### Sub-step D2 — Manifest verification
+
+After the Verify command passes, verify the step's Manifest block. This is
+the objective completion predicate: a step is not passed until its manifest
+holds, regardless of Verify exit code.
+
+**Checks to run (in order):**
+
+1. **Expected paths exist:** for each `expected_paths` entry, verify the
+   file exists in the repo. Count how many exist.
+2. **min_file_count satisfied:** count from (1) must be ≥ `min_file_count`.
+3. **Forbidden paths untouched:** for each `forbidden_paths` entry, run
+   `git diff --name-only HEAD~{attempts} HEAD -- {path}` (since this step
+   began). Any modified forbidden path = fail.
+4. **Bash syntax check:** for each entry in `bash_syntax_check` AND any
+   `.sh` file that appears in `git diff --name-only HEAD~1 HEAD` (safety
+   net for unlisted scripts), run `bash -n {path}`. Non-zero exit = fail.
+5. **must_contain patterns:** for each `{path, pattern}` pair, run
+   `grep -E "{pattern}" {path}`. No match = fail.
+
+**On manifest failure:**
+
+```
+Manifest verification FAIL for step {N}:
+- {which check failed with detail}
+```
+
+Apply the step's `On failure:` clause (same as Sub-step D1 failure). Increment
+attempts. Manifest failures count against the retry cap equally with command
+failures.
+
+If all manifest checks pass: proceed to Sub-step F (checkpoint).
+
+Record per-step manifest audit result in progress file:
+`steps.{N}.manifest_audit = "pass" | "fail"`,
+`steps.{N}.manifest_drift = [{check: reason, ...}]` on fail.

 #### Sub-step E — On failure handling

@ -816,8 +897,15 @@ The step's verification already passed — the commit is bookkeeping.
 Step {N}: PASS (committed: {hash})
 ```

+**Commit-message-pattern check:** after the commit succeeds, read the HEAD
+commit message (`git log -1 --pretty=%s`) and match it against the step's
+`commit_message_pattern`. Mismatch does NOT fail the step (verification
+already passed) but is recorded as `checkpoint_drift: {expected_pattern,
+actual_message}` in the progress file. Phase 7.5 audit reports drift as
+an advisory.
+
 Update progress: `steps.{N}.status = "passed"`, `steps.{N}.commit = {hash}`,
-`steps.{N}.completed_at = now`.
+`steps.{N}.completed_at = now`, `steps.{N}.manifest_audit = "pass"`.

 ### Step mode exit

@ -839,6 +927,96 @@ Exit condition check:
 If all pass: `exit_condition_checked: true` in progress file.
 If any fail: record which failed. Include in final report.

+## Phase 7.5 — Manifest audit (independent)
+
+**Runs for all modes except dry-run.** This is the last-line-of-defense
+check: it ignores the executor's own per-step bookkeeping and re-verifies
+session-wide state directly from the filesystem and git log. If the audit
+disagrees with the executor's self-report, the audit wins.
+
+This phase exists because agents can hallucinate completion. Phase 7.5
+produces an independent truth based on objective state, not self-narrative.
+
+**Steps:**
+
+1. **Enumerate expected paths:** aggregate `expected_paths` across every
+   step manifest (and the session_manifest, if present). Deduplicate.
+
+2. **Filesystem check:** for each expected_path, confirm the file exists.
+
+3. **Commit-count check:** count commits since session-start
+   (`git rev-list --count {start_sha}..HEAD`). Expected count =
+   number of steps with `status=passed` (excludes Step 0 if sandbox_preflight).
+
+4. **Commit-message pattern sweep:** walk `git log {start_sha}..HEAD` and
+   confirm each commit matches one of the declared
+   `commit_message_patterns` (in any order — order is advisory).
+
+5. **Bash syntax sweep:** for every `.sh` file in
+   `git diff --name-only {start_sha} HEAD`, run `bash -n`. Collect failures.
+
+6. **Forbidden-path sweep:** for each `scope_forbidden` entry, check
+   `git diff --name-only {start_sha} HEAD -- {path}` is empty.
+
+Compare the audit result against the executor's `progress.status`:
+
+- **Audit pass + progress=completed:** status stays `completed`.
+- **Audit fail + progress=completed:** OVERRIDE to `partial`. The executor
+  believed it was done; the filesystem says otherwise. This is the Wave 1
+  hallucination case — the audit is the defense.
+- **Audit fail + progress=failed/stopped:** status stays as-is; drift is
+  informational.
+
+Record in progress file:
+- `manifest_audit.status = "pass" | "drift"`
+- `manifest_audit.drift_details = [{check, expected, actual}, ...]`
+
+## Phase 7.6 — Recovery dispatch (multi-session parent context only)
+
+**Preconditions:**
+- This is the parent ultraexecute invocation (not a child `--session N`)
+- Phase 7.5 reported `drift`
+- `recovery_depth < 2` (hard cap to prevent infinite loops)
+
+**Skip otherwise.** Recovery in child context or at depth 2+ escalates to
+the user for manual resolution.
+
+**Synthesize a recovery session spec:**
+
+1. Determine the missing step numbers from `manifest_audit.drift_details`
+   (steps whose expected_paths are absent or whose commits are missing).
+2. Read the original session spec. Copy only the missing steps (preserving
+   their Manifest blocks) into a new file:
+   `{output_dir}/session-{N}-recovery-{depth}.md`
+3. Populate `## Recovery Metadata`:
+   - `recovery_of: {original session spec path}`
+   - `recovery_depth: {current depth + 1}`
+   - `missing_steps: [N, M, ...]`
+   - `entry_condition_override: "previous partial session committed at {sha}"`
+   - `parent_progress_file: {path}`
+4. Prepend the synthetic Step 0 pre-flight (same as normal session specs).
+5. Set `recovery_dispatched = true` in parent progress file.
+
+**Invoke the recovery session:**
+
+```bash
+cd "$WORKTREE_PATH" && claude -p "/ultraexecute-local --session {N} {recovery spec path}" \
+  --allowedTools "Read,Write,Edit,Bash,Glob,Grep" \
+  --permission-mode bypassPermissions \
+  > "$LOG_DIR/session-{N}-recovery-{depth}.log" 2>&1
+```
+
+Wait for the recovery session to complete. After it returns, re-run Phase
+7.5 audit one more time. If it still drifts at `recovery_depth=2`:
+
+```
+RECOVERY EXHAUSTED: session {N} drifted after 2 recovery attempts.
+Missing: {steps and details}
+Status: partial (recovery_depth=2, escalated to user)
+```
+
+Do NOT dispatch a third recovery. Report to the user.
+
 ## Phase 8 — Final report

 Always produce a final report.
@ -855,11 +1033,20 @@ Update progress file: `status` to `completed`/`failed`/`stopped`, `updated_at`,

 ### Step Results

-| Step | Description | Result | Attempts | Commit |
-|------|-------------|--------|----------|--------|
-| 1 | {desc} | PASS | 1 | abc1234 |
-| 2 | {desc} | FAIL | 3 | — |
-| 3 | {desc} | — | 0 | — |
+| Step | Description | Result | Attempts | Commit | Manifest |
+|------|-------------|--------|----------|--------|----------|
+| 0 | Sandbox pre-flight | PASS | 1 | — | n/a |
+| 1 | {desc} | PASS | 1 | abc1234 | pass |
+| 2 | {desc} | FAIL | 3 | — | — |
+| 3 | {desc} | — | 0 | — | — |
+
+### Manifest Audit (Phase 7.5)
+
+- **Status:** {pass | drift}
+- **Drift details:** {enumerated; empty on pass}
+- **Recovery dispatched:** {true | false}
+- **Recovery depth:** {N}
+- **Legacy plan:** {true | false}

 ### Summary

@ -867,6 +1054,7 @@ Update progress file: `status` to `completed`/`failed`/`stopped`, `updated_at`,
 - Skipped: {N}
 - Failed: {N}
 - Not reached: {N}
+- Blocked (sandbox): {N}

 {if all passed + exit condition passed}:
 All steps completed. Exit condition: PASS.
@ -886,6 +1074,14 @@ Attempts: {N}
 To resume: /ultraexecute-local --resume {path}
 ```

+**Result vocabulary (v1.7, strict):**
+- `completed` — all steps passed AND Phase 7.5 manifest audit passed
+- `partial` — steps passed per executor but Phase 7.5 found drift, OR
+  Phase 7.6 recovery incomplete
+- `blocked` — Step 0 sandbox pre-flight exited 77; no real work attempted
+- `failed` — a step failed and On failure was revert/retry (retry cap hit)
+- `stopped` — On failure: escalate triggered
+
 **JSON summary block** (always at the end, machine-parseable):

 ```json
@ -893,14 +1089,21 @@ To resume: /ultraexecute-local --resume {path}
  "ultraexecute_summary": {
    "plan": "{path}",
    "plan_type": "{plan | session-spec}",
-    "result": "{completed | failed | stopped | partial}",
+    "plan_version": "{1.7 | 1.6 | legacy}",
+    "result": "{completed | partial | blocked | failed | stopped}",
    "steps_total": 0,
    "steps_passed": 0,
    "steps_failed": 0,
    "steps_skipped": 0,
    "steps_not_reached": 0,
+    "steps_blocked": 0,
    "failed_at_step": null,
    "exit_condition": "{pass | fail | skipped | n/a}",
+    "manifest_audit": "{pass | drift | n/a}",
+    "drift_details": [],
+    "recovery_dispatched": false,
+    "recovery_depth": 0,
+    "legacy_plan": false,
    "progress_file": "{path}"
  }
 }
@ -995,3 +1198,18 @@ Never let stats failures block the workflow.
    (git hook injection), `~/.ssh/`, `~/.aws/`, `~/.gnupg/`, `.env` files,
    shell configs (`~/.zshrc`, `~/.bashrc`, `~/.profile`), or
    `.claude/settings.json` / `.claude/hooks/` (infrastructure self-modification).
+
+17. **Manifest is the completion predicate.** A step may not be marked passed
+    if its manifest does not verify, regardless of the Verify command's exit
+    code. The manifest is the objective contract — Verify is necessary but
+    not sufficient. For v1.7+ plans: a Verify pass with a manifest fail sets
+    step result to `failed` and triggers the On-failure clause. For legacy
+    v1.6 plans: synthesized manifests apply with the same force, but
+    `legacy_plan: true` is logged in progress.
+
+18. **Last-activity rule.** The executor's final tool call before writing
+    Phase 8 must be a manifest check (Phase 7.5 audit), never an arbitrary
+    file review. This prevents the "hallucinated completion" failure mode
+    where a transcript ends on an unrelated Read and the agent self-reports
+    `completed` without verifying. If Phase 7.5 has not run, the executor
+    may not emit `result: completed` under any circumstances.
--- a/plugins/ultraplan-local/templates/headless-launch-template.md
+++ b/plugins/ultraplan-local/templates/headless-launch-template.md
@ -47,6 +47,27 @@ if [ -n "$(git status --porcelain)" ]; then
  exit 1
 fi

+# Pre-flight: verify remote push permissions (catches credential/auth issues
+# BEFORE spawning sessions). Sub-agent bash sandbox may have different
+# credentials than the launching shell — Step 0 in each session spec handles
+# the sandbox-side detection. Set ULTRAEXECUTE_SKIP_PREFLIGHT=1 for offline
+# or air-gapped testing.
+if [ "${ULTRAEXECUTE_SKIP_PREFLIGHT:-0}" != "1" ]; then
+  if ! git push --dry-run origin HEAD >/tmp/push-dryrun-launch.log 2>&1; then
+    echo "ERROR: git push --dry-run failed. Sessions will be unable to push."
+    cat /tmp/push-dryrun-launch.log
+    echo ""
+    echo "Fix remote credentials before running parallel execution, or set"
+    echo "ULTRAEXECUTE_SKIP_PREFLIGHT=1 to bypass (offline/air-gapped only)."
+    exit 1
+  fi
+  if grep -qE "(rejected|denied|forbidden|permission)" /tmp/push-dryrun-launch.log; then
+    echo "ERROR: git push --dry-run reports rejection. Sessions will fail at commit time."
+    cat /tmp/push-dryrun-launch.log
+    exit 1
+  fi
+fi
+
 echo "=== Ultraplan Headless Execution (Worktree-Isolated) ==="
 echo "Plan: {plan_path}"
 echo "Sessions: {total_sessions}"
--- a/plugins/ultraplan-local/templates/plan-template.md
+++ b/plugins/ultraplan-local/templates/plan-template.md
@ -2,7 +2,7 @@

 > **Plan quality: {grade}** ({score}/100) — {APPROVE | APPROVE_WITH_NOTES | REVISE | REPLAN}
 >
-> Generated by ultraplan-local v{version} on {YYYY-MM-DD}
+> Generated by ultraplan-local v{version} on {YYYY-MM-DD} — `plan_version: 1.7`

 ## Context

@ -56,6 +56,17 @@ when the project has tests.
 - **Verify:** `{exact command}` → expected: `{output}`
 - **On failure:** {revert | retry | skip | escalate} — {specific instructions}
 - **Checkpoint:** `git commit -m "{conventional commit message}"`
+- **Manifest:**
+  ```yaml
+  manifest:
+    expected_paths:
+      - path/to/file.ts
+    min_file_count: 1
+    commit_message_pattern: "^feat\\(scope\\):"
+    bash_syntax_check: []
+    forbidden_paths: []
+    must_contain: []
+  ```

 ### Step 2: {description}

@ -69,10 +80,43 @@ when the project has tests.
 - **Verify:** `{exact command}` → expected: `{output}`
 - **On failure:** {revert | retry | skip | escalate} — {specific instructions}
 - **Checkpoint:** `git commit -m "{conventional commit message}"`
+- **Manifest:**
+  ```yaml
+  manifest:
+    expected_paths:
+      - path/to/file.ts
+    min_file_count: 1
+    commit_message_pattern: "^feat\\(scope\\):"
+    bash_syntax_check: []
+    forbidden_paths: []
+    must_contain:
+      - path: path/to/file.ts
+        pattern: "expected content marker"
+  ```

 *For projects without tests: omit "Test first" and keep "Verify" with a
 concrete command (e.g., run the app, check output, curl an endpoint).*

+### Manifest — objective completion predicate
+
+Every step MUST have a Manifest block. This is the machine-checkable contract
+that ultraexecute-local verifies after the Verify command passes. A step is
+not considered complete until its manifest verifies — regardless of Verify
+command exit code.
+
+- **expected_paths** — files that must exist after this step. Existing files
+  must be present in repo; new files must be marked `(new file)` in prose.
+- **min_file_count** — minimum number of expected_paths that must exist.
+  Typically equal to `len(expected_paths)`.
+- **commit_message_pattern** — regex that MUST match the HEAD commit message
+  after Checkpoint runs. Use escaped regex syntax (e.g., `\\(scope\\)`).
+- **bash_syntax_check** — list of `.sh` files that must pass `bash -n`.
+  Auto-include any `.sh` in expected_paths.
+- **forbidden_paths** — files this step must NOT modify (defense-in-depth
+  beyond Scope Fence).
+- **must_contain** — optional grep assertions: `path` + `pattern` pairs that
+  must match in created/modified files.
+
 ### Failure recovery rules

 - **On failure: revert** — undo this step's changes (`git checkout -- {files}`), do NOT proceed
@ -121,7 +165,10 @@ before execution.*

 ## Verification

-End-to-end checks that prove the plan was implemented correctly.
+*Per-step manifest verification runs automatically during execution (every
+step's Manifest block is objectively checked by ultraexecute-local before the
+step is marked passed). This section is for end-to-end integration checks
+that cross step boundaries — complete workflows, system-level behavior.*

 - [ ] `{exact command}` → expected: `{exact output or behavior}`
 - [ ] `{exact command}` → expected: `{exact output or behavior}`
@ -179,7 +226,8 @@ later waves depend on earlier waves completing first.*
 | Coverage completeness | 0.20 | {0–100} | {spec → steps, no gaps} |
 | Specification quality | 0.15 | {0–100} | {no placeholders, clear criteria} |
 | Risk & pre-mortem | 0.15 | {0–100} | {failure modes addressed} |
-| Headless readiness | 0.15 | {0–100} | {On failure + Checkpoint per step} |
+| Headless readiness | 0.10 | {0–100} | {On failure + Checkpoint per step} |
+| Manifest quality | 0.05 | {0–100} | {all steps have valid, checkable manifests} |
 | **Weighted total** | **1.00** | **{score}** | **Grade: {A/B/C/D}** |

 **Adversarial review:**
--- a/plugins/ultraplan-local/templates/session-spec-template.md
+++ b/plugins/ultraplan-local/templates/session-spec-template.md
@ -20,8 +20,61 @@ the purpose and make judgment calls.}
 - **Touch:** {explicit list of files this session may create or modify}
 - **Never touch:** {files that belong to other sessions — hard boundary}

+## Session Manifest
+
+Machine-readable aggregate of all step manifests in this session. Used by
+ultraexecute-local for independent Phase 7.5 audit.
+
+```yaml
+session_manifest:
+  plan_version: "1.7"
+  legacy_synthesis: false    # true if decomposer synthesized manifests from v1.6 plan
+  expected_paths:            # union across all steps (deduplicated)
+    - {path from step N}
+    - {path from step M}
+  commit_count: {N}          # number of implementation steps (excludes Step 0)
+  commit_message_patterns:   # in step order; Step 0 omitted
+    - "^feat\\(scope\\):"
+    - "^fix\\(scope\\):"
+  bash_syntax_check: []      # union of step bash_syntax_check
+  scope_touch: []            # from Scope Fence Touch
+  scope_forbidden: []        # Never touch + union of step forbidden_paths
+```
+
 ## Steps

+### Step 0: Sandbox pre-flight (auto-generated — do not modify)
+
+- **Files:** none (read-only test)
+- **Changes:** verify git push permissions are available in this sandbox
+- **Verify:**
+  ```
+  git push --dry-run origin HEAD 2>&1 | tee /tmp/push-dryrun-$$.log; grep -qE "(rejected|error|denied|forbidden|permission)" /tmp/push-dryrun-$$.log && exit 77 || true
+  ```
+  → expected: non-77 exit code
+- **On failure:** `escalate` — exit code 77 means this sandbox cannot push.
+  Abort immediately; do not attempt any work. Main orchestrator will
+  re-spawn with correct permissions.
+- **Checkpoint:** none (no file changes)
+- **Manifest:**
+  ```yaml
+  manifest:
+    expected_paths: []
+    min_file_count: 0
+    commit_message_pattern: ""
+    bash_syntax_check: []
+    forbidden_paths: []
+    must_contain: []
+    sandbox_preflight: true
+  ```
+
+*Step 0 runs in the same sandbox as all real work. If it exits 77,
+ultraexecute-local marks the session `blocked` and does NOT proceed. This
+catches the fail-late push-denial mode observed in Wave 1.*
+
+*Escape hatch:* set `ULTRAEXECUTE_SKIP_PREFLIGHT=1` in the environment to
+bypass Step 0 (use only for offline/air-gapped testing).
+
 ### Step 1: {description}

 - **Files:** `{path}`
@ -31,10 +84,21 @@ the purpose and make judgment calls.}
 - **Verify:** `{exact command}` → expected: `{output}`
 - **On failure:** {revert | retry | skip | escalate} — {specific instructions}
 - **Checkpoint:** `git commit -m "{message}"`
+- **Manifest:**
+  ```yaml
+  manifest:
+    expected_paths:
+      - {path}
+    min_file_count: 1
+    commit_message_pattern: "^feat\\(scope\\):"
+    bash_syntax_check: []
+    forbidden_paths: []
+    must_contain: []
+  ```

 ### Step 2: {description}

-{same structure as Step 1}
+{same structure as Step 1, including Manifest block}

 ## Exit Condition

@ -78,3 +142,14 @@ introduced. This section bridges sessions — it's the "baton" in a relay race.}
 - **Steps from plan:** {step N}–{step M}
 - **Estimated complexity:** {low | medium | high}
 - **Model recommendation:** {opus | sonnet} — {rationale}
+
+## Recovery Metadata
+
+*This section is populated only when this session spec was generated by the
+ultraexecute-local Phase 7.6 recovery dispatcher. Omit for normal sessions.*
+
+- **Recovery of:** `{original session spec path}`
+- **Recovery depth:** {1 | 2}
+- **Missing steps (reason for recovery):** {step numbers + drift summary}
+- **Entry condition override:** {e.g., "previous partial session committed at {sha}"}
+- **Parent progress file:** `{path to .ultraexecute-progress-*.json}`