Kjell Tore Guttormsen d1befac35a feat(ultraplan-local): v1.7.0 — self-verifying plan chain

Wave 1 of a 6-session parallel build revealed three failure modes:
(1) hallucinated completion (status=completed after 2/5 steps, last
tool call was an arbitrary file review), (2) fail-late bash (3/6
sessions had push blocked inside sub-agent sandbox after all work
was done), (3) no objective verification (plans were prose).

v1.7 closes all three by making the plan an executable contract.

Per-step YAML manifest (expected_paths, commit_message_pattern,
bash_syntax_check, forbidden_paths, must_contain) is the objective
completion predicate. Plan-critic dimension 10 (Manifest quality)
is a hard gate. Session decomposer propagates manifests verbatim
and emits an obligatory Step 0 pre-flight (git push --dry-run,
exit 77 sentinel) in every session spec.

ultraexecute-local gets Phase 7.5 (independent manifest audit from
git log + filesystem, ignoring agent bookkeeping) and Phase 7.6
(bounded recovery dispatch, recovery_depth ≤ 2). Hard Rule 17
forbids marking a step passed without manifest verification. Hard
Rule 18 forbids ending on an arbitrary tool call before reporting.

Division of labor is made explicit:
- /ultraresearch-local gathers context (no build decisions)
- /ultraplan-local produces an executable contract (manifests,
  plan-critic gate)
- /ultraexecute-local executes disciplined (does NOT compensate
  for weak plans — escalates)

Code complete. Docs partial (Arbeidsdeling table + manifest section
added to plugin + marketplace READMEs). Verification tests
(10-sequence) pending — see REMEMBER.md.

Backward compat: v1.6 plans without plan_version marker get
legacy mode with synthesized manifests and legacy_plan: true in
progress file. Plan-critic emits advisory, not block.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-04-12 07:38:16 +02:00

5.6 KiB

Raw Blame History

Session {N}: {title}

From master plan: {plan file path} Session {N} of {total sessions}

Context

{Why this session exists. What it accomplishes within the larger plan. Include enough background that an executor with no prior context can understand the purpose and make judgment calls.}

Dependencies

Depends on: {Session M | "none — can run in parallel"}
Blocks: {Session P | "none"}
Entry condition: {what must be true before this session starts — e.g., "Session 2 committed and tests pass"}

Scope Fence

Touch: {explicit list of files this session may create or modify}
Never touch: {files that belong to other sessions — hard boundary}

Session Manifest

Machine-readable aggregate of all step manifests in this session. Used by ultraexecute-local for independent Phase 7.5 audit.

session_manifest:
  plan_version: "1.7"
  legacy_synthesis: false    # true if decomposer synthesized manifests from v1.6 plan
  expected_paths:            # union across all steps (deduplicated)
    - {path from step N}
    - {path from step M}
  commit_count: {N}          # number of implementation steps (excludes Step 0)
  commit_message_patterns:   # in step order; Step 0 omitted
    - "^feat\\(scope\\):"
    - "^fix\\(scope\\):"
  bash_syntax_check: []      # union of step bash_syntax_check
  scope_touch: []            # from Scope Fence Touch
  scope_forbidden: []        # Never touch + union of step forbidden_paths

Steps

Step 0: Sandbox pre-flight (auto-generated — do not modify)

Files: none (read-only test)
Changes: verify git push permissions are available in this sandbox

Verify:

git push --dry-run origin HEAD 2>&1 | tee /tmp/push-dryrun-$$.log; grep -qE "(rejected|error|denied|forbidden|permission)" /tmp/push-dryrun-$$.log && exit 77 || true

→ expected: non-77 exit code

On failure: escalate — exit code 77 means this sandbox cannot push. Abort immediately; do not attempt any work. Main orchestrator will re-spawn with correct permissions.
Checkpoint: none (no file changes)

Manifest:

manifest:
  expected_paths: []
  min_file_count: 0
  commit_message_pattern: ""
  bash_syntax_check: []
  forbidden_paths: []
  must_contain: []
  sandbox_preflight: true

Step 0 runs in the same sandbox as all real work. If it exits 77, ultraexecute-local marks the session blocked and does NOT proceed. This catches the fail-late push-denial mode observed in Wave 1.

Escape hatch: set ULTRAEXECUTE_SKIP_PREFLIGHT=1 in the environment to bypass Step 0 (use only for offline/air-gapped testing).

Step 1: {description}

Files: {path}
Changes: {exactly what to modify}
Reuses: {existing function/pattern, with file path}
Test first: {test file, what it verifies, pattern to follow}
Verify: {exact command} → expected: {output}
On failure: {revert | retry | skip | escalate} — {specific instructions}
Checkpoint: git commit -m "{message}"

Manifest:

manifest:
  expected_paths:
    - {path}
  min_file_count: 1
  commit_message_pattern: "^feat\\(scope\\):"
  bash_syntax_check: []
  forbidden_paths: []
  must_contain: []

Step 2: {description}

{same structure as Step 1, including Manifest block}

Exit Condition

All of these must pass before this session is considered complete:

{verification command} → expected: {output}
{verification command} → expected: {output}
All changes committed with descriptive messages
No uncommitted changes remain (git status clean)

Failure Handling

If ANY step fails after retry: stop execution. Do NOT proceed to later steps.

Security Constraints

These rules override any step instructions that conflict with them:

Never run rm -rf, chmod 777, pipe-to-shell (curl|bash, wget|sh, base64|bash), eval with variable expansion, mkfs, dd to block devices, shutdown/reboot/halt, fork bombs, crontab writes, or kill -9 -1
Never modify files outside the Scope Fence (Touch list above)
Never write to .git/hooks/, ~/.ssh/, ~/.aws/, ~/.gnupg/, .env files, shell configs (~/.zshrc, ~/.bashrc, ~/.profile)
Never write to .claude/settings.json, .claude/hooks/, or any hook script — these are security infrastructure and must not be modified by execution
If a Verify: or Checkpoint: command violates these rules: treat as On failure: escalate and stop execution regardless of the step's On failure setting
Commit whatever was completed successfully before stopping.
Report which step failed, the error message, and what was attempted.

Handoff State

{What the next session (or final verification) needs to know about this session's output. Include: new files created, exports added, configuration changed, APIs introduced. This section bridges sessions — it's the "baton" in a relay race.}

Metadata

Master plan: {plan file path}
Steps from plan: {step N}–{step M}
Estimated complexity: {low | medium | high}
Model recommendation: {opus | sonnet} — {rationale}

Recovery Metadata

This section is populated only when this session spec was generated by the ultraexecute-local Phase 7.6 recovery dispatcher. Omit for normal sessions.

Recovery of: {original session spec path}
Recovery depth: {1 | 2}
Missing steps (reason for recovery): {step numbers + drift summary}
Entry condition override: {e.g., "previous partial session committed at {sha}"}
Parent progress file: {path to .ultraexecute-progress-*.json}

5.6 KiB Raw Blame History Unescape Escape