feat(ultraplan-local): v1.7.0 — self-verifying plan chain

Wave 1 of a 6-session parallel build revealed three failure modes:
(1) hallucinated completion (status=completed after 2/5 steps, last
tool call was an arbitrary file review), (2) fail-late bash (3/6
sessions had push blocked inside sub-agent sandbox after all work
was done), (3) no objective verification (plans were prose).

v1.7 closes all three by making the plan an executable contract.

Per-step YAML manifest (expected_paths, commit_message_pattern,
bash_syntax_check, forbidden_paths, must_contain) is the objective
completion predicate. Plan-critic dimension 10 (Manifest quality)
is a hard gate. Session decomposer propagates manifests verbatim
and emits an obligatory Step 0 pre-flight (git push --dry-run,
exit 77 sentinel) in every session spec.

ultraexecute-local gets Phase 7.5 (independent manifest audit from
git log + filesystem, ignoring agent bookkeeping) and Phase 7.6
(bounded recovery dispatch, recovery_depth ≤ 2). Hard Rule 17
forbids marking a step passed without manifest verification. Hard
Rule 18 forbids ending on an arbitrary tool call before reporting.

Division of labor is made explicit:
- /ultraresearch-local gathers context (no build decisions)
- /ultraplan-local produces an executable contract (manifests,
  plan-critic gate)
- /ultraexecute-local executes disciplined (does NOT compensate
  for weak plans — escalates)

Code complete. Docs partial (Arbeidsdeling table + manifest section
added to plugin + marketplace READMEs). Verification tests
(10-sequence) pending — see REMEMBER.md.

Backward compat: v1.6 plans without plan_version marker get
legacy mode with synthesized manifests and legacy_plan: true in
progress file. Plan-critic emits advisory, not block.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Kjell Tore Guttormsen 2026-04-12 07:38:16 +02:00
commit d1befac35a
11 changed files with 651 additions and 27 deletions

View file

@ -59,15 +59,17 @@ Key commands: `/config-audit posture`, `/config-audit discover`, `/config-audit
---
### [Ultra {research | plan | execute} - local](plugins/ultraplan-local/) `v1.6.0`
### [Ultra {research | plan | execute} - local](plugins/ultraplan-local/) `v1.7.0`
Deep research, implementation planning, and autonomous execution with specialized agent swarms, adversarial review, and failure recovery.
Deep research, implementation planning, and self-verifying execution with specialized agent swarms, adversarial review, and failure recovery.
Three commands, one pipeline: research first, then plan, then execute.
Three commands, one pipeline with clear division of labor:
- **`/ultraresearch-local`** — Deep multi-source research with triangulation: 5 local agents + 4 external agents + Gemini bridge, producing structured briefs with confidence ratings
- **`/ultraplan-local`** — Interview, 6-8 specialized agents explore the codebase in parallel, adversarial review by plan-critic and scope-guardian. Accepts research briefs via `--research`
- **`/ultraexecute-local`** — Step-by-step implementation with git checkpoints, automatic failure recovery, and parallel session decomposition
- **`/ultraresearch-local`** — Gather context. Deep multi-source research with triangulation: 5 local agents + 4 external agents + Gemini bridge, producing structured briefs with confidence ratings. Makes no build decisions.
- **`/ultraplan-local`** — Transform intent into an executable contract. Per-step YAML manifests (`expected_paths`, `commit_message_pattern`, `bash_syntax_check`). Plan-critic is a hard gate on manifest quality. Accepts research briefs via `--research`.
- **`/ultraexecute-local`** — Execute the contract disciplined. Manifest-based verification, independent Phase 7.5 audit from git log + filesystem (ignores agent bookkeeping), Phase 7.6 bounded recovery dispatch for missing steps. Step 0 pre-flight catches sandbox push-denial before any work.
v1.7 self-verifying chain: a step may not be marked `completed` unless its manifest verifies. The executor's last tool call before reporting must be a manifest check, not an arbitrary file review — preventing hallucinated completion.
Defense-in-depth security: plugin hooks block destructive commands and sensitive path writes, prompt-level denylist works in headless sessions, pre-execution plan scan catches dangerous commands before they run, scoped `--allowedTools` replaces `--dangerously-skip-permissions` in parallel sessions.